Prompting: the contract with the model
System vs user, clear instructions, few-shot and the limits of what prompting solves.
5 min read
You already have a capable model — but it has no idea what you want until you say so. The prompt is the contract: it defines role, rules, format, and examples before any response appears. Understanding this structure is what separates 'try until it works' from deliberately engineering behavior.
The three roles in a conversation
Every modern LLM receives messages labeled with a role. The three that matter are system, user, and assistant.
System is where you, the architect, speak to the model before the user does. It's the space to define identity, tone, constraints, and output format. Think of it as the briefing a manager gives an employee before a client meeting: the employee (model) will interact with the client (user), but the rules are already set.
User is the message from whoever is using the system — a human typing, or your application assembling a string programmatically. Assistant is the model's reply; in few-shot prompting (covered next) you also inject assistant messages to show expected response examples.
A common mistake is putting everything in the user role and skipping system entirely. The result is a model that ignores constraints because they arrived mixed in with the request, carrying no configuration authority. Business rules, persona, and format belong in system. User should carry only the variable data — the question, the text to process, the session context.
Anatomy of a well-structured prompt
Prompt assembly flow through to model response
- system · papel, regras, formato
- assistant (exemplos) · few-shot opcional
- user · dado variável da sessão
- Context Window · tokens concatenados
- LLM Inference · next-token prediction
- Resposta · no formato definido
- Validação · (app / schema)
What makes a prompt good
A good prompt has four ingredients: context, clear instruction, output format, and, when needed, examples.
Context is what the model needs to know so it doesn't invent: who the user is, what the domain is, what constraints apply. Without context, the model fills gaps with probability — and probability isn't always what you want.
Clear instruction means action verb + scope + constraint. "Summarize" is weak. "Summarize in at most three sentences, focusing on financial impact, without mentioning people's names" is a contract.
Output format reduces post-processing. If you need JSON, say so in system. If you need a numbered list, say so. The model follows format when instructed — not by default.
Examples (few-shot) are the most powerful shortcut when the instruction alone is ambiguous. Showing two or three expected input/output pairs calibrates the model better than any adjective like 'detailed' or 'professional'. Zero-shot works for simple, well-defined tasks; few-shot steps in when the output pattern is too specific to describe with words alone.
A pattern I use in production: system defines persona + rules + format; user carries only {{variable}}. This makes the prompt versionable, testable, and decoupled from the data.
Building a structured prompt — practical example
- 1
System: persona and rules
``
system: You are a technical support triage assistant. Respond ONLY in JSON with keys: { "categoria": string, "prioridade": "alta|media|baixa", "resumo": string }. Do not invent information absent from the ticket.`` - 2
Few-shot: two examples in history
``
user: "I can't log in since yesterday, error 401." assistant: {"categoria":"autenticacao","prioridade":"alta","resumo":"Login failure with 401 error for over 24h."} user: "I want to change the button color on the dashboard." assistant: {"categoria":"ui","prioridade":"baixa","resumo":"Aesthetic request on the dashboard."}`` - 3
User: session variable data
``
user: "{{ticket_text}}"`` Just that. The data changes; the contract doesn't.
Roles in a model conversation
Tap a concept, then its definition.
You'll hear a lot about 'think step by step' (chain-of-thought). What it does is simple: it forces the model to generate intermediate reasoning tokens before giving the final answer. Since the model predicts the next token based on everything that came before (lesson 03), having the reasoning written in the context window improves the final response on tasks requiring multiple logical steps. It's not a trick — it's a direct consequence of how the model works. I use it for complex classification and root-cause analysis; I avoid it for short responses where the extra reasoning only adds latency and cost.
The limits of prompting — what it doesn't solve
Here's the part I see ignored most often: a prompt does not give the model new knowledge. If the model was trained through March 2024 and you need it to answer about an October 2024 event, no prompt fixes that. The model will hallucinate or say it doesn't know — and both are correct behaviors given what it has.
Likewise, a prompt does not give the model tools. You can instruct the model to 'query the database', but it has no way to do that on its own. The ability to act in the world — calling APIs, retrieving documents, executing code — comes from architecture: RAG for external knowledge, tool calling for actions. Those are the next two lessons.
What a prompt controls: behavior (tone, format, constraints), reasoning strategy (chain-of-thought, problem decomposition), and example calibration (few-shot). What a prompt does not control: facts beyond training, real-time data, and code execution or external calls.
Understanding this boundary is essential to avoid building brittle systems. When a requirement doesn't fit in the prompt, the architectural answer is RAG or tools — not a bigger prompt.
Zero-shot vs Few-shot: when to use each
| Criterion | Zero-shot | Few-shot | |
|---|---|---|---|
| When to use | Simple, well-defined task | Specific or ambiguous output pattern | — |
| Token cost | Low | Medium–high (examples consume context) | — |
| Format consistency | Depends on written instruction | High — the example shows exactly what's expected | — |
| Maintenance | Simpler | Examples need to be reviewed with the model | — |
Key takeaways from this lesson
Frequently asked questions
Will prompt engineering disappear with better models?
Better models reduce the need for tricks, but don't eliminate the need for clear instruction. The more capable the model, the more important it is to say exactly what you want — because it will do exactly that, with higher fidelity.
Can I put security rules only in the system prompt?
Not as the only layer. System prompt matters, but it can be bypassed via prompt injection. Lesson 10 covers guardrails and why security needs defense in depth, not just instruction.
How many few-shot examples are enough?
Generally, two to five cover most cases. More than that rarely helps and consumes context that could be used for real data. Test with two and add more only if consistency degrades.
Module 1 Checkpoint
You've reached the end of the first module. We covered what AI is and where LLMs fit, how a model learns and what parameters are, how tokens and context work, what embeddings and semantic search are, and now how to structure the contract with the model via prompting. These five concepts are the foundation for everything ahead — RAG, tools, agents, evaluation. Next is an association exercise to consolidate conversation roles, then the module quiz. If you can answer without looking back, you're ready for Module 2.
Checkpoint — Module 1
1. What can prompting NOT solve on its own?
2. The system prompt is for…