Prompt engineering became a celebrated skill in 2022–2023. By 2026, the field has matured significantly — many early techniques have been superseded by better models, while a smaller set of genuine principles remains important. Here is the honest state of the art.
What Has Become Less Important
Chain-of-thought hacks: in 2022, adding “let’s think step by step” to a prompt dramatically improved performance on reasoning tasks. Modern models do this automatically and better. Elaborate system prompt templates: complex system prompts with dozens of rules, nested instructions, and conditional logic were necessary when models were worse at following instructions. Modern frontier models (Claude 3.5+, GPT-4o) follow clear, direct instructions reliably — a concise system prompt is often better than a complex one. Output format forcing: prompts that included “respond only in JSON, never say anything else, if you don’t know output null” were necessary when models were bad at structured output. Now: use structured output modes (JSON mode in OpenAI, tool use in Claude) and let the model handle formatting. Jailbreaking as a benchmark: the creative prompt sequences that early users used to bypass safety guidelines are now largely caught by model-level safety training — pursuing them is both less effective and ethically questionable.
What Still Matters: The Core Principles
Specificity over vagueness: “Write a 300-word summary of this contract’s termination clauses, targeting a non-lawyer reader, in bullet points” is reliably better than “summarize this contract.” The model cannot read your mind; specify the output format, length, audience, and purpose. Persona and role specification: “You are an experienced backend engineer reviewing this code for security vulnerabilities” produces better security review than “review this code.” The persona sets expectations about what the model should focus on and what level of expertise to deploy. Examples (few-shot): providing 2–3 examples of the desired input-output pattern dramatically improves consistency for structured tasks — classification, extraction, reformatting. For one-off tasks, examples often help less than people think. Context front-loading: put the most important context at the beginning and end of a prompt, not buried in the middle. Models trained with attention tend to under-attend to long middle sections. Iterative refinement: treating prompts as code — versioning them, testing them against examples, tracking which changes improved which metrics — is the practice that separates professional prompt engineering from casual use.
Advanced Techniques That Work
Constitutional AI prompting: asking the model to evaluate its own output against explicit criteria before finalising (“before responding, check: does this answer address the original question? Does it contain any factual claims you’re uncertain about?”). This self-evaluation step improves accuracy measurably for complex tasks. Structured reasoning elicitation: for complex analytical tasks, instructing the model to use a specific framework (“analyse this business decision using the MECE framework” or “list the pros, cons, and key uncertainties before recommending”) produces more thorough and structured outputs. Negative constraints: telling the model what not to do is often more effective than adding more instructions about what to do — “do not speculate about figures you don’t have in the document” is more reliable than “only use information from the document.” Tool use over prompting: for tasks that require current information, calculation, or code execution, providing the model with tools (web search, code interpreter, calculator) is far more reliable than prompting the model to do these things from memory.




