From Prompt Engineering to Context Engineering

Prompt Engineering as a practice came up with the statement- “English is the new programming language”. It indeed did bestow some powers for doing many tasks with Language models. However in the last one year or so things have started to change in some interesting ways, with a new discipline called “context engineering”.

Let me break this down a bit.
AI agents and models need context to perform tasks. And for this they use a construct called context window. Think of this as the right amount of information required to complete the task.
Context engineering is the science, art, and systemic method of filling this context window just right.
If you look at the context window as a finite, expensive hardware resource(like RAM), this parallels the writing of resource efficient code.
This needs engineering and software architect’s mindset to do right.

Systems Architect len’s of context

There could be too much unstructured history, leading to memory bloat.
Too many irrelevant tool descriptions, leading to context poisoning.
Too many simultaneous responsibilities, leading to context clash.
Summaries that overwrite details, leading to data loss.
Retention of old obsolete data, leading to hallucination and drift.

This is akin to the challenges that a system’s architect faces in:

Cache refresh : what to keep versus what to evict
Execution plan: what to run now versus later
Memory segmentation: What belongs to which subsystem
Compression: How to preserve signal while reducing size
Isolation: How to contain the execution of a component without affecting other modules
Observability: Knowing what happened when things break

Context engineering has very similar challenges - just operating on tokens instead of bytes.

LLMs have a training cut-off date, beyond which it has to be fed current information by means of RAG and contextual session information that is crucial for AI agents built on top of LLMs to reason and act with situational awareness.
If context is not properly optimised then this will lead to multiple challenges, including:

Hallucinations
inconsistent reasoning
Tools misuse
High token costs
Latency and performance issues
unreliable in long workflows

This is why complex agent systems may fail in production. The culprit is not the model or the agent, rather it is the systems architecture of the whole application in which optimal context engineering becomes a first class concern.

How do we supply the right context, at the right time, in the right amount, to the right agent?

Well, there are 4 pillars here.

WRITE - Persist only what matters
SELECT - Fetch only what is relevant
COMPRESS - Shrink the information without losing the signals
ISOLATE - Keep tasks, agents, tools, and memories separate.

So as we can see, context engineering needs systems thinking. It is the foundation of scalable, reliable, intelligent agents.

This Series: Going Deep and Broad
In the coming posts, I’ll explore each pillar like a systems architect would:
How do you build a memory hierarchy?
How do you prevent context poisoning?
How do you make agents cost-efficient without losing fidelity?
How do you design sub-agents with isolated state?
How do you create adaptive summarization pipelines?
How do you run experiments and measure context quality?
How do you architect multi-agent systems that don’t collapse under their own history?

This is the frontier. And smart architects have a huge opportunity here — because this is no longer about prompts and hacks.

This is about architecture, systems design, optimization, state machines, abstractions, and scalable reasoning. So buckle up, it will be an interesting ride for all fellow Architects.

← Previous Post Next Post →

Subscribe