Context-Driven Development: Teaching Your AI Assistant How You Think

Your AI coding assistant can read every file in your project. It can grep for symbols, trace imports, and read function signatures all day long. What it cannot do is understand why you built things that way.

That gap between “what the code says” and “what the team intended” is where most AI-assisted development falls apart. The agent finds the authentication middleware. It reads the JWT validation logic. But it has no idea that you chose JWTs over sessions because of a compliance requirement from legal, or that the error handling strategy follows RFC 7807 because your API consumers demanded it. That context lives in your head, in a Slack thread from six months ago, or in a design document that no one has opened since it was written.

Context-driven development is the practice of closing that gap on purpose.

The Industry Is Already Converging on This

If you use AI coding tools, you have probably noticed something. Every major player has independently arrived at the same idea: a markdown file in your repo root that tells the AI how to behave.

Anthropic has CLAUDE.md. Gemini CLI has GEMINI.md. OpenAI’s Codex CLI uses AGENTS.md. Cursor has its rules directory. GitHub Copilot reads copilot-instructions.md. The Agentic AI Foundation, operating under the Linux Foundation, formally adopted AGENTS.md as an ecosystem standard in 2025. Over 60,000 repositories on GitHub now use the format.

This convergence tells us something. When five competing companies independently build the same feature, they are responding to the same pain. The pain is that AI agents without project context produce generic, often wrong, output. Give them a file that says “we use Postgres, not MySQL” and “all API responses follow this schema” and the output quality jumps immediately.

But a single markdown file is just the beginning. It is a sticky note on the monitor of a very fast, very literal junior developer. It helps. It is not enough.

The Problem with Search and Read

The default way AI coding agents explore a codebase is what I think of as the search-and-read loop. The agent receives a question. It greps for relevant keywords. It reads a few files. It greps again based on what it found. It reads more files. Eventually, it synthesizes an answer.

This works well for a certain class of question. “Where is the createOrder function defined?” Grep finds it in seconds. “What does this config file do?” The agent reads it and tells you. These are location-based questions. The developer already knows what they are looking for. They just need the agent to find it.

The loop breaks down on a different class of question. The conceptual ones. “How does authentication work in this project?” Now the agent has to guess which keywords to search for. auth? jwt? session? login? passport? Each search returns a dozen files. The agent reads four or five of them, burning through tokens and context window space. Some of the files turn out to be test fixtures or deprecated modules. After five or six tool calls and maybe 5,000 tokens of file reading, the agent has pieced together a partial answer from source code alone.

And it still does not know why any of it was built that way.

The real cost is not just the tokens. Every file the agent reads sits in the context window for the rest of the conversation. In a long working session where the developer asks several architectural questions, the search-and-read pattern can consume 25,000 to 35,000 tokens just on file reading. That is context window space that could hold actual conversation, actual instructions, actual work.

What RAG Changes

RAG stands for Retrieval-Augmented Generation. In plain terms, it means pre-indexing your documentation into a searchable database that understands meaning, not just keywords.

Here is how it works. You point the system at your documentation directory. It reads every file, splits them into chunks, and converts each chunk into a numerical representation called an embedding. These embeddings capture the semantic meaning of the text. When you later ask a question, the system converts your question into the same kind of embedding and finds the chunks whose meaning is closest to what you asked.

The difference from grep is fundamental. Grep finds the string validate_credentials. RAG finds the document that explains your authentication strategy, even if the word “authentication” never appears in the query and the word “validate” never appears in the document. It matches on meaning, not characters.

For developer workflows, this changes which questions an AI agent can answer well.

A developer asks: “What is our error handling strategy?” With search-and-read, the agent has to grep for error, catch, throw, exception across the entire codebase and hope it reads the right files. With RAG over the project’s documentation, the query goes straight to the architecture document that explains the strategy. One tool call. A few hundred tokens. The right answer.

“How should I add a new microservice?” This is a procedural question. The answer is in a contributor guide or a runbook. Grep has nothing useful to search for in source code. RAG surfaces the runbook directly.

“Why did we choose Postgres over DynamoDB?” This answer lives in a design document or an ADR. Source code will show you Postgres connection strings, which tells you nothing about the decision. RAG finds the ADR.

The pattern is clear. RAG wins when the answer is distributed across files, when the answer is in prose rather than code, or when the question is about intent rather than location.

Being Honest About the Tradeoffs

I do not want to oversell this. RAG is not a replacement for search-and-read. It is a complement.

Anthropic’s own Claude Code team initially tried a local vector database for code search and moved away from it. Their reasoning was straightforward: for source code specifically, agentic search (the grep-and-read loop) generally works better. It is simpler, avoids staleness issues, and gives you the exact current state of the code every time.

They were right about source code. Code is semantically dense and context-dependent. The line const result = await process(input, opts) means almost nothing without knowing what process is and what file this lives in. Embeddings capture the meaning of natural language well. They capture the meaning of a random line of code poorly.

But documentation is natural language. Architecture docs, design decisions, runbooks, onboarding guides, API specifications. These are exactly what embedding models are good at. The hybrid approach is the honest answer: RAG for your documentation layer, search-and-read for your source code, both working together.

The Virtuous Cycle (and the Hard Part)

Here is where this gets interesting. And also where it gets hard.

If RAG over documentation is valuable, then your documentation has to be good. Not “we have a wiki somewhere” good. Actually maintained, actually current, actually covering the decisions and patterns that a new developer (or an AI agent) would need to understand.

Harper Reed wrote about this in early 2025, describing a workflow where the specification document is the primary engineering artifact, not the code. He writes detailed specs before touching a code generator, and reports that his entire hack to-do list is empty because the approach let him build everything he had been putting off. The spec took fifteen minutes. The code generation followed from there.

This points at a broader shift. In a world where AI agents can generate code quickly, the bottleneck moves upstream. The bottleneck becomes: does the context exist for the agent to do good work? If the documentation is stale or missing, the agent is guessing. If the documentation is current and thorough, the agent has a foundation.

This creates a virtuous cycle. Better documentation powers better AI-assisted development. Better AI-assisted development frees up time. Some of that time goes back into documentation. The loop reinforces itself.

The hard part is that this loop requires discipline. Documentation has to be treated as infrastructure, not as an afterthought that gets updated when someone remembers. It needs the same attention as tests. You would not ship code without running the test suite. In a context-driven workflow, you should not ship code without updating the docs that power your development context.

Root: A Take on Making This Practical

I have been building a framework called Root that tries to make context-driven development practical rather than aspirational. It ships as a plugin for Claude Code and an extension for Gemini CLI.

The core idea is simple. When a developer starts a task, Root automatically gathers context from a local RAG database of ingested project documentation, classifies the work into tiers, and recommends specialist agents based on what the docs say about the relevant parts of the codebase. The RAG database runs entirely locally using the all-MiniLM-L6-v2 embedding model. No API calls, no data leaving the machine.

The part I think matters most is not the RAG search itself. It is the documentation lifecycle that Root enforces around it. The framework includes staleness detection (flagging docs older than six months or with mismatched dates), gap discovery (finding undocumented code and triaging by priority), quality rubrics (requiring every doc to include purpose, public API surface, dependencies, and usage examples), and frontmatter validation.

I’ve gone through the doc gap-and-fix process with several key repos for my BrandCast project. The results have been great. The gap and fix around missing or stale documents was a few evenings of boring reading and validation. But that’s my fault. Now I get better plans faster and with more complete context. The docs in my repos become self-improving.

The Argument

Context-driven development is not really about RAG databases or embedding models or any particular tool. It is about recognizing that AI coding assistants are only as good as the context they operate in. The code itself is necessary but not sufficient. The intent, the decisions, the patterns, the “why” behind the “what” are what separate a useful AI suggestion from a generic one.

The industry is converging on this understanding. The tools are getting better at consuming context. The remaining bottleneck is whether teams will invest in creating and maintaining context worth consuming.

I think Root is a good take on this problem. It is open source, MIT licensed, and ready to try. But the ideas behind it are more important than the tool. Give your AI assistants better context and they will do better work. Treat your documentation as infrastructure and it will pay for itself in ways that go beyond human readability.