I spend hours every day working with AI agents. A ton of Gemini CLI. A bunch of Claude Code. Others when needed for competitive or verification needs. The orchestration system running my marketing operations for BrandCast makes hundreds of LLM calls daily. JSON is the default format for passing data between agents. But a significant portion of the token budget goes to syntax overhead. Curly braces, colons, and repeated field names instead of meaningful data.
This inefficiency raises a bigger question: are we designing data formats for the wrong audience?
The Token Efficiency Problem: Why JSON Wastes LLM Resources
JSON is everywhere in AI systems. It structures prompts, passes data between agents, and formats API responses. But JSON was designed in 2001 for human developers to debug HTTP responses. We’re now using it to communicate between machines that think in tokens.
Here’s a simple example. To pass 100 user records with 5 fields each to an LLM, JSON requires:
[
{
"id": "001",
"name": "Alice Chen",
"role": "engineer",
"status": "active",
"joined": "2024-01-15"
},
{
"id": "002",
"name": "Bob Smith",
"role": "designer",
"status": "active",
"joined": "2024-02-20"
}
// ... 98 more records
]
That’s roughly 2,500 tokens. The field names “id”, “name”, “role”, “status”, and “joined” repeat 100 times. At GPT-4 pricing ($30 per million input tokens), that JSON syntax costs money for every single agent interaction.
Scale this to production systems making thousands of LLM calls daily. That JSON overhead becomes a significant line item. More critically, it consumes context window space that could hold actual data.
TOON vs JSON: A Token Comparison for AI Agents
TOON launched in November 2025 with a simple premise: optimize data formats for token efficiency instead of human readability. The same 100 records in TOON look like this:
users [100]
id name role status joined
001 "Alice Chen" engineer active 2024-01-15
002 "Bob Smith" designer active 2024-02-20
// ... 98 more records
Field names appear once. Array length is explicit. Indentation replaces braces. The result is about 1,500 tokens for the same data. That’s a 40% reduction.
The TOON specification achieves this through three key design choices:
1. Declared fields instead of repeated keys. Field names appear once in a header row, then only values follow. This is where most token savings come from.
2. Explicit array lengths. Instead of closing brackets, TOON uses [N] notation. This actually helps LLMs validate structure during parsing.
3. Indentation-based nesting. Curly braces become whitespace. This trades human visual clarity for token efficiency.
According to benchmarks from the TOON GitHub repository, the format achieves 30-60% token reduction across various data types while maintaining 73.9% parsing accuracy compared to JSON’s 69.7%. The explicit structure helps models understand and validate data more reliably.
Why Markdown is Secretly the Most Efficient Format for LLMs
Agents using markdown for data exchange consistently use fewer tokens than those using JSON. This happens because markdown aligns with how LLMs were trained. Most training data includes natural language with markdown formatting. The models have deep statistical associations between markdown structure and semantic meaning.
Consider these characteristics:
Token density. Markdown uses minimal syntax overhead. Headers are #, not <h1></h1>. Lists are -, not <ul><li></li></ul>. Bold is **text**, not <strong>text</strong>. Every saved character is a saved token.
Natural language alignment. LLMs process markdown almost like prose. The hierarchical structure (headers → subheaders → content) mirrors how concepts relate in natural language. This reduces cognitive load during parsing.
Hierarchical structure benefits. The heading hierarchy tells models what’s a main idea versus a subpoint. This contextual information helps with understanding and retrieval in RAG systems.
Research from Webex Developers shows markdown improves RAG retrieval accuracy by 20-35% compared to HTML or plain text. The combination of token efficiency and semantic clarity makes markdown quietly excellent for agent communication.
Token-Oriented Object Notation (TOON): The JSON Alternative for AI
TOON fills a specific gap in the data format landscape. JSON excels at nested objects and complex structures. Markdown works well for documents and narrative content. TOON optimizes for tabular data in agent communication.
The format’s value emerges at scale. One API call with 10 records? The savings are negligible. But 1,000 agent interactions daily with 100 records each? That’s millions of tokens saved monthly.
When TOON excels:
- Multi-agent systems exchanging structured data
- RAG pipelines retrieving database results
- Agent tool definitions and parameter schemas
- Bulk data transfer in prompts (lists, tables, logs)
- Any scenario with repeated field structures
When to stick with JSON:
- Public APIs requiring broad compatibility
- Deeply nested object hierarchies
- Data structures that change frequently
- Systems where human debugging is critical
- Legacy integrations
The TOON ecosystem is nascent. The TypeScript SDK exists, but tooling is limited. Converting between JSON and TOON requires custom code. For production systems, this creates friction. But for greenfield agent architectures, the token economics are compelling.
What About Binary Formats?
The obvious question: if token efficiency matters this much, why not use binary formats like Protocol Buffers or MessagePack? They’re more compact than JSON and widely used in microservices.
The answer is that binary formats solve a different problem. Protobuf and MessagePack optimize for transmission size and parsing speed in networked systems. They’re excellent for that. But LLMs don’t process binary data directly.
The LLM constraint: Models tokenize text. They expect strings, not bytes. Even if you base64-encode binary data to make it text-compatible, you’re adding overhead that eliminates the size advantage.
Schema requirements: Protobuf requires predefined schemas compiled into code. This works when both endpoints are deterministic systems. But LLMs generate data dynamically. They need formats that don’t require compilation steps.
Human inspection: Binary formats are unreadable to humans without decoding tools. Even with token-optimized formats like TOON, developers can still read the raw data when debugging. With binary formats, you lose that completely.
That said, there are scenarios where binary formats make sense in AI systems. When passing large datasets between traditional services and AI agents, Protobuf or MessagePack can reduce network overhead. But for data that goes directly into prompts or model responses, text-based formats optimized for token efficiency are the better choice.
YAML gets mentioned occasionally as a JSON alternative. It’s more human-readable, supports comments, and uses less syntax. But YAML is actually more verbose than JSON for structured data because of its use of dashes and indentation. For LLM consumption, YAML offers no token advantage over JSON and adds parsing complexity.
The emerging consensus: text-based formats optimized for token efficiency (like TOON) beat both JSON and binary alternatives for direct LLM interaction.
Agent-Readable vs Human-Readable: The Paradigm Shift in Data Design
This leads to a bigger question. We’ve spent decades optimizing data formats for human developers. Variable names should be readable. Code should be self-documenting. JSON is preferred over binary protocols because engineers can inspect it.
But in multi-agent systems, humans rarely see the data in flight. Agents generate it, pass it to other agents, and consume it programmatically. The primary audience is machines.
This mirrors other abstraction shifts in computing. Assembly language was replaced by C because human productivity mattered more than execution speed. C was replaced by Python for the same reason. Each transition traded machine efficiency for human readability.
With AI agents as the dominant consumers of data, maybe the calculus flips. Maybe we should optimize for token efficiency and let debugging tools handle human inspection.
The counterargument is that humans still need to debug agent systems. When something goes wrong, readable data helps. But debugging tools can translate between formats. We’ve been doing this for decades with debuggers that show human-friendly representations of binary data.
The question isn’t whether to abandon human-readable formats entirely. It’s whether to make token-optimized formats the default and provide translation layers when humans need to inspect.
Structured Data Formats for Multi-Agent Systems
This shift connects to emerging agent communication protocols. The Model Context Protocol (MCP) defines how agents exchange context and capabilities. Agent Communication Protocol (ACP) and Agent-to-Agent Protocol (A2A) specify message formats for agent coordination.
These protocols face the same token efficiency challenge. When agents coordinate on complex tasks, they exchange substantial structured data. Prompt templates, tool definitions, execution results, and state information all consume tokens.
TOON-like formats could become standard in these protocols. The benefits compound in multi-agent scenarios:
Reduced context window pressure. Each agent maintains less context because data is more compact. This allows more agents to collaborate within model constraints.
Lower operational costs. Token savings scale with agent count. Ten agents collaborating use fewer tokens with optimized formats.
Faster inference. Smaller inputs reduce processing time. In real-time agent systems, this matters.
Better semantic precision. Schema-aware formats like TOON help agents validate data structure. This reduces hallucinations and parsing errors.
The resurgence of structured operational data in AI systems isn’t just about efficiency. It’s about enabling more sophisticated agent architectures that were cost-prohibitive with JSON.
Context Window Optimization: How Data Format Affects LLM Performance
Context windows keep growing. GPT-4 supports 128k tokens. Claude 3.5 Sonnet handles 200k. But larger windows don’t make efficiency irrelevant. They make it more important.
With a 200k context window, you could load an entire codebase for analysis. But if 40% of those tokens are JSON syntax, you’re wasting 80k tokens on formatting. That’s the difference between including critical business logic or cutting it out.
Data format also affects how models use context. Research shows LLMs don’t process all context equally. In the “Lost in the Middle” study, when models needed to find specific facts across multiple documents, accuracy formed a U-shaped curve: over 80% when the relevant information was in the first or last documents, but under 40% when it was in the middle. Bloated formats push important data deeper into that low-attention middle zone.
Token-optimized formats improve information density. More meaningful data fits in the same window. This has second-order effects on agent performance. Better context leads to better reasoning leads to better outputs.
The economics matter too. Context window usage drives inference costs. Google’s Gemini and Anthropic’s Claude charge per input token. Optimization isn’t just about staying under limits. It’s about doing more with less.
Should We Optimize Systems for Humans or Agents?
When building automation systems where agents handle 90% of the work, this question becomes practical. Debugging failures requires understanding what happened. Readable data helps.
The pragmatic answer is probably “both, with translation layers.” Design core agent communication for token efficiency. Provide debugging tools that render human-friendly views. This is how binary protocols work today. Network engineers debug HTTP/2 binary frames through tools that show readable representations.
The implementation could be straightforward. Agent systems log interactions in TOON or other optimized formats. Debugging interfaces translate to JSON or markdown for human inspection. Most developers never see the optimized format unless specifically investigating token usage.
This approach preserves developer experience while capturing economic benefits. The cost is maintaining translation layers. But that’s a one-time engineering investment that pays ongoing dividends in token savings.
The longer-term question is whether this matters as model costs drop. If inference becomes nearly free, token efficiency becomes less critical. But “nearly free” hasn’t arrived yet. Current pricing makes optimization worthwhile for production systems.
Practical Implementation: When to Use TOON, Markdown, or JSON
Format selection depends on the specific use case:
Use JSON when:
- Interoperating with external systems or public APIs
- Data structure is deeply nested or irregular
- Human inspection during development is frequent
- Tooling compatibility matters more than efficiency
Use TOON when:
- Passing tabular data between agents (lists, tables, logs)
- Token costs are significant (high-volume systems)
- Data structure is regular and repeated
- Building greenfield multi-agent systems
Use markdown when:
- Documents or narrative content
- Hierarchical information (reports, summaries)
- RAG retrieval accuracy matters
- Mixed structured and unstructured data
Migration strategy: For existing JSON-heavy systems, wholesale rewrites rarely make sense. Instead, identify high-volume, high-token interactions. Optimize those first. Measure token savings. Expand if the economics justify it.
For new systems, TOON-first architecture is worth considering. The ecosystem is immature, but for internal agent communication, custom tooling is manageable. The token savings compound over the system’s lifetime.
The Future of Agent Communication Standards
TOON represents a broader trend. As AI agents become more prevalent, we’ll see more data formats optimized for machine communication. This isn’t just about token efficiency. It’s about enabling agent architectures that weren’t feasible before.
This mirrors the shift from human-operated computers to automated systems. Early computers required human operators to translate between instructions and machine code. We automated that translation. Now we’re automating the translation between human-readable and machine-optimized formats.
This opens up interesting possibilities:
Agent-native protocols. Communication standards designed for agents first, with human tooling as an afterthought. This inverts current design priorities.
Hybrid formats. Systems that embed metadata for human readers alongside token-optimized data for agents. Think of it like source maps in JavaScript.
Dynamic format selection. Agents negotiating optimal formats based on context. High-volume exchanges use TOON. Human inspection uses JSON. The system adapts.
Semantic compression. Formats that preserve meaning while aggressively minimizing tokens. This goes beyond structural optimization to semantic optimization.
The TOON specification is early. The ecosystem is small. But the core insight is sound: optimize data formats for the primary consumer. In agent systems, that’s increasingly machines, not humans.
Conclusion
We’re at an inflection point in how we design data systems. For decades, human readability guided our choices. JSON became ubiquitous because developers could debug it. But as AI agents become the primary consumers of data, the calculus changes.
TOON demonstrates that token-optimized formats can reduce costs by 30-60% while improving parsing accuracy. Markdown shows that formats aligned with LLM training data offer both efficiency and semantic benefits. Together, they point toward a future where agent-readable formats are the default and human tooling translates when needed.
This shift isn’t about abandoning human developers. It’s about recognizing that in multi-agent systems, machines are the primary audience. We can optimize for them while providing translation layers for human inspection.
The question isn’t whether this transition will happen. It’s how quickly we’ll adapt our systems to capture the benefits.
Resources
- TOON Specification - Official GitHub repository with detailed format documentation
- Markdown Efficiency for LLMs - Research on markdown’s advantages in RAG systems
- Structured Data for AI Agents - Analysis of structured data’s resurgence in agent systems
Building multi-agent systems at BrandCast while exploring what happens when code writes itself. More at jduncan.io.