Designing Search APIs for Agentic Workflows

Search APIs for humans and search APIs for agents have different jobs. A person can skim, compare, open tabs, and repair missing context. An agent usually receives one compressed response and has to make the next decision from that response.

That makes the API contract the product. The response should carry enough structure to be useful without forcing the model to reverse-engineer the page.

What I optimize for

The first goal is predictable shape. Titles, URLs, snippets, dates, extracted markdown, and source metadata should appear in consistent places. The second goal is context density. Every token should either help answer the query, verify the source, or choose the next retrieval step.

The third goal is graceful uncertainty. If a result is weak, stale, blocked, or only partially extracted, the API should say that directly. Silent confidence is more expensive than honest incompleteness.

A practical response model

I like results that separate the source, the extracted content, and the ranking explanation. That makes it easier for a downstream LLM to cite sources, discard weak matches, and decide whether to search deeper.

For RAG and research agents, markdown is often a better transport layer than raw HTML because it keeps hierarchy and links while removing layout noise. The trick is to keep markdown clean, not decorative.

The durable lesson

Agentic systems do not become reliable because one model call is clever. They become reliable when every tool response is shaped for inspection, compression, and recovery.

What I optimize for

A practical response model

The durable lesson

Let's build the quiet part.