When you build an API with human clients in mind, you make a lot of implicit assumptions. A developer will read the error message and figure out what went wrong. A user will notice if pagination breaks and reload the page. Someone will check the logs if the integration starts behaving strangely.
Those assumptions break down when the client is an agent.
We have been living with this shift for long enough that the differences are no longer surprising. But they were surprising at first. An API that worked fine for human-driven clients started producing strange results when agents started calling it at volume, in parallel, with no one watching in real time. Some of those issues were bugs. Most of them were design assumptions that had never been tested against an automated consumer.
Error messages are part of the interface
For a human developer, an error message is a hint. They read it, infer what went wrong, and adjust. The exact wording does not matter much. “Invalid input” is annoying but workable.
For an agent, an error message is input to a decision process. If the message is vague, the agent either stalls, retries blindly, or escalates to the nearest human. None of those outcomes is free.
We changed how we write error responses after watching agents get stuck on messages that were technically accurate but practically useless. The standard we settled on: every error response includes what was wrong, why it was rejected, and when (if ever) the same request might succeed. That last part matters more than it sounds. An agent that receives “rate limit exceeded” with no indication of when to retry will either wait too long or retry too fast. An agent that receives “rate limit exceeded, retry after 30 seconds” can make a real decision.
This pushed us toward error shapes that look more like structured data than prose. Not because agents cannot parse prose, but because consistent shapes are more reliable than natural language inference, especially under time pressure.
{
"error": "checkout_conflict",
"message": "Issue is already checked out by another agent",
"retryable": false,
"conflictingAgentId": "...",
"lockedAt": "2026-04-04T12:00:00Z"
}
That extra context costs almost nothing to generate server-side. For the agent receiving it, the difference between this and "error": "conflict" is the difference between stopping and knowing exactly what to do next.
Idempotency becomes load-bearing
Human-driven clients rarely retry the same request twice. If a form submission fails, the user usually starts over. If a button click triggers an error, they try again later.
Agents retry constantly. Network hiccups, timeouts, interrupted execution windows — all of these lead to a pattern where an agent has no reliable knowledge of whether its last request succeeded. The safe move is to retry. If the API is not idempotent, those retries cause problems: double-created resources, double-applied state changes, tasks checked out twice.
We made every mutating endpoint in our internal APIs accept a client-supplied idempotency key. The pattern is simple: the agent generates a stable identifier for the operation (usually derived from its run ID and the action it is taking), includes it in the request, and the server deduplicates on that key. The same operation attempted twice with the same key has the same effect as one attempt.
This required some thought about what “same effect” means for operations that have side effects. Creating a comment with the same content twice should not post the comment twice. Checking out a task that is already checked out by the same agent should return the current state, not an error. We found that treating idempotency as a first-class concern forced us to be more explicit about the intended semantics of each endpoint, which improved the design independently of the agent use case.
Pagination that assumes state persistence does not work
Cursor-based pagination is not new. But most internal APIs we have worked with still use offset-based pagination because it is simpler to implement and human clients rarely notice the edge cases.
Agent clients expose those edge cases constantly. Offset-based pagination breaks when records are inserted or deleted mid-traversal. For a human paging through results, an occasional duplicate or missing item is annoying. For an agent processing a full dataset, silent gaps are data corruption.
We moved to cursor-based pagination across our internal APIs. The cursor encodes position in the sorted set, not a numeric offset. Insertions and deletions between page requests do not affect traversal accuracy. Agents that process large datasets do so without silent data loss, even when the underlying records change between requests.
The related issue is page size. Human clients tend to request reasonable amounts by default. Agents do not always have calibrated intuitions about what “reasonable” means, especially when they are working through a large backlog. We added server-side caps and made the page size constraints explicit in the response, not just the documentation. If an agent requests 10,000 records and the maximum is 100, the response should say so clearly, not just truncate silently.
Observability changes shape
When human developers use an API, the primary debugging surface is their own client code. They can add logging, step through with a debugger, check the network tab.
When agents use an API, the debugging surface is largely on the server side. The agent’s execution context is a black box that may not be accessible at all, especially in production runs. The server either records what happened or it doesn’t.
This pushed us to be more aggressive about trace IDs and run correlation. Every request from an agent carries a run identifier that links it to a specific heartbeat execution. Every response includes a server-generated trace ID. When something goes wrong, we can pull up all requests from a given run and replay the sequence. Without this, diagnosing a failed agent run means making educated guesses from incomplete logs.
The other change was in audit trails. Human clients are accountable through the humans using them. Agent clients need their own accountability chain. We log not just what happened but which agent triggered it, under which authorization context, as part of which run. This made it possible to answer questions like “who changed this record and why” without needing to reconstruct intent from circumstantial evidence.
What this means for new API design
The practical upshot is that APIs designed for agent consumers need to be more explicit about everything. More structured errors, more explicit pagination, more idempotency, more tracing. None of this is novel engineering. It is mostly discipline applied more consistently than most internal APIs demand.
The interesting thing is that these properties also make APIs better for human consumers. Structured errors help developers as much as agents. Cursor-based pagination is more correct than offset-based for anyone. Run correlation helps human operators investigate problems faster.
We build for agents now, but we’ve found that the resulting APIs are ones we’d want even if the only consumers were humans. That convergence feels like a good sign.