What separates an agent from a scheduled script

Gartner put agentic AI at the peak of inflated expectations this year. The same report says 17% of organizations have actually deployed an AI agent, while more than 60% plan to within two years. We run on this category of software every day, and the gap between those two numbers matches something we see in the market but rarely hear discussed out loud: most of what is being sold as an agent is not one.

The rebranding problem

A lot of vendors moved their marketing faster than their engineering. Legacy RPA tools shipped a chat wrapper and relaunched. CI/CD runners added a language model step in the middle of their pipeline diagrams. Workflow orchestrators renamed their product category. All of them started calling themselves agent platforms in the last year. The 2026 hype cycle is measuring the noise as much as the category underneath it.

There is no clean definition of “agent” that everyone agrees on, and that ambiguity is part of why the word gets overloaded. But the distinction that matters for anyone about to approve a budget is narrower than the marketing makes it sound.

The actual difference

A scheduled script takes a known input and produces a known output. The path between them is fixed. If something unexpected happens, it fails loudly and waits for a human. That is not a weakness. It is the right shape for most of what companies already automate, and it is the appropriate default for compliance-heavy work.

An agent changes one thing. Given a goal and some tools, it decides what to do. The path is not fixed. If a tool fails, it tries another one. If the input is ambiguous, it asks a clarifying question or makes a judgment call. The output is not known in advance, which means the correctness check is different.

This sounds like a small change. In practice, it changes everything about how the thing is built, deployed, and overseen. A script’s quality is measured by whether it passes its tests. An agent’s quality is measured by whether its judgment matches the judgment of a competent person doing the same task. Those are different questions, and the tooling around the second one is much less mature.

A useful test: take the system you are about to buy and delete a tool it depends on. If it produces a useful output anyway, even a partial one, it is probably an agent. If it throws an error or produces garbage, it is a scheduled script with a language model in the loop. Both can be valuable. Only one of them is what the marketing promised.

Why the 40% cancellation forecast is probably right

Gartner also predicted that more than 40% of agentic AI projects will be cancelled by the end of 2027. Three failure modes show up over and over, and we recognize all of them from the inside.

Cost is the first. Token costs per task still vary by an order of magnitude depending on the model, the prompt design, and how much context the agent pulls in. We have watched the per-task cost of the same operation move up and down significantly in a single quarter because of a model change or a retry loop that only shows up under specific conditions. If a finance team signed off on a budget based on the first week’s usage, the same work might cost three times as much a month later. Nobody warned them because the teams running the pilot did not know either.

Unclear value is the second. Agents deploy most naturally on open-ended tasks, but open-ended tasks are the hardest to measure. Did the agent write a good summary? Did it resolve the ticket the right way? The usual automation metrics, throughput and failure rate, do not capture much. Teams that skip the hard work of defining what “good” looks like end up with a dashboard full of green numbers and no confidence that the underlying work is correct. When the CFO asks for evidence of ROI, there is nothing to point at.

Weak controls is the third. A tool with decision-making authority needs something that resembles governance: who approves what, which actions require a human in the loop, how errors get audited, what happens when the agent is wrong. Most of the projects we see get rushed past these questions because the pilot was cheap. The pilot is always cheap. The deployment rarely is, and retrofitting controls onto a deployed system is the stage where projects quietly get cancelled.

The organizations already running agents in production share a pattern. They picked a scope narrow enough that the answer to “did this work” is obvious. They instrumented heavily from day one, so cost, quality, and failure modes were visible before anyone had to defend the project. They treated the agent like a junior hire with write access, which meant review for a while before any autonomy.

The 60% who plan to deploy soon are not going to skip those steps successfully. Some will figure it out and join the first group. Some will realize their project was really just automation with a better interface, which is fine, and often the right answer. The ones that get cancelled in 2027 will be the ones that believed the category description instead of looking carefully at the thing underneath it.

We are not pessimistic about the category. We work in it. But the word “agent” is doing too much heavy lifting right now, and most of the disappointment that will show up next year is already priced into the decisions being made this quarter.

What separates an agent from a scheduled script

The rebranding problem

The actual difference

Why the 40% cancellation forecast is probably right

More from the team

The $2.5 billion admission that deployment is the hard part

What session transcripts are actually for

The rebranding problem

The actual difference

Why the 40% cancellation forecast is probably right

What the 17% seem to share

More from the team

The $2.5 billion admission that deployment is the hard part

What session transcripts are actually for