The Arup case is now more than two years old. A finance employee in the company’s Hong Kong office sat through a video meeting, watched the UK-based CFO and several colleagues request a sequence of transfers, and processed fifteen of them, totaling roughly twenty-five and a half million US dollars. None of the people on the call were real. The fraud went undetected for about a week.
Most of the public coverage has framed this as a detection problem. The video looked convincing enough. The voices sounded right. The participants moved naturally on screen. The technology has improved past the point where a person, watching a screen on a normal day, can reliably tell. The State of the Call 2026 report this year put a number on how widespread the experience has become: roughly one in four Americans say they received a deepfake voice call in the last twelve months. Sumsub puts deepfakes at about eleven percent of all global fraudulent activity. The detection question, framed as “is this video real,” is genuinely hard, and getting harder.
The detection framing is also, we think, the wrong frame.
The fraud succeeded because there was no second channel
The Arup employee did not approve a wire transfer because they could not tell the video was fake. They approved fifteen wire transfers because nothing in the workflow required a second channel of confirmation. The video meeting was the channel. The video meeting was also the only channel. If the call had been a phone call from the same fake CFO, asking for the same transfers, the loss would not have been smaller. The novelty of the case is not that the impersonation was visual. The novelty is that “I saw them on a video call” had become someone’s idea of a sufficient authorization for moving twenty-five million dollars.
That assumption is older than deepfakes. Most office processes inherited it from a prior decade where the cost of producing a convincing live-video impersonation was high enough that nobody bothered. Fraud was dominated by phishing emails and forged invoices, both of which had obvious in-band tells that staff were trained against. Live video, by comparison, looked like proof. The Arup case is the moment that assumption stopped paying for itself.
The second-channel problem is not new in any system that takes its threat model seriously. Out-of-band confirmation is the default in most engineering and security workflows. A deploy approval that moves through one channel cannot be authorized in that same channel. A code review request comes in one place and the merge happens in another. A credential reset arrives via email and confirms via SMS or an authenticator app. None of those workflows trust the original message in isolation. The reason the deepfake era feels so disorienting is that ordinary office work, especially in finance and operations, never adopted that discipline. The cost of the incident at Arup is partially the cost of fifteen transfers. It is mostly the cost of an industry catching up to a protocol it had skipped.
What the recommended rituals are actually doing
The advice circulating after each new deepfake fraud story converges on a small set of rituals: agree on a code phrase with family members, call back on a known phone number when an unusual request comes in, route any wire transfer above a threshold through a dual-approval workflow that touches at least two communication channels, ask the person on the call a question only the real person can answer. None of those rituals are sophisticated. They are the household and small-team version of out-of-band verification.
The interesting thing about the list is how much of it the engineering side of the same companies already does. A request to deploy production from a chat channel is normally routed through a separate review tool. A request to rotate a key requires confirmation through a different system than the one where the request landed. A high-privilege action triggers a notification on a phone the requester doesn’t control. These are the same five moves restated for a meeting room.
What changes when scam compounds start hiring full-time fake “AI models” to staff video calls, as Malwarebytes documented this year, is the volume of attempts and the production quality. What does not change is the fix. There is no detection tool that becomes a long-term answer. The arms race between generation and detection runs in favor of generation, because every detector becomes the next generator’s training signal. The protocol fix does not have that property. A second channel is a second channel regardless of how good the first one looks.
The harder question is who carries the discipline
In an engineering organization, the second-channel discipline is carried by tooling. The user does not have to remember to require dual approval; the deploy script will refuse without it. Most office finance flows are not built that way. The discipline is carried by a person, often the most junior person in the chain, who is also the person under the most time pressure when the fake CFO calls about a deal that has to close before market open in another time zone. That mismatch is the actual point of attack, and it is the part the technology coverage tends to underweight.
The shift we expect over the next eighteen months is not better deepfake detectors deployed at the edge of every video call. It is finance and ops teams adopting the same kind of refusal-by-default tooling that engineering already runs on. Wire-transfer approval becomes a workflow, not a conversation. The system blocks the second transfer until a callback to a known number has been logged. The dual-approval requirement is enforced by the rail, not by the policy doc. None of that is novel work. It just has not been priced in for office finance the way it is priced in for production deploys.
The cinematic reading of the Arup story, the reading where the fraud succeeded because the deepfake was good, is the version that makes the technology feel inevitable and the defense feel hopeless. The quieter reading is more useful. The fraud succeeded because, at the moment of the request, there was nothing in the workflow that asked a second question through a second channel. The deepfake era did not invalidate trust at work. It revealed how much of that trust was sitting on a single channel that had only ever been good enough by accident.