AI Return-On-Investment (ROI) is becoming harder to quantify, and the problem is not that AI has no value, it is a matter of time and hindsight.
The problem is that many ROI models are built around the easiest parts of the AI story to count: tool adoption, output volume, time to generate, and vendor-promised productivity gains. Conventional wisdom treats those numbers as evidence that the investment is working. In contrast, the harder parts are arriving later, and they are not in the model.
The CFO problem with AI ROI is not just one problem, it is three problems converging in the same space:
- Expected savings are underperforming
- Actual costs are becoming harder to control
- human verification labor is still missing from the calculation
AI ROI is being squeezed from both sides before the human governance layer is even priced.
The Savings Story Is Not Arriving Cleanly
The AI investment case has been sold as a cost-reduction story, “fire humans, AI is cheap”. But recent data makes that story harder to believe or accept at face value.
A Bain & Company survey reported by Bloomberg found that 40% of companies were showing AI cost reductions of 10% or less, while only 4% globally had reached savings above 30%. AI budgets are being approved in anticipation of savings that may not arrive on the original timeline, in the original magnitude, or in the original cost model.
If AI generates more output but requires more review, the gross speed gain is not the net savings. If AI reduces drafting time but increases senior verification time, the labor did not disappear. It moved into a more expensive layer.
The Cost Side Is Becoming Less Predictable
The Information reported that Anthropic customers, including ServiceNow, are facing difficulty predicting what they will pay this year. ServiceNow said it had consumed its full-year budget for Anthropic AI tools before the year was out.
Companies pushed employees to use AI as proof of adoption, and some even tracked high token consumption as a success metric. Now the bill is arriving. Token volume is not ROI. For CFOs, the question is whether AI usage produced measurable operating value after token cost, review burden, and rework were counted.
More usage does not automatically mean more savings. More complex usage can mean more expense, more monitoring, and more review.
Token Spend Is a Behavior Problem
ServiceNow’s chief digital information officer said better telemetry would make it easier to identify employees who are “tokenmaxxing,” or using AI tools inefficiently.
Tokenmaxxing exposes the CFO flaw in AI ROI. It treats AI consumption as if it were evidence of value, when it may only be evidence of activity. The more employees are pushed to use AI, the more token consumption can rise. But higher usage does not prove that costs fell, judgment improved, rework declined, or business value increased.
Reuters Breakingviews has also described corporate AI sticker shock as companies confront tokenmaxxing, usage-based pricing, and AI bills that are harder to forecast. The ServiceNow discussion is part of the same pattern: AI spend is becoming behavioral, metered, and harder to control.
The Human Governance Layer Still Has to Be Priced
If a tool produces a draft in two minutes but a senior employee spends twenty minutes verifying, correcting, and contextualizing it, the ROI calculation cannot stop at the two-minute output. AI externalizes execution while internalizing judgment. If finance counts the first, without measuring the second, the return calculation is incomplete.
This is where the Power User Trap℠ becomes a finance issue. The power user is the employee through whom AI becomes usable inside real work: learning the failure modes, supplying institutional context, catching plausible errors, and translating machine output into something the organization can rely on.
Gartner warned CFOs in 2026 not to mistake AI deployment for value creation. That warning is the correct finance posture, deployment is not the same as durable operating value.
The CFO Question Has Changed
The CFO question is no longer simply: how much are we spending on AI? It is also: what did AI actually remove, what did it relocate, and what did it create?
The full cost model includes licenses, tokens, vendor fees, usage monitoring, deployment labor, internal review burden, rework, and the power users whose judgment stabilizes the system without appearing in any budget line.
The financial question is not whether AI can produce output faster. The financial question is whether the organization can prove that faster output became durable operating value after the full human and technical cost was counted.
The Materiality Question Set
Before the next board meeting or budget cycle, the organization should be able to answer:
- Which AI investments are being evaluated through adoption or output volume rather than verified operating value?
- If verification, correction, and informal training are priced in, does the AI investment still clear the approval threshold?
- Where are productivity gains being attributed to AI when the output is being stabilized by unmeasured human judgment?
- What evidence substantiates AI efficiency claims made to the board or investors beyond output volume or adoption rate?
Series Context
This is the fourth article in Lozen Advisory’s AI Workforce Materiality series. The next article, AI Productivity Claims Are Becoming a Disclosure-Control Problem, examines whether organizations can support public claims about AI-driven productivity when the underlying work depends on unmeasured human verification.
Commission a Strategic Briefing
AI adoption is creating unpriced liability for CFOs, substantiation risk for General Counsel, and continuity exposure for boards. Lozen Advisory delivers private advisory on AI implementation risk, unmeasured verification strain, and the human-capital exposure organizations are building without measuring.