You Are Using The Wrong AI Metric
Adoption metrics don’t track success. They manufacture it.
Monday morning. Bob opens his laptop and converts his bullet points into a polished presentation using AI. Professional slides, coherent flow, impressive visuals. He hits send.
Alice receives it. Twenty slides. No time for this. She feeds it to her AI assistant: “Summarize the key points.”
The AI returns bullet points.
Bob’s bullets became a presentation became bullets. Content made a round trip through artificial intelligence only to arrive exactly where it started. Except now the organization paid for it twice... and neither Bob nor Alice engaged with the substance at all.
On the dashboard, everything looks great. Two employees adopted AI. Adoption rate: up. Prompts used: up. Content generated: up.
Welcome to the Round-Trip Economy.
Looking for Measurement
Measurement isn’t neutral. It’s a strategy.
Organizations measure AI adoption because they can’t measure AI value. As Christopher S. Penn puts it: “If you don’t know the ROI of what you’re doing today, you cannot calculate the ROI of AI’s impact on it.”
Most organizations never measured knowledge work value in the first place. So they default to what’s visible: activity. But when activity becomes the target, it ceases to measure anything real. Goodhart’s Law at enterprise scale.
According to Worklytics’ 2025 survey, 74% of organizations say better metrics are critical. Only 17% feel effective at measuring true value.
Erik Brynjolfsson calls this the “productivity J-curve”: inputs rise before outputs do. Real transformation requires redesigned workflows, upskilled teams, new processes. But quarterly KPIs penalize exactly those investments.
Wrong metrics don’t just fail to measure success. They actively incentivize behaviors that destroy value.
What Activity Metrics Create at Scale
41% of employees now receive what Harvard Business Review calls “workslop”, AI-generated content that looks professional but contains no substance. Each instance costs 2 hours to decode, verify, redo. As one retail director put it: “I had to waste my own time having to redo the work myself.” Researchers estimate that adds up to $9 million annually for a 10,000-person organization.
The financial cost is bad enough. The social damage is worse. 32% of workslop recipients are less likely to work with the sender again. Metrics optimized for individual productivity destroy organizational collaboration.
Most organizations treat AI as a faster version of what they already had. Penn identifies five dimensions where it actually transcends human limits: speed, scale, flexibility, complexity, patience. Real transformation comes from combining at least three simultaneously. But that requires building the motorway first—redesigning processes around AI’s actual capabilities.
Most organizations aren’t building motorways. They’re measuring how fast employees pedal.
Sometimes the damage doesn’t show up in any dashboard. Consider meetings. Organizations now record everything: for transcription, for AI summaries, for “knowledge capture.” But when recorded, some people stop saying what they think. They hedge. They wait for the recording to stop before speaking candidly.
On the dashboard: meetings recorded, transcripts generated, summaries delivered. Off the dashboard: the question someone didn’t ask, the idea that stayed unspoken, the challenge to the plan that would have changed everything. Here’s a pattern that confuses organizations: individual tasks get faster, but organizational output stays flat.
AI coding assistants increase developer output, but not company productivity. Faros AI tracks it precisely: “Downstream bottlenecks absorb the value.”
The dynamic is consistent across functions. Developers ship code 35% faster, but QA can’t keep up and features pile up in review queues. Content teams triple their output, but legal review becomes the chokepoint. Marketing creates more campaigns, but creative approval grinds to a halt. The constraint doesn’t disappear. It moves downstream to wherever AI hasn’t been deployed, and intensifies there.
Meanwhile, 44% of marketers now use AI for content summarization. The loop closes: AI generates content humans can’t process, so organizations deploy more AI to summarize what AI created.
The primary skill becomes knowing what to ignore. That’s not productivity. That’s triage.
What Actually Separates the 5%
Only 5% of companies achieve what Boston Consulting Group calls “future-built” status—five times the revenue growth, three times the cost reductions. Tempting to ask what they do differently.
Less tempting to sit with the honest answer: they think harder about the question behind the question.
Harvard and Mayo Clinic researchers found that combining physician intuition with algorithmic analysis cuts hospital readmissions by 26%; neither alone performs as well. The metric that mattered wasn’t “percentage of doctors using AI.” It was readmission rate. But the reason this worked wasn’t a methodology. Physicians and data scientists had to figure out, together, what the AI could see that humans couldn’t.
Most organizations can’t define what AI success looks like because they’ve never clearly defined what the underlying work was supposed to accomplish. The easy answer is “measure outcomes instead of activity.” The honest answer is harder: outcomes in knowledge work are difficult to specify, results take longer to materialize than any quarterly cycle allows, and the J-curve means you’ll look like you’re failing before you’re winning. There’s no shortcut around that.
What the 5% share is less a playbook than a mindset. They ask, seriously, which problems actually warrant deploying AI, and which ones just seem convenient. They know that no organization wins because they automated their meeting recaps. Competitive advantage comes from deploying AI where its scale, speed, and pattern recognition create something genuinely new: an analysis no human team could have produced, a decision informed by signals no individual could have tracked. That’s a different category from faster slide decks.
HBR research distinguishes “pilots” from “passengers”: pilots use AI with intent, to extend what they’re capable of; passengers use it to offload work they didn’t want to do. Pilots use AI 75% more, and to better effect. The difference isn’t skill. It’s understanding. Pilots know why they’re using AI on a given task. Passengers don’t.
That’s the real diagnostic: not how many prompts your team generates, but whether they can articulate what they’re trying to accomplish and why AI belongs in that work. Adoption metrics have their place, but only alongside honest outcome thinking.
The 5% who see transformative returns built the motorway before they bought the supercar.
You liked this article? You may want to read the following, which explores the same paradox from the individual’s perspective. Same paradox, different scale.
Why AI Disappoints At Productivity - But Excels At Ambition
Best AI users don't spend less time on their projects, they spend more. We thought we would get a speed machine. We got a depth machine instead.





The Round-Trip Economy framing is sharp. The dangerous version isn't the obvious case where AI generates nothing -- it's where AI produces something technically correct that then moves through four more human steps before anyone asks whether the original task was worth doing.
I've been running a structured experiment tracking actual outputs vs costs over 90 days precisely because 'sessions used' was meaningless. The honest answer is the numbers are messier than expected, but they're pointing the right direction. Early data here: https://thoughts.jock.pl/p/project-money-ai-agent-value-creation-experiment-2026
There are so many insights within that article - thanks for sharing. It is really felt than when everybody expedite content production through genAI, the temptation to avoid « the last mile » human judgment is real - and suddenly more work done begins to feel like snowball avalanche of things to be done and loosing sight of the real value. The initial example made me feel about hiring with ATS : twice as more tailored made CVs going through fewer job openings and automated ATS screening. Ultimately time to hire (or time to find a job) did not improve. Inconsistent agent to agent interactions in that case made none of the human parties happier.