Dualo
AI in Practice

AI ROI & Metrics — how to measure real value

Most AI projects fail on measurement, not tech. Here's what to track to know if it's actually working and to justify scaling the budget.

1 min read

The biggest cause of death for AI projects: unmeasurable value. Executive excitement drives the POC, then 6 months later there's no data to justify the next budget, so the project stalls. Measure from day one.

**4 metric families** to track for any AI project: (1) **Quality** (does it work? — accuracy, F1, CSAT on AI answers), (2) **Efficiency** (does it save time/money? — minutes saved per task, cost per query, deflection rate), (3) **Adoption** (do people use it? — DAU, % eligible tasks actually using AI, time-to-engagement), (4) **Risk** (what's going wrong? — rate, incident count, complaint volume).

ROI calculation pattern: [value created (time saved × hourly rate + errors avoided × cost per error + new revenue)] − [cost (API calls + infra + engineering time + change management)]. Do the math upfront (hypothesis) and at 3 months (measured). If hypothesis and measurement diverge 3×, you have a learning — not a failure, a calibration.

Typical value benchmarks (for sizing the business case): customer support deflection 20-40% of volume = hundreds of k€/year at mid-size. Internal Q&A (HR/IT/legal bots) 10-15 min saved × 1000 queries/week = 1-2 FTE saved. Document extraction 80% automation × 10k docs/month = 2-5 FTE. Sales email drafting: 30% faster = ~5h/week/sales rep.

Watch out — the traps: (i) vanity metrics ('the bot answered 10k questions' — without CSAT, useless); (ii) displaced time ('I saved 10 min' but that time went to scrolling, not to higher value); (iii) hidden costs (engineering maintenance, prompt iteration, eval set upkeep — typically 20-30% of initial build); (iv) sample bias (the N=20 who love it ≠ the 500 eligible users).

The leading indicator of success: user adoption at 90 days. If < 30% of eligible people use it regularly, technology is not the problem — change management is. Rework UX + onboarding + integrations into existing tools (Slack, CRM, email).

Grounded on https://www.anthropic.com/customers

Next up

AI Risks — hallucinations, prompt injection, privacy

The three big classes of risk on production AI systems, and the practical mitigations that actually work.