The Metrics
What an engagement looks like measured. A redacted view of a working KPI dashboard, with five years of anonymized client data. The structure, the visualizations, and the numbers are real — only the client and project specifics have been removed.
Knowledge work has always been hard to measure objectively. Story points get gamed; subjective reviews carry bias; manager intuition drifts. The AI Bench instruments output continuously — per agent, per engineer, per ticket, per hour. For the first time, engineering productivity is as observable as physical work.
Two-week sprint, four engineers
Same team, same sprint, with and without a tuned AI Bench.
Without AI Bench
With AI Bench
Same team, same sprint, twenty-five times the output. Cost per story point drops from $750 to about $31 — the kind of number leadership can actually do something with. The team chooses how to spend the gain: more features, or the same features in days instead of weeks.
Time to ship — 9,550 tickets, five years
Distribution of ticket resolution time before and after the AI Bench was introduced. Log-spaced bins. Median lines marked.
Quality mix over time
Share of total effort by ticket category, by year. Stable until the AI Bench is introduced — then bug-fix shrinks, and feature, refactor, exploration, and documentation work all grow.
1 day, 1 engineer, 7 agents in the Bench
145h human estimate → 7h 29m human hands-on (plus 13h of Bench time) · 19× less human time · 39 tasks closed.
| Agent | Tasks | Human est | Hands-on | AI actual | Ratio | Tokens (M) | Commits |
|---|---|---|---|---|---|---|---|
| A | 20 | 63h | 20m | 1.2h | 53.2× | 84.02 | 4 |
| B | 1 | 4.0h | 5m | 16m | 15.0× | 18.23 | 1 |
| C | 1 | 4.0h | 5m | 18m | 13.3× | 7.35 | 1 |
| D | 10 | 52h | 37m | 5.2h | 10.1× | 201.72 | 9 |
| E | 1 | 4.0h | 10m | 47m | 5.1× | 37.04 | 8 |
| F | 5 | 17h | 1h 35m | 4.4h | 3.9× | 285.98 | 13 |
| G | 1 | 1.0h | 12m | 32m | 1.9× | 20.33 | 1 |
Actual engagement dashboards go further: per-agent trends over time, share-of-attention breakdowns, tokens-per-hour comparisons, weekly aggregates with prior-day deltas. This is the daily headline.
Weekly trend — AI hours versus human hands-on
Daily breakdown across one work week, with weekly means shown as dashed horizontal references.
The day's economics
What the same day would have cost in traditional engineering versus what it actually cost. Adjust the engineer rate to your own cost basis — offshore hourly, onshore annual, or anywhere in between.
Engineer time, redirected
The Bench takes the routine work, and the work that always slipped — documentation, cleanup, the refactors no one got to. Your engineers spend their day on vision, strategy, and the tough problems you hired them to solve.
The numbers above the bar chart are labor cost avoided. The numbers below it are the opportunity created on top — engineer hours redirected from writing routine code to the higher-leverage work expensive engineers were hired to do.
Compute and tokens
Ticket throughput
This is what an engagement looks like measured. Real numbers, in your repository, available to your team and to whoever needs them.
If you want this kind of visibility over your own team's work, write us.
hello@fromeach.comSource: data anonymized from five years of client engagements using Jira with Tempo for timesheet collection. We do not make up numbers. We put them on charts and teach your team how to collect the right data to tell the right story. No vibes, no feelings — this is what AI-first development actually looks like when you measure it.