Why Delivery Metrics Finally Matter to Executives
Engineering teams have always known that deployment frequency and incident recovery time matter. But for years, these metrics lived in developer tooling, invisible to the business. DORA changed that.
The DevOps Research and Assessment (DORA) program, now part of Google Cloud, analyzed thousands of engineering organizations over a decade. Their finding: four specific metrics predict both software delivery performance and organizational performance (profitability, market share, productivity). Teams that score in the top tier — "Elite" performers — deploy 973x more frequently than low performers and recover from incidents 6,570x faster.
These numbers are not a typo.
The Four DORA Metrics
1. Deployment Frequency
How often does your team deploy to production? This is the most visible measure of delivery throughput and batching behavior. Teams that deploy in large batches deploy infrequently — and large batches mean large blast radii when something goes wrong.
Elite: Multiple times per day (continuous deployment)
High: Once per day to once per week
Medium: Once per week to once per month
Low: Once per month to once every six months
2. Lead Time for Changes
The time from a code commit being made to that commit running in production. This measures the efficiency of your entire delivery pipeline: PR review, CI, approval gates, and deployment automation.
Elite: Less than one hour
High: One day to one week
Medium: One week to one month
Low: One to six months
Long lead times usually indicate a combination of large PRs (more review time), slow CI pipelines, manual approval gates, or infrequent deployment windows. Each is independently fixable.
3. Change Failure Rate
What percentage of deployments cause a degradation requiring a hotfix, rollback, or patch? This is the quality signal in the DORA framework.
Elite: 0–15%
High: 16–30%
Medium/Low: 16–30% (DORA's 2023 report collapsed these tiers)
Counterintuitively, teams with high deployment frequency often have lower change failure rates. Smaller, more frequent deployments are easier to test, easier to review, and dramatically easier to roll back. The "deploy less to break less" instinct is wrong.
4. Mean Time to Restore (MTTR)
When a service incident or deployment failure occurs, how long does it take to restore service? This measures your team's ability to detect, diagnose, and recover.
Elite: Less than one hour
High: Less than one day
Medium: One day to one week
Low: One week to one month
MTTR is primarily a function of observability and on-call practice, not just technical architecture. Teams with good dashboards, structured runbooks, and practiced incident response recover faster than teams with better infrastructure but poor observability.
How to Measure Without Gaming the Numbers
DORA metrics get gamed when they're used for performance evaluation of individuals or teams. When deployment frequency becomes a target, engineers start splitting trivial changes into multiple commits to inflate counts. When lead time becomes a metric, teams bypass code review.
The right framing: use DORA metrics as a diagnostic tool for the system, not as a report card for people. Collect them automatically from your delivery pipeline (GitHub Actions, CircleCI, PagerDuty, or a tool like Sleuth or LinearB) — not from self-reported data.
Where to Start
If your team has never measured DORA metrics, start with Deployment Frequency and MTTR — they're the most straightforward to instrument and give the most immediate signal. A team deploying once every two weeks with a four-hour MTTR has very different problems from one deploying daily with a 48-hour MTTR.
Once you have baseline measurements, identify the single biggest constraint in your pipeline — the one metric furthest from Elite — and focus improvement effort there before touching the others.
Ready to build a engineering dashboard?
Browse our production-ready templates with realistic mock data and real KPI configurations.
Browse Dashboard Templates