DORA Metrics
DORA (DevOps Research and Assessment) metrics measure software delivery performance. They originated in the Accelerate research programme (Forsgren / Humble / Kim) and are maintained today at dora.dev. The CDF announcement of Oct 2025 codified the move from four metrics to the current canonical set of five.
LumenFlow calculates all five locally and — when a control plane endpoint is configured — ships them to the cloud as structured telemetry. For CLI usage, tag taxonomy, and batching semantics, see the advanced metrics guide.
The five metrics
Section titled “The five metrics”DORA groups the metrics into throughput (how fast good change flows) and instability (how often change causes rework or failure).
Throughput
Section titled “Throughput”Deployment Frequency
Section titled “Deployment Frequency”How often the team releases code to production.
- Formula:
commits_in_window / days_in_window × 7 - Unit: deploys per week
- Aggregation: mean (DORA canonical — median doesn’t make sense for a rate)
- Normalisation: LumenFlow scales the count to per-week regardless of the
--dayswindow so--days 7and--days 30produce comparable values.
| Tier | Threshold |
|---|---|
| Elite | > 5 / week |
| High | 1 – 5 / week |
| Medium | 0.25 – 1 / week |
| Low | < 0.25 / week |
Lead Time for Changes
Section titled “Lead Time for Changes”Time from code change to running in production.
- Formula: WU cycle time =
completed_at − claimed_at - Unit: hours
- Aggregation: median (DORA canonical; mean is skewed by long-tail PRs)
- LumenFlow also reports mean and p90 so dashboards can track both central tendency and tail behaviour.
| Tier | Threshold (median) |
|---|---|
| Elite | < 24 h |
| High | < 168 h (7 d) |
| Medium | < 720 h (30 d) |
| Low | ≥ 720 h |
Failed Deployment Recovery Time (FDRT)
Section titled “Failed Deployment Recovery Time (FDRT)”Time to recover from a deploy-caused failure.
- Formula: time between paired EMERGENCY-tagged commits
(
break_commit.timestamp → fix_commit.timestamp) - Unit: hours
- Aggregation: median (DORA canonical)
| Tier | Threshold (median) |
|---|---|
| Elite | < 1 h |
| High | < 24 h |
| Medium | < 168 h (7 d) |
| Low | ≥ 168 h |
Instability
Section titled “Instability”Change Failure Rate
Section titled “Change Failure Rate”Share of deployments that cause a failure.
- Formula:
failures / total_deployments × 100 - Unit: percent
| Tier | Threshold |
|---|---|
| Elite | < 15 % |
| High | < 30 % |
| Medium | < 45 % |
| Low | ≥ 45 % |
Deployment Rework Rate
Section titled “Deployment Rework Rate”5th canonical metric (added by CDF, Oct 2025). Measures reactive churn — reverts and hotfixes — as a share of total deploys.
- Formula:
(revert + hotfix commits) / total_deployments × 100 - Unit: percent
- Deduplication: a commit matching both
revert:andhotfixpatterns is counted once.
| Tier | Threshold |
|---|---|
| Elite | < 5 % |
| High | < 10 % |
| Medium | < 20 % |
| Low | ≥ 20 % |
Why FDRT replaced MTTR
Section titled “Why FDRT replaced MTTR”In 2023, DORA renamed Mean Time to Recovery to Failed Deployment Recovery Time and narrowed the definition. The rationale:
- MTTR conflated deploy failures with unrelated incidents — infra outages, third-party degradations, and platform events were landing in the same bucket as regressions, making the metric useless as a signal about code-change quality.
- Mean masked long-tail incidents. A single multi-day incident would drag the mean into the medium/low tier even if the team consistently recovered from normal deploy failures in minutes. DORA standardised on median as the canonical aggregation.
FDRT is the fix: scope = deploy-caused failures only, aggregation = median.
Why Deployment Rework Rate was added
Section titled “Why Deployment Rework Rate was added”Change Failure Rate catches failures that trigger alerts or user reports. It doesn’t catch the softer signal of teams that ship, quietly revert, and then re-ship a fix — where nothing failed visibly but a deploy effectively didn’t stick. The CDF added Deployment Rework Rate in Oct 2025 to surface this reactive churn.
Together, CFR + Rework Rate give a more complete instability picture:
- CFR high, Rework low: failures are visible; fix culture is reactive but corrective.
- CFR low, Rework high: silent instability — lots of “oops, revert, redo” that doesn’t register as incidents.
- Both low: genuine stability.
- Both high: systemic problem.
Mean vs median aggregation
Section titled “Mean vs median aggregation”DORA canonical guidance:
| Metric | Aggregation |
|---|---|
| Deployment Frequency | mean |
| Lead Time for Changes | median |
| Failed Deployment Recovery Time | median |
| Change Failure Rate | ratio |
| Deployment Rework Rate | ratio |
Median-over-mean matters because software delivery time distributions are heavy-tailed — one long-running PR or one stubborn incident can lift the mean into a tier that misrepresents the normal case. The median is robust to outliers and answers the question DORA research cares about: what does a typical change look like?
LumenFlow follows this exactly:
- Lead time classification is driven by
medianHours. - FDRT classification is driven by
medianHours. - Both metrics still report mean and p90 alongside the median so dashboards can plot all three series.
Local proxies in LumenFlow OSS
Section titled “Local proxies in LumenFlow OSS”LumenFlow computes DORA metrics from signals that exist in every repository — no CI integration required. The trade-off is that two metrics are proxies rather than ground-truth production signals:
- Change Failure Rate uses the
.lumenflow/skip-gates-audit.ndjsonlog as the failure signal. This catches “we shipped despite failing gates” but misses actual production incidents that never touched skip-gates. - Failed Deployment Recovery Time pairs consecutive commits containing the
EMERGENCYtoken orfix(EMERGENCY)scope. It catches deploy-failure pairs tagged by the team; it misses recoveries that weren’t explicitly tagged.
Upgrading these to true production incident signals requires CI-side event integration (webhook from the incident tool, annotated deploy events) and is tracked as a follow-up WU — out of scope for the local proxy implementation.
See also
Section titled “See also”- Advanced: Flow Metrics & Analytics — CLI usage, tag taxonomy, cloud sync
- dora.dev — Using the four (now five) keys to measure your DevOps performance
- CDF — DORA’s Five Metrics (Oct 2025)
- Workspace spec —
control_planeconfiguration for cloud sync