Engineering metric

Change Failure Rate. Whether you're shipping fast — or just breaking things faster.

Change Failure Rate is the percentage of your deployments that cause a production incident, rollback, or hotfix. It's the DORA metric that keeps deployment frequency honest: shipping twenty times a day is only impressive if those twenty deploys hold. A low change failure rate is what makes everything else possible — it's the reason an elite team can deploy in the middle of the day, without fear, and trust that customers won't feel it. Without it, frequent deployment isn't velocity; it's just breaking production more often.

What it is

The share of deployments that result in a production failure — an incident, a rollback, or an emergency hotfix. Lower is better. It measures the quality and safety of your shipping, the counterweight to how often you ship.

Measurement period

Rolling.

Tracked as a rolling percentage across recent deploys. Elite teams sit at 0–15% — most ships are clean, and the rare failure is small and quickly reverted.

Formula

Deploys causing a failure

Total deploys

× 100

Lower is better. Read it next to deployment frequency — speed and stability are one story.

When to review

Weekly.

Watch the trend weekly. A climbing failure rate means you're trading stability for speed — the warning sign that velocity is becoming chaos.

Why it matters

A low failure rate is what lets you ship without fear.

The reason a healthy engineering team can deploy whenever it wants — mid-day, on a Tuesday, no ceremony — and never really hurt customers is that its change failure rate is low. That's the foundation continuous deployment is built on. When the overwhelming majority of your deploys are clean, shipping stops being a high-stakes event and becomes routine. When a large fraction of them break things, every deploy is a gamble, the team gets cautious, releases batch up, and you lose the velocity that frequent shipping was supposed to give you. Speed and stability aren't opposites — stability is what enables sustainable speed.

That's why you can never read deployment frequency on its own. A team shipping twenty times a day with a 5% failure rate is elite. A team shipping twenty times a day with a 40% failure rate is breaking production eight times a day and calling it velocity. Same frequency, opposite businesses. Change failure rate is the number that tells the two apart — and it's the one that protects your uptime, since most downtime traces back to a change someone shipped.

Twenty deploys a day at a 5% failure rate is elite. Twenty a day at 40% is breaking production eight times a day and calling it velocity. The frequency looks identical; the businesses are opposite.

Benchmarks

The DORA bands — and yes, lower is better here.

Unlike most metrics, a smaller number is the good one — so these bands run best at the top, worst at the bottom. They're the standard DORA performance tiers for the share of deploys that cause a failure. Read this number next to deployment frequency: the goal is frequent and low-failure.

Elite0 — 15%

Most deploys are clean, and the rare failure is small and quickly reverted. This is the band that makes mid-day, on-demand deployment safe — the foundation of continuous deployment. Shipping is routine, not a gamble.

Healthy15 — 30%

A solid, workable rate for most teams. Failures happen but aren't the norm, and the team can still ship with reasonable confidence. Worth pushing lower as test coverage and rollback speed improve, but a respectable place to operate.

Watch30 — 45%

Nearly half your deploys are causing problems — the team is trading stability for speed, and it's starting to show. This usually means thin test coverage or batches that are too large. Slow down, shrink the changes, and invest in confidence before the chaos compounds.

UnstableOver 45%

More than two in five deploys break production. This isn't velocity, it's instability — the team is firefighting more than building, uptime is at risk, and frequent deployment is actively hurting customers. Stop, fix the pipeline and tests, and rebuild shipping confidence before adding speed.

When the failure rate is climbing

Three plays that actually move it.

A high failure rate almost always traces to batch size, test coverage, or recovery speed. The plays attack each — and the first instinct, counterintuitively, is to slow down to go faster.

— 01 Shrink the change

Smaller deploys fail less, and fail smaller.

The single biggest driver of failure rate is batch size. A large, multi-feature release has more surface area to break and is harder to reason about, so it fails more often and the failures are harder to isolate. Ship smaller changes more frequently, and each deploy is easier to test, safer to release, and trivial to roll back. Counterintuitively, deploying more often — in smaller pieces — usually lowers the failure rate.

— 02 Build the test and CI safety net

Catch failures before customers do.

A climbing failure rate is often a test-coverage problem — bugs reaching production that automated tests should have caught. Investing in a solid test suite and continuous integration moves failure detection earlier, from "customer reports an incident" to "the pipeline blocks a bad deploy." This is the unglamorous engineering work that converts a scary, failure-prone release process into a confident one, and it's where the durable gains live.

— 03 Make recovery instant

A failure you revert in minutes barely counts.

Some failures will always slip through, so the other lever is shrinking their impact. Fast, reliable rollback and a low time to restore turn a failed deploy from an outage into a blip — the change goes out, something looks wrong, it's reverted in minutes, and customers barely notice. When recovery is instant, the cost of a failure drops, and the team can keep shipping confidently while it works the rate down.

Common mistakes operators make with Change Failure Rate.

Reading deployment frequency without it.

Deployment frequency and change failure rate are one story, not two. A high deploy count with a high failure rate isn't velocity — it's breaking production faster. Always read them together: frequent and low-failure is elite; frequent and high-failure is chaos. Celebrating the deploy count while ignoring the failure rate is how teams convince themselves instability is speed.

Pushing for more speed while the rate climbs.

When the failure rate is rising, the answer is usually to slow down and fix the foundation — smaller batches, better tests, faster rollback — not to push for more deploys. Adding speed to an unstable pipeline just multiplies the failures. Counterintuitively, the path back to sustainable velocity runs through a temporary slowdown to rebuild shipping confidence.

Blaming individuals instead of the system.

A high change failure rate is a process and tooling problem — thin tests, large batches, slow rollback — not a sign that engineers are careless. Treating failures as individual mistakes creates a fear culture that makes the rate worse, because frightened engineers batch up changes and ship rarely. Fix the system that lets failures through, and the rate falls without anyone being blamed.

Ignoring its link to uptime.

Most downtime traces to a change someone shipped, so change failure rate is one of the biggest levers on uptime — and on the SLA promise behind it. A team chasing more availability nines through infrastructure while shipping unstable releases is solving the wrong half of the problem. Lower the failure rate and you protect uptime at its most common source.

Read alongside

Failure rate and frequency are one story.

Change failure rate only means something next to how often you ship. Frequent and low-failure is elite; frequent and high-failure is chaos. Read the two together, always — they're the velocity-and-stability pair at the heart of DORA.

Deployment Frequency guide →

How Upbeat helps

Speed and stability, read together.

Deployment frequency and change failure rate only tell the truth side by side — and most dashboards show one without the other. Upbeat puts both on the weekly leadership scorecard, so a rising failure rate is visible as the warning it is before the team is firefighting instead of building, and "we ship fast" never quietly becomes "we break things fast."

See how it works →See pricing

Related metrics

Stability, and what it touches.

Ship fast — and make it hold.

Upbeat keeps change failure rate next to deployment frequency on your weekly scorecard, so velocity and stability are read together and "fast" never quietly becomes "broken."

Become a design partner →