Quality and Goodhart's Law: When Your Organization's Metrics Become the Target and Excellence Becomes the Casualty

Uncategorized

Quality
and Goodhart’s Law: When Your Organization’s Metrics Become the Target
and Excellence Becomes the Casualty

When a measure becomes a target, it ceases to be a good
measure — and the KPIs everyone chased became the quality nobody
actually delivered.


The Metric That Ate the
Mission

It started with the best of intentions. The VP of Quality stood
before the leadership team and announced a new focus: first-pass yield.
“From this quarter forward,” she declared, “every plant will track FPY
daily. We’ll post it on the shop floor. We’ll review it in every
operations meeting. And we’ll tie management bonuses to hitting
98.5%.”

Six months later, first-pass yield hit 99.1%. The charts looked
beautiful. The trend lines pointed upward. The PowerPoint decks glowed
green.

But something else was happening that the charts didn’t show.

Scrap costs had risen 23%. Customer complaints had doubled. Three key
accounts had sent formal escalation letters. And the incoming inspection
team at the largest customer had found a pattern: parts that should have
been rejected were being reworked on the line, reclassified, and counted
as “first-pass good.”

The organization hadn’t improved its quality. It had improved its
ability to report quality.

This is Goodhart’s Law in action, and it may be the most dangerous
force operating inside your quality system right now.

What Is Goodhart’s Law?

Named after British economist Charles Goodhart, the principle was
originally formulated in the context of monetary policy: “Any observed
statistical regularity will tend to collapse once pressure is placed
upon it for control purposes.”

The more popular version, articulated by anthropologist Marilyn
Strathern, cuts closer to the bone: “When a measure becomes a
target, it ceases to be a good measure.”

The logic is simple and devastating. A metric is useful because it
approximates something real — quality, performance, efficiency,
customer satisfaction. But the moment you attach consequences to that
metric — bonuses, promotions, reputations, penalties — the people being
measured begin optimizing for the metric itself, not for the underlying
reality it was supposed to represent.

The measure and the thing measured diverge. And the harder you push
on the measure, the faster they separate.

Why Quality Systems
Are Uniquely Vulnerable

Every management discipline deals with Goodhart’s Law to some degree.
But quality management is especially susceptible, for three structural
reasons.

First, quality metrics are proxies for complex
realities.
“Defect rate” sounds straightforward, but what
counts as a defect? Who makes that call? Under what conditions? At what
point in the process? The moment defect rate becomes a target, every one
of those judgment calls gets subtly influenced by the target.

Second, quality outcomes have powerful stakeholders.
When production bonuses depend on throughput and quality bonuses depend
on defect rates, you’ve created two opposing forces. The system will
find a way to satisfy both — usually by distorting the one that’s easier
to game.

Third, quality measurement happens in organizations, not
laboratories.
The people measuring quality are not
disinterested observers. They’re employees with careers, managers with
budgets, and teams with cultures. The measurement system is embedded in
the social system, and the social system will protect itself.

The Taxonomy of Metric
Corruption

Goodhart’s Law doesn’t operate in just one way. In quality
organizations, it manifests through at least four distinct mechanisms.
Understanding which one you’re dealing with is the first step toward
resistance.

1. Classification Drift

This is the most common and most insidious form. When the pressure to
hit a target increases, the definition of what counts begins to
shift — not through any formal policy change, but through thousands of
small, individual judgment calls.

An inspector who would have rejected a part six months ago now sends
it back for “minor rework” that doesn’t count as a defect. A supervisor
who would have logged a near-miss as a quality event now files it as a
“process observation.” An engineer who would have classified a failure
mode as critical now argues it’s moderate.

Nobody is lying. Nobody is breaking rules. The boundary between
acceptable and unacceptable simply migrates, millimeter by millimeter,
until the metric says 99% but the customer is receiving 94% quality.

I watched this happen at an automotive supplier where the target was
“zero customer line stops.” The metric was achieved for three
consecutive quarters. The celebration was real. What the metric didn’t
capture was that the customer’s receiving inspectors had effectively
been retrained by the supplier’s quality team to accept parts they would
have previously rejected. The line stops went to zero because the
acceptance criteria went down, not because the parts got better.

2. Strategic Narrowing

When you measure one dimension of quality, organizations optimize
that dimension at the expense of all others.

Set a target for on-time delivery, and inventory accuracy
mysteriously improves while product testing thoroughness quietly
declines. Reward defect reduction, and cycle times stretch as operators
slow down to be more careful — but only on the measured steps, not on
the unmeasured ones. Focus on customer complaint response time, and your
team becomes excellent at closing tickets and terrible at solving the
underlying problems.

The classic example: a pharmaceutical company that set aggressive
targets for batch release time. The metric improved dramatically. What
degraded was the depth of deviation investigation. Batch records were
reviewed faster, but the reviewers stopped asking the uncomfortable
questions that would have triggered costly investigations. The metric
celebrated speed. The hidden cost was rigor.

3. Temporal Manipulation

When you measure quality over a fixed period — a shift, a week, a
month — the boundary between periods becomes a zone of intense strategic
behavior.

End-of-month spikes in “good” parts. Defects discovered on Friday
afternoon that somehow become Monday morning’s problem, attributed to
the new reporting period. Preventive maintenance deferred because taking
a machine down would hurt this month’s OEE numbers, even though the
deferral virtually guarantees a breakdown next month.

I saw a plant where the monthly quality report always showed
improvement in the last three days of the month. Always. For eighteen
consecutive months. The pattern was so consistent you could set your
watch by it. When I asked the quality manager about it, he smiled and
said, “The team knows what the target is. They just… find a way.”

He wasn’t wrong. They found a way. The question was whether what they
found was quality.

4. Metric Migration

Perhaps the most sophisticated form: when organizations achieve a
target by moving the measurement point rather than changing the
underlying process.

A supplier achieves “zero defects at customer incoming inspection” by
adding an expensive 100% sort at their own shipping dock — without
addressing the process variation that’s causing the defects in the first
place. The metric improves. The cost of quality explodes. The root cause
festers.

A plant achieves its target for “first article inspection pass rate”
by conducting pre-inspections before the official inspection and
reworking any failures before they’re logged. The official metric is
pristine. The unofficial rework rate is staggering.

The metric looks perfect because the organization has built a
parallel quality system — one that does the real work and one that
reports the numbers.

Thecascade
Effect: How One Metric Corrupts Many

Goodhart’s Law rarely operates in isolation. When one metric becomes
a target, it sets off a cascade of distortions through the entire
measurement ecosystem.

Consider a typical scenario: the organization sets a target for OEE
(Overall Equipment Effectiveness). Suddenly, every team has an incentive
to maximize OEE. But OEE is a composite of availability, performance,
and quality rate — and maximizing all three simultaneously is often
impossible.

So teams begin trading. They extend runs to avoid changeovers
(boosting availability) but increase inventory and reduce
responsiveness. They speed up machines (boosting performance) but
increase wear and defect rates. They narrow inspection criteria
(boosting quality rate) but ship more marginal product.

Each individual trade-off makes sense from the perspective of the
person making it. But the aggregate effect is an organization that looks
incredibly efficient on paper while becoming progressively less capable
in reality.

The deeper problem is that the cascade is invisible to the
people inside it. When your bonus depends on OEE, you don’t experience
your decisions as compromises. You experience them as optimization.
You’re doing your job well. The metric says so.

The Executive’s Blind Spot

There’s a cruel irony at the heart of Goodhart’s Law: the people most
vulnerable to it are the ones most confident they’ve accounted for
it.

Executives understand Goodhart’s Law intellectually. They’ve read
about it. They’ve nodded during presentations about it. They’ve even
warned their teams about it. And then they set aggressive targets
anyway, convinced that this time the target is the right one,
the team is the right team, and the culture is strong enough to resist
gaming.

It’s not.

The reason it’s not is that Goodhart’s Law isn’t a failure of
character or competence. It’s a structural feature of measurement
systems. If you attach consequences to a metric, the metric will be
gamed. Not because people are bad, but because people are rational
actors responding to incentives
— which is exactly what the
measurement system is supposed to achieve.

You can’t prevent Goodhart’s Law through willpower. You can’t prevent
it through culture. You can’t prevent it through values statements or
ethical training or “tone at the top.” You can only prevent it through
system design.

Designing Systems
That Resist Goodhart’s Law

The solution isn’t to abandon measurement. It’s to build measurement
systems that are resilient to the pressures that corrupt them.

Measure Multiple Things
Simultaneously

The single most effective defense against Goodhart’s Law is
redundancy in measurement. When you track one metric, it’s easy to game.
When you track five related but distinct metrics, gaming one usually
shows up as degradation in another.

Instead of just tracking defect rate, track defect rate and
scrap cost and customer complaint rate and rework
hours and warranty claims. If defect rate improves but scrap
cost increases, you’ve caught the game. If customer complaints drop but
warranty claims rise, the signal is clear.

The key is that the metrics must be independent enough to
catch different distortions but related enough to tell a
coherent story about the same underlying reality.

Separate Measurement From
Consequence

The closer the person measuring is to the person being measured, the
stronger the pressure to distort. Wherever possible, create separation
between measurement and consequence.

Use independent auditors for critical quality checks. Rotate
inspection personnel. Implement blind measurement systems where the
inspector doesn’t know which shift, which team, or which operator
produced the part. The more distance you put between the measurement and
the stakes, the more honest the measurement will be.

This doesn’t eliminate Goodhart’s Law — the organization will still
find ways to optimize around the measurement system. But it raises the
cost and complexity of gaming, which is often enough to keep the
distortions manageable.

Measure Process, Not Just
Outcome

Outcome metrics — defect rate, yield, customer returns — are the most
vulnerable to Goodhart’s Law because they’re the furthest from the
process and the easiest to manipulate through classification and
selection.

Process metrics — control chart adherence, calibration compliance,
standardized work conformance, FMEA review completion — are harder to
game because they measure behaviors rather than results. You
can’t easily fake whether a control chart is being updated daily. You
can’t easily game whether operators are following standardized work.

The best quality measurement systems balance outcome metrics (which
tell you whether you’re winning) with process metrics (which tell you
whether you’re playing the right game).

Change the Metrics
Periodically

This is the most counterintuitive recommendation, and the one that
meets the most resistance. But it’s also one of the most effective.

When you keep the same metrics for years, the organization becomes
expert at optimizing for those metrics — and the gap between metric and
reality widens. By rotating metrics periodically — not randomly, but
thoughtfully — you force the organization to continually re-engage with
the underlying reality rather than settling into a comfortable
optimization of a familiar target.

This doesn’t mean changing your quality objectives. It means changing
the proxies you use to track them. If you’ve been measuring
defect rate for three years, switch to measuring cost of poor quality
for a year. You’ll learn things about your quality system that the
defect rate metric had been hiding.

Audit the Auditors

Finally, every measurement system needs a meta-measurement system.
Who’s checking whether the metrics themselves are still meaningful?
Who’s looking for the gaps between what the dashboard says and what the
shop floor actually looks like?

This is where the Gemba walk becomes essential. Not as a management
ritual, but as a reality check against the measurement system. When the
dashboard says quality is improving but the shop floor feels worse,
trust the shop floor. The dashboard is measuring the dashboard. The shop
floor is measuring reality.

The Deeper Lesson

Goodhart’s Law isn’t really about metrics. It’s about the
relationship between measurement and meaning.

Every quality system relies on a fundamental act of faith: the belief
that what we can measure tells us something true about what we can’t.
That defect rates reflect real defects. That yield numbers reflect real
quality. That customer satisfaction scores reflect real customer
experience.

This faith is necessary. We can’t manage what we can’t measure, and
we can’t improve what we can’t track. Measurement is the foundation of
quality management.

But the faith must be earned. It must be tested continuously. And it
must be balanced with a deep humility about the gap between the map and
the territory.

The organizations that master quality aren’t the ones with the best
metrics. They’re the ones that never stop questioning whether their
metrics are still telling them the truth.

The measure is not the thing.

The target is not the mission.

And the dashboard is not the reality.

The moment you forget that, Goodhart’s Law reminds you — usually
through a customer complaint, a warranty claim, or a recall that the
metrics never saw coming.


Peter Stasko is a Quality Architect with 25+ years of experience
transforming organizations across automotive, aerospace, and
pharmaceutical industries.

Scroll top