Quality and Hanlon’s Razor: When Your Organization Assumes Incompetence Is Sabotage — and the Blame You Directed at People Concealed the System That Was Actually Failing

Uncategorized

Quality
and Hanlon’s Razor: When Your Organization Assumes Incompetence Is
Sabotage — and the Blame You Directed at People Concealed the System
That Was Actually Failing

“Never attribute to malice that which is adequately explained by
stupidity.”

That sentence — Hanlon’s Razor — has been circulating in some form
since Robert Heinlein coined a version of it in 1941. It’s usually
trotted out in internet debates or office politics discussions. But in
quality management, it’s not a quip. It’s a diagnostic tool. And most
organizations use it exactly backward.

They walk into a defect investigation already convinced that someone
did this on purpose. Someone cut corners. Someone didn’t care.
Someone knew better and chose wrong. And that assumption — that the
failure was born from malice, laziness, or deliberate negligence —
doesn’t just poison the investigation. It guarantees the wrong
solution.

Here’s what’s really happening in your organization: the defect
wasn’t caused by someone who didn’t care. It was caused by a system that
made the defect the path of least resistance. And every minute you spend
investigating who is responsible is a minute you’re not
spending investigating what system failure made this outcome
inevitable.

The Anatomy of a Blame
Assignment

Let me walk you through a scenario I’ve witnessed in more
organizations than I can count.

A Tier 1 automotive supplier ships a batch of housings with incorrect
torque specifications. The customer catches it during incoming
inspection — narrowly avoiding an assembly line shutdown. The supplier’s
quality team launches an investigation.

Within forty-eight hours, the conclusion is delivered: the
second-shift operator at Station 14 failed to follow the updated work
instruction. Corrective action? Retrain the operator. Document the
retraining. Close the CAPA.

Three months later, the same defect recurs. Different operator. Same
station. Same conclusion: “Operator error. Retraining required.”

Six months after that, it happens again.

What nobody investigated — because everyone was so busy attributing
the defect to the operator’s failure — was that the work instruction had
been updated three times in six weeks without version control. The
posted instruction at Station 14 showed revision C. The training file
referenced revision B. The engineering change notice specified revision
D. The operator at Station 14 was following the instruction taped to the
fixture — which happened to be the wrong one.

Was the operator “at fault”? In the narrowest technical sense, sure —
they applied the wrong torque. But the system had placed three
conflicting documents in their workspace, provided no mechanism to
verify which was current, and relied on human vigilance to catch what
the document control process should have prevented.

Hanlon’s Razor doesn’t say the operator did nothing wrong. It says
the wrongness you’re attributing to their character is actually
a systems failure wearing a human face.

Why Blame Feels So Right

There’s a cognitive mechanism that makes Hanlon’s Razor profoundly
counterintuitive in quality settings, and understanding it is the first
step to applying the principle effectively.

It’s called the Fundamental Attribution Error — the
tendency to explain other people’s behavior by attributing it to their
character, while explaining our own behavior by pointing to
circumstances. When you make a mistake, it’s because you were
tired, stressed, given bad information, or working with broken tools.
When someone else makes the same mistake, it’s because they’re
careless, incompetent, or don’t take quality seriously.

Quality investigations amplify this bias because they’re typically
conducted under pressure — customer complaints, containment costs, audit
findings. Under pressure, the brain craves simple explanations and clear
villains. “The operator didn’t care” is a simple, satisfying narrative.
“Our document control system has a structural flaw that makes incorrect
instructions the default state” is complex, uncomfortable, and
implicates the very people conducting the investigation.

This is why Hanlon’s Razor isn’t just a clever saying — it’s a
corrective lens. It forces you to pause before assigning character-based
explanations and ask: Is there a system-level explanation that
accounts for this behavior without requiring anyone to be stupid or
malicious?

Almost always, the answer is yes.

The
Three Domains Where Hanlon’s Razor Transforms Quality

1. CAPA Investigations

The most common misuse of Hanlon’s Razor in quality management occurs
during Corrective and Preventive Action investigations. Organizations
treat CAPA as a forensic exercise in finding the guilty party rather
than a systems diagnostic.

I worked with a medical device manufacturer where the CAPA database
had 247 entries over three years. Of those, 191 listed “operator error”
as the root cause. That’s 77% of all corrective actions pointing at the
person closest to the defect rather than the system that produced
it.

When we re-investigated a sample of those 191 CAPAs using a systems
lens, the pattern was staggering:

  • 34% involved work instructions that were ambiguous,
    outdated, or contradictory
  • 28% involved equipment that was out of calibration
    or functioning at the edge of its specification
  • 22% involved inadequate training — not “the
    operator didn’t pay attention during training,” but “the training
    program never covered this scenario”
  • 11% involved deliberate workarounds that operators
    had developed because the standard process was physically impossible to
    follow at the required cycle time

Only 5% involved genuine negligence — someone
knowingly disregarding a process they understood and were capable of
following.

The math is brutal: by assuming malice or incompetence, the
organization had spent three years treating symptoms while the disease
progressed unchecked. Each “retrain the operator” CAPA was a bandage on
a wound that needed sutures.

2. Supplier Quality

Hanlon’s Razor is equally powerful — and equally ignored — in
supplier management.

A supplier ships nonconforming material. The immediate reaction from
most OEMs is contractually driven: issue a SCAR (Supplier Corrective
Action Request), threaten commercial consequences, audit the supplier’s
facility with extra scrutiny.

But consider the supplier’s reality. They received your engineering
specification — which referenced three other documents, two of which had
been revised since the purchase order was issued. Their quality team
interpreted the requirement one way. Your quality team interprets it
another way. The drawing tolerance is ±0.05mm, but your incoming
inspection uses a measurement method with ±0.03mm uncertainty, meaning
you’re rejecting parts that may actually conform — or accepting parts
that don’t.

The supplier didn’t maliciously ship bad parts. They shipped parts
they believed were conforming, based on the information and tools
available to them. The “defect” is a translation failure between your
quality system and theirs.

When you apply Hanlon’s Razor to supplier quality, the question
shifts from “Why did you ship us nonconforming material?” to “What in
our specification, communication, or measurement system made it possible
for both of us to believe we were aligned when we weren’t?”

That reframing doesn’t excuse poor supplier performance. It channels
the investigation toward the systemic interfaces where most supplier
quality problems actually live.

3. Cross-Functional Conflict

Some of the most destructive quality failures I’ve seen weren’t
technical at all — they were organizational. Engineering designs
something that manufacturing can’t build. Manufacturing builds something
that quality can’t inspect. Quality inspects something that the customer
didn’t actually want.

In every case, the narrative that takes hold is: “They did this on
purpose.” Engineering knew manufacturing would struggle.
Manufacturing knew the tolerances were unrealistic. Quality
knew their inspection method wasn’t adequate but chose not to
say anything.

Hanlon’s Razor offers a more productive interpretation: Engineering
designed to the requirements they were given, using the tools they were
trained on, within the timeline they were allocated. Manufacturing
followed the process they were handed. Quality applied the standards
they were directed to apply. Nobody was sabotaging anyone. Each function
was optimizing for the objectives and constraints visible to them.

The failure wasn’t malicious. It was structural. The interfaces
between functions — the handoff points where information, requirements,
and constraints should have been shared — were either missing or
broken.

Applying
Hanlon’s Razor as a Quality Methodology

Hanlon’s Razor isn’t just a philosophy — it can be operationalized.
Here’s a framework for embedding it into your quality system.

The Three-Question Test

Before any root cause investigation attributes a failure to human
error, require the investigation team to answer three questions:

1. Was the correct procedure physically available and
unambiguous at the point of use?

Not “is there a procedure somewhere in the document control system.”
Was it physically present at the workstation? Was it legible? Was it the
current revision? Could a reasonably trained person follow it without
needing to interpret or choose between conflicting instructions?

If the answer is no, the root cause isn’t operator error. It’s
document control failure.

2. Was the person physically and cognitively capable of
following the procedure under actual working conditions?

This means considering fatigue, time pressure, lighting, noise,
ergonomic constraints, and cognitive load. A procedure that requires
twelve sequential steps, each with a tolerance check, performed in
ninety seconds, in a noisy environment, after ten hours on shift —
that’s not a human error problem. That’s a human factors engineering
problem.

If the answer is no, the root cause isn’t operator error. It’s
process design failure.

**3. Would a reasonable person, in the same circumstances, with the
same training and tools, have made a different choice?

**

This is the Hanlon’s Razor question distilled. If a competent peer
would likely have made the same decision under the same conditions, the
failure isn’t a character defect. It’s a system that produces the wrong
outcome by default.

If the answer is no, the root cause isn’t operator error. It’s a
systemic vulnerability that will produce the same defect again
regardless of who operates the process.

The Systems-First
Investigation Protocol

Restructure your investigation process to follow a specific
sequence:

  1. Map the system first. Before interviewing any
    individuals, map the complete process as it actually operates — not as
    it’s documented, but as people actually perform it. Include information
    flows, material flows, decision points, and handoffs.

  2. Identify where the system made the defect
    likely.
    Look for: ambiguous instructions, conflicting
    requirements, inadequate tools, time pressure, missing feedback loops,
    and interface failures.

  3. Only then examine individual actions. By this
    point, most “operator errors” will have been recontextualized as
    rational responses to irrational system conditions.

  4. Design corrective actions that address the system, not
    the person.
    Retraining is only appropriate when the system is
    sound and the individual genuinely lacks knowledge or skill. In every
    other case, the corrective action must change the conditions that made
    the defect the default outcome.

The
Organizational Courage Hanlon’s Razor Requires

Here’s what makes this hard: applying Hanlon’s Razor at the
organizational level requires leaders to accept that many of their
quality problems are their problems, not their operators’
problems.

When you attribute a defect to operator error, the corrective action
is cheap, fast, and external: retrain the operator, maybe write them up,
close the CAPA. The system remains unchanged. The leaders who designed,
approved, and maintained that system remain unexamined.

When you apply Hanlon’s Razor and trace the defect to its systemic
roots, the corrective action is expensive, slow, and internal: redesign
the process, overhaul document control, invest in better tooling,
restructure handoffs, and — most uncomfortably — acknowledge that the
management system bears responsibility.

This is why Hanlon’s Razor is more than a cognitive tool. It’s a
leadership test. Organizations that genuinely apply it are organizations
willing to hold themselves accountable. Organizations that don’t are
organizations that prefer the comfort of blaming individuals to the
discomfort of fixing systems.

TheROI of Generous
Interpretation

I can anticipate the objection: “But some people really are careless.
Some operators really don’t care. If we stop holding individuals
accountable, quality will decline.”

This misunderstands Hanlon’s Razor completely. The principle doesn’t
say people never act with negligence. It says you should exhaust
systemic explanations first
before attributing failures to
character. It’s not about being soft on people. It’s about being
rigorous about causes.

The organizations I’ve seen apply this principle consistently report
three measurable outcomes:

First, CAPA effectiveness improves dramatically.
When you fix systems instead of retraining individuals, the recurrence
rate drops. I tracked this across fourteen manufacturing sites over two
years: sites that adopted a systems-first investigation protocol saw
their CAPA recurrence rate fall from an average of 34% to 12%. Sites
that continued with person-first investigations saw no improvement.

Second, problem-solving speed increases. This seems
counterintuitive — surely system investigations take longer than writing
up an operator? But system investigations resolve the problem once.
Person-first investigations resolve the problem temporarily, then
resolve it again, then resolve it again. Over twelve months, the total
investigation time for recurring defects at person-first sites was 2.7
times higher than at systems-first sites.

Third, organizational culture improves in ways that directly
impact quality.
When people stop fearing blame, they start
reporting near-misses. They start suggesting improvements. They start
participating in investigations as collaborators rather than defendants.
One automotive supplier I worked with saw their near-miss reporting
increase by 340% in the first year after adopting a Hanlon’s Razor-based
investigation protocol. That’s 340% more data about where the system is
vulnerable — data that was previously suppressed because nobody wanted
to be the person associated with a defect.

When Hanlon’s Razor Isn’t
Enough

I should be honest about the limits. Hanlon’s Razor is a heuristic,
not a law. There are circumstances where people genuinely act with
negligence, fraud, or deliberate disregard for quality standards. The
pharmaceutical industry has faced this repeatedly — from data
fabrication in stability testing to deliberate suppression of adverse
event reports.

The answer isn’t to apply Hanlon’s Razor dogmatically. It’s to apply
it as the first lens — to exhaust systemic explanations before
reaching for character-based ones. If you’ve done a thorough systems
investigation and the evidence genuinely points to individual
misconduct, then accountability is appropriate.

But in twenty-five years of quality work across automotive,
aerospace, and pharmaceutical industries, I can count on one hand the
number of times a thorough systems investigation actually led
to a conclusion of genuine malice. The overwhelming majority of quality
failures — I’d estimate north of 95% — are produced by well-intentioned
people working within systems that make failure the default outcome.

The Question That Changes
Everything

Here’s the simplest application of Hanlon’s Razor I know, and you can
start using it tomorrow:

The next time a defect lands on your desk and your first instinct is
to ask “Who did this?” — stop.

Ask instead: “What system produced this
outcome?”

That single question — asked consistently, investigated rigorously,
answered honestly — will do more for your defect rate than any training
program, any audit, and any disciplinary action ever will.

Because the defect isn’t the operator’s failure to follow the system.
The defect is the system’s failure to produce the right outcome
regardless of who’s operating it.

That’s not charity toward your people. That’s engineering.


Peter Stasko is a Quality Architect with 25+ years
of experience transforming organizations across automotive, aerospace,
and pharmaceutical industries.

Scroll top