The Promise
Every Organization Makes and Breaks
Every quality management system in the world rests on a simple
promise: when something goes wrong, you fix it. And then — here’s the
part almost everyone forgets — you make sure it never happens again.
That’s CAPA. Corrective Action to address what already failed.
Preventive Action to stop what hasn’t failed yet from ever getting the
chance.
ISO 9001 demands it. The FDA requires it in medical device
manufacturing under 21 CFR 820.100. AS9100 builds it into aerospace.
IATF 16949 embeds it in automotive. Every auditor who has ever walked a
factory floor has asked the same question: “Show me your CAPA system.”
And every quality manager has opened the same binder, pointed to the
same forms, and made the same claim: “Yes, we have one.”
But having a CAPA system and having a functioning CAPA system are two
entirely different things. The difference is the gap between an
organization that learns from its failures and an organization that
merely documents them.
What
CAPA Actually Means (Before We Discuss How It Goes Wrong)
Let’s be precise.
Corrective Action is reactive. Something failed — a
nonconformance, a customer complaint, an audit finding, a defect that
escaped to the field. Corrective action asks: what happened, why did it
happen, and what will we do to ensure this specific failure doesn’t
recur? It’s the fire investigation after the fire.
Preventive Action is proactive. Nothing has failed
yet, but you’ve identified a risk — through trend analysis, through
near-miss reporting, through process monitoring, through someone in
production who said “this feels like it’s going to be a problem.”
Preventive action asks: what could go wrong, and what will we do to stop
it before it does? It’s the fire inspection before the fire.
Together, they form a closed loop: detect, investigate, act, verify,
and learn. The learning is the whole point. Without learning, CAPA is
just a documentation exercise — a way of appearing to address problems
while actually just recording them in a more formal format.
The Five Ways CAPA Systems
Fail
After twenty-five years in quality management across automotive,
electronics, medical devices, and heavy industry, I’ve watched CAPA
systems fail in the same five patterns, over and over, in organization
after organization. The patterns are so consistent that I can walk into
a factory I’ve never seen before, ask to review ten closed CAPA records,
and tell you within thirty minutes which of these failure modes their
system suffers from.
Failure Mode 1: The Root
Cause Shortcut
The most common failure. A nonconformance occurs. The CAPA form gets
opened. The “root cause” field gets filled in with the first plausible
explanation that comes to mind — usually “operator error” or “procedural
deviation.” No 5 Whys analysis. No Ishikawa diagram. No actual
investigation. Just the fastest answer that gets the form moving toward
closure.
The problem isn’t that these answers are always wrong. Sometimes an
operator does make an error. Sometimes a procedure isn’t followed. The
problem is that “operator error” is almost never a root cause — it’s a
symptom. Why did the operator make the error? Were the instructions
unclear? Was the training inadequate? Was the process designed in a way
that made the error easy to make and hard to catch? Was the operator
fatigued from a schedule that production, not quality, dictated?
When the root cause field says “operator error” and the corrective
action field says “retrained the operator,” you have a CAPA that will
fail again. You haven’t corrected anything. You’ve documented your
failure to investigate, and you’ve dressed it up in the language of
quality.
I’ve seen CAPA records where the same “operator error” root cause
appears for the same process failure in the same workstation six times
in eighteen months. Each time, the corrective action was “retrained the
operator.” Six times. Same operator, in some cases. The system was
telling them, in every language it had, that the process was the
problem. They kept blaming the person and calling it a solution.
Failure
Mode 2: The Action-Without-Verification Trap
A proper CAPA system has a verification step. After you implement a
corrective action, someone independent of the implementation checks
whether it actually worked. Did the nonconformance stop? Did the process
improve? Is the fix sustained over time — thirty days, sixty days,
ninety days?
Most CAPA systems skip this step entirely. Or worse, they perform it
as a checkbox exercise: “Verified effective — process monitored for two
weeks, no recurrence.” Two weeks is not verification. Two weeks is
optimism dressed up as evidence.
The timeline matters because many process changes have lag effects.
You adjust a machine parameter, and the immediate results look good —
but three months later, the change has introduced a subtle drift in a
related measurement that nobody is watching. You rewrite a procedure,
and the new version eliminates the original failure mode but creates a
new one because nobody thought through the interaction between the
revised step and the downstream process.
Verification means waiting long enough to know. It means checking the
right metrics. It means having someone who didn’t implement the fix
assess whether the fix actually fixed anything. Without verification,
your CAPA system is an opinion machine — and the opinion is always that
the action was effective, because the person who implemented it is the
same person checking the box.
Failure Mode 3: The
Preventive Action Vacuum
Here’s a pattern I’ve seen in nearly every organization: the CAPA
system is 95% corrective and 5% preventive. Sometimes it’s 100%
corrective. The preventive action section of the form is either left
blank, filled with “N/A,” or populated with a vague statement like “will
monitor for similar issues.”
Preventive action is harder than corrective action because it
requires imagination. Corrective action responds to something that
already happened — the evidence is in front of you. Preventive action
requires you to look at a process that is currently working fine and
ask: what could break this? What trend am I not seeing? What near-miss
almost became a failure, and what would I do differently if it had?
Most organizations don’t do this because most organizations are
firefighting. When your quality team spends its day reacting to
nonconformances, customer complaints, and audit findings, nobody has the
bandwidth to sit down and think about what hasn’t gone wrong yet.
Preventive action feels like a luxury — something you do when things are
calm. And things are never calm.
But this is precisely why problems keep recurring. The preventive
work that would have stopped the next crisis doesn’t get done because
the current crisis consumes all available attention. It’s a vicious
cycle: more corrective action, less preventive action, more problems,
more corrective action, less preventive action.
Organizations that break this cycle do it deliberately. They carve
out time for preventive work — scheduled, protected, non-negotiable
time. They treat near-miss reporting as a first-class quality activity,
not an afterthought. They analyze trends not just in failures but in
process behavior, looking for drift before it becomes deviation.
Failure Mode 4: The
Closure Obsession
In many organizations, CAPA metrics drive CAPA behavior, and the
primary metric is closure rate. How many open CAPAs do you have? How old
is your oldest one? How quickly are you closing them?
These sound like reasonable questions, and they are — but they create
a perverse incentive. When the goal is to close CAPAs quickly, people
close them quickly. Not correctly. Quickly.
I’ve seen quality managers under pressure to reduce their open CAPA
count who systematically closed records that weren’t ready for closure.
The verification step hadn’t been completed. The corrective action
hadn’t been implemented long enough to demonstrate effectiveness. The
root cause investigation was superficial. But the metric said “close
them,” so they closed them.
The result is a CAPA system that looks healthy on a dashboard and is
completely rotten underneath. Audit-ready numbers. Audit-proof problems.
When an auditor digs into the closed records — and good auditors always
do — they find the same failure modes recurring, the same shallow
investigations, the same unverified actions stamped “effective.”
The metric should not be how fast you close CAPAs. The metric should
be whether the problems you addressed stayed addressed. A CAPA system
with a high closure rate and a high recurrence rate is not a functioning
system. It’s a recycling program.
Failure Mode 5: The
Learning Disconnect
This is the most subtle and arguably the most damaging failure mode.
Even when a CAPA is done correctly — thorough root cause, effective
action, proper verification — the learning stays trapped in the record.
The CAPA form sits in a database. The knowledge gained from the
investigation doesn’t propagate to other processes, other lines, other
facilities.
You see this in multi-site organizations constantly. Plant A has a
CAPA for a problem that Plant B is about to have. The knowledge exists
within the organization. But there’s no mechanism to share it. Plant B
will discover the same problem, run the same investigation, reach the
same conclusion, and implement the same fix — six months later, at full
cost.
Even within a single facility, the learning disconnect is real. A
CAPA investigating a failure in the machining cell identifies a systemic
issue with tool change procedures. The fix is implemented in that cell.
But the assembly cell, which has similar tool change procedures, doesn’t
learn anything. The knowledge is local when it should be global.
A learning organization doesn’t just fix problems. It distributes the
knowledge gained from fixing them. Every CAPA should end with the
question: who else needs to know this? And the answer should trigger
communication — a training update, a procedure revision, a
cross-functional review, a lessons-learned briefing.
What a Functional CAPA
System Looks Like
After diagnosing these failure modes in dozens of organizations, I’ve
developed a simple description of what a functional CAPA system looks
like. It has four characteristics:
Investigation over documentation. The root cause
analysis is the most important part of the CAPA. It gets the most time,
the most resources, and the most rigor. If the investigation is shallow,
everything downstream is wasted effort. Use structured methods — 5 Whys,
Ishikawa, fault tree analysis — and use them properly, not as
performative exercises.
Verification is non-negotiable. Every corrective
action is verified by someone who didn’t implement it, over a meaningful
timeframe, using objective evidence. “No recurrence in two weeks” is not
verification. “Process capability maintained at Cpk > 1.33 over
ninety days with no special-cause signals on the control chart” is
verification.
Preventive action gets real resources. Preventive
action isn’t an afterthought. It has dedicated time, dedicated people,
and dedicated metrics. Near-miss reporting is actively encouraged, not
quietly discouraged. Trend analysis is performed regularly, not when
someone has spare time.
Learning propagates. Every closed CAPA generates a
lessons-learned output that reaches every relevant part of the
organization. The knowledge doesn’t stay in the database. It goes to the
people who need it, in a format they can use, through a channel they
actually check.
The Cost of a Broken CAPA
System
Organizations tolerate broken CAPA systems because the cost of the
dysfunction is hidden. It doesn’t show up as a line item. But it’s there
— in the warranty claims that keep coming, in the customer complaints
that sound familiar, in the audit findings that repeat year after year,
in the rework costs that never decrease, in the employee turnover that
happens because people get tired of working in a system that never
improves.
The cost of a broken CAPA system is the cost of solving the same
problems forever. It’s the premium you pay for not learning. And it
compounds — because every problem you don’t truly solve creates the
conditions for the next one.
If your organization has more than ten open CAPAs older than ninety
days, you have a broken CAPA system. If your CAPA recurrence rate is
above 15%, you have a broken CAPA system. If your
preventive-to-corrective ratio is below 1:5, you have a broken CAPA
system. These aren’t aspirational benchmarks. They’re the minimum
threshold for a system that does what CAPA is supposed to do: prevent
the preventable.
The Uncomfortable Truth
CAPA is not complicated. It’s a straightforward loop: find a problem,
understand it, fix it, check the fix, share the learning. The
methodology is well-documented. The tools are widely available. The
training is accessible.
The reason CAPA systems fail is not because people don’t know how to
do them. It’s because doing them properly requires something most
organizations aren’t willing to provide: the time and the honesty to
investigate problems thoroughly, even when the investigation reveals
uncomfortable truths about management decisions, resource allocation, or
process design.
A CAPA system that blames operators will always fail because
operators are rarely the root cause. A CAPA system that stops at the
first plausible answer will always fail because the first plausible
answer is usually a symptom. A CAPA system optimized for closure speed
will always fail because real problem-solving takes time.
The organizations that get CAPA right are the organizations that are
honest with themselves. They look at failures and ask what the system
did wrong, not who the person was that got it wrong. They invest in
investigation because they understand that a thorough investigation is
cheaper than a recurrence. They verify because they’ve learned that
unverified fixes are just deferred failures.
CAPA isn’t paperwork. It’s the institutional memory of an
organization’s quality journey. Treat it as paperwork, and you’ll have a
quality journey that goes in circles. Treat it as learning, and you’ll
have a quality journey that actually goes somewhere.
About the Author: Peter Stasko is a Quality
Architect with over 25 years of experience implementing and auditing
quality management systems across automotive, electronics, medical
device, and heavy manufacturing industries. He specializes in
transforming dysfunctional quality processes into learning systems that
drive measurable improvement.