FMEA: When Your Failure Mode Analysis Becomes a Risk Spreadsheet Nobody Updates — and the Hazards You Identified Became the Checkbox You Filed Away and Forgot

Failure Mode and Effects Analysis is supposed to be the intellectual
conscience of your engineering process. It is the document where your
team sits down, imagines everything that could go wrong, assesses how
bad it would be, how often it might happen, and how likely you are to
catch it before it reaches the customer — and then uses that analysis to
drive preventive action. It is, in principle, one of the most powerful
quality tools ever devised. The fact that it has become one of the most
hated, most abused, and most meaningless documents in modern
manufacturing is not a failure of the tool. It is a failure of us.

I have spent decades watching organizations weaponize FMEA into the
exact opposite of what it was designed to do. The irony is bitter. A
tool created to prevent failures has become a failure in its own right —
a bureaucratic exercise so divorced from engineering reality that the
people filling it out often have no understanding of the process they
are analyzing, and the people who do understand the process have long
since stopped attending the meetings. What remains is a spreadsheet full
of numbers that nobody believes, serving no purpose other than
satisfying a customer requirement or an audit checklist.

The story of how this happened is worth telling, because it reveals
something fundamental about how quality tools degrade — not through
malice, but through a slow, comfortable erosion of meaning that nobody
notices until the failure arrives that the FMEA was supposed to
prevent.

The Origin: A Tool Born
From Necessity

FMEA was not invented by a consultant looking to sell templates. It
was developed by engineers in the aerospace industry in the 1940s and
1950s, where the cost of failure was measured not in dollars but in
lives. When a rocket explodes on the launch pad or an aircraft system
fails mid-flight, you cannot simply issue a recall and fix it in the
next production run. The consequences are immediate, catastrophic, and
irreversible. The engineers who built those systems needed a structured
way to think through every possible failure mode — every way each
component, subsystem, and interface could go wrong — before the system
was ever built or flown.

The method they developed was elegant in its simplicity. For each
potential failure mode, you ask three questions: How severe would the
consequence be? How likely is this failure to occur? How likely are you
to detect it before it causes harm? You score each on a scale (typically
1 to 10), multiply them together, and the result — the Risk Priority
Number, or RPN — gives you a ranked list of where to focus your
preventive efforts. The highest RPNs get attention first. Resources go
where the risk is greatest. It was, and remains, a rational framework
for making decisions under uncertainty.

The automotive industry adopted FMEA in the 1970s, and from there it
spread to virtually every manufacturing sector. ISO 9001, IATF 16949,
AS9100, and dozens of other standards now require or strongly recommend
it. AIAG-VDA published a harmonized handbook. Software tools were built
to automate it. Training programs certified practitioners by the
thousands. FMEA became institutional — and that is precisely when the
trouble began.

The
Degeneration: From Engineering Tool to Compliance Artifact

Here is how FMEA dies in an organization. It does not die
dramatically. It does not die in a single meeting or through an explicit
decision to stop doing it. It dies the way most quality initiatives die
— through a thousand small compromises, each of which seems reasonable
at the time, until the document that remains bears no resemblance to the
engineering analysis it was supposed to be.

Step one: The FMEA becomes a template. Someone,
somewhere, creates a master FMEA template — maybe from a previous
project, maybe downloaded from a supplier portal, maybe purchased from a
consulting firm. This template has columns for everything: function,
failure mode, effect, severity, cause, occurrence, detection, RPN,
recommended action, responsibility, target date, revised ratings. It
looks comprehensive. It looks professional. And it creates the illusion
that FMEA is about filling in columns rather than thinking about
failures.

The template is copied for each new project. The team opens it, looks
at the familiar structure, and begins populating it — not by conducting
a fresh analysis of the new design or process, but by adapting what was
written last time. Failure modes from the previous product are carried
over. Severity ratings are reused. Occurrence scores are copied.
Detection scores are assumed. The analysis is not analysis; it is
editing. And because the template always looks complete when the columns
are filled, nobody questions whether the content reflects genuine
engineering thought.

Step two: The RPN becomes a number game. The Risk
Priority Number was designed to be a prioritization tool — a rough
sorting mechanism to help teams decide where to focus. Instead, it
becomes a target. Customers set RPN thresholds: any RPN above 100 (or
80, or 50, depending on the customer) must have an action plan. Auditors
flag high RPNs as nonconformances. Organizations track average RPN as a
quality metric.

The result is entirely predictable. Teams learn to engineer their
RPNs, not their products. They adjust severity ratings downward — “Well,
the customer might notice the cosmetic defect, but they probably will
not return the part, so let us call it a 3 instead of a 5.” They lower
occurrence ratings — “We have never seen this failure in production, so
it must be unlikely” — ignoring that the reason they have never seen it
is that the product was just launched and the volume is low. They
inflate detection ratings — “Our final inspection will catch it” — even
though the final inspection has a known 15 percent escape rate. The RPN
comes in below the threshold, the customer is satisfied, and the risk
that was supposed to be analyzed has been negotiated away through a
spreadsheet.

Step three: The cross-functional team stops showing
up. A meaningful FMEA requires diverse perspectives: design
engineers who understand the product architecture, process engineers who
understand the manufacturing flow, quality engineers who understand
inspection capabilities, and operators who understand what actually
happens on the floor. The method’s power comes from combining these
perspectives — the design engineer identifies a failure mode that the
process engineer would never think of, the operator describes a
real-world condition that the design engineer never anticipated, and the
quality engineer identifies a detection gap that neither of them
noticed.

But cross-functional meetings take time. They require scheduling,
preparation, and follow-through. As the FMEA becomes a template
exercise, the meetings become formulaic. The design engineer sends a
junior delegate who has not been involved in the design decisions. The
process engineer joins by phone and multitasks through the meeting. The
operator is not invited because production cannot spare the headcount.
The quality engineer facilitates, but facilitation is not engineering
analysis — it is keeping the meeting moving and the columns filled.

Eventually, the FMEA meeting becomes a one-person exercise. The
quality engineer sits alone with the template, fills in what they can,
circulates it for review, receives no comments, and declares it
complete. The cross-functional analysis that was supposed to surface
hidden risks has been replaced by a single person’s best guess,
validated by silence.

Step four: The action plan becomes a lie. The FMEA
process requires that high-risk failure modes have action plans —
specific tasks, with owners and deadlines, designed to reduce the risk.
These actions are the entire point of the exercise. Without them, the
FMEA is just a catalog of things that might go wrong, which is
interesting but useless.

In practice, the action column is where the FMEA goes to die. The
actions written there are vague: “Monitor in production.” “Improve
operator training.” “Review inspection method.” Nobody defines what
monitoring means, what the training will cover, or what the review will
produce. The owner column lists a department, not a person. The target
date says “Q3” or “ongoing” or is simply left blank. When the audit
comes, the FMEA is reviewed for completeness — all columns filled, all
actions listed — and it passes. Whether the actions were ever executed
is a question nobody asks.

I have reviewed hundreds of FMEAs across dozens of organizations, and
the pattern is remarkably consistent. The recommended action column is a
graveyard of good intentions. If you took every FMEA action plan ever
written and tracked them to completion, I estimate that fewer than 20
percent were ever fully implemented. The rest exist only on paper,
creating a false sense of security that the risks have been addressed
when they have merely been documented.

The Action FMEA: A
Different Approach

There is a better way, and it does not require a new template or a
new software tool. It requires a different relationship with the
document entirely.

The most effective organizations I have worked with treat the FMEA as
a living engineering document, not a compliance record. It is opened at
the start of a project, when the design is still fluid and the process
is still being defined. It is updated when new information arrives — a
field failure, a test result, a supplier change, a process modification.
It is reviewed in engineering reviews, not just quality audits. And its
action items are tracked with the same rigor as any other engineering
deliverable: with owners, milestones, deliverables, and closure
criteria.

These organizations have made a few key shifts that distinguish their
FMEA practice from the compliance exercise:

They start with functions, not failure modes. Before
you can analyze how something might fail, you need to understand what it
is supposed to do. The best FMEAs I have seen spend significant time on
the function column — defining, for each component and interface, what
the intended function is, under what conditions, and to what performance
level. Only when the function is clearly defined does the team ask: what
could prevent this function from being fulfilled? This sequencing
matters because it forces the team to think about the design intent
before they think about the failure. It grounds the analysis in
engineering reality.

They use the RPN as a starting point, not an
endpoint. The RPN is a conversation starter, not a decision. A
high RPN does not automatically trigger a mandatory action — it triggers
a discussion. Is this risk real? Is the severity rating accurate? Do we
have data on occurrence, or are we guessing? Is the detection rating
based on actual inspection capability data, or on what the procedure
says should happen? The discussion is the value. The number is just the
prompt.

They close the loop. Every action has a defined
deliverable — a design change, a process modification, a Poka-Yoke
device, a new inspection method, a revised work instruction. When the
action is complete, the FMEA is updated: the severity, occurrence, or
detection rating is revised to reflect the improvement, and a new RPN is
calculated. This creates a visible record of risk reduction over time,
which is far more meaningful than a static snapshot of risk at project
launch.

They involve the right people at the right time. The
initial FMEA session is small — three to five people with deep knowledge
of the design and process. But the document is then circulated to a
wider audience: operators, suppliers, customers (where appropriate), and
cross-functional reviewers. Feedback is incorporated. The document
evolves. The FMEA is not a meeting; it is a process.

The Cost of Getting It Wrong

The consequences of a hollow FMEA practice are not theoretical. I
have seen them play out in factory after factory, in industry after
industry.

A Tier 1 automotive supplier filed an FMEA identifying a potential
failure mode in a steering component. The severity was correctly rated
at 10 — catastrophic, potential for loss of life. The occurrence was
rated at 2 — unlikely. The detection was rated at 3 — almost certain to
be caught. The RPN was 60, below the customer threshold, so no action
was required. The FMEA was approved and filed.

What the FMEA did not capture was that the occurrence rating was
based on a failure rate from a previous generation of the product,
manufactured on different equipment, with a different material supplier.
The detection rating was based on a 100 percent inline inspection that
had been added to the control plan but never validated for effectiveness
— the inspection could not reliably detect the specific failure mode
identified in the FMEA. The risk was real, it was high, and the FMEA had
documented it without understanding it.

The failure occurred in production. Parts escaped. A recall was
issued. The investigation traced the root cause back through the FMEA,
which was found to be complete, properly formatted, and utterly wrong in
its risk assessment. The supplier had done everything right on paper and
everything wrong in practice.

This is the paradox of FMEA: the more thoroughly you document the
risk without actually understanding it, the more dangerous your
organization becomes — because the document creates a confidence that
the analysis did not earn. An organization that has never done an FMEA
is cautious. An organization that has done an FMEA and believes its own
numbers is reckless. The difference between caution and recklessness is
not the presence of the document; it is the quality of the thinking
behind it.

Rebuilding the Practice

If your organization’s FMEA practice has degraded into the compliance
exercise described above — and if you are honest, it probably has — the
path back is not complicated, but it requires something harder than a
new template or a training program. It requires your engineering leaders
to treat the FMEA as engineering work, not paperwork.

This means allocating real time for FMEA sessions — not thirty
minutes sandwiched between a design review and a production meeting, but
dedicated blocks where the team can think deeply about failure modes
without watching the clock. It means assigning the FMEA to an engineer
who understands the product, not to a quality facilitator whose job is
to fill in the template. It means reviewing the FMEA in design reviews,
not just in quality audits — and asking hard questions about the
ratings, not just checking that the columns are filled.

It means accepting that a good FMEA will make your project look
riskier, not safer, in the short term. A team that takes the analysis
seriously will identify more failure modes, assign higher severity
ratings, and surface more detection gaps than a team going through the
motions. This is not a sign that your product is worse than you thought.
It is a sign that your understanding is better than it was. The risk was
always there. The FMEA has simply made it visible, which is the
prerequisite for doing something about it.

And it means tracking actions to closure with the same discipline you
apply to any other engineering deliverable. If your FMEA identifies a
risk serious enough to warrant an action, then that action is serious
enough to warrant follow-through. An open FMEA action is not a paperwork
gap; it is an unmitigated risk that you have identified and chosen to
ignore. That is worse than not identifying it at all, because you can no
longer claim ignorance.

The Real FMEA

The FMEA was never meant to be a document. It was meant to be a
discipline — a way of thinking systematically about what could go wrong,
grounded in engineering knowledge, driving preventive action. The
document is merely the artifact of that discipline, the record that the
thinking occurred. When the thinking stops and only the document
remains, the FMEA becomes worse than useless. It becomes a lie — a
carefully formatted assertion that risks have been analyzed and
addressed when they have merely been listed and filed.

Every organization I have worked with that excels at quality has one
thing in common: their FMEA is a living document, owned by engineers,
debated in reviews, updated with new information, and taken seriously as
a driver of design and process decisions. Every organization that
struggles with quality has the opposite: a spreadsheet owned by the
quality department, completed at project launch, filed in the system,
and never looked at again.

The difference between these two organizations is not their template,
their software, their training budget, or their RPN thresholds. The
difference is whether they treat the FMEA as thinking or as paperwork.
You can call it whatever you want. You can format it however you like.
But the question is always the same: when your team filled out the FMEA,
did they actually think about what could go wrong — or did they fill in
the columns?

Your answer to that question is your answer to whether your FMEA is
worth the paper it is printed on.

Peter Stasko is a Quality Architect with over 25
years of experience in manufacturing quality management, process
improvement, and engineering problem-solving across automotive,
aerospace, and industrial manufacturing sectors. He has implemented and
audited quality systems to ISO 9001, IATF 16949, and AS9100, and has
spent decades in the gap between what quality tools are supposed to do
and what they actually do on factory floors. He writes about quality not
as an academic exercise but as a practitioner who has seen every side of
it — the good, the bad, and the bureaucratic.

The Origin: A Tool Born From Necessity

The Degeneration: From Engineering Tool to Compliance Artifact

The Action FMEA: A Different Approach