The Promise That Sold It
Total Productive Maintenance arrived in your plant with a seductive
pitch: zero breakdowns, zero defects, zero accidents. The consultants
showed you the eight pillars — Autonomous Maintenance, Planned
Maintenance, Focused Improvement, Quality Maintenance, Early Equipment
Management, Training and Skills Development, Safety and Environment, and
Office TPM — and you nodded along, imagining a floor where operators
knew their machines the way a driver knows a car that’s been theirs for
twenty years. You pictured operators who could hear a bearing going bad
before the vibration analyst’s monthly visit, who could feel a slight
drag in a slide and adjust it before it became a jam, who took personal
pride in uptime numbers because those numbers were theirs.
The goal was world-class: OEE above 85%, mean time between failures
measured in months instead of days, maintenance costs as a percentage of
asset replacement value trending steadily downward. And the mechanism
was just as appealing — blend the skills of operators and maintenance
technicians so completely that the line between them disappears.
Operators do the daily care. Maintenance does the deep technical work.
Everybody wins.
It sounded like the answer to every breakdown that had ever cost you
a shipment.
What Actually Happened
Fast-forward eighteen months. Walk the floor and here is what TPM
looks like in practice:
The Autonomous Maintenance steps — the ones where operators were
supposed to learn their machines by cleaning them, inspecting them,
lubricating them, and gradually taking ownership of routine care — have
been reduced to a laminated card clipped to each machine. The card says
“Clean. Inspect. Lubricate. Tighten.” It has check boxes. Operators
check the boxes. They check them the same way they check the boxes on
every other form they’ve been handed — quickly, without thinking,
sometimes for machines they haven’t actually looked at.
The cleaning itself has become the visible signal that TPM is
“happening.” When executives walk through, they see gleaming machines
and color-coded labels and shadow boards with precisely placed tools.
They see operators wiping down housings and they think: this is
equipment excellence. What they don’t see is that the cleaning was
supposed to be the entry point to understanding — not the
destination. The act of cleaning a machine was supposed to teach the
operator where the heat builds up, where the chips accumulate, where the
oil migrates, where the wear shows first. It was supposed to be an
inspection disguised as a chore. Instead, the chore survived and the
inspection died.
The Seven-Step Decay
Autonomous Maintenance was designed as a seven-step progression:
initial cleaning, countermeasures to sources of contamination, standards
for cleaning and lubrication, general inspection training, autonomous
inspection, standardization, and full autonomous management. Each step
was supposed to be a gate — you don’t move forward until the current
step is genuinely mastered. But gates are only as good as the people
guarding them, and in most implementations, the gates were opened wide
because the project timeline said it was time to move on.
So operators who never truly learned general inspection — who
couldn’t tell you what a bearing race looks like when it’s starting to
spall, who couldn’t identify the sound a loose coupling makes under load
— were “progressed” to autonomous inspection. They were handed
inspection sheets filled with technical terms they’d never been taught
to interpret. They went through the motions. The sheets came back clean.
Not because the machines were clean, but because the inspectors couldn’t
tell the difference.
Step seven — full autonomous management — was declared “achieved” on
a spreadsheet in a conference room while the maintenance manager stared
out the window knowing full well that the operator on Line 3 still calls
maintenance every time the hydraulic pressure fluctuates, because nobody
ever taught them what fluctuation means.
The Focused Improvement
Theater
Then there’s Focused Improvement, or Kobetsu Kaizen — the pillar
dedicated to cross-functional teams attacking the biggest equipment
losses with structured problem-solving. In theory, these are your best
people spending dedicated hours analyzing why Machine X loses 200 hours
a quarter to minor stops, using tools like Why-Why analysis and Pareto
charts to drive those losses down permanently.
In practice, Focused Improvement became a monthly meeting where
people who were already stretched thin on their actual jobs sat in a
room for forty-five minutes and brainstormed ideas on a flipchart. The
ideas were assigned as action items to people who already had action
items from three other pillars. The action items were chased at the next
meeting, where it was discovered that nobody had time to complete them.
The meeting adjourned with renewed commitment. The losses continued.
The problem isn’t laziness. The problem is that Focused Improvement
requires something most plants won’t give it: protected time. Real
Kobetsu Kaizen means pulling your best operator off the line for hours —
maybe days — to observe, measure, analyze, and experiment. It means
accepting that the short-term production hit is worth the long-term
reliability gain. Most plants can’t make that trade because the
short-term number is what the weekly report tracks, and nobody has
figured out how to put “we prevented a future breakdown” into this
week’s OEE.
Planned Maintenance
Without the Planning
Planned Maintenance was supposed to be the pillar where your
maintenance organization transitions from firefighting to proactive
care. Condition-based monitoring, predictive maintenance, and a
maintenance calendar built on actual equipment history rather than
manufacturer guesses.
What actually happened is that your maintenance team — already
understaffed and overwhelmed — adopted the calendar but skipped the
analysis. PMs got scheduled based on convenience and available downtime,
not based on failure data. The result is that some machines get serviced
far more often than they need to, consuming parts and labor that could
be deployed elsewhere, while other machines run to failure because their
PM keeps getting deferred when something more urgent pops up — and
something is always more urgent.
The predictive maintenance tools that were supposed to transform this
— vibration analysis, oil analysis, thermography — were purchased,
partially deployed, and then partially abandoned. Not because they don’t
work, but because nobody was given the dedicated time to learn them
properly, build the baseline data, and integrate the findings into the
maintenance planning cycle. You have a vibration analyst who comes in
once a month, generates reports that get filed, and whose critical
findings get addressed when someone has time — which is to say,
sometimes after the bearing has already failed.
Training
and Skills Development: The Promise and the Pipeline
The Training and Skills Development pillar is where TPM was supposed
to close the skills gap — to create a matrix of competencies for every
role, assess every operator and technician against that matrix, and
build deliberate training plans to close the gaps.
Most plants did the matrix. They built a beautiful skills matrix on a
whiteboard in the break room, with green, yellow, and red dots showing
who was proficient, who was developing, and who wasn’t yet trained on
each machine. The matrix was updated quarterly. Then it was updated
annually. Then it was updated when someone remembered.
The training itself — the actual transfer of knowledge from
experienced to inexperienced operators — was left to “on-the-job
training,” which is manufacturing’s universal euphemism for “follow Bob
around for two days and try to pick it up.” Bob is your most experienced
operator, which means Bob is also your busiest operator, which means
Bob’s “training” consists of showing the new hire where the start button
is and telling them to call if something looks weird.
This is not the fault of the operators. This is the fault of an
organization that declared a skills development pillar without
allocating dedicated training time, without building structured training
materials, and without measuring training effectiveness. The matrix
measures intent. It does not measure capability.
The Safety and Environment
Checkbox
Safety and Environment — the pillar that was supposed to ensure TPM
improvements never compromise worker safety or environmental compliance
— became a line item at the bottom of every TPM report: “Zero safety
incidents this month.” Which is good. But it’s also the same line that
was on the report before TPM existed, because safety was already being
tracked by a separate department with its own metrics, its own audits,
and its own culture.
TPM was supposed to integrate safety into the fabric of equipment
management — operators who identify hazards as part of their daily
rounds, maintenance procedures designed for safe execution, machine
modifications that eliminate ergonomic strain. Some of that happened.
Most of it didn’t, because the safety department still owns safety, the
maintenance department still owns maintenance, and TPM is just the name
for the binder that sits on the shelf between them.
Why It Breaks: The
Three Structural Failures
Strip away the individual pillar failures and you find three
structural problems that explain why TPM so rarely delivers on its
promise.
The first is the ownership problem. TPM requires a
fundamental shift in who owns equipment reliability. Operators are asked
to take ownership of machines they don’t fully understand, using skills
they were never properly taught, with accountability for metrics they
can’t meaningfully influence. Maintenance technicians are asked to share
ownership with operators who they see as unqualified, in a culture where
knowledge has always been power and sharing it feels like giving that
power away. The ownership transfer was supposed to be gradual and
supported. In most plants it was abrupt and unsupported, which means it
never actually happened.
The second is the time problem. TPM is not a project
you overlay on top of existing work. It is a different way of working
that requires reallocation of time — operator time for maintenance
tasks, cross-functional team time for improvement work, training time
for skill development. Plants that try to add TPM to an already-full
workload get partial execution at best. The cleaning gets done because
it’s visible and simple. The analysis, the training, the improvement
work — anything that requires sustained cognitive effort — gets
shortchanged because there’s no time carved out for it.
The third is the measurement problem. TPM’s success
metrics — OEE, MTBF, MTTR, maintenance cost ratio — are lagging
indicators that are influenced by dozens of factors beyond TPM
implementation. When OEE goes up, TPM gets the credit. When it goes
down, the market, the material quality, or the aging equipment gets the
blame. This measurement ambiguity makes it impossible to hold TPM
accountable in a rigorous way, which means nobody can definitively say
whether their TPM program is working or just consuming effort. And when
you can’t measure whether something is working, the path of least
resistance is to declare it working and move on to the next
initiative.
The OEE Trap Within TPM
Here’s a pattern that plays out repeatedly: a plant launches TPM,
declares OEE as its north-star metric, and then watches the OEE number
climb month after month. Celebrations ensue. Bonuses are paid. And then,
two years later, a major breakdown takes down a critical line for a
week, and the investigation reveals that the machine had been
deteriorating for months — visible in the vibration data, audible in the
bearing noise, detectable in the rising minor-stop frequency — but
nobody noticed because the OEE number looked fine.
This happens because OEE, as calculated in most plants, is a number
that can be optimized in ways that mask underlying problems.
Availability can be propped up by deferring maintenance. Performance can
be inflated by running faster than the process was designed for, trading
quality for speed. And quality — the one factor that should be
non-negotiable — can be quietly redefined to exclude certain defect
categories that are “being addressed separately.”
The OEE number that TPM was supposed to improve through genuine
equipment excellence instead gets improved through accounting, and the
improvement becomes self-validating evidence that TPM is working.
What Genuine TPM Looks Like
For contrast, consider what TPM looks like in organizations where it
actually works.
In these organizations, the operators do know their
machines. Not because they were handed a checklist, but because they
were given months of structured training — classroom time, hands-on time
with cutaway models, mentoring from maintenance technicians, and
examinations that tested real understanding. They can describe what each
gauge means, what each sound indicates, what each temperature reading
implies. When something changes, they notice — not because a form tells
them to inspect, but because they have a mental model of what “normal”
looks like and the deviation registers automatically.
In these organizations, maintenance planners have protected time to
plan. They don’t get pulled into firefighting because the firefighting
has been reduced by the autonomous maintenance work that operators
genuinely own. The predictive maintenance program has baselines built
over years, with trending that actually predicts failures. When the
vibration analyst flags a bearing, the work order is generated,
prioritized, and scheduled — and the bearing is replaced during planned
downtime, not after a catastrophic failure.
In these organizations, the Focused Improvement team is a real team
with real time — not forty-five minutes a month, but dedicated hours
each week, with the authority to make changes and the resources to
implement them. Their improvements are documented, standardized, and
shared across the plant so that the same problem doesn’t get solved
three times on three different lines.
And in these organizations, nobody talks about TPM as a program. They
talk about it as the way they work. The pillars don’t have launch dates
because they don’t have endpoints. The initiative didn’t end — it became
the culture.
The Path Back
If your TPM program has become what most TPM programs become — a
cleaning schedule with pillars — the path back starts with honesty about
where you are.
Stop measuring pillar maturity on a five-point scale and start
measuring outcomes: how many of your operators can accurately describe
the top three failure modes of their primary machine? How many of your
PMs are based on actual failure data versus default schedules? What
percentage of your maintenance budget goes to planned versus unplanned
work — and is that ratio actually improving, or has it been flat for two
years while your TPM dashboard shows green?
Re-invest in training. Real training. Not a lunch-and-learn or a
video module, but structured, multi-week, hands-on training with
assessments that test actual capability. This is expensive. It is also
the single highest-return investment in equipment reliability that
exists.
Protect time for improvement work. If you can’t afford to pull people
off the line for four hours a week to work on the losses that are
costing you twenty hours a week in downtime, then you have decided that
the downtime is acceptable. Say that out loud. Then decide if you mean
it.
And maybe most importantly, stop telling yourselves that TPM is
working because the dashboard says so. The dashboard measures activity.
Activity is not improvement. Improvement is what happens when an
operator catches a developing fault that would have been a four-hour
line stoppage, and the reason they caught it is that someone taught them
what to look for and gave them the authority to act on what they
found.
That’s the TPM you were promised. That’s the TPM almost nobody
has.
Peter Stasko is a Quality Architect with over 25
years of experience in manufacturing quality, process improvement, and
operational excellence. He has implemented and rescued quality systems
across automotive, electronics, and heavy industry on three
continents.