Andon Systems: When Your Stop-the-Line Authority Becomes a Button Nobody Presses — and the Empowerment You Engineered Became the Culture of Looking the Other Way

Blog

The word “andon” (行灯) originally referred to paper lanterns —
portable lights that illuminated the path in old Japan. In
manufacturing, the andon system serves the same metaphorical purpose: it
makes problems visible. When something goes wrong on the production
line, the operator pulls a cord, presses a button, or activates a
signal, and the entire line stops. Lights flash. Music plays.
Supervisors come running. The problem gets fixed. Production
resumes.

It sounds simple. It is simple. And yet, in the vast majority of
factories that implement andon systems, the cord hangs untouched, the
button gathers dust, and the lights that were supposed to illuminate
problems serve instead as decorative reminders of a culture that once
aspired to transparency.

This is the story of how the most powerful quality tool ever devised
becomes, through a thousand small compromises, nothing more than
expensive overhead lighting.

The Promise

The andon system is built on a radical principle: the
operator closest to the work knows best when something is
wrong
. Not the engineer in the office. Not the manager in the
meeting. The person whose hands are on the product, who sees it, feels
it, and hears it every forty-five seconds, all day long.

When Toyota implemented andon as part of the Toyota Production
System, it made a promise to every worker on the line: if you see a
problem, stop the line. We will come. We will help. You will not be
punished.

This promise is the foundation of jidoka — autonomation with a human
touch. It is the mechanism by which defects are caught at the source
rather than discovered downstream, where they become expensive,
invisible, and contagious. The andon cord is not a technical device. It
is a social contract.

And like most social contracts, it works beautifully until it
doesn’t.

The Slow Death

Here is how the andon system dies in most organizations. Not with a
bang, but with a shrug.

Week One: The andon system is installed. Operators
are trained. “Pull the cord for any abnormality,” the trainer says. “You
have the authority to stop the line.” Operators nod. They are skeptical
but willing.

Week Two: An operator pulls the cord. The line
stops. Lights flash. And then… nothing. The supervisor is in a
meeting. The team leader is on break. The maintenance technician is
working on another machine. The line stays down for eleven minutes. When
the supervisor finally arrives, he asks, “What happened?” The operator
explains. The supervisor sighs, writes something on a clipboard, and
says, “Okay, let’s get it running.” No root cause analysis. No
countermeasure. Just restart.

Week Three: Another operator pulls the cord. This
time, the supervisor arrives quickly — but he’s frustrated. “Is it
really a problem?” he asks. “Can we keep running and fix it at the
break?” The operator, who has now been trained by experience to read the
room, says, “Yeah, it’s probably fine. Let’s keep running.”

Month Two: An operator pulls the cord for a
recurring issue — a fixture that doesn’t quite hold the part correctly.
The line stops. The supervisor arrives, looks at the problem, and says,
“We know about this one. Engineering is working on it. Let’s just run
for now.” The operator nods. The line restarts. The fixture continues to
misbehave.

Month Three: The cord is not pulled for this issue
anymore. The operator has learned that pulling it produces no action,
only delay and social discomfort. The defect continues. It is discovered
at final inspection, where it costs twelve times more to fix. But nobody
connects this cost to the unpulled cord.

Month Six: A new operator joins the line. On her
first day, she notices something wrong and reaches for the cord. The
operator next to her says, “Don’t pull that. They’ll just get mad. Just
run it.” She learns the system in thirty seconds — the real system, not
the one in the training manual.

Year One: The andon board shows zero activations for
the past month. Management congratulates itself on how well the line is
running. The quality data shows a steady increase in escape defects, but
nobody connects this to the silent andon system. The two datasets live
in different spreadsheets, reviewed by different people, in different
meetings.

Year Two: A consultant visits the factory and notes
the andon system in her report. “Mature andon implementation,” she
writes. “Very few line stops — indicating stable process.” The factory
manager includes this quote in his annual review presentation.

What Went Wrong

The failure of andon is never a failure of the hardware. The lights
work. The cords work. The buttons work. The failure is always, always,
always cultural.

Here are the specific mechanisms by which organizations destroy their
andon systems:

1. The Response Time Erosion

An andon system is only as good as the response it generates.
Toyota’s standard is ninety seconds — a supervisor reaches the problem
location within ninety seconds of the pull. Not ninety minutes. Not
“when the meeting is over.” Ninety seconds.

Most companies never set a response-time standard. The response
arrives when it arrives. Operators learn that pulling the cord means
waiting fifteen minutes for someone who may or may not be able to help.
The math is brutal: if the line costs $2,000 per minute and the average
response takes fifteen minutes, each pull costs $30,000 in lost
production time. Very quickly, the unspoken cost-benefit analysis
shifts: it is cheaper to let the defect through than to stop the line
for an issue that “might not matter.”

2. The Blame Migration

When an operator pulls the cord, the question that follows reveals
everything about the culture:

  • Healthy culture: “What happened? What did you
    see?”
  • Broken culture: “Why did you stop the line?”

The second question is lethal. It reframes the andon pull as an act
of disruption rather than an act of quality. The operator who hears it
understands immediately: I am being blamed for the stop, not thanked
for the catch.
The next time she sees the same problem, she will
not pull the cord.

3. The Countermeasure Vacuum

An andon pull is not an endpoint. It is the beginning of a
problem-solving process. The cord gets pulled because something is
wrong; the real work begins after the line stops.

In most organizations, this problem-solving step is missing. The
supervisor arrives, confirms the problem, and restarts the line with a
workaround: “Just put it on hold and tag it.” “Run it through and we’ll
sort at the end.” “Use the backup fixture.” The defect is deferred, not
solved. The operator sees that pulling the cord produces a temporary
patch, not a real fix. The same problem will recur tomorrow, and pulling
the cord again will produce the same non-solution. Why bother?

4. The Volume Problem

Here’s a paradox: too many andon pulls are a sign of problems, but
too few are a sign of fear. Healthy systems have a moderate, steady rate
of pulls — not zero and not constant. New systems often have high pull
rates as operators discover and report long-tolerated issues. This is
good. It means the system is working.

But management sees the high pull rate and interprets it as
instability. “Why are we stopping the line so often?” They push for
fewer stops. Supervisors pass the pressure to operators. The pull rate
drops. Management is satisfied. The hidden defect rate rises.

The correct response to a high andon rate is not fewer pulls. It is
faster problem-solving so the same issues don’t recur.

5. The Hierarchy Override

In many factories, the operator’s authority to stop the line is
theoretical. The practical authority belongs to the supervisor, who can
override the andon pull and restart the line against the operator’s
judgment. This happens most often at the end of a shift when production
targets are at risk.

Operator: “There’s something wrong with this part. I want to
stop.”

Supervisor: “We’re thirty units short. Run it and tag it.”

This single interaction destroys the andon system more effectively
than any mechanical failure. It tells every operator on the line:
your judgment is subordinate to the production schedule. The
cord is a suggestion, not a right. And once that understanding spreads —
and it spreads within hours — the system is dead.

The Cost of Silence

The most insidious aspect of andon failure is its invisibility. When
a cord is not pulled, nothing happens. No lights flash. No music plays.
No supervisor runs over. The line continues smoothly, the numbers look
good, and the defects flow downstream like sediment in a river —
invisible until they accumulate somewhere expensive.

Consider: a casting process has a temperature drift. The operator
notices the parts feel slightly different — warmer than usual, or the
color is off. In a healthy andon culture, she pulls the cord. The
process is checked. The temperature controller is recalibrated. Ten
minutes of line time is lost. Twenty parts are quarantined. Problem
solved.

In a broken andon culture, she doesn’t pull the cord. The parts
continue. Three hundred parts later, a downstream operation discovers
that the entire batch has excessive porosity. The root cause
investigation takes two weeks. The scrap cost is $47,000. The customer
ship date is missed. And the original observation — “the parts felt
different” — never surfaces because nobody asks and nobody records.

The andon cord that was not pulled cost forty-seven thousand dollars.
The ten minutes of line time that was saved cost zero dollars in
comparison. But the ten minutes was visible and the $47,000 was not, so
the organization optimized for the number it could see and ignored the
one it couldn’t.

Rebuilding What’s Broken

If your andon system is moribund — if the cord hangs untouched and
the board shows zeros — you can rebuild it. But it requires a sequence
that most organizations get backwards.

Step 1: Fix the response before you fix the pulling.
Before asking operators to pull the cord more often, ensure that every
pull produces a fast, respectful, effective response. Assign dedicated
responders. Set a response-time standard (Toyota’s ninety seconds is the
benchmark). Measure it. Post it publicly. If you cannot guarantee a
response, you have no right to ask for a pull.

Step 2: Celebrate the pulls, not just the catches.
When an operator pulls the cord, thank them. Publicly. Every time. Even
if it turns out to be a false alarm. The behavior you want is the
pulling, not the accuracy of the diagnosis. Diagnosis improves with
practice; pulling dies without reinforcement.

Step 3: Close the loop. Every andon pull should
produce a visible countermeasure within 24 hours. If the fix takes
longer, communicate the timeline. The operator who pulled the cord
should know exactly what happened as a result of their action. If the
loop is not closed, the operator assumes nothing happened — because from
their perspective, nothing did.

Step 4: Track the pull rate as a health metric. Zero
pulls is not a goal. It is a warning sign. A healthy line has a baseline
pull rate that reflects the reality of the process. Track deviations
from this baseline — both up and down. A sudden drop in pulls is as
concerning as a sudden spike in defects.

Step 5: Protect the operator’s authority. Make it
structurally impossible for a supervisor to override an andon pull for
production reasons. The operator’s authority to stop the line must be
absolute, written into policy, and enforced by leadership. If production
pressure can override quality judgment, the system is a theater set —
convincing from the audience, hollow backstage.

The Lantern’s Light

The original andon lanterns guided travelers along dark roads. They
didn’t eliminate the darkness; they made the path visible. The
manufacturing andon serves the same purpose. It does not prevent defects
by itself. It makes defects visible — immediately, at the source, by the
person best positioned to see them.

When you install an andon system, you are not installing hardware.
You are installing trust. You are telling every operator: We believe
you. We will come when you call. We will fix what you find. We will not
punish you for seeing.

When that trust is broken — by slow responses, by blame, by
production overrides, by unsolved problems — the hardware remains but
the system is gone. The lights are on, but nobody’s home.

The question is not whether your factory has an andon system. The
question is whether your operators believe it works.

Ask them. Their answer is your andon.


Peter Stasko is a Quality Architect with over 25 years of
experience in manufacturing quality management, process improvement, and
organizational transformation. He has implemented andon systems across
automotive, electronics, and heavy industry — and watched too many of
them die the slow death of good intentions.

Scroll top