Quality
and the Flaw of Averages: When Your Organization Designs for the Average
Customer and Fails Every Single One

The Dimension That Should
Have Worked

The part measured 12.450 mm — dead center of the specification. Every
sample the lab pulled from the first production run landed within
tolerance. The Cpk was 1.67, the histogram was a beautiful bell curve,
and the quality engineer signed off with confidence.

Six weeks later, the customer returned 40% of the shipment.

Not because the average was wrong. The average was perfect. The
problem was that no individual customer experienced the average.
Customer A used the part in a high-temperature environment where thermal
expansion pushed it out of fit. Customer B assembled it with a mating
component on the tight end of its own tolerance. Customer C ran it at
high speed, where vibration amplified a minor imbalance. Each customer
experienced the part in a specific context — and the average
specification addressed none of them.

This is the Flaw of Averages, and it is quietly undermining more
quality systems than any of us would like to admit.

What Is the Flaw of Averages?

The concept was formalized by Sam Savage, a Stanford professor who
spent decades showing that plans based on average assumptions are wrong
on average. The idea is deceptively simple: when you replace an
uncertain quantity with its average value, the results you calculate are
systematically, sometimes catastrophically, incorrect.

In quality management, this manifests in ways that are both pervasive
and invisible:

You set tolerance windows around the nominal dimension, ignoring the
distribution of actual conditions your product will face
You plan capacity based on average demand, leaving you helpless
during peaks and wasteful during valleys
You design ergonomics for the average operator, creating
workstations that fit nobody perfectly
You calculate failure rates as averages across product families,
masking the fact that one configuration accounts for 80% of field
returns
You schedule preventive maintenance at average intervals,
over-servicing some machines and under-servicing others

The average is a statistical abstraction. It does not exist in
reality. No customer is average. No process runs at average. No operator
is average. No shift is average. And yet entire quality systems are
built on the assumption that designing for the average is the same as
designing for everyone.

Why It Feels So Right

The Flaw of Averages is seductive because it simplifies a complex
world. When your boss asks “What’s the dimension?” it’s easier to say
“12.45 mm” than to hand over a probability distribution. When your
production planner asks “How many do we make?” it’s easier to cite the
monthly average than to explain demand variability.

Reduction to the average feels like clarity. It feels like
professionalism. It feels like you have the answer.

But you don’t have the answer. You have a number that is, by
definition, wrong for every individual case.

Consider the old joke about the statistician who drowns in a river
that averages three feet deep. It’s funny until you realize that your
quality system is the statistician, and your customers are drowning in
the deep spots while you point at the average depth and declare it
safe.

The Manufacturing
Manifestations

Tolerance Stack-Up

In assembly, individual parts may each sit comfortably within
tolerance, but when you stack them together, the cumulative variation
can push the assembly out of spec. Engineers who design to nominal
dimensions — each part centered on its average — miss the fact that real
parts cluster, shift, and drift. The average stack-up looks perfect. The
actual stack-up is a disaster waiting to happen.

Tolerance analysis using RSS (Root Sum of Squares) or Monte Carlo
simulation accounts for the distribution, not just the average. But many
organizations still design to nominal and hope for the best. Hope is not
a quality strategy.

Process Capability
Misinterpretation

A Cpk of 1.33 tells you that the process average is comfortably
inside the specification limits. But that single number hides critical
information: Is the distribution normal? Is it skewed? Are there
multiple modes? Does it shift between shifts? A process with a beautiful
average Cpk can still produce defects if the distribution is anything
other than what you assumed.

I once audited a plant that proudly reported a Cpk of 2.0 on a
critical dimension. When I asked to see the individual x-bar charts by
shift, we discovered that Shift A ran consistently at the low end of
tolerance and Shift B ran at the high end. The combined data averaged
out beautifully. But no individual shift was centered, and the customer
experienced completely different parts depending on when they were
made.

Inventory and Supply Chain

Planning inventory based on average lead times and average demand is
a recipe for stockouts during peaks and excess during valleys. The
average says you have enough. Reality says you ran out last Thursday and
won’t see the next shipment until Monday.

Ergonomics and Workstation
Design

Workstations designed for the “average” operator — average height,
average reach, average grip strength — fit almost no one. The 5th
percentile female operator can’t reach the controls. The 95th percentile
male operator hunches over all day. Both develop repetitive strain
injuries that manifest as quality defects: fatigue destroys attention,
discomfort creates shortcuts, and pain produces mistakes.

Maintenance Intervals

Setting maintenance schedules based on average machine runtime means
some machines get serviced too early (wasting money and downtime) while
others run too long between services (accumulating wear that degrades
precision and increases defect rates). Condition-based maintenance —
monitoring actual vibration, temperature, and wear — replaces the
average with reality.

The Statistical Foundation

Jensen’s Inequality, a fundamental result in probability theory,
provides the mathematical backbone for why the Flaw of Averages matters.
In its simplest form, it states that for a convex function, the average
of the function’s outputs is greater than or equal to the function
evaluated at the average input.

What this means in practice: if you have a nonlinear relationship
between an input and its outcome — and in quality management, you almost
always do — then planning based on the average input will systematically
underestimate (or overestimate) the real outcome.

Consider a simple example. A machine’s wear rate accelerates
nonlinearly with temperature. At 20°C, wear is negligible. At 40°C, it’s
moderate. At 60°C, it’s catastrophic. If your operating environment
averages 40°C but swings between 20°C and 60°C, planning for 40°C misses
the fact that the 60°C episodes are doing disproportionate damage. The
average plan says the machine will last 10,000 hours. Reality says it
fails at 6,000 because of the peaks.

How to Fight the Flaw

Think in Distributions, Not
Numbers

Train your engineers and quality professionals to think in
distributions. When someone presents an average, ask: “What does the
distribution look like?” When someone proposes a target, ask: “What’s
the spread, and what happens at the tails?”

This isn’t about making things more complicated. It’s about making
them more honest. A distribution tells you what the number hides. The
tails tell you where your defects live.

Use Monte Carlo Simulation

Monte Carlo simulation replaces single-point estimates with thousands
of scenarios drawn from actual distributions. Instead of asking “What
happens at the average?” it asks “What happens across the full range of
possibilities?”

For tolerance stack-ups, process capability projections, financial
planning of quality costs, and reliability predictions, Monte Carlo
turns the Flaw of Averages from a hidden trap into a visible, manageable
risk.

Design for the Extremes,
Not the Center

In many quality applications, the extremes matter more than the
center. Designing a workstation? Design for the 5th and 95th percentile
operators, not the 50th. Setting safety stock? Plan for demand at the
95th percentile, not the mean. Establishing spec limits? Consider
worst-case usage conditions, not typical ones.

This is the principle behind robust design (Taguchi methods):
optimize the product so that it performs well across the full range of
operating conditions, not just at the nominal point.

Stratify Your Data

When you aggregate data across shifts, machines, operators, or
product variants, the average can obscure critical differences. Always
ask whether the data should be stratified before it’s summarized.

The plant I mentioned earlier — the one with the beautiful Cpk of 2.0
— would have caught its problem immediately if it had been calculating
capability by shift instead of across shifts. The aggregate average was
lying. The stratified data was telling the truth.

Replace “What’s
the Average?” With “What’s the Range?”

Change the language in your organization. When someone asks “What’s
the average defect rate?” respond with “The average is X, but it ranges
from Y to Z depending on [factor].” When a report presents a mean,
require that it include the standard deviation, the confidence interval,
or at minimum the min and max.

Culture change starts with vocabulary change. If your organization
stops treating averages as answers and starts treating them as
incomplete questions, you’ve already made progress against the Flaw of
Averages.

A Personal Observation

Over 25 years in quality, I’ve seen the Flaw of Averages cause more
damage than outright incompetence. Incompetence is visible — people know
when they don’t know something. The Flaw of Averages is invisible
because the average feels like knowledge. It feels precise. It feels
sufficient.

I remember sitting in a management review where a quality manager
reported that “average customer satisfaction is 4.2 out of 5.” The board
was pleased. Nobody asked about the distribution. I pulled the data
afterward and found that while 60% of customers rated the company 5/5,
25% rated it 1/5 or 2/5. The average of 4.2 was masking a bimodal
distribution — the company was simultaneously delighting a majority and
alienating a significant minority. The average said everything was fine.
The distribution said the company had a serious problem with a specific
segment that was being systematically ignored.

The customers giving 1/5 ratings were all from one geographic region
with unique environmental conditions the product wasn’t designed for.
The product was designed for average conditions. Those customers weren’t
average. And they were leaving.

The Bottom Line

The Flaw of Averages isn’t a statistical curiosity. It’s a practical
quality failure that plays out daily in manufacturing plants around the
world. Every time you design to the average, plan to the average, or
report the average without context, you are making decisions based on a
number that no real customer, real process, or real product will ever
actually experience.

Your customers are not average. Your processes are not average. Your
operators are not average. Your machines are not average. Stop
designing, planning, and measuring as if they were.

The average is not your customer. Design for reality.

Peter Stasko is a Quality Architect with 25+ years of experience
transforming organizations across automotive, aerospace, and
pharmaceutical industries.

Quality and the Flaw of Averages: When Your Organization Designs for the Average Customer and Fails Every Single One

The Dimension That Should Have Worked