Quality and the Law of Large Numbers: When Your Organization’s Small Samples Lead to Big Mistakes — and the Three Defects You Saw This Morning Were Never Enough to Tell You Anything Real

Uncategorized

Quality
and the Law of Large Numbers: When Your Organization’s Small Samples
Lead to Big Mistakes — and the Three Defects You Saw This Morning Were
Never Enough to Tell You Anything Real

The Inspector Who Knew Too
Little

Carlos had been running the final inspection line at a precision
machining plant for eleven years. He knew every tolerance, every surface
finish requirement, every visual defect category by heart. When a batch
of 2,000 aerospace housings came through on a Tuesday morning, he pulled
his standard sample of five units, checked them against the spec, signed
off the batch, and moved on.

Three of those five were perfect. Two had minor dimensional variation
— still within spec, but trending toward the upper limit. Carlos noted
it mentally and said nothing. The batch shipped.

Eight weeks later, a field failure on an aircraft hydraulic system
was traced back to those housings. The investigation revealed that a
tool wear pattern had been causing a gradual shift in a critical bore
dimension. Out of the full batch of 2,000 units, 34% were actually out
of specification.

Carlos had inspected five.

He missed it not because he was incompetent. He missed it because he
trusted five data points to represent two thousand. And that trust —
that deep, pervasive, almost invisible trust in small samples — is one
of the most dangerous assumptions in quality management today.

This is the story of the Law of Large Numbers, and why understanding
it separates organizations that control their processes from
organizations that merely hope they do.

What the Law of
Large Numbers Actually Says

The Law of Large Numbers is one of the foundational principles of
probability theory, first formalized by mathematician Jacob Bernoulli in
1713. In its simplest form, it states that as the size of a sample
increases, the sample average will converge toward the true population
average.

This sounds abstract. It is not.

Every time your inspector checks five parts out of five hundred,
every time your quality engineer calculates a Cpk from thirty
measurements, every time your manager reviews a month of complaint data
and declares a trend — the Law of Large Numbers is either your ally or
your enemy. There is no neutral.

The practical implications are staggering:

  • Small samples are noisy. A sample of five parts
    tells you almost nothing about the true defect rate of a batch of two
    thousand. The random variation in such a tiny sample can make a good
    process look terrible or a terrible process look fine.
  • Convergence is not optional. As your sample grows,
    your estimate of reality improves. This is not a hope or a guideline —
    it is mathematical certainty.
  • The rate of convergence is predictable. The
    standard error of your estimate decreases proportionally to the square
    root of your sample size. Double your sample, and your uncertainty drops
    by roughly 30%. Quadruple it, and you cut it in half.

These are not suggestions. They are mathematical facts that govern
every measurement you take, every inspection you perform, and every
decision you make based on data.

The Three Traps of
Small Sample Thinking

Trap One: The Illusion of
Precision

A quality engineer measures ten parts and calculates a defect rate of
2.3%. The report goes to management. Decisions are made. Resources are
allocated.

Here is what the report does not say: with a sample of ten, the true
defect rate could be anywhere from 0% to 30% and still produce that
result through random chance alone. The precision of the number — that
“.3” — creates an illusion of accuracy that the sample size cannot
support.

I watched this play out at a medical device manufacturer. The quality
team reported a process yield of 98.7% based on a weekly sample of
fifteen units. For three consecutive months, the yield hovered between
98.2% and 99.1%. Management was delighted. The Six Sigma dashboard
showed beautiful, stable numbers.

Then a customer audit required them to test fifty units instead of
fifteen. The yield dropped to 91.3%. The process had never been at
98.7%. It had been running at roughly 92% all along. The apparent
stability and precision were artifacts of insufficient data, masked by
the false confidence of decimal-point precision.

The rule: The number of decimal places in your
report should never exceed the number of data points that justify them.
A defect rate calculated from ten samples should be reported as
“approximately 0-30%” — not “2.3%.”

Trap Two: The Phantom Trend

A production manager notices that defects increased from two to five
over three consecutive weeks. She calls an emergency meeting. The team
scrambles. A corrective action request is issued. Engineers are pulled
from other projects to investigate.

Three weeks later, after extensive analysis, they find… nothing.
Because there was nothing to find.

When your baseline defect rate is low and your sample sizes are
small, random fluctuation will regularly produce sequences that look
like trends. Two defects one week, three the next, five the following —
this can happen purely by chance. The human brain, wired for pattern
recognition, sees a trajectory where none exists.

This is not a hypothetical risk. It is happening in your organization
right now.

At an automotive components plant, the quality team maintained a “Top
5 Defects” dashboard that was updated weekly based on the previous
week’s production. Every week, the ranking shifted. Every week,
engineers were reassigned to address the “new” top defect. Every week,
the previous week’s crisis was forgotten as a new one took its
place.

After three months of this chaos, I asked a simple question: “How
many of these shifts are statistically significant?”

The answer, after analysis: none of them. Every single ranking change
was within the range of normal random variation for the sample sizes
involved. The team had been chasing ghosts for a quarter of a year,
burning through engineering hours and organizational attention,
responding to statistical noise as if it were signal.

The rule: Before you declare a trend, calculate
whether the change you’re seeing is larger than what random variation
could produce. If your sample is small, almost nothing you observe will
be a real trend.

Trap Three:
The False Conclusion About Individuals

An operator makes three errors in one week. The supervisor concludes
the operator is struggling and assigns them to retraining.

But what if the error rate for all operators on that process is one
error per 250 opportunities, and this operator had 800 opportunities
that week? Three errors out of 800 opportunities at a 1-in-250 error
rate is not unusual. It is exactly what you would expect.

When you judge individual performance based on small numbers, you
inevitably punish people for random variation and reward people for
statistical luck. This is not a management philosophy issue. It is a
math issue.

I saw the corrosive effect of this at a pharmaceutical packaging line
where operators were ranked monthly by their error count. The “best” and
“worst” operators changed every month — not because performance was
fluctuating, but because the sample sizes were too small to reliably
distinguish one operator from another. The ranking system, intended to
motivate excellence, instead created anxiety, resentment, and a culture
where operators hid near-misses to protect their numbers.

The rule: You cannot reliably evaluate individual
performance from small samples. Before you judge a person, calculate
whether the variation you’re seeing could be explained by random chance
alone.

What the Law
Demands From Your Quality System

Understanding the Law of Large Numbers is not academic. It has
direct, practical implications for how you design every element of your
quality system.

Acceptance Sampling
Must Match the Risk

If you are sampling from a batch, your sample size must be large
enough to detect the defect rate that would be unacceptable to your
customer. A sample of five from a batch of two thousand can reliably
detect a 50% defect rate. It cannot reliably detect a 5% defect rate. If
5% defective is unacceptable — and in most industries, it is — then your
sampling plan is not fit for purpose.

Use AQL tables. Use statistical sampling plans like ANSI/ASQ Z1.4 or
ISO 2859. These exist precisely because the Law of Large Numbers demands
that your sample size be proportional to the precision you need.

SPC Requires Sustained Data

A control chart with five data points is a decoration, not a process
control. You need a minimum of twenty to twenty-five data points to
establish control limits that have any meaning. Even then, the limits
will have confidence intervals of their own.

I have walked into plants where control charts were drawn from the
first five parts of every shift, limits calculated fresh each day. These
charts provided zero information about the process. They were
statistical theater — impressive on the wall, meaningless in
reality.

The Law of Large Numbers tells us that our estimates improve with
more data. This means your SPC system must be designed to accumulate
data over time, not reset it constantly. Control limits should be based
on long-run data, not yesterday’s sample.

Capability Studies
Need Real Sample Sizes

A process capability study with thirty measurements is a starting
point, not a conclusion. The confidence interval on a Cpk calculated
from thirty samples is wide enough that a reported Cpk of 1.33 could
actually represent a true process capability anywhere from 1.0 to
1.6.

For critical processes — aerospace, medical, automotive safety —
thirty measurements is almost never sufficient. You need fifty, a
hundred, or more to claim capability with the confidence your customer
requires.

Trend Analysis
Requires Statistical Significance

Before you declare that defect rates are increasing, calculate a
confidence interval. Before you announce that a new process is better
than the old one, run a hypothesis test. Before you restructure your
quality team around this quarter’s defect data, ask whether the pattern
you’re seeing is larger than the noise.

This does not require advanced statistics. It requires basic
discipline. Every spreadsheet can calculate a confidence interval. Every
quality engineer should know how.

The Organization That
Understood

There is a positive version of this story. At a Tier 1 automotive
supplier I worked with, the quality director made a decision that seemed
radical at the time: she doubled the inspection sample size on their
three most critical product lines.

The cost was significant. The inspection team needed additional
staff. The measurement cycle time increased. The production manager
objected. The finance director objected. The plant manager asked if
there was a cheaper alternative.

She held firm. Her argument was simple: “We are currently making
decisions based on samples that cannot reliably detect the defect rates
our customers consider unacceptable. We are not saving money by
inspecting less. We are gambling, and we don’t even know the odds.”

Within six months, the larger samples revealed two process shifts
that the previous sampling plan had been systematically missing. One was
a gradual tool wear pattern in a CNC operation. The other was a material
lot-to-lot variation that their supplier had never disclosed. Both were
causing defects that the old sample size was statistically incapable of
detecting.

The corrective actions saved the company four times what the
additional inspection cost. More importantly, they stopped shipping
latent defects that would have eventually reached customers.

This is what the Law of Large Numbers offers your organization: not a
guarantee of perfection, but a guarantee that your data is actually
connected to reality. Without that connection, every quality decision
you make is partly a guess.

The Deeper
Implication: You Cannot Inspect Quality In

The Law of Large Numbers exposes a fundamental truth that most
quality professionals know intellectually but few organizations act on:
you cannot inspect quality into a product.

If your process produces defects at a rate of 1%, and you sample 10
units out of 1,000, you have roughly a 90% chance of finding zero
defects in your sample. Your inspection will tell you the batch is
clean. It will be wrong nine times out of ten.

This is not a failure of inspection. It is a mathematical certainty.
The only way to reliably deliver quality is to build quality into the
process so that the defect rate approaches zero before inspection ever
occurs. Inspection is a verification step, not a creation step. And
verification is only as reliable as the sample size allows.

W. Edwards Deming argued this point for decades. He was not being
philosophical. He was being mathematical. The Law of Large Numbers was
on his side.

Practical Steps for Your
Organization

Audit your sampling plans. Pull every sampling plan
in your quality system and calculate what defect rate each plan can
reliably detect. You may be disturbed by what you find.

Add confidence intervals to every reported metric.
Defect rates, yields, capability indices, supplier quality scores — all
of them should include confidence intervals that make the uncertainty
visible. This transforms reports from false certainties into honest
assessments.

Stop evaluating individuals on small numbers. If
your operator error tracking system ranks people on weekly data, either
aggregate over longer periods or abandon the ranking. The damage from
false conclusions about individuals far exceeds the motivational benefit
of a leaderboard.

Extend your SPC baseline. Stop recalculating control
limits every month. Build them from at least fifty data points and
revise them quarterly, not daily. Let the Law of Large Numbers work for
you instead of against you.

Require statistical justification for trends. Before
any corrective action is initiated for an alleged trend, require a basic
statistical test. Is the change significant? Is it outside the range of
normal variation? If not, observe and wait. Reacting to noise is more
dangerous than ignoring it, because reacting to noise consumes resources
and attention that should be directed at real problems.

The Uncomfortable Truth

Carlos, the inspector from the beginning of this story, was not a bad
inspector. He was an inspector working inside a system that asked him to
make judgments his sample size could not support. The system failed, not
the person.

The Law of Large Numbers is not a suggestion. It is not a guideline.
It is not a best practice. It is a mathematical law that governs every
measurement you take and every decision you base on that
measurement.

Organizations that respect this law build quality systems that are
grounded in reality. Organizations that ignore it build quality systems
that are grounded in hope.

In quality, hope is not a strategy. Mathematics is.


Peter Stasko is a Quality Architect with 25+ years
of experience transforming organizations across automotive, aerospace,
and pharmaceutical industries. He specializes in bridging the gap
between statistical theory and practical quality management — helping
leaders understand that the numbers they trust are only as reliable as
the samples behind them.

Scroll top