Spurious

beautifully meaningless correlations

Famous Spurious Correlations in History

The history of statistics is littered with correlations that look convincing on a chart but fall apart under scrutiny. Some were honest mistakes, some were deliberate satire, and some have been weaponized to push bad science. Here are seven of the most famous examples — and what each one teaches us about thinking clearly with data.

1. Ice Cream Sales and Drowning Deaths

This is the textbook example taught in every introductory statistics course. Ice cream sales and drowning deaths both spike during summer months. The confounding variable is obvious: temperature. Hot weather drives people to both buy ice cream and swim — two activities that have nothing to do with each other.

ConfoundSummer heat / temperatureLessonAlways look for a seasonal or environmental factor that could drive both trends.

2. Pirates and Global Warming

Bobby Henderson famously pointed out that as the number of pirates declined over the past 200 years, global temperatures rose. He used this to satirize poor causal reasoning in his open letter about the Flying Spaghetti Monster. The correlation is real — both trends happened — but attributing one to the other is absurd.

ConfoundThe passage of time / industrializationLessonTwo things changing over the same long time period will almost always correlate. Time is the ultimate confounding variable.

3. Organic Food Sales and Autism Diagnoses

Both organic food sales and autism diagnoses rose steeply from the late 1990s onward. Anti-science advocates have tried to draw causal links from charts like this. In reality, organic food grew because of marketing and consumer trends, while autism diagnoses increased due to broader diagnostic criteria, greater awareness, and improved screening.

ConfoundGrowing public awareness, broadening definitions, population growthLessonWhen two things are both growing, check whether the growth has independent causes before assuming a link.

4. Nicolas Cage Films and Swimming Pool Drownings

Tyler Vigen's original Spurious Correlations project made this pairing famous. The number of films Nicolas Cage appeared in each year correlated with the number of people who drowned in swimming pools. It became a viral illustration of how absurd correlations can appear statistically strong.

ConfoundPure coincidence + small sample sizesLessonWith enough datasets, you will find correlations everywhere. Statistical significance is not the same as real-world significance.

5. Shoe Size and Reading Ability in Children

Studies have found a positive correlation between children's shoe sizes and their reading test scores. Bigger feet, better readers? Of course not. Older children have both larger feet and more years of reading practice. Age is the hidden variable driving both measurements.

ConfoundAgeLessonWhen studying a population that changes over time (like growing children), look for developmental confounds.

6. Stork Populations and Birth Rates in Europe

A well-known ecological study found that countries with more storks tended to have higher birth rates. This was not evidence that storks deliver babies. Rural areas have both more stork habitat and, historically, higher birth rates due to socioeconomic factors. The correlation was driven by urbanization levels.

ConfoundRural vs. urban geography, socioeconomic factorsLessonEcological correlations (comparing regions or countries) are especially prone to confounding because they aggregate millions of individual differences into a single data point.

7. Per Capita Cheese Consumption and Bedsheet Tangling Deaths

Another gem from Tyler Vigen: the per capita consumption of cheese in the United States correlated almost perfectly with the number of people who died by becoming tangled in their bedsheets. The absurdity is the point. With thousands of datasets and enough years, coincidental alignment is inevitable.

ConfoundRandom chance over a shared time periodLessonThe more comparisons you make, the more false positives you'll find. This is called the multiple comparisons problem.

The Takeaway

Every one of these examples involves real data and a real statistical correlation. None of them prove causation. The pattern is always the same: two variables move together because of a hidden third factor, a shared time trend, or pure coincidence magnified by large datasets.

The next time you see a headline claiming that X is “linked to” Y, ask yourself: is this an ice-cream-and-drowning situation? Chances are, it might be.