Famous Spurious Correlations in History
The history of statistics is littered with correlations that look convincing on a chart but fall apart under scrutiny. Some were honest mistakes, some were deliberate satire, and some have been weaponized to push bad science. Here are seven of the most famous examples — and what each one teaches us about thinking clearly with data.
1. Ice Cream Sales and Drowning Deaths
This is the textbook example taught in every introductory statistics course. Ice cream sales and drowning deaths both spike during summer months. The confounding variable is obvious: temperature. Hot weather drives people to both buy ice cream and swim — two activities that have nothing to do with each other.
2. Pirates and Global Warming
Bobby Henderson famously pointed out that as the number of pirates declined over the past 200 years, global temperatures rose. He used this to satirize poor causal reasoning in his open letter about the Flying Spaghetti Monster. The correlation is real — both trends happened — but attributing one to the other is absurd.
3. Organic Food Sales and Autism Diagnoses
Both organic food sales and autism diagnoses rose steeply from the late 1990s onward. Anti-science advocates have tried to draw causal links from charts like this. In reality, organic food grew because of marketing and consumer trends, while autism diagnoses increased due to broader diagnostic criteria, greater awareness, and improved screening.
4. Nicolas Cage Films and Swimming Pool Drownings
Tyler Vigen's original Spurious Correlations project made this pairing famous. The number of films Nicolas Cage appeared in each year correlated with the number of people who drowned in swimming pools. It became a viral illustration of how absurd correlations can appear statistically strong.
5. Shoe Size and Reading Ability in Children
Studies have found a positive correlation between children's shoe sizes and their reading test scores. Bigger feet, better readers? Of course not. Older children have both larger feet and more years of reading practice. Age is the hidden variable driving both measurements.
6. Stork Populations and Birth Rates in Europe
A well-known ecological study found that countries with more storks tended to have higher birth rates. This was not evidence that storks deliver babies. Rural areas have both more stork habitat and, historically, higher birth rates due to socioeconomic factors. The correlation was driven by urbanization levels.
7. Per Capita Cheese Consumption and Bedsheet Tangling Deaths
Another gem from Tyler Vigen: the per capita consumption of cheese in the United States correlated almost perfectly with the number of people who died by becoming tangled in their bedsheets. The absurdity is the point. With thousands of datasets and enough years, coincidental alignment is inevitable.
The Takeaway
Every one of these examples involves real data and a real statistical correlation. None of them prove causation. The pattern is always the same: two variables move together because of a hidden third factor, a shared time trend, or pure coincidence magnified by large datasets.
The next time you see a headline claiming that X is “linked to” Y, ask yourself: is this an ice-cream-and-drowning situation? Chances are, it might be.