Global data created per yearPhishing attacks reported annually (worldwide)
It turns out that the more information humanity creates, the more determined a subset of humanity becomes to trick the rest of humanity into revealing information they've already created, which is rather like observing that the bigger your house gets, the more enthusiastically burglars case your neighbourhood. Between 2015 and 2022, these two metrics climbed in almost perfect synchronisation, as if phishing attacks had simply decided to scale proportionally with the digital universe's expansion, the way a particularly ambitious parasite might grow in proportion to its host. One wonders if the criminals simply read the quarterly reports and thought, well, if they're making more data, we'd better get busier.
The most plausible explanation is almost certainly mundane: more people online means more data created and more targets for phishing schemes simultaneously. The global internet-using population grew from roughly 3.6 billion in 2015 to 5.1 billion by 2022, which is a lot of newly available humans, each one theoretically capable of both generating data through their various digital exertions and falling for a poorly spelled email about their bank details. Add to this the fact that the infrastructure supporting both data creation and cybercrime improved in lockstep—better cloud services, better APIs, better tools for mass-mailing schemes—and you have less a correlation and more a shared acceleration driven by simple digital proliferation.
What we're really observing here is the boring but unavoidable truth that crime scales with opportunity, and opportunity scales with participation. The correlation tells us something true but not particularly surprising: as the digital economy expands, so do both its benefits and its liabilities, moving together like a bicycle and its shadow. Neither dataset caused the other, which is rather the point.
As an Amazon Associate, getspurious.com earns from qualifying purchases. Learn more.
Want to learn more about why correlations like “Global data created per year” vs “Phishing attacks reported annually (worldwide)” don't prove causation? Read our guide to statistical thinking.