Global data created per yearAmericans identifying as LGBTQ+
Global data creation and the percentage of Americans identifying as LGBTQ+ have grown together since 2015 with an r of 0.9666, which is either a statement about the data-generating power of a more openly self-expressive population or simply what happens when you compare any two things that went up during the smartphone era. The zettabytes pile up. The percentages climb. Somewhere in a data center, a server is storing this very correlation, adding infinitesimally to both sides of the equation simultaneously. It's correlations all the way down.
Global data creation has followed an exponential trajectory, growing from roughly 15 zettabytes in 2015 to over 120 zettabytes by 2023, driven by cloud computing, IoT devices, social media, streaming, and mobile usage. LGBTQ+ identification rates have grown linearly due to generational replacement and declining stigma, particularly accelerating among Gen Z. Both are measured annually, both trend upward over this short nine-year window, and both are driven by entirely unrelated structural shiftsâdigital infrastructure expansion and social norm evolution respectively. With only nine data points, the correlation is statistically unremarkable.
In an era when nearly everything is growingâdata, populations, markets, identitiesâfinding two upward-sloping lines that correlate is less a discovery than a default condition. The question is never whether two things correlate; it is whether the correlation survives a plausible mechanism.
As an Amazon Associate, getspurious.com earns from qualifying purchases. Learn more.
Want to learn more about why correlations like âGlobal data created per yearâ vs âAmericans identifying as LGBTQ+â don't prove causation? Read our guide to statistical thinking.