The many flaws of Big Data

“Increasing data size shrinks confidence intervals but magnifies the effect of survey bias: an instance of the Big Data Paradox,” a study in Nature magazine has stated. What this means is that Big Data can give you great confidence in being entirely wrong.

Organizations are suffering from data overload. And the problem will get exponentially worse over the coming years. Expect to hit a series of data overload crises between now and 2030, as Big Data floods organizations and sinks their capacity to manage properly. Many organizations have no clue how much data they have, let alone how to manage it properly. Artificial intelligence systems are being trained on all sorts of dodgy data, some of it plain wrong, some of it pure rubbish, and some racist, bigoted and biased, since it reflects the historical prejudices of those who created it.

This is all a reflection of the Cult of Technology combined with the Cult of Volume. It seems that in modern times each rise in individual intelligence is matched by a drop in individual wisdom. Since I started working on websites back around 1995, the content management system was going to solve the content management problem, the search engine was going to solve the content organization problem. I have never seen technology deliver on these promises and I never expect to see it either, as long as humans are the ones who create the content.

Quality content requires quality people. Ethical content requires ethical people. Well organized content requires well organized people. What technology has generally given us is huge capacities to create huge quantities of mainly low-value, often attention-grabbing content. What technology has also done well is understanding how to get us to click, understanding our most base and primitive instincts and exploiting them to the max.

Facebook is an advertising engine. Google is an advertising engine. Their core purpose is to use our data to get us to consume, consume, consume. The destruction of society, the destruction of the climate, is all good once it fuels Facebook and Google growth. Facebook and Google have used Big Data to change us from consumers to devourers. That’s what they consider success.

In so many other areas, Big Data just slouches around, boasting of its volume and its mega energy-consuming processing power, talking loud and saying nothing. According to Nature, Facebook got about 250,000 responses per week to a survey on COVID vaccine uptake. Facebook overestimated uptake by 17%. A quality, properly prepared, properly managed survey by Axios–Ipsos, which got 1,000 responses, provided much more reliable estimates.

Is this any surprise to anyone who knows anything about quality content and quality data? Not at all. And yet the tech bro Big Data crunchers are the rock stars of our age, while the people who focus on quality are dismissed as too expensive, too slow.

I’m old enough to remember a time when we used to have professionals called editors who cared about content and data quality. Those were wiser times. We’re so much more intelligent now.

Unrepresentative big surveys significantly overestimated US vaccine uptake, Valerie C. Bradley, Shiro Kuriwaki, Michael Isakov, Dino Sejdinovic, Xiao-Li Meng & Seth Flaxman, Nature, 2022

Podcast: World Wide Waste
Interviews with prominent thinkers outlining what can be done to make digital as sustainable as possible.
Listen to episodes