It’s not simply crap content. Computer code bloat is everywhere. For starters, most software, most features, serve no useful function. A pile of crap software is launched and then either it dies a death, or else for years badly designed fixes are made to try to get it to work in the most basic manner, while making it even more complex and bloated. Most commercial software apps hardly even get downloaded. Those that do, hardly ever get used. Thirty days after news apps, shopping apps, entertainment apps, education apps, have been downloaded, most will have lost over 90% of their users.
Big Tech laughs about all this crap production. This is how Big Tech makes so much money in its data centers. It sells them as plush five-star hotels for superior VIP data, when in reality it’s running a datafill, a data dump. If most of the data stored in a typical data center was processed and accessed every day, then everything would explode, the servers would fry. The demand would crash everything. The data center business case is dependent on most people never accessing the data they’ve stored. You’re paying for a data graveyard.
To protect their profits, Big Tech has historically made it very hard for you to delete. Artist, Honor Ash, observed how, “Initially, Gmail didn’t even include a delete button. Not only was it no longer necessary to delete emails regularly to make space in your inbox, but it was actually not even possible. This shift broke everyone’s deletion habit—it ended the ritualistic appraisal of what should be kept, and ushered in a default in which literally everything should.”
For almost thirty years of professional content management work, I’ve had to deal with the argument that we have to keep everything because you never know what will be important in the future. This was in the middle of intranets, websites and computer systems, where nobody could find anything because of the terrible search design and because of the awful information architecture. And what they did find was usually some sort of dodgy draft, some copy of a copy, or something that was way out of date, inaccurate or useless. Keeping all this junk data does not simply reduce the chances of findability, it also increases cybersecurity risk. Huge quantities of poorly structured and badly maintained data are an invitation to hacking and other risks.
Even if we could put a data center in every town and village in the world, we couldn’t keep everything. There is too much data being produced, vastly too much. We are now into the era of zettabytes. As my previous book, World Wide Waste, explained:
A zettabyte is 1,000,000,000,000,000? MB or one quadrillion MB. If a zettabyte was printed out in 100,000-word books, with a few images thrown in, then we would have one quadrillion books. It would take 20,000,000,000,000 (20 trillion) trees’ worth of paper to print these books. It is estimated that there are currently three trillion trees on the planet. To print a zettabyte of data would thus require almost seven times the number of trees that currently exist to be cut down and turned into paper.
Soon, we will be producing thousands of zettabytes a year. It’s a tsunami of data every day, every hour of every day, every minute of every hour, every second of every minute. As a result, important data that definitely does need storing is getting lost. In relation to research, for example, we are flooding our research environments with low-quality—often AI-produced—research paper garbage. It is becoming more and more expensive to store all this stuff. Research repositories are thus disappearing and lots of good research is being lost. In a thousand years, there may be more quality data artifacts on the Maya and Inca than on our digital generation. Digital is fragile, transient, and it will sink in its own crap.