Fallible data

Data is not fact and fact is often just a hypothesis anyway. We humans design how data is created and we humans are the ones who interpret data and draw conclusions from it. Therefore, data will always be inherently fallible. Unless we approach it with a sense of humility and a willingness to acknowledge our opinions as hypotheses to be tested, we will end up using data to entrench the gut instinct behaviors we claim we want it to replace.

To many, data is the new god. It can do no wrong. “The data says” has a certainty and absoluteness to it. “Data doesn’t say anything,” explains Andrea Jones-Rooy in an excellent article for Quartz magazine. “Humans say things. They say what they notice or look for in data—data that only exists in the first place because humans chose to collect it, and they collected it using human-made tools.”

For those involved in customer or user experience, or in the creation of content, or in the digital design world in general, data can be a powerful advocate. Again and again, I meet web professionals who struggle with organizational politics and ego. The opinions of other stakeholders often demand the creation of more and more digital stuff, much of it unnecessary, and some of it counterproductive to a good customer experience.

Many of these stakeholders are in senior management, so if your response to them is your opinion, it is their opinion that will invariably win. Or even if you win, your career will lose because, whether we like it or not, pleasing our superiors by agreeing with and implementing their opinions, is still the safest way to promotion. Those who champion the customer are often seen as awkward, obstructive, stubborn. Not a great career move.

Data can take the heat for you. You can use data to convince others. In the best scenario, data can help you change minds and opinions, so that stakeholders develop new opinions based on data. That’s a genuine win-win situation.

However, once you enter the arena of data, you have to be careful. Some will see your behavior as a claim to have a higher truth, to have more robust facts. Some will challenge your data. You need to be able ensure that your data is as valid as possible, while at the same time clearly outlining its weaknesses and limitations.

“Data is an imperfect approximation of some aspect of the world at a certain time and place,” Jones-Rooy explains. “It’s what results when humans want to know something about something, try to measure it, and then combine those measurements in particular ways.”

According to Jones-Rooy, we can introduce imperfections into data in number of ways. Random errors occur as a result of faulty equipment or human mistakes. A systematic error results from “using data from Twitter posts to understand public sentiment about a particular issue is flawed because most of us don’t tweet—and those who do don’t always post their true feelings,” Jones-Rooy states.

You may choose the wrong things to measure. You might measure how long people spend on your website, thinking the longer they spend the better the experience, whereas it may reflect time wasted because of confusing navigation and low quality search results and content. Errors of exclusion happen when you exclude a particular segment of the population and then assume that the data applies to them too.

I’m a data scientist who is skeptical about data

Leave a Reply

Your email address will not be published. Required fields are marked *