Pickyreads

Pickyreads

The Data Detective Summary – July 2022

Author: Tim Harford

Short Summary
The Data Detective (2021) is an online viewer and visualizer for statistical data – a book that makes statistics and raw data digestible, fun, and even beautiful. Instead of a long descriptive text with a lot of complicated formulas and calculations, this book provides a visual tabular view that allows for easy comparison, analysis, and interpretation of the data.
the data detective
Source: amazon.com

Detailed Summary

We’ve all heard the saying “correlation does not imply causation,” but how often do we see it in practice? Our Data Detective tool helps users discover hidden links between two different sets of data. Utilizing a simple interface, users can scroll through two data sets and see which links stand out as correlations. Some examples include:

  • The most densely populated countries also have more babies than those with lower density.
  • The unemployment rate of a country is negatively correlated with its population growth.

There is a fun and easy way to understand this correlation: the Data Detective. The Data Detective uses data not to uncover dark secrets, but to help you solve everyday problems. This time around, the kids are back in town and they have a mystery on their hands, just like Sherlock Holmes.

This book is exactly what its title suggests. The author doesn’t just give you a summary of what you should know about using statistics to your advantage, he teaches you how to use statistics yourself–not by trying to memorize every formula or statistical model, but by teaching you how to critically think about the data at hand.

The Data Detective Key points

Notice your emotions and trust the statistics based on your experiences

First, we have to know how to recognize emotionally charged statistics. Emotionally charged statistics usually cause strong reactions from our emotions and bias our judgment. They can stir up our emotions such as anger and distrust, which can lead us to believe that the number is correct.

Imagine you are voting in an election and you have a strong political preference. In the election, you are given two choices: your preferred candidate and a candidate that is strongly against your political views. Based on statistics, you find out that if you vote for your preferred candidate, you have an 80% chance of winning the election. The statistics we read can be politically charged and may be completely fake. Statistics are used in different ways. Here are some examples:

Politicians and marketers may use statistics to convince us of their beliefs or sell their products or candidate. For example, a recent survey by a reputable news channel found that 50 percent of people who voted for Trump were uneducated. Then they showed the top ten states in America with the highest illiteracy rates.

Well, always trust your emotions that are based on your personal experiences, right? Nope. Not always. Statistics about sexual assault usually lack ‘selection bias’ which is having a biased opinion towards a person or group based on other factors instead of looking at the whole picture.

You should have the will to change or revise your opinion

This is surprising at first glance because the people you’d expect to be best at making forecasts are the ones with the most experience and expertise: think of weathermen and financial analysts. But that’s not who these researchers found to be best at forecasting. Instead, they found that more open-minded people were able to revise their initial predictions as new information came in.

How? The researchers identified some attributes that made people better predictors. They were more open-minded, and they were more willing to revise their opinions. But these were temporary traits. What if we could make you more open-minded at the time you make a forecast? We could. And we tested it on the most important question a person can make predictions about: Will I get into medical school?

The applicant was asked for his prediction, then given the results of a test that measures his willingness to update his opinion in case it’s wrong. Then he makes another prediction and gets another test result. If he’s not changing his opinion, then his second prediction is just as likely to be wrong as the first.

We’ve shown that people who are willing to change their minds turn out to be better at forecasting than others. Now, this doesn’t mean anyone willing to change their mind will be a good forecaster, just as no one who has a high IQ will be a good forecaster.

Every statistics doesn’t apply to everyone

A couple of months ago, it struck me that most of our data literacy education focuses on finding key insights into big data: what big data is, how to get it, how to clean it, and so on. While preparing for a class I teach on Data Literacy, I started to wonder: What happens when you don’t have any big data?

Many journalists don’t have the luxury of a large social media audience or access to a large store of historical data. The problem with this tendency is that it can lead to a false sense of homogeneity and comfort. We assume something is true because it feels true, or because we subconsciously think it is “normal”. The truth is every slice of human life has a different profile. The key is understanding how to read this data. In books, the information usually lies in the detail. We understand that not everyone likes the same book or author. Author, Age, and Genre are three attributes that are useful in understanding someone’s book preferences. And all three have something to tell us about the story of how we read.

Every so often I get a call from someone asking for some statistics about a particular business or issue. The question is usually along the lines of “How many people do X,” or “How big is the market for Y,” but the type of data asked for is rarely quantitative. It might be, “I’m wondering how many people will have to have heart surgery within the next five years,” or “I wish there were an accurate count of how many people will be affected by this policy change.”

When I hear questions like this, my ears perk up. In my opinion, asking for a count of “everyone who has this disease and then some” is usually not the right question to be asked.

The Data Detective Quotes

“Testing a hypothesis using the numbers that helped form the hypothesis in the first place is not OK.15” –Tim Harford

“Whatever we’re trying to understand about the world, each other, and ourselves, we won’t get far without statistics – any more than we can hope to examine bones without an X-ray, bacteria without a microscope, or the heavens without a telescope.” –Tim Harford

The Data Detective Review

This is one of the important books on how people badly use the data. One thing that has surprised me is that even though we don’t use any big data – no web analytics, no search analytics – we still have a lot of information about our users and their reading habits. Recommended.

To whom I would recommend The Data Detective Summary?

  • Anyone wants to become aware.
  • Anyone who wants to learn more and more.
  • Anyone who trusts every statistic.

Link: https://amzn.to/3zr9so4