Skip to main content

The Data Detective by Tim Harford

Intellectually tepid but harmless. A discussion of the value of statistics in explaining and understanding the modern world, with advice in the form of simple--and sometimes vacuous--rules. The author tries for a Malcolm Gladwell style but he's not quite the writer Gladwell is, and the result is a book that offers readers less insight and less enjoyment than it might.

Notes: 
1) The author opens with a strange critique of the short (and in my opinion very useful) 1950s-era book How To Lie With Statistics, claiming it made people collectively suspicious of statistics. Interestingly, he makes this claim with no evidence given, which is an unfortunate way to open a book about finding the truth with statistics.

On one level this probably seems a stupid thing to nitpick about, but to put it on another level, it is never nitpicking to expect a writer to back up any and all claims with at least some evidence. If an author makes the specific claim that a book written in 1954 produced broad, collective cynicism about statistics, and does not cite a single example of that cynicism, it leaves a reader vaguely appalled to see such a transparently unproved assertion made so confidently in the very introduction of a book. But the author just moves on. 

2) I say "harmless" above, but the author appears unaware of the *harm* sometimes caused by statistics, particularly the Gaussian/normal distribution statistics that he celebrates in the many studies that he refers to in his examples and arguments. It is a critical error the author could have easily cured if he had read any of Nassim Taleb's work (Fooled By Randomness would be a good start for him). The category error here is to assume that the world (especially domains with extreme complexity or unusual distributions) conforms to a Gaussian/normal distribution and can be analyzed as such. Certainly many domains are normally distributed and can be described very helpfully using Gaussian statistical techniques, but it is a grave error to assume therefore that all domains can be modeled or described effectively this way. 

Sadly, the author cites Taleb twice in the book, but it is plain that he hasn't yet integrated any of Taleb's central ideas. 

3) Another oddity, in light of the author's criticism of How To Lie With Statistics: See Chapter 9 where the author cites--without any criticism--a book called How Charts Lie in a chapter that basically describes how to lie with statistics using charts. Wait a minute: if a generation of readers collectively became cynical about statistics from reading the first book, won't a new generation of readers became cynical about charts and lose their trust in charts because of this book? When seeing incontinent thinking like this from an author it makes a reader wonder: is there a coherent message in this book or is it just a salad of words?

4) Harford writes this book with an enthusiastic tone of a man who has discovered an incredibly useful tool (statistics) and wants to share his appreciation for it. Imagine a man who's discovered a hammer for the first time, and is enthusiastically explaining all the uses for it. The problem is: to the man with a hammer all the world is a nail. This is a journey all of us go through in our discovery of statistics: it is a journey of epistemic humility, where we discover where statistics can and should be used, and where it shouldn't and can't. There is thus a kind of enthusiastic naive empiricism about this author, which on one hand is endearing, but on the other hand would be quickly dispelled if he would just read Fooled By Randomness.

5) Another interesting structural problem to think about: How do you write a book to encourage readers to pleasure in the joys of statistical analysis (and also enjoy the benefits of the many scientific studies produced using these statistical methods), while at the same time admitting one of the gravest crises in "studies show science" which is the reproducibility crisis? The author first addresses the reproducibility crisis some 100 pages into the book, but yet the book is filled front to back with citations of studies--almost none of which have actually been reproduced! This is a very interesting structural problem and I have no idea how I would handle it either.

Rule 1: Search Your Feelings
This chapter is actually somewhat helpful in providing to an attentive reader two very useful metaquestions when made to feel feelings by news or information: first ask "how does this make me feel?" Then ask "why does it make me feel this way?" You will be impervious to unethical rhetoric if you remember to ask these questions of yourself.

Rule 2: Ponder Your Personal Experience
This chapter contains excellent examples of Goodhart's law, with which everyone should be familiar: "when a measure becomes a target, it ceases to be a good measure."

Rule 3: Avoid Premature Enumeration
Interesting discussion on arbitrary cutoff dates for declaring live births as a driver for widely disperate statistics on infant mortality. The author could have done a lot of interesting things discussing our reactions to this information, which would have tied together all the chapters so far in the book. He missed the opportunity. 

Rule 4: Step Back and Enjoy the View
Genuinely intriguing idea to consider a newspaper that was delivered every 25, 50 or 100 years rather than every day. What would it cover? How would it cover the news? What actually would be news looked at from such a long-term perspective?

Thinking of news delivered at a slower rhythm helps broaden your perspective, helps you see what is more likely to be noise rather than signal. There's definitely applications here for investing to think about.

It struck me as rather weird that this author cites Nassim Taleb and the disgraced Rolf Dobelli in the same short paragraph and yet fails to acknowledge (more likely fails even to know about) the embarrassing plagiarism Dobelli committed against Taleb. One strong possibility is this author doesn't really know as much about his sources and subject as he should. Which takes me to a heuristic: the more glibly written the book, the further outside his circle of competence the author is.

Rule 5: Get the Backstory
How many books have I read that contain the infamous jam tasting study, along with meta-analysis of what it means? Even this thick-headed food blogger wrote about it years ago. :))

Note however that this chapter is a good starter text for people to learn about the various catastrophic problems in "studies show" science. The author could have and probably should have gone further and been more aggressive in his criticisms: there's plenty to criticize and plenty of atrocious examples: of foundational psychological studies being found to contain fake data, p-hacking techniques, the decline effect, etc.

Rule 6: Ask Who Is Missing
The author doesn't appear to be conversant in many of the controversies surrounding design flaws in Milgram's famous conformity experiments. 

Rule 7: Demand Transparency When the Computer Says No
Google flu trends: its early success and its embarrassing later failure.
Useful chapter to avoid overly credulous belief in the value of big data analytics.

Note several examples here where the author thinks private companies should be forced to release not only their data but their algorithms to the public for "scientific value." I hate big tech companies as much as the next person, but this idea is rife with disastrous second order consequences (of which the author seems blissfully unaware).

I'm not sure if the author realizes his many inchoate conclusions often seem contradictory: we're supposed to trust statistics, but we're NOT supposed to trust anyone who uses statistics in the privacy of their own corporate headquarters. Or: sure, we should have privacy, but if our data happens to be captured by and used by some corporate algorithm it should be released to the public to be "assessed rigorously," and evaluated by "independent experts" for "accountability" and for "scientific value."

Rule 8: Don't Take Statistical Bedrock for Granted

Rule 9: Remember That Misinformation Can Be Beautiful Too 
Note the quote from and positive mention of the book How Charts Lie here in light of the earlier negative criticism of How to Lie with Statistics. I'm beginning to have grave concerns--truly grave concerns--that a generation of readers will become cynical about charts and lose their trust in charts because of this book. The author appears totally blind to this grave, grave risk. 

Nice to hear the story of Florence Nightingale, but I think this chapter's message on not being tricked by charts could be replaced with a simple one-sentence heuristic, borrowed from The Last Psychiatrist: "what do they want you to believe?"

Rule 10: Keep An Open Mind
Interesting condescension towards Irving Fisher here: he's often used as a punching bag in many books thanks to his unfortunate quotes about the stock market in 1929. 

Once again, It's rather odd to have the author finally address the reproducibility crisis in his book, but then, throughout the book, go on to cite various studies to make the points he's trying to make... without ever addressing the reproducibility of those specific studies that he's citing. Kind of a strange structural (and circular) problem. 

Conclusion: Be curious
When I mentioned the author's "sometimes vacuous" rules, these last two were front of mind. 

More Posts

The Genealogy of Morals by Friedrich Nietzsche (trans. Francis Golffing)

Of the three essays of The Genealogy of Morals  I recommend the first two. Skim the third. Collectively, they are extremely useful reading for citizens of the West to see clearly the oligarchic power dynamics under which we live. Show me a modern Western nation-state where there isn't an increasing concentration of power among the elites--and a reduction in freedom for everyone else. You can't find one. Today we live in an increasingly neo-feudal system, where elites control more and more of the wealth, the actions, even the  thoughts  of the masses. Perhaps we should see the rare flowerings of genuine democratic freedom (6th century BC Athens, Republic-era Rome, and possibly pre-1913 USA ) for what they really are: extreme outliers, quickly replaced with tyranny. The first essay inverts the entire debate about morality, as Nietzsche nukes centuries of philosophical ethics by simply saying the powerful simply do what they do , and thus those things are good by defi...

The Fourth Turning is Here by Neil Howe

If you've read the original  The Fourth Turning , much of this book will be review. However, this book explains the Forth Turning framework more cogently and tightly than the original, so if you  haven't  read the original book, I recommend just reading this and skipping the original. You'll walk away with the same central ideas plus the author's additional new (and slightly-adjusted) conclusions. The most profound takeaway from the overall Fourth Turning paradigm is that it teaches you to remember your place in the grand scheme of things. Sadly, modernity teaches the exact opposite: it persuades us to think we humans are bigger than history, that we can ignore it, be oblivious to it, and yet not repeat it. Worst of all, modernity teaches us to believe we've somehow managed to defeat history with our SOYANCE!!! and tEcHNologY--ironically none of which we can understand, replicate or repair. These "modren" beliefs, as arrogant and wrong as they are, conflic...

Anatomy of the State by Murray Rothbard

Tight, concise discussion of what the State really is and what it really does, not what we would like it to be. Thanks to the recent pandemic response, most of us lost once and for all our delusive belief that governments are a force for good, a force for fairness and justice. In this short book, Murray Rothbard shows how the State--no matter how "limited" a government you might set up in the beginning--always, always abrogates its citizens' rights and freedoms. It's just a matter of time. We also come to understand why the State loves war. It loves it. It gives the State far more power. It provides an easy justification to abrogate still more freedoms. And of course those in the State apparatus who profit politically or economically from war never seem to send their own sons to fight it. An all-too-typical example: note how Benjamin Netanyahu's military-age son lives safely and luxuriously in Miami, his security paid for by Israeli taxpayers . The fourth chap...