*The Book of Why: The New Science of Cause and Effect* by Judea Pearl and Dana MacKenzie

*The Book of Why: The New Science of Cause and Effect*by Judea Pearl and Dana MacKenzie

Judea Pearl is professor of computer science at UCLA with a long, distinguished career in software and artificial intelligence. This book is his third in a series on causality. Dana MacKenzie is a professional writer.

The point of *The Book of Why *is that expressing causal linkages with diagrams and mathematical symbols removes much of the fuzziness from searches for the cause of – well – almost anything. Language is ambiguous, sometimes intentionally so, and mere statistical association does not prove what causes what. (Correlation is not causation.)

Over the past 25 years, Pearl concluded that Artificial Intelligence cannot advance to human-like reasoning as long as programming relies on statistical probability alone.

If you are math averse, Pearl can be heavy going. After all, he advocates mathematical symbolism. But all readers will revel in the arrogance of statistical pioneers, like Sir Ronald Fisher and Karl Pearson, that illustrate how behaviorally difficult it is for statistical testing to overcome biases. A particularly egregious example is Fisher and Jacob Yerushalmy. Both were lifelong smokers on the team of statisticians investigating whether smoking causes cancer while being paid as consultants for tobacco companies.

The purpose of statistical testing is to attribute cause. Data is not always necessary. For example, we can see mechanical breakage in a malfunctioning machine without using data or estimating probabilities. However, if invisible or unknown factors contribute to causality, causal reasoning must go beyond direct observation. Absent statistical data, how do humans do this?

Pearl concludes that human causal reasoning does not key on probabilities although they can be expressed in that way. Also *The Book of Why *assumes that we really want to identify a cause, not scapegoat or confirm what we want to believe – a big assumption. False attributions and competing lies run rampant. Brushing aside this mental rubbish, statistical analyses can still fool professional statisticians as well as the public.

One reason is misunderstanding an intermediate cause. For example, the British Navy had outbreaks of scurvy long after discovering that sucking lime juice prevented it. Before vitamin C was discovered, scientists thought that lime’s acidity did it. Some naval commanders learned the hard way that substituting an acid without vitamin C did not work.

But Pearl is at his best illustrating how the design of a statistical test can be biased, if for no other reason than inability to control for confounding variables – lots of potential partial causes. The least biased test is a double blind, randomized controlled trial, when such an experiment can be run. Often it can’t.

A recent illustration of how statistics can bias causality is the study of the effectiveness of a wellness program in a large organization. (This is not in Pearl’s book.) With 5000 persons surveyed, the effects were first tested by a classic observational comparison: compare enrollees who used the service with enrollees who did not. The participants ate better, exercised more, etc. – significant differences.

However, comparing all wellness enrollees, including those who did not participate in its services, with non-enrollees showed no significant differences. That is, the effectiveness of the wellness program depended on lots of confounding variables and the inclinations of people. “Health nuts” might live healthily without a program. If you were the management, would you continue this wellness program? If you want to improve wellness of the total employee population, what should you do?

Pearl proposes formal causal logic to knife through statistical thickets. In particular, he despairs of observational comparisons that “control for” a string of confounding variables, like male-female, age range, smoke-non-smoke, and so on. He zeros in on the analysis of “counterfactuals,” what could be, or what might have been, rather than what is. Many of our most crucial decisions involve counterfactuals. These can only be assessed indirectly by inference from evidence that we can see.

Pearl proposes causal diagrams and inferential logic based on his Ladder of Causation. The causal logic has to be rigorous. For example, for Y to occur, X had to exist, but the existence of X does not mandate Y. Or the existence of X implies that Y could follow, but not with certainty; Z might follow.

Probabilities suggest a static state – even if it’s a future static state. Causality is always about change, the direction going from one state to another and why. Pearl blends the two into a pattern of logic for “answering a query” that may or may not include gathering data and estimating probabilities, but it always includes a causal model. Drawing a causal model, even if simplistic, usually clarifies to everyone exactly what we’re trying to prove.

**Figure 1. Examples of Causality Diagrams**

To prove anything, the diagram is a guide to any data that must be collected and how to analyze it. The diagram makes this seem simpler than it is. For example, in the low baby weight case, the data had to back door into a probability of a smoking mother having a baby with a birth defect. All this hangs on Pearl’s Ladder of Causation. It is suggestive of the older ladder of inference, which has been presented in many ways, but Pearl cuts a couple of rungs off it by assuming that people are willing to change beliefs, and that they can exercise logic more rigorously (the reason for the math symbolism).

**Figure 2. Pearl’s Ladder of Causation**

Pearl contends that statistical reasoning is usually stuck at level 2 because it does not want to delve into the subjectivity of level 3. He presents a logical format for analysis at level 3.

The diagramming is deceptively simple. To reduce something complex to a diagram, one has to think hard. Beyond that, I doubt that hordes of people will sharpen their reasoning with mathematical symbolism, but just diagramming causal relationships would advance public discourse. If Pearl can stimulate us to do that, he opens a new chapter in how we can resolve our problems instead of endlessly arguing about them.