In the age of data deluge, we are often mesmerized by patterns. But as any good detective or data scientist knows, patterns alone don’t solve mysteries. We must ask deeper questions: not just what happened, but why. This is where Judea Pearl’s "Ladder of Causation" offers a powerful framework for understanding the world, helping us move from passive observation to active reasoning.
To bring these abstract ideas to life, let’s take a walk through 221B Baker Street, where Sherlock Holmes shows us how causal reasoning unfolds in layers.
The Ladder of Causation
Judea Pearl categorizes reasoning into three levels, which he calls the "Ladder of Causation":
Level 1: Association (Seeing)
This is the world of pattern recognition. We observe correlations: the presence of smoke and fire, footprints and a crime.
Dr. Watson notices footprints leading away from the crime scene. He records them dutifully. Holmes, however, knows this is only the beginning.
" Association is data” Holmes reminds us. “It is not yet insight.”
Level 2: Intervention (Doing)
Now we ask: what happens if we do something? This involves manipulating a variable to see the effect.
Holmes tests whether the back door creaks when opened. He reenacts the suspect's movements, examining what outcomes those actions might produce. This is the act of intervention.
Holmes doesn't just watch, he experiments. He steps into the mystery to see how it responds.
Level 3: Counterfactuals (Imagining)
This is the deepest and most human form of reasoning. We ask, “What would have happened if things had been different?”
Holmes ponders, "Had the dog barked, the thief must have been a stranger. But the dog did not bark, so the thief was known."
Counterfactuals allow Holmes to reason backward from facts to motives and from outcomes to alternate realities.
The Invisible Hand: Confounding Variables
Before we climb this ladder with confidence, we must watch out for misleading clues, especially those caused by confounding variables. A confounder is an unseen factor that influences both the cause and the effect, giving the illusion of a direct relationship.
Imagine Holmes is investigating a murder at a lavish dinner party. All guests who ate the fish fell ill, so Watson declares: "It must be the fish!" But Holmes digs deeper and finds that everyone who chose the fish also drank the imported white wine. The true culprit? The wine was poisoned. Here, the wine is the confounding variable, linked to both the apparent cause (fish) and the outcome (sickness).
Confounders are the hidden threads in the web of causality. Without uncovering them, we risk mistaking coincidence for truth.
Most of today’s machine learning systems are stuck on the first rung. They excel at finding patterns but falter when asked to reason about causes or imagine alternatives. Pearl’s work gives us a way to build systems that think more like Holmes, ones which are able to explore interventions and counterfactuals.
Let’s now take the help of Holmes to investigate the reasons of payment delays using the Supply chain graph that I introduced in my previous article Navigating Payment Behavior Using Graph Data Science using the fictitious supplier, TimberFlow Inc.
Sherlock Holmes and the Case of the Leather Crisis
Sherlock Holmes leaned back in his creaky office chair at TimberFlow Inc., eyes fixed on the web of nodes glowing on the supply chain graph.
“Watson,” he murmured, “I hear that the distributors are complaining about other clients too, and everyone’s blaming LeatherLux Ltd. for these late payments. Something smells off”
Watson raised an eyebrow. “But look at the connections! LeatherLux Ltd. shows up in every major delay. They’re clearly the bottleneck.”
Holmes spun the screen toward his companion. “True. But that’s correlation, not causation. Let’s trace the timeline.”
He pointed his finger at the top of the graph. “Look here—August 1st, 2025: EU Leather Tariff Hike. A sudden 15% levy on all leather imports. The news hit LeatherLux Ltd. like a hammer.”
Watson leaned closer. “So, their input costs skyrocketed?”
“Precisely,” said Holmes. “Shipments delayed, contracts renegotiated, and naturally delivery to their client CraftFurn Designs that TimberFlow Inc. also supplies to and awaits payments from, were stalled.”
Holmes tapped four more nodes. “Now observe these clients. — UrbanHide Accessories, StrideWear Footwear, LuxPurse Co., and ClassicLeather Wallets. “All four rely on high-grade leather sourced through LeatherLux Ltd. They also suffered payment delays. But was it because LeatherLux was late, or because the tariff also affected their downstream margins and cash flow?”
Watson’s eyes widened. “So, the EU Tariff didn’t just hit the supplier—it hit the whole chain. And by assuming LeatherLux Ltd. was solely to blame, we missed the bigger culprit.”
Holmes smiled. “A classic case of confounding, Watson. The tariff impacted both the cause and the effect, making LeatherLux Ltd. look guilty by association.”
He tapped the table for emphasis. “In the world of data, you mustn’t just ask what happened, but why. And sometimes, the why hides behind another why.”
“Brilliant!” Watson exclaimed. “And by mapping this as a causal structure, we could eventually isolate the real driver of payment delay"
“We could also simulate alternate realities,” Holmes finished, “to see what the payment behavior would have looked like. But that’s a journey for another day.”