How to Explore the Effect of Doing Something? (Part 1)

Applied Causal Inference 101: Counterfactual Worlds and The Experimental Ideal

Photo by on

When you saw the subtitle of this article, did you see the word “casual”? Maybe “interference” as well?

If you did, you are not alone! We, humans, are wired to make sense of the world based on our prior experiences. As we get exposed to the word “casual” more often than the word “causal,” at first look, many of us are more likely to see the earlier, even though the writer wrote the latter.

It took me a long time to realize that humans inherently think about the world and communicate with each other in causal terms. In our everyday language, we often make sentences that include causal conjunctions, such as because, therefore, thus, hence, etc. Moreover, many of the verbs are inherently causal; for instance, affect, impact, influence, increase, decrease, etc.

As another example, if we are thinking about the relationship between rainfall and rice production, we make statements such as “this year, the production of rice was lower because of lack of rainfall.” How often do you hear a farmer saying rainfall and agricultural yield are positively correlated/associated? Growing up in a society where ~80% of the population were farmers, I never heard of anything even close to the above statement. Outside academia and the data science industry, humans do not usually make statements like that. In fact, before starting to study statistics, I did not have a consistent, intuitive understanding of any non-causal correlation/association between two events.

In our daily lives, we look at events that happen together (or one after another) and make a causal inference based on the events’ concurrence (or succession). For example, every time I catch a cold, I consume a holy basil paste, and I get better within a week. Based on this real-world observation (some may even call it an evidence-based conclusion), I “believe” the consumption of holy basil cures my cold symptoms. However, such a framework of inferring a causal relation from two concurrent/successive events often does not work as a reliable method of generating knowledge. In fact, one can make a decent argument by claiming that many of our superstitions, dogmas, and bad policy actions are fundamentally grounded in this type of thinking process.

The critical question, then, is: which approach should we follow to understand the effect of doing something?

In this article, we are going to explore the fundamentals of counterfactual causal reasoning a method used by social, behavioral, and health scientists to make causal inferences.

Causal Effect of Doing X on Y

The causal effect of doing something (for example, receiving a treatment X) on an outcome (Y) is the difference in the value of the outcome (Y) when we do X and the value of the outcome (Y) when we do not do X, keeping all other things constant.

Let’s begin by defining the treatment/action/intervention (i.e., the thing I am doing), the outcome, and the time period of interest.

I define my treatment as a binary one, i.e., consuming 2 tablespoons of holy basil paste 3 times a day or not receiving any treatment.

The outcome of interest is recovery from the symptoms of the common cold. I define it as a binary outcome, i.e., whether I get a complete recovery.

And the time period of interest is 3 days from the day I start receiving the treatment.

Now that we have defined our key nuts and bolts, we think about how to identify the causal effect of interest. In theory, we need to figure out two outcomes:

Outcome #1: My recovery status (i.e., whether I completely recover or not) after 3 days in a world where I consume holy basil paste 3 times a day.

Outcome #2: My recovery status (i.e., whether I completely recover or not) after 3 days in another world where everything else stays exactly the same as in world 1 except I do not consume holy basil paste 3 times a day.

The causal effect is the value of outcome #1 minus the value of outcome #2.

More importantly, let’s take a moment to understand why merely observing my recovery after consuming holy basil paste in this real world is not sufficient to claim that holy basil cures the symptoms of the common cold.

Think about scenarios #1 and #2, as shown above. In both cases, I observe my full recovery, but I can claim that the holy basil paste cures cold symptoms only in scenario #1!

Why so? 😳

Because if scenario #1 is true, I will fully recover from all cold symptoms after 3 days when I consume the holy basil paste, but I will keep suffering if I do not take it. Please keep in mind the critical assumption that everything else stays exactly the same in the two worlds, except in one world, I take the treatment, while in the other, I do not. In this case, we conclude that the holy basil paste treatment positively affects recovery from the common cold symptoms after 3 days.

However, if scenario #2 is true, I will fully recover from all cold symptoms after 3 days even if I do not consume the holy basil paste (which may be solely due to my natural immunity). Here, we conclude that the holy basil treatment has no effect on the recovery from cold symptoms after 3 days.

Conceptualizing Counterfactual Worlds with Real-World Data/Evidence

You might be wondering, “Okay, that makes sense! But how can we create that counterfactual world where everything else stays constant except we do not receive the treatment? After all, we (seemingly) exist only in this world and not in any counterfactual worlds 🙄”

Indeed, it is a difficult challenge! Unfortunately, we do not yet know any reliable way of identifying the causal effect of a treatment on an outcome for any individual human being. However, we do have a pretty reliable method of identifying the causal effect for the typical (average) person of a target population. In this sense, causal inference in social, behavioral, and health sciences is inherently “probabilistic,” not “deterministic” in nature. For example, when we say smoking cigarettes causes lung cancer, it does not imply that every smoker will develop lung cancer; instead, it means if we compare the outcomes of a group of people in two different worlds — in which everything else stays constant, except in one world, they smoke cigarettes and in the other, they do not — on average, the probability that they will develop lung cancer is higher in the world in which they smoke compared to the world in which they do not smoke.

Okay, coming back to the creation of counterfactual worlds!

Ideally, we follow a two-step process. First, we define our target population (e.g., generation Z in the U.S.) and select a random sample from that population. This is done because running an experiment — maintaining all the scientific rigor — on an entire population can be extremely complicated. Next, we randomly assign our sample into two groups. Interestingly, random assignment creates two groups of people who are, on average, identical to each other. For example, if the average age in one group is 20, the average age in the other group is expected to be 20 (**at least pretty close to 20 unless you are extremely unfortunate**).

**For practical reasons, in academic research, the first step (i.e., the random selection of research participants from a target population) is often avoided; in many cases, experiments are run on participants who are “convenient” to find. This limits the generalizability of the findings beyond the studied sample and the unknown population it represents. Interestingly, big-tech companies do not need to select a random sample as they can run experiments on their entire user population if they want. Think about the user interface of the tipping option in your UberEats/GrubHub/DoorDash app. They can tweak a specific feature in their app and randomly assign their users to treatment and control groups to test whether the tweak really worked.**

How do we randomly assign people? In the simplest case, we can give each participant a coin; as they flip the coin, if they get a head, we assign them to the treatment group, and if they get a tail, we assign them to the control group. The treatment group, obviously, gets the treatment, whereas the control group does not get the treatment. **In medical trials, the control group receives a placebo (i.e., something we know for sure will not directly affect the outcome)**. This process is called a “randomized experiment,” which many academics, but not all, consider to be the “gold standard” of causal inference.

Most importantly, the typical (average) participant of the treatment group and the control group in a randomized experiment can be considered counterfactuals of each other. In simpler words, randomized experiments help us make an apples-to-apples comparison. If randomization works as intended, we consider the difference between the treatment group’s average outcome and the control group’s average outcome as the causal effect’s estimate (a.k.a. average treatment effect).

**I must admit, the magic of randomization did not seem intuitive to me when I first learned about it. I wondered, “how can something as chaotic as a random process create an outcome as orderly as two identical groups?🤔” Again, I was thinking like a normal human being; uncertainty is innately uncomfortable to me. Later, I convinced myself of the benevolence of randomization after observing the outcomes of computer simulations. Now I keep faith in the holy randomization 🙏**

Nevertheless, randomization, all by itself, is not enough for valid counterfactual reasoning. We have to make two additional assumptions:

1. No interference: The outcome of one participant in the experiment depends only on her treatment assignment, not on the treatment assignment of others. For example, if I am part of the experiment considered earlier, whether I get full recovery from the symptoms of the common cold depends only on whether I consume the holy basil paste and not on whether other participants consume holy basil paste.

**Do you think the above assumption holds in the context of contagious outcomes? 😳**

2. No two versions of the same treatment: Everyone within a particular treatment category must receive the same version of the treatment. For example, everyone who gets the holy basil paste treatment must consume 2 tablespoons of it 3 times a day.

Real-World Implications

Running experiments on human beings is difficult. Sometimes running an experiment is unethical, if not entirely impossible. For example, let’s pretend we are interested in investigating whether “consensual non-monogamy” makes couples happier. As I described earlier, to investigate the above question, first, we need to clearly define the treatment, the outcome, the time period, and the population of interest.

For simplicity, let’s again define the treatment as a binary one: either a couple is in a monogamous relationship or a consensual non-monogamous relationship. Can we randomly assign couples to these two groups? Even if we find some willing participants, the ethics review committee will probably not allow running such an experiment.

The outcome is happiness which we can measure using one of the existing happiness measuring scales (usually survey questionnaires). Many of these scales are thought to be reasonably accurate and precise.

How about the time period? This one is tricky as well! I mean, even if we can run this experiment, will participants let us measure their happiness on a frequent basis after they receive the treatment? And for how long will they allow us to do so?

These questions are worth asking and ruminating over. Arguably, we all aspire to know more about human behavior and wellbeing. As “consumers of knowledge,” we must train ourselves to differentiate between knowledge derived through a reliable method and knowledge derived through a not-so-reliable method.

I will end by sharing a provocative article by the BBC titled “.” As a reader, I unknowingly get enticed into reading the article due to its usage of the causal verb “benefit” (**who doesn’t want to be benefitted? 🤷🏾‍♂️️**). I start reading it to be enlightened on the effect of being non-monogamous on happiness. Again, often, I do not know how to wrap my head around a non-causal statistical association between two events. And the author knows that! (**and avoids writing these titles: having many lovers is associated with/linked to/related to more benefits🧠✨**)

The article is, indeed, a fascinating one. It includes the narratives of real people, theoretical and conceptual arguments, and expert opinions. Earlier in the article, the author acknowledges that the effect of consensual non-monogamy on wellbeing is unclear. However, afterward, the way the author develops the article, I — being an ordinary and non-expert reader of the issue — cannot help thinking that non-monogamy causes more happiness, at least for certain groups of people.

What causal inference would you make based on the BBC article? And why would you make it? Please do share your take on this!

**Hint: think about the 3 fundamental principles of causal inference: 1) apples-to-apples comparison, 2) no interference, and 3) no two versions of the same treatment**

This article’s second and final part discusses how social, behavioral, and health scientists infer causality when running an experiment is not an option. Here is the link to it:

**For drawing the figures, I used vector images from , which offers copyright-free vector images in popular .eps, .svg, .ai and .cdr formats.



Sharing synthesized ideas on Data Analysis, Data Literacy, Causal Inference, and Wellbeing | Ph.D. candidate @UW-Madison | More:

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Vivekananda Das

Sharing synthesized ideas on Data Analysis, Data Literacy, Causal Inference, and Wellbeing | Ph.D. candidate @UW-Madison | More: