A cat with big ears

Magical correlation

Today I’ll teach you how to cook the first-class modern science. A simple and working method – I’ve tested it, and it works great. Fifteen minutes and I am ready to publish three research papers. Here are my findings:

– In a sample of 80 cats the individuals with higher body mass have larger ears. It means that 99% of body mass in cars is concentrated in ears, and they determine the total mass.
– Pregnancy is the result of drinking water. I found out that 100% of pregnant females drank water. Meaning that in males there is an anomaly of water metabolism, hence no pregnancies.
– In Moscow, room heating is on when it’s winter. Which means that the heating can cause winters in Moscow!

You’re not persuaded, I bet. But would you be surprised to know that I am far from the first scientist in the world to every use this technique? It’s everywhere, in all scientific areas. It’s called “Correlation = Causation”.

Correlation is the relationship between two values. For example, here’s my plot from the cats’ ears research:

Cat ear length vs body mass

You can see how the ear length changes along with the body mass. The values correlate, and we can use one to guess the other. Say, I know that some cat is heavy, then I can wild-guess that its ears are long. And most likely, I’ll be right.

But when I conclude that body mass of cats is concentrated in their ears, I imply a causation.

Causation is the particular case of correlation when not only the values are associated, but the corresponding traits as well. While in case of correlation we can guess one value from another, with causation one value has a direct effect over the other. Say, if the body mass was indeed concentrated in cats’ ears, not only we could predict the mass from ear length, but control the mass, by trimming or stretching the ears.

Obviously, that’s impossible. The variables of ear length and body mass are associated, but the mass does not depend on the ear length. The same for water and pregnancy. And for heating and winters.

In all three “scientific” researches, I made the same pseudological conclusion. I assumed a causation from a correlation. Which is known as a gross blunder in data analysis.

If two values correlate, it by no means declares that one of them causes the other.
Correlation ≠ Causation!!!

Our brain instinctively searches for logical explanations. When we see two correlating values, we rush to explain one value with the other. When we find out that three pupils with highest exam scores eat tofu for breakfast, we start to think that tofu is the key to boundless IQ. When we find out that four out of six best actors in the world drive blue cars, we rush to think that blue is the colour of success.

Stop right now. Don’t, don’t and don’t. Don’t rush. You know nothing yet. Not only correlation does not necessarily mean causation, but there are many types of causation in data analysis.

Say, we have two correlating values: the magical power of a wizard, and the height of his/her hat. The plot:

Magical powers vs Hat height

The values correlate, and we can use one to predict the other. Yet the graph could mean one of five(!) types of interaction between the values:

– Direct causation
– Indirect causation
– Bidirectional causation
– Common causation
– Correlation without causation

Now, each of those in detail:

а) Direct causation.

Direct causation is when a change in one value causes a change in the other. That’s the basic type of causation. In our case, a magician’s hat is a magical antenna, and the higher it is, the better it harnesses the magical energy of the universe. The variables are interconnected, with one influencing the other. A direct causation.
Test: replace wizard’s hat with a higher one – the power grows.

Notare bene: direct causation can be reverse. Example: hat height doesn’t determine the power, which is increased by consuming special mushrooms. Yet, the order of wizards demands that the hat height corresponds to one’s power. In other words, the power will determine the hat height. Test: feed mushrooms to a wizard for supper. Next morning, the wizard finds out he’s become two times stronger, and immediately runs to a hat store for a new, higher hat. It won’t work in the other direction: if we replace his hat at night, his power won’t rocket in the morning (he’ll merely be expelled from the order for misconduct).

It’s not uncommon in data analysis to mistake a direct causation for a reverse one. Or vice versa. Like I did with the heating and winter. Winter causes heating, not the other way around.

б) Indirect causation.

Indirect causation occurs when one of the values causes some third, hidden trait to change, and that hidden trait then affects the second value. Exempli gratia: wizard’s power affects the length of her legs. While the order demands that hat height corresponds to leg length. Test: feed mushrooms to a wizard. In the morning she’ll notice that her powers have doubled, and sees that her legs became longer. And immediately runs to buy a new hat that would fit her new leg lenght.

Q: Where’s the difference between direct and indirect causations?
It looks the same, although we could, instead of feeding mushrooms, increase the leg length directly using some surgical procedure, say. Then our wizard wakes up, realizes that her legs are longer and puts on another hat. While the power hasn’t changed. The power does not determine the hat length directly.

!!! Indirect causation can as well be reverse.

в) Bidirectional causation.

When bidirectional causation is the case, the change in either of two values causes the change in the other.

Say, a hat is a magical antenna, thus higher hat increases magical powers. At the same time, there are also mushrooms that boost powers. But the order demands that the height of a hat corresponds to powers anyway. A wizard who has pumped her powers with mushrooms would have to tamper with his hat to make it look higher, otherwise – expulsion from the order.

Essentially, a bidirectional causation is a direct and reverse causations at the same time. A affects B, and B affects A. They are sort of balance scales that always have to be in equilibrium. Whichever bowl you add to, the same weight is to be put in the other.

Test: give a new hat to a wizard: now antenna is better and thus powers increase.
Test 2: increase the powers of a wizard with mushrooms: he’ll notice it and immediately run to elongate his hat. Causation works in both directions.

г) Common causation.

Common causation means that there is the third, hidden value that makes both our correlating values to change.

Example: the power of a wizard depends on the elevation. The higher the wizard ascends in the mountains, the stronger she becomes. At the same time, a hat is a special height measuring device: it grows up when pressure drops, and shrinks when it increases. It’s designed so for a wizard to touch her hat at any time and know instantly how much above the ground she is. The elevation thus determines both the power and the hat height.

Test: invite our wizard into an elevator. At he top floor, her powers exceed combined strength of Darth Sidious and Master Yoda, and her hat punctures the roof of the elevator. At the B4 floor, her power is nothing and her hat cap turns into tiny pompom.

By the way, failure to recognize the common causation is the most frequent misinterpretation of  correlation. People see how two values correlate and blurt out that instant: “the left one determines the right one”. While in reality they merely have some common reason, but are unrelated with each other. The opposite happens too: there is a direct causation, but the scientists are skeptical: no, no, it’s just the Sun affecting both of them.

д) Correlation without causation.

Sometimes two values correlate, but are not really related to each other. Neither directly, nor somehow else.

It rarely happens. Normally, if two values correlate, they must be related at least somehow. Correlation without causation essentially means a coincidence. We accidentally happened to sample weak wizards with small hats and strong wizards with high hats. There are heaps of other magicians, they just happened not to come to our social inquiry, god knows why.

Real correlations misinterpreted:

Correlation: Sleeping with footwear correlates to morning headaches.
Wrong one: Having footwear on when sleeping somehow causes headaches (direct causation).
Reality: In the morning after booze parties headaches are not uncommon. Also, when falling asleep drunk, the chance of forgetting to take one’s footwear off is much higher than when sober. In other words, booze causes both headaches and sleeping with footwear (common causation).

Correlation: Children sleeping with lights on are more likely to develop myopia.
Wrong one: Light left on at night causes myopia in children (direct causation).
Reality: That’s an interesting one. In fact, myopia can be inherited from parents. Who in turn are more likely to leave the light on in children’s bedrooms (to see better themselves). Myopia in parents in the common reason for myopia in their children and light on at night (common causation).

Correlation: Ice-cream sales correlate to frequencies of drownings.
Wrong one: Ice-cream makes people drown (direct causation).
Reality: Ice-cream is mostly sold at hot seasons, when people swim a lot. The more they swim, the more, obviously, they drown (common causation, also indirect in one way).

Be careful and thorough whenever you feel that two occurrences have a relationship. Think well: does one really cause the other; if so, then how. And be advised to pay attention to any facts you read on the Net. Some of them are as good as poor fantasy fiction.


Further reads

Next article: How to Tell Causation
List of all articles about science

Leave a Reply

Your email address will not be published. Required fields are marked *