h1

Causality Versus Correlation

September 7, 2008

We hear regularly about the “causes” and “preventions” for various diseases from the media. For example, high-fat diets cause this, and high cholesterol causes that, or as I discussed last week, coffee helps “prevent” heart disease. But do we actually know whether these things cause or prevent these diseases? If you read any of the major health sites you might think so. But if you read them again more carefully, you’ll notice they don’t actually talk about causes, they talk about risk factors. Take the American Heart Association web site for example:

Disease Causes Major Risk Factors
Heart Disease ? Smoking, high cholesterol, high blood pressure, lack of exercise, obesity, poor diet
Type 2 Diabetes ? Smoking, high cholesterol, high blood pressure, lack of exercise, obesity, poor diet
Stroke ? Smoking, high cholesterol, high blood pressure, lack of exercise, obesity, poor diet

So what’s the difference between a “cause” and a “risk factor”? Risk factors sound like risks, which are bad, right?

Let’s look at an example. What’s the leading cause of car accident mortality?

  • Reckless driving? Nope.
  • Speeding? Nope.
  • Getting distracted? Nope.
  • Bad weather? Nope.

None of these things actually cause death. Car accident mortality is usually caused by bodily harm incurred when we’re introduced to Newton’s First Law of Motion, i.e. “A body in motion will remain in motion unless acted upon by an outside force”, like a tree or another car!

Now you might argue that this is splitting hairs, but my point is that we need to be careful when talking about cause and effect. In order for something to be causative, we need to be able to link the series of events together with a clear relationship between them. For example:

  • Serious bodily harm usually causes death because we lose too much blood and our internal organs fail, or our internal organs become too damaged to function

Straightforward, right? So back to our secondary factors:

  • Speeding can increase the probability of an accident, and thus death, because you and the other drivers have less time to react to avoid a collision
  • Bad weather can increase the probability of accident, and thus death, because reduced visibility / traction can prevent you from avoiding a collision

Still straightforward, right? If we analyze car accident rates, we’d expect that drivers who speed or drive recklessly would have higher mortality rates. We start with a plausible cause, and we can verify effect. If necessary, we can test our hypothesis in a controlled environment.

But say we notice that drivers with red cars also have higher mortality rates. Now what? We have effect, but what’s the cause? Working backwards is dangerous – just because something is correlated, doesn’t make it causative. For example, what happens if we ban red cars? Will we have fewer car accidents? Not likely. In this case, red cars may be correlated to personality types that are more likely to drive recklessly, rather than being causative in any way. Some people are going to drive recklessly regardless of car color. So red cars may not cause increased mortality, but they are a risk factor for increased mortality.

All of this is intuitive for car accidents, but what about diseases? If we look back at the American Heart Association web site information from above, how many causative factors are listed? None. How many correlated factors are listed? All of them. Oops…

Digging around a little, I couldn’t find any major health sites that said XXX causes YYY, besides smoking causing lung cancer. But some certainly imply causation:

“Extensive clinical and statistical studies have identified several factors that increase the risk of coronary heart disease and heart attack.”

Risk factors increase the risk of heart disease? I don’t think so… What they should have said is “they’ve identified several risk factors that are associated with an increased risk of heart disease”. It’s a subtle but important difference.

“People with diabetes are two to four times more likely to develop cardiovascular disease due to a variety of risk factors”

“Due to” implies “because of” – which certainly implies cause and effect. Hmm…

The definition of a risk factor is:

“A risk factor is a variable associated with an increased risk of disease or infection. Risk factors are correlational and not necessarily causal, because correlation does not imply causation

So by definition, risk factors don’t cause anything. If diet and obesity were causative, they’d say so – but they don’t. So why are they trying so hard to get us to control risk factors that may or may not be causes? That’s a topic for a future post…

Risk factor != cause

Also, don’t assume that major risk factors are more likely causative than minor risk factors. “Major” simply means that they’re more strongly correlated. For example, yellow teeth are a major risk factor for lung cancer. So yellow teeth cause lung cancer?

[Content © 2008 SorryToConfuseYou.com, All Rights Reserved.]

Leave a Comment