Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm tempted to repeat the oft said correlation is not causation. But really I have never thought that captures the problem I have with studies like this. Of course correlation does not necessarily imply causation.

What I really find to be problematic is a small time series of data of a facet of a human behaviour across a population is probably going to correlate extremely well to many other time series across the same population. Or even other populations. How unusual should we find the correlation?

What we might be seeing is nothing more than the average rate of change in a population to legislative changes. If it even shows that. If you take two time series that rise and then fall and overlay them on their peak I suspect most graphs look as convincing as the one presented.



Yeah, it just feels like the entire article is too convenient.

1) Everything seems to work out perfectly, no matter how it is considered. For instance, if lead contamination in the ground is such a big deal, why does lead in gas track so closely to the crime rate? Wouldn't it be the case that it would track on the upside but not the downside?

2) He keeps on emphasizing lead lowers IQ. But the Flynn Effect suggests that IQ started going up in the 1930s and has only (maybe) started to level off very recently. It's the exact opposite of what you'd expect based on lead exposure from leaded gas.

3) Long term crime rates have been mostly going down for centuries: http://marginalrevolution.com/marginalrevolution/2011/06/lon...

Mind you, I'm fully in agreement that lead sucks and I sure as hell don't want my kid exposed to it. It's just that I'm far from convinced that the crime rates change mostly because of lead exposure. It seems far too tidy an explanation.


As far as the Flynn effect is concerned, it's possible that the benefits of general increases in public health (to which the Flynn effect is oft attributed, though somewhat more controversially in recent years) negated the effects of lead on a large scale. However, this does not mean that the Flynn effect would not have had a greater observable magnitude had lead exposure not been so high.

Part of me agrees with your overall conclusion about the corollary nature of the research as a whole, but those are my two cents on that particular point.


Sure, it's possible (even probable!) there are any number of things affecting IQ. I'm just pointing out that if you'd plotted raw IQ levels versus time, the graph would have the opposite sign from what you'd expect from Drum's lead hypothesis. There may be an explanation that makes it all fit together, but it somehow needs to be addressed, not just ignored or hand-waved away.


Good points...just possibilities:

1) Maybe lead in the ground has the same general level of danger as lead paint? Other sources of lead were covered in the article.

2) It is possible (probable even) that the levels of lead have been retarding the growth of IQ, but not reversing it.

3) Ignoring that even your link mentions that the data is skewed due to higher survivability as medical technology has gotten better, this is a separate trend on a separate scale and not really all that relevant. It could possibly be explained by lead from other sources (plumbing maybe?) or other factors outside of the scope of this particular study. If anything, your link reinforces that there was an issue as the homicide rates climbed sharply before suddenly dropping off again.


You're assuming that the causation goes lead -> lower IQ -> more crime. Indeed, the Flynn effect doesn't support the lower IQ -> more crime relationship.

Lead has a wider range of neurological effects than just reducing IQ. The causation may be more likely: lead -> unknown neurological changes -> more criminality & lower IQ.


This article cites research that goes way beyond the usual sort of correlation. How much stronger would the evidence need to get?


I'm not sure, how strong should such a correlation be to be considered evidence? That's the real question. And I think when it comes to annual averages of broad trends its a real question how we should interpret correlations. To me this article is data porn, its a story with data. It may be some part of the truth, but I think it exists to entertain the data literate - so we can follow along with the numbers and pat ourselves on the back for a fact discovered. Its enough analysis to be dangerous.

And in case hackernews has become so infected with the kind of reddit thinking that can only see in black and white - yes I think reducing lead in gasoline is a very good thing.


The strength of the correlation or its time-scale have no bearing on whether the correlation should be considered evidence of causation. The fatal flaw of a correlation is that it can be specious — it can appear to explain reality, but really there's a third variable responsible for driving the phenomenon you're interested in.

In this case, lead exposure and crime rate are correlated, but maybe lead doesn't cause crime at all: maybe something else causes crime that also happens to correlate with lead exposure. Who knows, maybe a certain pesticide was used at the same time that lead gasoline came into vogue, and that's really the true cause of the rise in crime rates.

In research like this, when you can't do a manually controlled experiment, you have to control for hidden variables by some other means. And that's precisely what the investigators in this article did: they varied the input data to try to "shake out" other variables that might be behind the correlation. They looked at different time scales, different geographies, and different demographics, in an attempt to control for hidden variables that might be related to any one of those things. Every time you vary the input data and keep finding a correlation, your evidence of a causative relationship goes up.

The gold standard, of course, would be to expose two random, double-blind sets of infants to lead and to a control substance and see what happens. But since that would be unethical (as we have reason to think lead is bad for you), we're stuck with animal studies or retrospective studies. Personally, I find the evidence in this article impressive, but it would take quite a lot of looking into their specific methods to come to any real conclusion.


> The fatal flaw of a correlation is that it can be specious — it can appear to explain reality, but really there's a third variable responsible for driving the phenomenon you're interested in.

But in the case of lead, the causal chain from lead exposure to worse behaviour is fairly well understood.


That's the high level view of the problem of correlation and causation. What I'm really saying is how unlikely should we consider these correlations - especially correlations over very broad average trends in human behaviour. Now I'm sure there are some statistical tools that would give some sense of how surprising the correlation of these trends is but I'd like to know out of all the things about human society we track how many of them correlate and how closely.


Presumably one can correlate lead exposure with reduced intelligence independent of time.


Would you at least agree the correlation seems strong enough to warrant further investigation?


Sure, it might even be strong evidence. It's just not clear and we see these time series correlations again and again.


Well, for starters to prove casuation one doesn't only need to show X moves in sync with Y. One also needs to show that it is not because both X and Y are caused by some reason Z which influences both even though X and Y are completely independent from each other. E.g. correlation between good lead data and good crime data may be caused by the fact that in cities with better law enforcement both criminal laws and environmental laws are enforced strictly, or because richer folks drive less polluting cars and commit less crimes. It does not mean less lead pollution causes people to be rich.

The whole thing sound like classic fitting the preconceived conclusion. But there's also something like this:

>>>> Not only does lead promote apoptosis, or cell death, in the brain, but the element is also chemically similar to calcium.

How lead is chemically similar to calcium? They are on opposite ends of the periodic table and have very different chemical properties. That, of course, does not prevent lead from messing with ion channels, but saying lead is chemically similar to calcium is wrong.


One also needs to show that it is not because both X and Y are caused by some reason Z which influences both even though X and Y are completely independent from each other

How do you show that something is not caused by a hidden Z? All one can hope for is a preponderance of evidence. If the following passage from the article is true, then the preponderance here is stronger than a generic correlation-is-not-causation objection can just dismiss:

"In fact, use of leaded gasoline varied widely among states, and this gave Reyes the opening she needed. If childhood lead exposure really did produce criminal behavior in adults, you'd expect that in states where consumption of leaded gasoline declined slowly, crime would decline slowly too. Conversely, in states where it declined quickly, crime would decline quickly. And that's exactly what she found."

Thus your hidden Z would have to move in sync not only with X and Y, but also with their widely varying rates in different states. That's a high bar – especially if there are legal and institutional explanations for why leaded gas was phased out at different rates in those places.


Well, a good first step would be to analyse possible causes of Z and show they are not matching it. Quoting Feynman:

For example, if you're doing an experiment, you should report everything that you think might make it invalid — not only what you think is right about it; other causes that could possibly explain your results; and things you thought of that you've eliminated by some other experiment, and how they worked — to make sure the other fellow can tell they have been eliminated.

I did not see such explanation done in this case - it looks like everything described talks about what matches the hypothesis, and no effort is done to look for alternative explanations and refute them.

>>>> Thus your hidden Z would have to move in sync not only with X and Y, but also with their widely varying rates in different states.

Use of gasoline is highly correlated with use of cars, which again is highly correlated with income, population density, business activity, etc. Any of which can be also associated with crime. I.e. if both effects are caused, say, by population density (I do not say it is, I just take it as an example), then gasoline use would raise and crime would raise when population density raises (say, because some economic factors attract people, like it happens now in North Dakota due to shale gas boom) and both would fall when some factors cause population to move away - as it happening in Detroit, for example. There might be also other factors.

>>> In fact, use of leaded gasoline varied widely among states, and this gave Reyes the opening she needed. If childhood lead exposure really did produce criminal behavior in adults, you'd expect that in states where consumption of leaded gasoline declined slowly, crime would decline slowly too. Conversely, in states where it declined quickly, crime would decline quickly. And that's exactly what she found.

And this is exactly what she would find if both were strongly related to some third cause Z, but completely causally independent from each other. The fact that the MJ author takes it as evidence confirming the causation means they do not understand how causation needs to be proven. It is not evidence for causation, since it equally applies to both when causation exists and when it does not.

Yes, this means causation is hard to prove. It is.


I don't see you addressing what I find to be the most important point, though I realize now that I didn't emphasize it enough. If we know that leaded gas was phased out at different rates because legislators mandated different schedules in different states (I say "if" — I don't know if that's true), then any hidden cause Z would have had to produce not only X (decline of lead at a certain rate) and Y (decline of crime at a similar rate) but also the legislation which caused X in the first place. Many plausible Z's become absurd if that is the case. To use your example, population density might conceivably determine both gas consumption and crime, but surely not the behavior of state legislatures implementing federal mandates. I find it hard to imagine history getting closer to a controlled experiment than that.

Does that mean that causation is proved? Of course not. But it does mean that the evidence here is stronger than garden-variety correlation.


> How do you show that something is not caused by a hidden Z?

By doing a controled study.

Of course, in this case, we're better not knowing than paying that kind of cost.


> Of course, in this case, we're better not knowing than paying that kind of cost.

How could the entire population ever be better off 'not knowing'? If there was a button to exchange a few babies or animals lives for the possible outcomes presented at the end of the article, I would do it in a heartbeat.


How lead is chemically similar to calcium? They are on opposite...

Then you go on to explain how lead is like calcium in its bioavailability, so in relation to the operations of a human body, lead is chemically similar to calcium (only dangerous).

http://brain.oxfordjournals.org/content/126/1/5.full

For example, lead’s ability to pass through the blood–brain barrier (BBB) is due in large part to its ability to substitute for calcium ions (Ca2+)


"Ability to replace calcium in certain biochemical processes" is not nearly the same as "chemically similar to calcium". It's like saying if I stole your login for facebook I would look like you and nobody would be able to distinguish between us without DNA test. Completely different things. Biochemical processes are known for accepting a lot of substitutes, that's how a lot of drugs work. That does not mean all those are chemically similar in general - they just play similar roles in certain processes.


> I'm tempted to repeat the oft said correlation is not causation

I's be interested in seeing what you think on that score after reading TFA and possibly even the studies that it talks about.


I haven't read any of the studies it talks about. I think the evidence for lead causing brain damage is very strong. Its effect on the crime rate is a different story. Admittedly it says it holds up over multiple countries, which seems like it should be stronger evidence.

Honestly I don't know, but I do think there is a real problem in our understanding of how to interpret population level trends. The rigorous camp simply wants to throw out everything that isn't a double blind study. The optimistic are convinced of every similar looking graph. I think we should be doing more meta-analysis of these things and building up a Bayesian sort of set of priors to apply to these results.


I have two out of the air guesses on how brain damage might cause crime. First, part of your brain responsible for empathy might not work right. Second, if you're dumb you have a harder time holding down a job, so might be more likely to resort to crime. I throw these out only to play with ideas, not to imply I have any evidence they are correct, or that brain-damage even does cause crime increase.


From the article:

> A second study [23] found that high exposure to lead during childhood was linked to a permanent loss of gray matter in the prefrontal cortex—a part of the brain associated with aggression control as well as what psychologists call "executive functions": emotional regulation, impulse control, attention, verbal reasoning, and mental flexibility.


So many factors that could have contributed to the crime drop.

Hate to drive by this interesting thread, but wasn't this same crime drop attributed to the legalization of abortion?

http://en.wikipedia.org/wiki/The_Impact_of_Legalized_Abortio...


(Duplicated from a comment I made to the original posting of this article at https://news.ycombinator.com/item?id=5002806...)

Via http://election.princeton.edu/2012/12/22/scientific-american, which cited Reyes' work on the lead theory, here's a link to a striking critique of the abortion theory:

http://www.economist.com/node/5246700?story_id=5246700

Someone inspected Donohue and Levitt's code and found a bug that meant they hadn't controlled for what they claimed they had. "Fixing that error reduces the effect of abortion on arrests by about half, using the original data, and two-thirds using updated numbers."


I've always thought that we should be wary of correlation between two series that are positively correlated with time.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: