A few weeks ago, the New England Journal published what we’d call the worst example of medical statistical misadventure we’ve seen in years: a paper claiming that “chocolate consumption enhances cognitive function” based on a correlation between chocolate consumption and the number of Nobel prize winners in a country (no, we’re not joking…it’s a real paper). Before we indulge in chocolate and a bit of other consumption during Thanksgiving, we thought it would be a good time to revisit a little lesson known as the ecological fallacy…
The paper in the Journal correlated the amount of chocolate people consume on average (kg/person/year) to the number of Nobel Laureates from that country (per 10 million inhabitants, to correct for population size). The idea behind the paper was this: “Dietary flavonoids, abundant in plant-based foods, have been shown to improve cognitive function…A subclass of flavonoids called flavanols, which are widely present in cocoa, green tea, red wine, and some fruits, seems to be effective in slowing down or even reversing the reductions in cognitive performance that occur with aging…the total number of Nobel laureates per capita could serve as a surrogate end point reflecting the proportion with superior cognitive function and thereby give us some measure of the overall cognitive function of a given country.”
The author then went on to find the following correlation (r=0.79, p<0.0001):
It was concluded that “The principal finding of this study is a surprisingly powerful correlation between chocolate intake per capita and the number of Nobel laureates in various countries. Of course, a correlation between X and Y does not prove causation but indicates that either X influences Y, Y influences X, or X and Y are influenced by a common underlying mechanism. However, since chocolate consumption has been documented to improve cognitive function, it seems most likely that in a dose-dependent way, chocolate intake provides the abundant fertile ground needed for the sprouting of Nobel laureates.” The author dismissed the idea that there might be third variables involved, as “it is difficult to identify a plausible common denominator that could possibly drive both chocolate consumption and the number of Nobel laureates over many years.”
Actually, it’s not.
There’s a couple key problems with the reasoning in this paper, other than the “correlation is not causation” issue that the author acknowledges. The “ecological fallacy” is perhaps the most important—that is, when studying aggregate data, one cannot determine whether the higher chocolate consumption actually occurs among those individuals manifesting higher cognitive function, let alone Nobel prizes. Furthermore, “third variables” are actually plentiful when trying to explain the correlation between chocolate consumption and Nobels. (This doesn’t mean you can’t use ecological data–we use ecological data all the time, but there are whole textbooks and courses to help choose the right kinds of variables and the appropriate models to actually make credible inferences, none of which this author seems to have acknowledged or attempted.)
For example, we repeated the authors’ analysis, and after controlling for per capita income, which is predictive of both consumption of luxury goods like chocolate and spending on education and research, it turns out that the relationship between chocolate consumption and Nobels becomes insignificant. Contrary to the flavanol theory, per capita consumption of green tea and red wine (both high-flavanol foods) do not correlate to Nobels per 10 million people (r=0.25; p>0.05).
Flavanols may or may not be beneficial, but we cannot know by conducting these correlations; that’s not just because of the ecological fallacy, or because correlation is not causation, but also because Nobels are not a good metric of country-wide cognition, since they are rare events. It’s a terrible idea to use rare events like Nobels as an index of the overall population’s cognitive status, just like it would be a bad idea to use lightning strikes as an indicator of a country’s overall weather.
The problems with this analysis can be illustrated by the fact that we were able to correlate Nobel prize winners to nearly any variable that increased with higher income, higher educational spending, or higher quantity of research dollars spent in a country per capita. For example, we can find stronger correlations between per capita borrowing from commercial banks and Nobels (r=0.92, p<0.05) or luxury car ownership and Nobels (r=0.85, p<0.0001) than between chocolate consumption and Nobels. Does that mean using credit and buying an Audi makes you smarter?