In today’s edition of the journal PLoS One, we published an “open access” study on the relationship between sugars and type 2 diabetes. The study was an international analysis applying statistical techniques from the field of econometrics to public health data in order to understand the relationship between sugar availability and diabetes prevalence. It was peer-reviewed by five independent statisticians and diabetes experts. The study can be easily misinterpreted—for example, one doctor made the silly comment: “Well this is just like correlating the number of cups someone owns to their risk of diabetes, which is confounded by obesity”—which reflects that the doctor did not read the study or didn’t understand the statistical methods involved; obviously, as professors who teach statistics all day, we controlled for obesity and dealt with these kinds of issues up front. The study is not a typical simple “correlation study” that is far too common in the medical literature. There are, however, very important caveats to the findings, and some context that’s pretty critical to understand. So we wanted to re-iterate the very careful wording in the study and make sure that the actual study findings made it somewhere into the melodramatic discourse on this subject…
Why did we do this study?
Our study sought to address this question: could sugars affect the risk of diabetes, even independent of their role in affecting the risk of obesity? Laboratory studies and experimental data suggest that the relationship between obesity and diabetes pathogenesis is not entirely clear. For example, several studies have estimated that about 20% of obese individuals appear to have normal insulin regulation and no signs of the metabolic syndrome (no indication of diabetes); such individuals also have normal longevity, such that focusing on their obesity may be obsessing about an imperfect marker of metabolic dysfunction. By contrast, up to 40% of normal weight people manifest aspects of the metabolic syndrome, so clearly something else is going on with metabolism besides being fat. There is really no doubt that obesity is a statistical risk factor for diabetes; our study was not designed to rebut that idea at all. Rather, it was designed to investigate an additional possibility that the availability of sugars may also have an independent role in diabetes, even aside from contributing to weight or total calories consumed. This could explain some puzzling findings about why diabetes rates among some populations have escalated independent of changes in obesity rates, as we discuss in the paper.
In addition to sugar leading directly to obesity, there appear to be other aspects of diabetes besides obesity that sugars contribute to. These include hepatic de novo lipogenesis and reduced fatty acid oxidation, which results from the liver’s metabolism of some sugars in the fed state in a manner that generates lipogenic substrates in an unregulated fashion. This process forms excessive liver fat and inflammation that inactivates the insulin signaling pathway, leading to hepatic insulin resistance. Sugary foods also appear to contribute to the development of insulin resistance in laboratory-based studies. Reactive oxygen species are produced by the Maillard reaction, damaging pancreatic beta cells, and leading to a subcellular stress response (the “unfolded protein response” in the endoplasmic reticulum) that drives insulin inadequacy. In concert, insulin resistance and reduced insulin secretion lead to overt diabetes. But all of these laboratory studies cannot tell us how important these findings are at a population level—that is, whether these relationships actually matter in the real world and could be contributing to epidemic-level diabetes rates.
What was our approach?
In the ideal scientific world, we would take two groups of people and give them very carefully calculated diets, in which one diet had higher sugars but otherwise identical total calories, and the other diet had low sugars and otherwise identical total calories. We would have the two groups exercise the same amount and maintain the same weights, then see if the high-sugar group got diabetes more or less than the low-sugar group.
Of course, such a study would not be very ethical. Hence, we had to do some fancier statistics with the data we have available. The problem is that typical medical correlation studies are fairly weak—they look at just one point in time, and—as we ourselves have pointed out several times—correlation does not imply causation because third variables can explain point-in-time correlations (e.g., Christmas cards do not cause Christmas to happen, but the two are highly correlated in time). Even having good control variables does not always resolve this problem.
So we used some more advanced statistical methods from the field of econometrics in order to move beyond the typical limitations associated with medical correlation studies. First, we looked at longitudinal trends in sugar availability and diabetes prevalence rates in 175 countries. Looking at long term trends allows us to conduct time-series analysis, which can filter out a lot of garbage that plagues common medical correlation studies, and allows us to look at long-term relationships that are inherent to food consumption and diabetes risks. In addition to statistically controlling for overweight, obesity, and other calories (not just sugar, but total calories, as well as calories from fats, proteins, etc), and other risk factors for diabetes like tobacco smoking and alcohol consumption, we did a different type of study design.
First, we looked at a “selection model” to examine whether unobserved variables could be an issue—meaning whether having greater sugar around was just an artifact of overall economic development, and since economic development is itself correlated to changes in physical activity and diet, that could contribute to diabetes. Of course, we also used statistical controls for economic development, urbanization, and so on, but the “selection models” determine whether there is unobserved variables that would predispose to a selection bias in the study (the main reason we do randomization in medical trials), and allow us to correct for any such biases we find so that we can closer to an ideal scientific experiment (one reason its inventor won a Nobel Prize).
Second, we used something called a Granger causality test, which looks at precedence—if X leads to Y, then when X increases, Y should increase later, and vice versa. That still doesn’t “prove causality”, it’s just a statistical test that’s commonly used in economics and sociology, but rarely has been applied to medical questions (fyi, its inventor also won the Nobel Prize). We found a so-called “dose-response” relationship between sugar availability and diabetes prevalence, such that increased sugar availability was associated with increased diabetes prevalence in the future, and those countries who reduced their sugar availability also had a proportionate reduction in diabetes prevalence—independent of other changes like changes in economic development, urbanization, physical activity, obesity, and changes in consumption of others foods like meats and fats and total calories. Essentially this is trying to get as close as possible to satisfying an epidemiological equivalent of Koch’s postulates: when you are exposed to an agent, you get the disease, when the agent is removed, the disease goes away, when you re-expose, the disease comes back. This is far stronger than a typical point-in-time medical correlation study. The approach makes use of a so-called “natural experiment”—because it would be unethical to watch people get diabetes after giving them more sugar, we instead look at long-term data from different populations who already increased or decreased their sugar consumption (often due to various changes in economic trade laws), statistically controlling for other factors that differed between these populations to isolate the effect of sugars.
There are, nevertheless, limitations to any statistical study. As we teach our students, we can’t “prove causality” through any amount of statistics—we’re simply halfway between the typical weak medical correlation studies and the ideal case of a randomized controlled trial (which often also can’t prove causality for a variety of reasons, despite common misconceptions). Secondly, our study is “ecological” meaning that we look at lots of countries but this requires that we use large-scale aggregate data, so like any epidemiological study using aggregate data we can suffer from the “ecological fallacy”, which means that when we look at aggregate populations, we can’t be sure that those people eating the greater sugars were the exact same people who experienced more diabetes in that given country. This seems extremely unlikely given the massive amount of data, and the extremely robust nature of the finding when we tested it against two independent datasets over a long duration, but is still worth noting. Third, the data themselves are not perfect—in addition to looking for selection bias and doing “robustness checks” by repeating the analysis while excluding outliers or extreme data points (finding, still, consistent results), we have to acknowledge that food availability data from even the best sources are not perfect, and diabetes surveillance rates (even though we checked them against multiple sources), as well as estimates of overweight, obesity and physical activity in many countries are far from perfect. We just used the best data available to date, given the urgency of this question. Also, body-mass index may not be the best marker of obesity and may mean different things in different countries (especially Asian ones, where the cut-offs between normal weight, overweight and obesity may be less valid than in the United States due to differences in body type), but we did repeat our analysis using alternative cut-offs and didn’t find any difference in the results. Similarly, metabolism is not necessarily the same among all populations; some theorists have hypothesized that South Asians may metabolize calories differently in a way that predisposes to diabetes, but this also is confounded by the fact that body mass index is probably not a very good measure of abdominal adiposity which may be more relevant than body mass per se.
We looked for a variety of confounding factors like food wastage (since food availability is not the same as food consumption), corrected for secular trends in independent and dependent variables, and various time-lags (various alternative durations between the sugars availability and the diabetes outcomes), finding greater exposures to sugars were associated over time with greater diabetes prevalence rates no matter how many of these factors we controlled for. What’s particularly reassuring is that parallel results were obtained by independent scientists at two other universities.
What’s the bottom line?
The bottom line is that this is one of several studies from independent scientific groups that have questioned the old mantra that “a calorie is a calorie”. Some calories may be more metabolically harmful than others, and sugar calories appear to have remarkably potent properties that make us concerned about their long-term metabolic effects. This study also suggests that obesity alone may not be the only issue in diabetes pathogenesis. The study was conducted to understand a statistical theory, using a statistical approach. It doesn’t say anything about any specific person’s diabetes risk or provide any kind of dietary advice. This data cannot distinguish between types of sugars (like high fructose corn syrup versus other types of sugars), nor does it establish more insight into the mechanisms that are at play, which need to be pieced together in laboratory and experimental research studies. This study also can’t inform any specific policies like the New York City ban on large soft drinks, since the real-world effects of specific policies weren’t evaluated in this experiment.
However, the precautionary principle in public health suggests that when there is not scientific consensus on the harms of a substance, the burden of proof falls on those who are declaring it to be safe. What we would do next is conduct a randomized controlled trial of low-sugars versus regular American diets (which are already very high in sugars) and follow people over years to identify whether lowering sugar intakes can lower diabetes risks. But this first study tells us that sugars may be important at a population level, not just an individual or molecular level, and conducting an econometric study is one of the few ways to do that.