If smoking causes lung cancer, we should try to stop people smoking, if greenhouse gas emissions are causing climate change, we should reduce them, if genetically modified organisms (GMOs) are hazardous, we should ban them. The problem for policy makers is that it is hard to be sure, or rather it is hard to be sure enough to convince a government to spend money or offend a powerful lobby. The Bradford Hill criteria can help in the difficult task of making decisions when the evidence, while strong, is not conclusive. Peter Saunders
Sir Austin Bradford Hill was a British medical statistician who had been involved in the study that found the correlation between smoking and lung cancer. That work was done in the 1940s, but it took a long time before it was generally accepted that smoking does cause lung cancer and governments began to take measures to discourage it. In the light of this experience he developed what are now called the Bradford Hill criteria . This is essentially a set of questions we should ask when we are trying to decide whether there really is a cause and effect relation between two phenomena.
The criteria have proved their worth in statistics and epidemiology but few people outside those fields have ever heard of them. Yet they can be useful in any situation where there is an association – statistical or otherwise – between two phenomena, and we need to decide whether one really is a cause of the other. They deserve to be much more widely known and used than they are.
The criteria that Bradford Hill proposed were illustrated by examples. As he was careful to stress, this is not a checklist on which every box has to be ticked. Not all the answers may even point in the same direction. But after we have dealt with each of the items, we will be in a much better position to judge whether the cause and effect relation we have evidence for is real.
(1) Strength of the association. The death-rate from lung cancer was over nine times as high in smokers as in non-smokers; in heavy smokers, it was more than twice that again. This was obviously much stronger supporting evidence than if the rates had only been slightly higher.
(2) Consistency: Are we talking about the result of a single study, or of several, and if there is more than one, were they all done in the same way or were they really different? Bradford Hill pointed out that according to a committee advising the US Surgeon General, 36 different inquiries, not all using the same methodology, had found an association between smoking and lung cancer. That does not rule out the possibility that the same fallacy was at work in all of them, but it strengthens the case.
(3) Specificity: If a disease occurs only in one group of people and if there are no other diseases that occur only in this group, this is strong evidence for cause and effect. In fact, while the death rates for smokers are higher for many causes of death, the increase is much greater for lung cancer than for the others, so this criterion is still satisfied.
(4) Temporality: While cause obviously has to come before effect, it is not always obvious which of two events was really first. If people who smoke are more likely to die from lung cancer, does that mean that smoking causes cancer or is it that the sort of people who are predisposed to lung cancer are also likely to adopt a life style that includes smoking? Here the obvious explanation is correct – smoking does cause lung cancer – but it is a question we should ask.
(5) Dose response: Does increasing the purported cause increase the effect? In the case of smoking and lung cancer, the increase in death rate rises linearly with the number of cigarettes smoked per day, and this is strong supporting evidence. On the other hand, in many cases there are threshold or trigger effects, and then there will be no dose response. Drinking two glasses of poison doesn't make you twice as dead.
(6) Plausibility: Is the cause-effect relationship plausible? Ideally, we would like to be able to find the mechanism that links cause and effect, but often this is not possible; if it were there would be no problem. We can, however, ask if it is at least plausible that A could be the cause of B. Hill immediately warns, however, that what is considered plausible changes in time. In the nineteenth century, for example, it was thought totally implausible that doctors not washing their hands could be responsible for the deaths of women in maternity wards.
(7) Coherence: Does the claim that A causes B seriously conflict with what we know about B? This is really a companion to the plausibility criterion. If our present knowledge provides no plausible mechanism by which A can cause B, can we actually rule it out? John Snow was not able to suggest how polluted water could be the means by which cholera is spread, but even in 1854, there was no good scientific reason for ruling out the possibility that it might be.
(8) Experiment: If we change A, does B change as well? If people stop smoking, does the death rate from lung cancer fall? We now know that it does. Not only do deaths from lung cancer in a population increase when the proportion of smokers goes increases , an individual who gives up smoking reduces his or her chance of contracting the disease depends on the total number of cigarettes smoked . Bradford Hill did include laboratory experiments in his paper, such as the effect of tobacco smoke on dogs, but because he was writing specifically for epidemiologists he considered those to be part of coherence.
(9) Analogy: Are there analogous examples? After it had been established that thalidomide and rubella can produce birth defects, it was easier to make the case that some other birth defect could be caused by a drug or a viral disease.
As Bradford Hill made clear, these are criteria to be used in decision making, not a list on which every box has to be ticked. We may not have enough evidence to deal with all the points, or some of them may not be applicable. There may even be some evidence that seems to point in the opposite direction, and then we will have to decide how much importance to give it.
At the time Bradford Hill was writing in 1965, there was still some controversy about the link between smoking and cancer. That's a point we should always remember. Everyone now accepts that asbestos, smoking, low intensity radiation, and lead for example, are dangerous, but there was a time when many scientists, government regulators and, above all, corporations insisted that they were not; and dismissed those who claimed they were as cranks. The moral is not that we are wiser than our predecessors, but that we are always in danger of making the same mistakes about different things, such as climate change and genetically modified organisms.
The controversy about greenhouse gases and climate changes is almost settled , despite the largely politically motivated rise in climate scepticism [5, 6] (Sceptical about Climate Change Sceptics, Getting Sceptical About Global Warming Scepticism, SiS 45). It serves as good illustration of how the Bradford-Hill criteria can be applied outside epidemiology.
The Earth has been getting warmer over the past century, the increase has recently accelerated, and the temperature is well correlated with the atmospheric concentrations of the greenhouse gases (GHGs), especially carbon dioxide [5, 6]. That much has been known for a long time, but many people, including some scientists, insist that rise in temperature is either illusory, or a natural cycle, or due to some other cause. As they point out, the mean temperature of the Earth has varied considerably in the past, long before there were humans burning fossil fuels.
Let's apply the Bradford-Hill criteria and see how they work.
Strength of the association: The correlation over the past two or three centuries is clearly very good, especially that temperature and GHGs have both increased more sharply over the last twenty years (see  Global Warming Is Happening, SiS 31).
Consistency: There have been a number of studies, and while they have all used a similar approach to modelling they have been carried out independently. The models differ considerably in the amount of warming they predict over the century, but they all agree there will be a significant increase even if carbon emissions are reduced and a more serious one if they are not . Furthermore, the predictions of at least some of the models are consistent with observations [5, 6, 9] (350ppm CO2 the Target, SiS 44).
Specificity: This criterion does not apply here exactly because there is only one Earth, but the specific effect of anthropogenic greenhouse gases is evident, and can be distinguished from natural causes [5-7].
Temporality: The recent increase in anthropogenic GHGs came before the increase in temperature [4-7]. This criterion is difficult to apply to non-anthropogenic GHGs because of positive feedbacks and time lags .
Dose response: As more and more fossil fuels have been burned, the concentration of atmospheric CO2 has increased, and the rate of warming has increased [4-7].
Plausibility: The Swedish chemist Svante Arrhenius explained in 1896 how increasing the proportion of carbon dioxide in the atmosphere could lead to global warming through the greenhouse effect . He was building on fundamental physics that has been confirmed by recent satellite observations [5, 6]. A plausible mechanism clearly exists, even if some still deny that this is what is actually happening.
Coherence: The greenhouse effect due to anthropogenic greenhouse gas emissions is coherent with all other findings [4-9], while the hypothesis favoured by sceptics based on solar activity clearly is not.
Analogy: The nearest analogy we have is the improvement in the ozone layer following the Montreal Protocol, the international agreement to phase out CFCs . That demonstrates both that what happens in the upper atmosphere can affect our life on the surface of the Earth and also that it is possible for us to do something about that, providing we set our minds to it.
Whether genetically modified organisms (GMOs) are hazardous is not the sort of issue that Bradford Hill was thinking of. All the same, the criteria do clearly highlight a number of important issues that are all too often ignored. Even where they are not applicable in the form he proposed, it is useful to consider why not, whether this matters, and whether there are other ways to cover the points raised.
Strength of the association: Unlike cancer and climate change, there are harmful effects of GMOs that happen quickly enough to be observed directly, both in the field and in feeding trials in the laboratory (see  GM is Dangerous and Futile, SiS 40); and in that case the association is obviously very strong. We would naturally like to have evidence about long term effects as well, but to measure the strength of a correlation we would need two groups of people who differed only in that one had been eating GM food and the other had not. This is not possible because where GM foods are most available, as in the US, no labelling is required and so people do not know whether they have been eating them. Sir Richard Doll and his team would never have succeeded if they had not been able to distinguish the smokers in their sample from the non-smokers . We do know that food related illnesses have increased sharply in the US over the same period in which the amount GM crops being grown has increased  (US Foodborne Illnesses Up Two to Ten Fold, SiS 13/14) but the data are not adequate for a proper analysis.
Consistency: Harmful effects have been found in many laboratory experiments, including at least two in which the experimenters were clearly hoping not to find them ( Transgenic Pea that Made Mice Ill SiS 29;  GM Maize MON 863 Toxic SiS 34). People and livestock have been harmed on farms ( GM Ban Long Overdue, SiS 29;  Cows Ate GM Maize & Died, SiS 21;  Mass Deaths in Sheep Grazing on Bt Cotton, SiS 30;  More Illnesses Linked to Bt Crops, SiS 30). Many of these cases involve immune reactions.
Specificity: The effects reported on cattle and on farm workers are not observed with other new crops. And the transgenic pea example [15, 21] shows that all transgenic proteins are potentially immunogenic because proteins are processed differently in even closely related species.
Temporality: This criterion is satisfied because there is no doubt that the genetic modification occurred before the harm.
Dose response: There is no evidence for a dose response, but as the effects appear to involve the immune system, we would not necessarily expect to observe one.
Plausibility: The molecules created by genetic engineering have never existed before, and because they arise through a process that is different from ordinary breeding and typically combines genetic material from quite different species, it is easy to see how they could trigger immune responses in both humans and animals.
Coherence: There is no a priori reason to doubt that GM crops could provoke unexpected immune responses, for reasons given above under ‘plausibility'; and this explanation is coherent with the biochemical mechanisms that generate new immunogenic proteins in GMOs [14, 21] (High Lysine GM Maize Withdrawn, Safety Concerns, SiS 45).
Experiment: Together with the harm suffered by people and animals on farms, laboratory experiments provide the strongest evidence for the damage that GMOs can do. Furthermore, we have seen that when the amount of GM crops being grown increased in US, so did the incidence of food related illnesses. The next step would be to phase out GM crops and see if food-borne illnesses decrease.
Analogy: We do not need an analogous example of harm, because we have direct evidence both for health hazards and ecological and social disasters (see most recent reports [23-25] (Farmer Suicides and Bt Cotton Nightmare Unfolding in India, Mealy Bug Plagues Bt Cotton in India and Pakistan, GM crops increase herbicide use in the United States, SiS 45). This is why we have to reject the introduction of GM crops into developing countries (see  Beware the New 'Doubly Green Revolution' SiS 37).
Article first published 03/02/10
Got something to say about this page? Comment