University of Minnesota

Association-Causation in Epidemiology: Stories of Guidelines to Causality

A profound development in the analysis and interpretation of evidence about CVD risk, and indeed for all of epidemiology, was the evolution of criteria or guidelines for causal inference from statistical associations, attributed commonly nowadays to the USPHS Report of the Advisory Committee to the Surgeon General on Smoking and Health of 1964, where they were formalized and first published (PHS 1964).

In fact, the historical evolution of the guidelines derives from Koch’s postulates, which so greatly strengthened medical science and causal inference about the agents and vectors in communicable diseases (Koch 1890). Several other steps anteceded, however, the formulation of the 1964 Report of the Advisory Committee (Blackburn and Labarthe 2012).

Yerushalmy and Palmer (1959) were importantly motivated to devise guidelines for causality in reaction to Ancel Keys’s first major published proposal of a diet-heart hypothesis from a Mt. Sinai Hospital discourse given in early 1953 (Keys 1953). After Yerushalmy and Hilleboe severely criticized that evidence (1956), Yerushalmy introduced his subsequent thinking as a discussion with Palmer:

“The major weakness of observations on humans stems from the fact that they often do not possess the characteristic of group comparability, a basic requirement which in experimentation is accomplished by conscious effort through randomization. The possibility always exists, therefore, that such association as observed may. . . be due to factors other than those under study” (Yerushalmy and Palmer 1959, 28). They summarize, we paraphrase, their guidelines:

For purposes of discussion the following statements are suggested as a first approach toward the development of acceptable guideposts for the implication of a characteristic as an etiologic factor in a chronic disease:

  1. The suspected characteristic must be found more frequently in persons with the disease in question than in persons without the disease, or
  2. Persons possessing the characteristic must develop the disease more frequently than do persons not possessing the characteristic.
  3. An observed association between a characteristic and a disease must be tested for validity by investigating the relationship between the characteristic and other diseases and, if possible, the relationship of similar or related characteristics to the disease in question. . . In general, the lower the frequency of these other associations the higher is the specificity of the original observed association and the higher the validity of the causal inference (ibid., 40).

The same year, Lilienfeld responded to Yerushalmy with more specifics on specificity; that cases occurring without the characteristic under consideration do not invalidate a hypothesis but weaken it, and that the frequency of the characteristic without the disease does not invalidate the hypothesis, because there are accessory factors affecting susceptibility (Lilienfeld 1959).

And in the response of Sartwell at Johns Hopkins to the same discussion these criteria were added to those strengthening causal inference (Sartwell 1960), which we paraphrase:

  1. Confirmation of the association by replication (different investigators and populations).
  2. Graded effect, that is, a quantitative relationship between the intensity or frequency of the characteristic or exposure and the frequency of the disease.
  3. Chronologic relationship. As in Koch’s postulates, the characteristic must precede the disease and trends in each should be parallel.
  4. Biologic reasonableness of the association is not to be left out but is left suspect because judgmental. [Note: For example, Yerushalmy’s specificity criterion above would not allow for a characteristic inducing multiple diseases, as we now know is true for smoking, diet, alcohol, etc., or for John Snow’s hypothesis about foul water causing the cholera epidemic before it was biologically plausible to incriminate the undiscovered cholera vibrio.]

At any rate, after all this dialogue, the criteria evolved further and were proposed by Reuel Stallones, then at the University of California, in his draft report to the Advisory Committee on the association of tobacco and coronary disease (Stallones 1963), which he presented in pretty much the exact form in which they were published in the final report of 1964, but which added this now familiar introduction:

Statistical methods cannot establish proof of a causal relationship in an association. The causal significance of an association is a matter of judgment which goes beyond any statement of statistical probability. To judge or evaluate the causal significance of the association between the attribute or agent and the disease, or effect upon health, a number of criteria must be utilized, no one of which is an all-sufficient basis for judgment. These criteria include:

  1. The consistency of the association
  2. The strength of the association
  3. The specificity of the association
  4. The temporal relationship of the association
  5. The coherence of the association

In his Presidential Address to the Section of Occupational Medicine of the Royal Society of Medicine in 1965, Austin Bradford Hill, the head of epidemiology at the London School of Hygiene and Tropical Medicine, and professor emeritus of statistics at the University of London, asks “what aspects of [an] association should we especially consider before deciding that the most likely interpretation of it is causation?” He proceeds to elaborate the principles we rely on today, each followed by a thoroughgoing essay justifying it. The clarity and beauty in his exact words eclipse the truncated listing of those criteria in the 1964 Report of the Surgeon General’s Advisory Committee. Thence, we quote them verbatim:

  1. “First upon my list I would put the strength of the association.” [And he goes on to describe his own finding of twenty times greater rate of lung cancer in heavy smokers of cigarettes and John Snow’s finding of fourteen times greater mortality rate from cholera in the customers supplied water from the Southwark and Vauxhall Company, etc. (Bradford Hill 1965, 295)]
  2. “Next on my list of features to be specially considered I would place the consistency of the observed association” (ibid., 296). [Here he cites the U.S. Surgeon General’s Report of 1964 in which the association of smoking with cancer of the lung was found in twenty-nine retrospective and seven prospective inquiries.]
  3. “. . Specificity of the association [is] the third characteristic which invariably we must consider.” [If the association is limited to specific occupations, for example, and not to others, this “is a strong argument in favor of causation. If specificity exists we may be able to draw conclusions without hesitation; if it is not apparent, we are not thereby necessarily left sitting irresolutely on the fence” (ibid., 297)].
  4. “My fourth characteristic is the temporal relationship [temporality] of the association–which is the cart and which is the horse?” And this is “particularly relevant with diseases of slow development” (ibid., 297).
  5. “Fifthly, if the association is one which can reveal a biological gradient, or dose-response curve, then we should look most carefully for such evidence.” In smoking and lung cancer, “the clear dose-response curve admits of a simple explanation and obviously puts the case in a clearer light” (ibid., 298).
  6. “It will be helpful if the causation we suspect is biologically plausible.” But, “what is biologically plausible depends upon the biological knowledge of the day” (ibid., 298).
  7. “Coherence: . . the cause-and-effect interpretation of our data should not seriously conflict with the generally known facts of the natural history and biology of the disease” (ibid.,298). [Hill adopted the word “coherence” from the 1964 Surgeon General’s Advisory Committee report (PHS 1964).]
  8. “Experiment: occasionally it is possible to appeal to experimental or semi-experimental evidence.” If people stop smoking cigarettes “is the frequency of the associated events affected? Here the strongest support for the causation hypothesis may be revealed” (ibid.,298).

Hill concludes: “No formal tests of significance can answer those questions” (ibid., 299). The grand sage has this further to say about taking action based on the evidence and causal inference:

We should need very strong evidence before we made people burn a fuel in the homes that they do not like or stop smoking the cigarettes and eating the fats and sugar that they do like. In asking for very strong evidence I would, however, repeat emphatically that this does not imply crossing every ‘t’, and [crossing] swords with every critic, before we act.

All scientific work is incomplete–whether it be observational or experimental. All scientific work is liable to be upset or modified by advancing knowledge. That does not confer upon us a freedom to ignore the knowledge we already have, or to postpone the action that it appears to demand at a given time.

Who knows, asked Robert Browning, but the world may end tonight? True, but on available evidence most of us make ready to commute on the 8:30 next day.”(Henry Blackburn)

[1] Stallone’s daughter, epidemiologist Lorann Stallones, reports: “It is my understanding that Mickey LeMaistre (former President of the University of Texas) has a napkin that Dad wrote these [guidelines for causal inference] down on while they were working on the Surgeon General’s Report” (Pers. Comm.. To H. Blackburn 2008). We have not confirmed this, since we have Stallone’s original mimeographed draft in our Minnesota archive.


Blackburn, H. and D. Labarthe. Stories From the Evolution of Guidelines for Causal Inference in Epidemiologic Associations: 1953-1965. Am J Epidemiol. 2012; 176 (12):1071-1077

Bradford-Hill, A. 1965. President’s Address. The Environment and Disease: Association or Causation? Proceedings of the Royal Society of Medicine. 58, 295-300.

Koch R. 1893. Über den augenblicklichen Stand der bakteriologischen Choleradiagnose” (in German). Zeitschrift für Hygiene und Infectionskrankheiten 14: 319-333.

Lilienfeld, A.M 1959. On the methodology of investigations of etiologic factors in chronic diseases. Some comments. Journal of Chronic Diseases 10:41-43

Sartwell, P.E. 1960. On the methodology of investigations of etiologic factors in chronic diseases. Further comments. Journal of Chronic Diseases 11:61-63

Stallones, R.A. 1963. The association between tobacco smoking and coronary heart disease. Draft of June 28 to the Surgeon Generals’ Advisory Committee on Smoking and Health. L.Schuman papers. University of Minnesota Archive.

US Department of Health, Education, and Welfare. Public Health Service. 1964. Smoking and Health. Report of the Advisory Committee to the Surgeon General of the Public Health Service. Publication #1103.

Yerushalmy, J. and Hilleboe, H.E. 1957. Fat in the diet and mortality from heart disease. New York Journal of Medicine 57: 2343-2354.

Yerushalm, J. and Palmer, C.E. 1959. On the methodology of investigations of etiologic factors in chronic diseases. Journal of Chronic Diseases. 10: 27-40.