On Priority in Multivariable Analysis of Coronary Risk

There is much more to the early story of multivariable models to predict CVD risk than that of the classic article most cited, on the multiple logistic of Truett, Cornfield, and Kannel (1963). Pioneers need acknowledge the giant shoulders on which they stand as they show us the light. In our interviews for this archive, for example, we have the following account of a “multivariate risk score” for coronary disease that predated Truett et al. and Framingham by several years.

Gertler, Garn, and White were faced with the same issues of multiple confounded variables in their early cross-sectional comparison of young coronary cases versus matched controls (Gertler et al, 1951. 1954, 1959) to those that arose in the risk predictions from baseline measures of the Framingham Study cohort (Kannel 2001). Faced for the first time in CVD research with voluminous data on multiple risk variables of potential causal interest, both studies were similarly confronted with the limitations of single-factor and cross-classification analyses.

We asked Gertler: “How did you get the skills (at age 28) to put a complex study like your early one into some order?” He replied that he was good in math and that Stanley Garn, anthropologist with statistical skills, played a complementary role. Moreover, for their study of young coronary patients begun in the late 1940s, they also tapped Max Woodbury, mathematician from NYU and then consulting with NASA, who had access to the earliest and grandest U.S. computing facilities. Gertler said that Woodbury “was very useful in bringing out the linear regression, where we were the first [to publish]” (Gertler 2007).

Their joint publication with Paul White and Howard Rusk on multivariable CVD risk prediction (Gertler et al 1959;) achieved far less notoriety and application than that of Truett, Cornfield, and Kannel’s model from Framingham data in 1963. This was in part because of Framingham’s notoriety, and in part because the logistic led to a model more appropriate for prospective epidemiological studies. When we asked Rick Shekelle’s opinion he replied (pers. communication): “I think that Cornfield’s approach dominated because it led to a measurement of “risk” suitable for cohort studies, whereas the Gertler-Garn approach was based on Fisher’s discriminant function and led to a probability that an individual belonged to one group or another. The former was much more useful for epidemiological purposes than the latter.”

Similarly, when asked, Paul Leaverton commented: “The fact that the log of the odds ratios are estimates in the logistic coefficients, even for multiple variables, is a great feature that I think Jerry Cornfield was the first to point out. I liked Fisher’s ingenious discriminant analysis methodology but the logistic regression is more appropriate to epidemiology.”

Framingham’s widely perceived or actual priorities in numerous areas of methods and results remain a leitmotive throughout the evolution of CVD epidemiology. (Henry Blackburn)

References

Gertler, M.M., Garn, S.M., and White, P.D., 1951. “Young candidates for coronary heart disease.” JAMA 147: 621-625.

Gertler, M.M. and White, P.D., 1954. Coronary Heart Disease in Young Adults. Commonwealth Fund. Cambridge. Harvard University Press.

Gertler, M.M., Woodbury, M.A., Gottsch, L.G., White, P.D., and Rusk, H.A., 1959. “The candidate for coronary heart disease; discriminating power of biochemical, hereditary and anthropometric measurements.” Journal of the American Medical Association. 170(2):149-52.

Menard Gertler-Henry Blackburn interview. New York City. October 11, 2007. CVD History Archive, School of Public Health, Univ. of Minnesota.

On Priority in Multivariable Analysis of Coronary Risk

Essays