Cohort study

From Citizendium
Jump to navigation Jump to search
This article is developing and not approved.
Main Article
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
This editable Main Article is under development and subject to a disclaimer.

A cohort study, sometimes called a panel study or an observational study, is a form of longitudinal study used in medicine and social science.

A cohort is a group of people who share a common characteristic or experience within a defined period (e.g., are born, leave school, lose their job, are exposed to a drug or a vaccine, etc.). Thus a group of people who were born on a day or in a particular period, say 1948, form a birth cohort. Some cohort studies track groups of children from their birth, and record a wide range of information (exposures) about them. The value of a cohort study depends on the researchers' capacity to stay in touch with all members of the cohort. Some of these studies have continued for decades. Subgroups within the cohort may be compared with each other.


A cohort study is often undertaken to measure the association between a risk factor and a disease. Crucially, the cohort is identified before the appearance of the disease under investigation. The cohort is observed over time to determine the frequency of new incidence of the studied disease.

An example of an epidemiological question that can be answered by the use of a cohort study is: does exposure to X (for example, smoking) correlate with outcome Y (for example, lung cancer)? Such a study would recruit a cohort that contains both smokers and non-smokers. The investigators then follows the cohort for a set period of time and notes differences in the incidence of lung cancer between the smokers and non-smokers. The groups are matched statistically in terms of many other variables such as economic status and other health status so that the variable being assessed, the independent variable (in this case, smoking) can be isolated as the cause of the dependent variable (in this case, lung cancer).


Prospective cohort

An example of a cohort study that has been going on for more than 50 years is the Framingham Heart Study.

The largest cohort study in women is the Nurses' Health Study. Started in 1976, it is tracking over 120,000 nurses and has been analyzed for many different conditions and outcomes.

Retrospective cohort

A "prospective cohort" defines the groups before the study is done, while a "retrospective cohort" does the grouping after the data is collected. Thus a retrospective cohort study actually consists of two cohorts that are compared: the cohort with the exposure (independent variable) and the cohort without the exposure. Whereas prospective cohorts should be summarized with the relative risk, retrospective cohorts should be summarized with the odds ratio. Examples of a retrospective cohort are Long-Term Mortality after Gastric Bypass Surgery[1], 'Alarm symptoms' in patients with dyspepsia: a three-year prospective study from general practice[2], and Atypical Antipsychotic Drugs and the Risk of Sudden Cardiac Death[3]

Nested case-control study

An example of a nested case-control study is Inflammatory markers and the risk of coronary heart disease in men and women which was a case control analysis extracted from the Framingham Heart Study cohort.[4]

Interpreting results

The Newcastle-Ottawa Scale (NOS) may help assess the quality of nonrandomized studies.[5]

Statistical analysis

Because the non-randomized allocation of subjects in a cohort study, several statistical approached have been developed to reduce confounding from selection bias.

A comparison of study in which three approaches (multiple regression, propensity score and grouped treatment variable) were compared in their ability to predict treatment outcomes in a cohort of patients who refused randomization in a chemotherapy trial.[6] The comparison study examined how well three statistical approaches were able to use the nonrandomized patients to replicate the results of the patients who consented to randomization. This comparison found that the propensity score did not add to traditional multiple regression while the grouped treatment variable was least successful.[6]

Multiple regression

Multiple regression with the Cox proportional hazards ratio can be used to adjust for confounding variable. Multiple regression can only correct for confounding by independent variables that have been measured

Grouped treatment variable

Creating a grouped treatment variable attempts to correct for unmeasured confounding influences.[7] In the grouped treatment approach, the "treatment individually assigned is considered to be confounded by indication, which means that patients may be selected to receive one of the treatments because of known or unknown prognostic factors."[6] For example, in an observational study that included several hospitals, creating a variable for the proportion of patients exposed to the treatment may account for biases in each hospital in deciding which patients get the treatment.[6]

Inverse probability weighting

The inverse probability weighting attempts to correct for unmeasured confounding influences.[8] A cohort study that used this method was the North American AIDS Cohort Collaboration on Research and Design (NA-ACCORD) study of Human Immunodeficiency Virus.[9]

Principal components analysis

For more information, see: Principal components analysis.

Principal components analysis was developed by Pearson in 1901.[10] The principal components analysis can only correct for confounding by independent variables that have been measured.

Prior event rate ratio

The prior event rate ratio has been used to replicate with observational data from electronic health records the results of the Scandinavian Simvastatin Survival Study[11] and the HOPE and EUROPA trials. [12][13] Like the grouped treatment variable, the prior event ration attempts to correct for unmeasured confounding influences. However, unlike the grouped treatment variable which controls for the proportion of subjects selected for treatment, the prior event rate ratio uses the "ratio of event rates between the Exposed and Unexposed cohorts prior to study start time to adjust the study hazard ratio".[12]

Limitations of the prior event ratio is that it cannot study outcomes that have not occurred prior to onset of treatment. So for example, the prior event ratio cannot control for confounding in studies of primary prevention.

Propensity score matching

The propensity score was introduced by Rosenbaum in 1983.[14][15] The propensity score is the "conditional probability of receiving one of the treatments under comparison ... given the observed covariates."[6] The propensity score can only correct for confounding by independent variables that have been measured.

Sensitivity analysis

Sensitivity analysis can estimate how strong must a unmeasured confounder be to reduce the effect of a factor under study.[16] An example of this analysis was a nonrandomized comparison of when to initial treatment for asymptomatic Human Immunodeficiency Virus in the North American AIDS Cohort Collaboration on Research and Design (NA-ACCORD) study.[9]

Determining causality

Bradford Hill criteria

If statistically significant associations are found, the Bradford Hill criteria can help determine whether the associations represent true causality. The Bradford Hill criteria emerged were proposed in 1965:[17]

  • Strength or magnitude of association?
  • Consistency of association across studies?
  • Specificity of association?
  • Temporality of association?
  • Plausibility based on biological knowledge?
  • Biological gradient: or dose-response relationship?
  • Coherence? Does the proposed association explain other observations?
  • Experimental evidence?
  • Analogy?


Immortal time bias

"Immortal time is a span of cohort follow-up during which, because of exposure definition, the outcome under study could not occur."[18]

Alternative study designs

Case-control study

Rare outcomes, or those that slowly develop over long periods, are generally not studied with the use of a cohort study, but are rather studied with the use of a case-control study. Retrospective studies may exaggeration associations.[19]

Randomized controlled trial

For more information, see: Randomized controlled trial.

Randomized controlled trials (RCTs) are a superior methodology in the hierarchy of evidence, because they limit the potential for bias by randomly assigning one patient pool to an intervention and another patient pool to non-intervention (or placebo). This minimizes the chance that the incidence of confounding variables will differ between the two groups.[20][21]

Empiric comparisons of observational studies and RCTs conflict and both find[22][23][24][25] and do not find[26][27] evidence of exaggerated results from cohort studies.

Nevertheless, it is sometimes not practical or ethical to perform RCTs to answer a clinical question. To take our example, if we already had reasonable evidence that smoking causes lung cancer then persuading a pool of non-smokers to take up smoking in order to test this hypothesis would generally be considered quite unethical.

Standards for the conduct and reporting

Standards are available for the reporting of observational studies[28][29][30][31] with accompanying explanation and elaboration[32].

Many scales and checklists have been proposed for assessing the quality of cohort studies.[33] The most common items assessed with these tools are:

  • Selecting study participants (92% of tools)
  • Measurement of study variables (exposure, outcome and/or confounding variables) (86% of tools)
  • Sources of bias (including recall bias, interviewer bias and biased loss to follow-up but excluding confounding) (86% of tools)
  • Control of confounding (78% of tools)
  • Statistical methods (78% of tools)
  • Conflict of interest (3% of tools)

Of these tools, only one was designed for use in comparing cohort studies in any clinical setting for the purpose of conducting a systematic review of cohort studies[34]; however, this tool has been described as "extremely complex and require considerable input to calculate raw scores and to convert to final scores, depending on the primary study design and methods".[33]


  1. Adams TD, Gress RE, Smith SC, et al (2007). "Long-term mortality after gastric bypass surgery". N. Engl. J. Med. 357 (8): 753-61. DOI:10.1056/NEJMoa066603. PMID 17715409. Research Blogging.
  2. Meineche-Schmidt V, Jørgensen T (2002). "'Alarm symptoms' in patients with dyspepsia: a three-year prospective study from general practice". Scand. J. Gastroenterol. 37 (9): 999–1007. PMID 12374244[e]
  3. Ray, Wayne A.; Cecilia P. Chung, Katherine T. Murray, Kathi Hall, C. Michael Stein (2009-01-15). "Atypical Antipsychotic Drugs and the Risk of Sudden Cardiac Death". N Engl J Med 360 (3): 225-235. DOI:10.1056/NEJMoa0806994. PMID 19144938. Retrieved on 2009-01-15. Research Blogging.
  4. Pai JK, Pischon T, Ma J, et al (2004). "Inflammatory markers and the risk of coronary heart disease in men and women". N. Engl. J. Med. 351 (25): 2599-610. DOI:10.1056/NEJMoa040967. PMID 15602020. Research Blogging.
  5. GA Wells, B Shea, D O'Connell, J Peterson, V Welch, M Losos, P Tugwell. The Newcastle-Ottawa Scale (NOS) for assessing the quality of nonrandomised studies in meta-analyses
  6. 6.0 6.1 6.2 6.3 6.4 Schmoor C, Caputo A, Schumacher M (May 2008). "Evidence from nonrandomized studies: a case study on the estimation of causal effects". Am. J. Epidemiol. 167 (9): 1120–9. DOI:10.1093/aje/kwn010. PMID 18334500. Research Blogging.
  7. Johnston SC, Henneman T, McCulloch CE, van der Laan M (October 2002). "Modeling treatment effects on binary outcomes with grouped-treatment variables and individual covariates". Am. J. Epidemiol. 156 (8): 753–60. PMID 12370164[e]
  8. Hernán MA, Lanoy E, Costagliola D, Robins JM (March 2006). "Comparison of dynamic treatment regimes via inverse probability weighting". Basic Clin. Pharmacol. Toxicol. 98 (3): 237–42. DOI:10.1111/j.1742-7843.2006.pto_329.x. PMID 16611197. Research Blogging.
  9. 9.0 9.1 Kitahata MM, Gange SJ, Abraham AG, et al (April 2009). "Effect of Early versus Deferred Antiretroviral Therapy for HIV on Survival". N. Engl. J. Med.. DOI:10.1056/NEJMoa0807252. PMID 19339714. Research Blogging.
  10. Pearson, K (1901). "On lines and planes of closest fit to systems of points in space". Philosophical Magazine 2: 559–572. [e]
  11. Weiner MG, Xie D, Tannen RL (March 2008). "Replication of the Scandinavian Simvastatin Survival Study using a primary care medical record database prompted exploration of a new method to address unmeasured confounding". Pharmacoepidemiol Drug Saf. DOI:10.1002/pds.1585. PMID 18327857. Research Blogging.
  12. 12.0 12.1 Tannen RL, Weiner MG, Xie D (March 2008). "Replicated studies of two randomized trials of angiotensin- converting enzyme inhibitors: further empiric validation of the 'prior event rate ratio' to adjust for unmeasured confounding by indication". Pharmacoepidemiol Drug Saf. DOI:10.1002/pds.1584. PMID 18327852. Research Blogging.
  13. Tannen RL, Weiner MG, Xie D (2009). "Use of primary care electronic medical record database in drug efficacy research on cardiovascular outcomes: comparison of database and randomised controlled trial findings". BMJ 338: b81. PMID 19174434[e]
  14. Rosenbaum PR, Rubin DB (1983). "The central role of the propensity score in observational studies for causal effects". Biometrika 70 (1): 41. DOI:10.1093/biomet/70.1.41. Research Blogging.
  15. Hill J (April 2008). "Discussion of research using propensity-score matching: Comments on 'A critical appraisal of propensity-score matching in the medical literature between 1996 and 2003' by Peter Austin, Statistics in Medicine". Stat Med 27 (12): 2055–2061. DOI:10.1002/sim.3245. PMID 18446836. Research Blogging.
  16. Schneeweiss S (May 2006). "Sensitivity analysis and external adjustment for unmeasured confounders in epidemiologic database studies of therapeutics". Pharmacoepidemiol Drug Saf 15 (5): 291–303. DOI:10.1002/pds.1200. PMID 16447304. Research Blogging.
  17. Hill AB (May 1965). "The Environment And Disease: Association Or Causation?". Proc. R. Soc. Med. 58: 295–300. PMID 14283879. PMC 1898525[e]
  18. Suissa S (February 2008). "Immortal time bias in pharmaco-epidemiology". Am. J. Epidemiol. 167 (4): 492–9. DOI:10.1093/aje/kwm324. PMID 18056625. Research Blogging.
  19. Eikelboom JW, Lonn E, Genest J, Hankey G, Yusuf S (September 1999). "Homocyst(e)ine and cardiovascular disease: a critical review of the epidemiologic evidence". Ann. Intern. Med. 131 (5): 363–75. PMID 10475890[e]
  20. Pocock SJ, Elbourne DR (June 2000). "Randomized trials or observational tribulations?". N. Engl. J. Med. 342 (25): 1907–9. PMID 10861329[e]
  21. Barton S (July 2000). "Which clinical studies provide the best evidence? The best RCT still trumps the best observational study". BMJ 321 (7256): 255–6. PMID 10915111. PMC 1118259[e]
  22. Ioannidis JP, Haidich AB, Pappa M, et al (August 2001). "Comparison of evidence of treatment effects in randomized and nonrandomized studies". JAMA 286 (7): 821–30. PMID 11497536[e]
  23. Guyatt GH, DiCenso A, Farewell V, Willan A, Griffith L (February 2000). "Randomized trials versus observational studies in adolescent pregnancy prevention". J Clin Epidemiol 53 (2): 167–74. PMID 10729689[e]
  24. Kunz R, Oxman AD (October 1998). "The unpredictability paradox: review of empirical comparisons of randomised and non-randomised clinical trials". BMJ 317 (7167): 1185–90. PMID 9794851. PMC 28700[e]
  25. Phillips AN, Grabar S, Tassie JM, Costagliola D, Lundgren JD, Egger M (October 1999). "Use of observational databases to evaluate the effectiveness of antiretroviral therapy for HIV infection: comparison of cohort studies with randomized trials. EuroSIDA, the French Hospital Database on HIV and the Swiss HIV Cohort Study Groups". AIDS 13 (15): 2075–82. PMID 10546860[e]
  26. Benson K, Hartz AJ (June 2000). "A comparison of observational studies and randomized, controlled trials". N. Engl. J. Med. 342 (25): 1878–86. PMID 10861324[e]
  27. Concato J, Shah N, Horwitz RI (June 2000). "Randomized, controlled trials, observational studies, and the hierarchy of research designs". N. Engl. J. Med. 342 (25): 1887–92. PMID 10861325. PMC 1557642[e]
  28. Equator Network. Guidance for reporting observational studies
  29. National Library of Medicine. Research Reporting Guidelines and Initiatives: By Organization
  30. STrengthening the Reporting of OBservational (STROBE) studies in Epidemiology
  31. von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP et al. (2007). "The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies.". Ann Intern Med 147 (8): 573-7. PMID 17938396.
  32. Vandenbroucke JP, von Elm E, Altman DG, Gøtzsche PC, Mulrow CD, Pocock SJ et al. (2007). "Strengthening the Reporting of Observational Studies in Epidemiology (STROBE): explanation and elaboration.". Ann Intern Med 147 (8): W163-94. PMID 17938389.
  33. 33.0 33.1 Sanderson S, Tatt ID, Higgins JP (2007). "Tools for assessing quality and susceptibility to bias in observational studies in epidemiology: a systematic review and annotated bibliography.". Int J Epidemiol 36 (3): 666-76. DOI:10.1093/ije/dym018. PMID 17470488. Research Blogging.
  34. Margetts BM, Thompson RL, Key T, Duffy S, Nelson M, Bingham S et al. (1995). "Development of a scoring system to judge the scientific quality of information from case-control and cohort studies of nutrition and disease.". Nutr Cancer 24 (3): 231-9. PMID 8610042.

See also