Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • What Is a Case-Control Study? | Definition & Examples

What Is a Case-Control Study? | Definition & Examples

Published on February 4, 2023 by Tegan George . Revised on June 22, 2023.

A case-control study is an experimental design that compares a group of participants possessing a condition of interest to a very similar group lacking that condition. Here, the participants possessing the attribute of study, such as a disease, are called the “case,” and those without it are the “control.”

It’s important to remember that the case group is chosen because they already possess the attribute of interest. The point of the control group is to facilitate investigation, e.g., studying whether the case group systematically exhibits that attribute more than the control group does.

Table of contents

When to use a case-control study, examples of case-control studies, advantages and disadvantages of case-control studies, other interesting articles, frequently asked questions.

Case-control studies are a type of observational study often used in fields like medical research, environmental health, or epidemiology. While most observational studies are qualitative in nature, case-control studies can also be quantitative , and they often are in healthcare settings. Case-control studies can be used for both exploratory and explanatory research , and they are a good choice for studying research topics like disease exposure and health outcomes.

A case-control study may be a good fit for your research if it meets the following criteria.

  • Data on exposure (e.g., to a chemical or a pesticide) are difficult to obtain or expensive.
  • The disease associated with the exposure you’re studying has a long incubation period or is rare or under-studied (e.g., AIDS in the early 1980s).
  • The population you are studying is difficult to contact for follow-up questions (e.g., asylum seekers).

Retrospective cohort studies use existing secondary research data, such as medical records or databases, to identify a group of people with a common exposure or risk factor and to observe their outcomes over time. Case-control studies conduct primary research , comparing a group of participants possessing a condition of interest to a very similar group lacking that condition in real time.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

Case-control studies are common in fields like epidemiology, healthcare, and psychology.

You would then collect data on your participants’ exposure to contaminated drinking water, focusing on variables such as the source of said water and the duration of exposure, for both groups. You could then compare the two to determine if there is a relationship between drinking water contamination and the risk of developing a gastrointestinal illness. Example: Healthcare case-control study You are interested in the relationship between the dietary intake of a particular vitamin (e.g., vitamin D) and the risk of developing osteoporosis later in life. Here, the case group would be individuals who have been diagnosed with osteoporosis, while the control group would be individuals without osteoporosis.

You would then collect information on dietary intake of vitamin D for both the cases and controls and compare the two groups to determine if there is a relationship between vitamin D intake and the risk of developing osteoporosis. Example: Psychology case-control study You are studying the relationship between early-childhood stress and the likelihood of later developing post-traumatic stress disorder (PTSD). Here, the case group would be individuals who have been diagnosed with PTSD, while the control group would be individuals without PTSD.

Case-control studies are a solid research method choice, but they come with distinct advantages and disadvantages.

Advantages of case-control studies

  • Case-control studies are a great choice if you have any ethical considerations about your participants that could preclude you from using a traditional experimental design .
  • Case-control studies are time efficient and fairly inexpensive to conduct because they require fewer subjects than other research methods .
  • If there were multiple exposures leading to a single outcome, case-control studies can incorporate that. As such, they truly shine when used to study rare outcomes or outbreaks of a particular disease .

Disadvantages of case-control studies

  • Case-control studies, similarly to observational studies, run a high risk of research biases . They are particularly susceptible to observer bias , recall bias , and interviewer bias.
  • In the case of very rare exposures of the outcome studied, attempting to conduct a case-control study can be very time consuming and inefficient .
  • Case-control studies in general have low internal validity  and are not always credible.

Case-control studies by design focus on one singular outcome. This makes them very rigid and not generalizable , as no extrapolation can be made about other outcomes like risk recurrence or future exposure threat. This leads to less satisfying results than other methodological choices.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval
  • Quartiles & Quantiles
  • Cluster sampling
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Prospective cohort study

Research bias

  • Implicit bias
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hindsight bias
  • Affect heuristic
  • Social desirability bias

A case-control study differs from a cohort study because cohort studies are more longitudinal in nature and do not necessarily require a control group .

While one may be added if the investigator so chooses, members of the cohort are primarily selected because of a shared characteristic among them. In particular, retrospective cohort studies are designed to follow a group of people with a common exposure or risk factor over time and observe their outcomes.

Case-control studies, in contrast, require both a case group and a control group, as suggested by their name, and usually are used to identify risk factors for a disease by comparing cases and controls.

A case-control study differs from a cross-sectional study because case-control studies are naturally retrospective in nature, looking backward in time to identify exposures that may have occurred before the development of the disease.

On the other hand, cross-sectional studies collect data on a population at a single point in time. The goal here is to describe the characteristics of the population, such as their age, gender identity, or health status, and understand the distribution and relationships of these characteristics.

Cases and controls are selected for a case-control study based on their inherent characteristics. Participants already possessing the condition of interest form the “case,” while those without form the “control.”

Keep in mind that by definition the case group is chosen because they already possess the attribute of interest. The point of the control group is to facilitate investigation, e.g., studying whether the case group systematically exhibits that attribute more than the control group does.

The strength of the association between an exposure and a disease in a case-control study can be measured using a few different statistical measures , such as odds ratios (ORs) and relative risk (RR).

No, case-control studies cannot establish causality as a standalone measure.

As observational studies , they can suggest associations between an exposure and a disease, but they cannot prove without a doubt that the exposure causes the disease. In particular, issues arising from timing, research biases like recall bias , and the selection of variables lead to low internal validity and the inability to determine causality.

Sources in this article

We strongly encourage students to use sources in their work. You can cite our article (APA Style) or take a deep dive into the articles below.

George, T. (2023, June 22). What Is a Case-Control Study? | Definition & Examples. Scribbr. Retrieved September 3, 2024, from https://www.scribbr.com/methodology/case-control-study/
Schlesselman, J. J. (1982). Case-Control Studies: Design, Conduct, Analysis (Monographs in Epidemiology and Biostatistics, 2) (Illustrated). Oxford University Press.

Is this article helpful?

Tegan George

Tegan George

Other students also liked, what is an observational study | guide & examples, control groups and treatment groups | uses & examples, cross-sectional study | definition, uses & examples, get unlimited documents corrected.

✔ Free APA citation check included ✔ Unlimited document corrections ✔ Specialized in correcting academic texts

Service update: Some parts of the Library’s website will be down for maintenance on August 11.

Secondary menu

  • Log in to your Library account
  • Hours and Maps
  • Connect from Off Campus
  • UC Berkeley Home

Search form

Oomph library resources: phw 250/250b epidemiologic methods: epidemiologic case study resources.

  • Online Books on Epidemiology and Biostatistics
  • R for Public Health
  • Epidemiologic Case Study Resources
  • Rural Health Resources
  • Stata Resources and Tips
  • Help/Off-Campus Access

Epidemiologic Case Studies

  • Epidemiologic Case Studies (US CDC) These case studies are interactive exercises developed to teach epidemiologic principles and practices. They are based on real-life outbreaks and public health problems and were developed in collaboration with the original investigators and experts from the Centers for Disease Control and Prevention (CDC). The case studies require students to apply their epidemiologic knowledge and skills to problems confronted by public health practitioners at the local, state, and national level every day.
  • Case Studies (WHO) From "Strengthening health security by implementing the International Health Regulations," each case has learning objectives and documentation.
  • Case Studies in Social Medicine A series of Perspective articles from the New England Journal of Medicine that highlight the importance of social concepts and social context in clinical medicine. The series uses discussions of real clinical cases to translate theories and methods for understanding social processes into terms that can readily be used in medical education, clinical practice, and health system planning.
  • African Case Studies in Public Heath Case study exercises based on real events in African contexts and written by experienced Africa-based public health trainers and practitioners. These case studies represent the most up-to-date and context-appropriate case study exercises for African public health training programs. These exercises are designed to reinforce and instill competencies for addressing health threats in the future leaders of public health in Africa.
  • Case Consortium @ Columbia University: Public Health Cases The case collection includes "teaching" cases. Nearly all the cases are multimedia and based on original research; a few are written from secondary sources. All cases are offered free of charge.
  • Epi Teams Training: Case Studies From the North Carolina Institute for Public Health, this curriculum includes several interactive case studies designed be used by the Epi Team as a group. These case studies are based on actual outbreaks that have occurred in North Carolina and elsewhere.
  • National Center for Case Study Teaching in Science The mission of the NCCSTS at the University at Buffalo is to promote the development and dissemination of materials and practices for case teaching in the sciences. Our website provides access to an award-winning collection of peer-reviewed case studies. We offer a five-day summer workshop and a two-day fall conference to train faculty in the case method of teaching science. In addition, we are actively engaged in educational research to assess the impact of the case method on student learning. "Case Collection" includes over 100 public health cases.

Books of Case Studies

what is a case study epidemiology

  • << Previous: R for Public Health
  • Next: Rural Health Resources >>
  • Last Updated: Aug 15, 2024 2:56 PM
  • URL: https://guides.lib.berkeley.edu/publichealth/PHW250

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed, case control studies, affiliations.

  • 1 University of Nebraska Medical Center
  • 2 Spectrum Health/Michigan State University College of Human Medicine
  • PMID: 28846237
  • Bookshelf ID: NBK448143

A case-control study is a type of observational study commonly used to look at factors associated with diseases or outcomes. The case-control study starts with a group of cases, which are the individuals who have the outcome of interest. The researcher then tries to construct a second group of individuals called the controls, who are similar to the case individuals but do not have the outcome of interest. The researcher then looks at historical factors to identify if some exposure(s) is/are found more commonly in the cases than the controls. If the exposure is found more commonly in the cases than in the controls, the researcher can hypothesize that the exposure may be linked to the outcome of interest.

For example, a researcher may want to look at the rare cancer Kaposi's sarcoma. The researcher would find a group of individuals with Kaposi's sarcoma (the cases) and compare them to a group of patients who are similar to the cases in most ways but do not have Kaposi's sarcoma (controls). The researcher could then ask about various exposures to see if any exposure is more common in those with Kaposi's sarcoma (the cases) than those without Kaposi's sarcoma (the controls). The researcher might find that those with Kaposi's sarcoma are more likely to have HIV, and thus conclude that HIV may be a risk factor for the development of Kaposi's sarcoma.

There are many advantages to case-control studies. First, the case-control approach allows for the study of rare diseases. If a disease occurs very infrequently, one would have to follow a large group of people for a long period of time to accrue enough incident cases to study. Such use of resources may be impractical, so a case-control study can be useful for identifying current cases and evaluating historical associated factors. For example, if a disease developed in 1 in 1000 people per year (0.001/year) then in ten years one would expect about 10 cases of a disease to exist in a group of 1000 people. If the disease is much rarer, say 1 in 1,000,0000 per year (0.0000001/year) this would require either having to follow 1,000,0000 people for ten years or 1000 people for 1000 years to accrue ten total cases. As it may be impractical to follow 1,000,000 for ten years or to wait 1000 years for recruitment, a case-control study allows for a more feasible approach.

Second, the case-control study design makes it possible to look at multiple risk factors at once. In the example above about Kaposi's sarcoma, the researcher could ask both the cases and controls about exposures to HIV, asbestos, smoking, lead, sunburns, aniline dye, alcohol, herpes, human papillomavirus, or any number of possible exposures to identify those most likely associated with Kaposi's sarcoma.

Case-control studies can also be very helpful when disease outbreaks occur, and potential links and exposures need to be identified. This study mechanism can be commonly seen in food-related disease outbreaks associated with contaminated products, or when rare diseases start to increase in frequency, as has been seen with measles in recent years.

Because of these advantages, case-control studies are commonly used as one of the first studies to build evidence of an association between exposure and an event or disease.

In a case-control study, the investigator can include unequal numbers of cases with controls such as 2:1 or 4:1 to increase the power of the study.

Disadvantages and Limitations

The most commonly cited disadvantage in case-control studies is the potential for recall bias. Recall bias in a case-control study is the increased likelihood that those with the outcome will recall and report exposures compared to those without the outcome. In other words, even if both groups had exactly the same exposures, the participants in the cases group may report the exposure more often than the controls do. Recall bias may lead to concluding that there are associations between exposure and disease that do not, in fact, exist. It is due to subjects' imperfect memories of past exposures. If people with Kaposi's sarcoma are asked about exposure and history (e.g., HIV, asbestos, smoking, lead, sunburn, aniline dye, alcohol, herpes, human papillomavirus), the individuals with the disease are more likely to think harder about these exposures and recall having some of the exposures that the healthy controls.

Case-control studies, due to their typically retrospective nature, can be used to establish a correlation between exposures and outcomes, but cannot establish causation . These studies simply attempt to find correlations between past events and the current state.

When designing a case-control study, the researcher must find an appropriate control group. Ideally, the case group (those with the outcome) and the control group (those without the outcome) will have almost the same characteristics, such as age, gender, overall health status, and other factors. The two groups should have similar histories and live in similar environments. If, for example, our cases of Kaposi's sarcoma came from across the country but our controls were only chosen from a small community in northern latitudes where people rarely go outside or get sunburns, asking about sunburn may not be a valid exposure to investigate. Similarly, if all of the cases of Kaposi's sarcoma were found to come from a small community outside a battery factory with high levels of lead in the environment, then controls from across the country with minimal lead exposure would not provide an appropriate control group. The investigator must put a great deal of effort into creating a proper control group to bolster the strength of the case-control study as well as enhance their ability to find true and valid potential correlations between exposures and disease states.

Similarly, the researcher must recognize the potential for failing to identify confounding variables or exposures, introducing the possibility of confounding bias, which occurs when a variable that is not being accounted for that has a relationship with both the exposure and outcome. This can cause us to accidentally be studying something we are not accounting for but that may be systematically different between the groups.

Copyright © 2024, StatPearls Publishing LLC.

PubMed Disclaimer

Conflict of interest statement

Disclosure: Steven Tenny declares no relevant financial relationships with ineligible companies.

Disclosure: Connor Kerndt declares no relevant financial relationships with ineligible companies.

Disclosure: Mary Hoffman declares no relevant financial relationships with ineligible companies.

  • Introduction
  • Issues of Concern
  • Clinical Significance
  • Enhancing Healthcare Team Outcomes
  • Review Questions

Similar articles

  • Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas. Crider K, Williams J, Qi YP, Gutman J, Yeung L, Mai C, Finkelstain J, Mehta S, Pons-Duran C, Menéndez C, Moraleda C, Rogers L, Daniels K, Green P. Crider K, et al. Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217. Cochrane Database Syst Rev. 2022. PMID: 36321557 Free PMC article.
  • Epidemiology Of Study Design. Munnangi S, Boktor SW. Munnangi S, et al. 2023 Apr 24. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2024 Jan–. 2023 Apr 24. In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2024 Jan–. PMID: 29262004 Free Books & Documents.
  • Risk factors for Kaposi's sarcoma in HIV-positive subjects in Uganda. Ziegler JL, Newton R, Katongole-Mbidde E, Mbulataiye S, De Cock K, Wabinga H, Mugerwa J, Katabira E, Jaffe H, Parkin DM, Reeves G, Weiss R, Beral V. Ziegler JL, et al. AIDS. 1997 Nov;11(13):1619-26. doi: 10.1097/00002030-199713000-00011. AIDS. 1997. PMID: 9365767
  • The epidemiology of classic, African, and immunosuppressed Kaposi's sarcoma. Wahman A, Melnick SL, Rhame FS, Potter JD. Wahman A, et al. Epidemiol Rev. 1991;13:178-99. doi: 10.1093/oxfordjournals.epirev.a036068. Epidemiol Rev. 1991. PMID: 1765111 Review.
  • Epidemiology of Kaposi's sarcoma. Beral V. Beral V. Cancer Surv. 1991;10:5-22. Cancer Surv. 1991. PMID: 1821323 Review.
  • Setia MS. Methodology Series Module 2: Case-control Studies. Indian J Dermatol. 2016 Mar-Apr;61(2):146-51. - PMC - PubMed
  • Sedgwick P. Bias in observational study designs: case-control studies. BMJ. 2015 Jan 30;350:h560. - PubMed
  • Groenwold RHH, van Smeden M. Efficient Sampling in Unmatched Case-Control Studies When the Total Number of Cases and Controls Is Fixed. Epidemiology. 2017 Nov;28(6):834-837. - PubMed

Publication types

  • Search in PubMed
  • Search in MeSH
  • Add to Search

Related information

Linkout - more resources, full text sources.

  • NCBI Bookshelf

Research Materials

  • NCI CPTC Antibody Characterization Program

Miscellaneous

  • NCI CPTAC Assay Portal

book cover photo

  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

Encyclopedia Britannica

  • History & Society
  • Science & Tech
  • Biographies
  • Animals & Nature
  • Geography & Travel
  • Arts & Culture
  • Games & Quizzes
  • On This Day
  • One Good Fact
  • New Articles
  • Lifestyles & Social Issues
  • Philosophy & Religion
  • Politics, Law & Government
  • World History
  • Health & Medicine
  • Browse Biographies
  • Birds, Reptiles & Other Vertebrates
  • Bugs, Mollusks & Other Invertebrates
  • Environment
  • Fossils & Geologic Time
  • Entertainment & Pop Culture
  • Sports & Recreation
  • Visual Arts
  • Demystified
  • Image Galleries
  • Infographics
  • Top Questions
  • Britannica Kids
  • Saving Earth
  • Space Next 50
  • Student Center
  • Why is biology important?
  • When did science begin?
  • Where was science invented?

Aspirin pills.

case-control study

Our editors will review what you’ve submitted and determine whether to revise the article.

  • National Center for Biotechnology Information - Case Control Studies

case-control study , in epidemiology , observational (nonexperimental) study design used to ascertain information on differences in suspected exposures and outcomes between individuals with a disease of interest (cases) and comparable individuals who do not have the disease (controls). Analysis yields an odds ratio (OR) that reflects the relative probabilities of exposure in the two populations. Case-control studies can be classified as retrospective (dealing with a past exposure) or prospective (dealing with an anticipated exposure), depending on when cases are identified in relation to the measurement of exposures. The case-control study was first used in its modern form in 1926. It grew in popularity in the 1950s following the publication of several seminal case-control studies that established a link between smoking and lung cancer .

Case-control studies are advantageous because they require smaller sample sizes and thus fewer resources and less time than other observational studies. The case-control design also is the most practical option for studying exposure related to rare diseases. That is in part because known cases can be compared with selected controls (as opposed to waiting for cases to emerge, which is required by other observational study designs) and in part because of the rare disease assumption, in which OR mathematically becomes an increasingly better approximation of relative risk as disease incidence declines. Case-control studies also are used for diseases that have long latent periods (long durations between exposure and disease manifestation) and are ideal when multiple potential risk factors are at play.

The primary challenge in designing a case-control study is the appropriate selection of cases and controls. Poor selection can result in confounding, in which correlations that are unrelated to the exposure exist between case and control subjects. Confounding in turn affects estimates of the association between disease and exposure, causing selection bias, which distorts OR figures. To overcome selection bias, controls typically are selected from the same source population as that used for the selection of cases. In addition, cases and controls may be matched by relevant characteristics. During the analysis of study data, multivariate analysis (usually logistic regression) can be used to adjust for the effect of measured confounders.

Bias in a case-control study might also result if exposures cannot be measured or recalled equally in both cases and controls. Healthy controls, for example, may not have been seen by a physician for a particular illness or may not remember the details of their illness. Choosing from a population with a disease different from the one of interest but of similar impact or incidence may minimize recall and measurement bias, since affected individuals may be more likely to recall exposures or to have had their information recorded to a level comparable to cases.

Logo for Pressbooks at Virginia Tech

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

3 Study Designs

3.1 measurement through study.

There are two primary categories of study designs (figure 3.1), and the primary difference between the two is whether or not we control the study factors.

In observational studies, we do not manipulate any study factors and do not randomize. We observe what happens in a particular group of people—for example, factory workers, children in a preschool, or patients seen in a clinic for primary care. When we say manipulate , we do not mean that we make things up. What we do mean is that we can set the parameters of the study (i.e., control study factors) such as who gets the exposure (e.g., a medication) or who does not (e.g., the placebo or standard of care) in order to see causal effects, if they exist between an exposure and an outcome. When we do this, it is called an experimental study.

In experimental studies, we do control factors and often use randomization to create fairly perfect conditions to see the influence of an exposure on an outcome. For example, we might enroll some cancer patients in a trial to see how a new medication works, or we might test how different the health is in communities with fluoridated water compared to those without fluoridated water. Randomization means that we use some sort of objective criteria to put study participants in whatever groups we establish for our study. For example, we may have one group that gets a sugar pill (i.e., a placebo), one group that gets the standard of care, and one group that gets the drug we are testing. In this scenario, we might assign patients to a group based on the order in which they come to the clinic. We might also choose to assign all patients a number and randomly allocate them to a group using a random number generator. No matter the assignation, we use an objective method to put patients in a study group. This helps us reduce the chance of a biased study result.

As you consider each study design, pay attention to these details:

  • Number of observations made
  • Directionality of exposure
  • Data collection methods
  • Timing of data collection
  • Unit of observation
  • Availability of subjects

Long description available at the end of the chapter.

All study designs are not created to be equal, but each has a specific purpose. Each study design helps us move closer to an understanding of causality ( section 1.2 ). As you move from in-vitro studies to meta-analyses (figure 3.2), you can see that the evidence each study design provides becomes stronger. It does not mean the designs at the top are weaker or useless, they just provide a different type of evidence. Though there is a general consensus about how valid or strong the evidence is from any particular type of study, the evidence from each design builds on the others.

Study types  
Least strong information. Just as important as the study types below.*
Strongest information. Just as important as the study types above.*
*All study types are important and each level builds on the ones before it.

Figure 3.2: How much can we rely on the answers from your study when determining the etiology (cause) of disease or conditions?

Example: Types of study designs

What happens when we approach the same topic and question with different study designs? Let’s find out using osteoarthritis as an example.

  • In vitro: In vitro models for the study of osteoarthritis
  • Animal: Animal models of osteoarthritis: classification, update, and measurement of outcomes
  • Opinion: Current opinion: where are we in our understanding and treatment of osteoarthritis?
  • Case report: The effect of knee resizing illusions on pain and swelling in symptomatic knee osteoarthritis: a case report
  • Cross-sectional: Is There an Association Between a History of Running and Symptomatic Knee Osteoarthritis? A Cross-Sectional Study From the Osteoarthritis Initiative
  • Case control: A case-control study to investigate the relation between low and moderate levels of physical activity and osteoarthritis of the knee using data collected as part of the Allied Dunbar National Fitness Survey
  • Cohort: Running does not increase symptoms or structural progression in people with knee osteoarthritis: data from the osteoarthritis initiative
  • RCT: Ultrasound-Guided Injection of Platelet-Rich Plasma and Hyaluronic Acid, Separately and in Combination, for Hip Osteoarthritis: A Randomized Controlled Study
  • Systematic review: Is Participation in Certain Sports Associated With Knee Osteoarthritis? A Systematic Review
  • Meta-analysis: Is Participation in Certain Sports Associated With Knee Osteoarthritis? A Systematic Review

In the real world, study designs are not always clearly distinguishable from each other. There often is overlap such as that seen in the nested case-control, cross-sectional case-control, and case-cohort study designs.

3.2 Study Designs

Beyond the measures presented in chapter 2 , epidemiologic studies allow us to create and compare measures across individuals and groups. Because we are really examining the relationship between factors, exposures, and outcomes, we call the majority of these measures of association . Figures 3.3 and 3.4 lay out the different types of studies and some overview details about them. The rest of this chapter is dedicated to explaining and discussing temporality , measures of association , measures of effect , and sampling .

Figure 3.3 describes five types of observational study designs: case series, ecologic studies, cross-sectional studies, case-control studies, and cohort studies. From left to right, the designs are listed in order of the strength of their evidence (weakest to strongest).

Details Observational designs (in order of strength)
Case study Correlational study Prevalence study Case-referent study Follow-up study
Descriptive Primarily descriptive Descriptive Analytic Analytic
No No No No Yes
Individual Group Individual Individual Individual
Describe interesting cases of disease, injury, or other health issues. Test or develop etiologic hypotheses (hypotheses about the population). Create hypotheses about causation or identify methods of prevention. Present the burden of disease, injury, or other health issues (morbidity or mortality). Generate hypotheses. Supports planning health services. Outbreaks, studying diseases of low prevalence, testing hypotheses. Studying etiology, providing direct measures of risk, testing hypotheses, showing temporal relationships, looking at rare exposures.
None Correlation, chi-square Prevalence estimates, prevalence rate ratio (AKA prevalence relative risk) Odds ratio Relative risk (most often), odds ratio (sometimes)
N/A Depends on the design

(Direct Measurement of Risk)

(Indirect Measurement of Risk)

This cross-product ratio is the derivative of:

Able to share information with others to then develop hypotheses or plan studies. Quick and easy to conduct. Inexpensive. Sometimes quick and relatively easy to conduct (if using secondary vs primary data). Inexpensive. Great with rare outcomes. Cheap, efficient. Can be completed rather quickly. Great with rare exposures. Can show temporal relationships between exposure and disease retrospectively, prospectively, or a combination of the two.
Not enough details to make decisions for treatment. Ecologic fallacy. Imprecise measurement. Not good for rare diseases. Shouldn’t be used for etiologic studies. Not good for rare exposures. Cannot provide a direct measure of risk. Recall bias. Can cost a lot of money and take a lot of time to complete. Difficult to execute. Selection bias. Not good for rare diseases.
We had five patients with hallucinations after taking NSAID A that is not known for causing hallucinations. We will describe their clinical presentation here. The rate of premature births decreased in West Virginia when Medicaid was expanded. The prevalence of high adiposity increased in New Mexico during the COVID-19 pandemic amongst people 60 to 65 years of age that were retired. People that worked in grocery stores during the first four months of the COVID-19 pandemic were more likely to be hospitalized with COVID-19 than the general population. Soldiers that entered boot camp in 1980 and stayed in the military for 20 years had a higher risk of osteoarthritis at 60 than soldiers that entered at the same time and stayed in the military less than 7 years.
PRR: Prevalence Rate Ratio
OR: Odds Ratio
RR: Relative Risk

Figure 3.3: Epidemiological study designs.

Figure 3.4 describes two types of experimental study designs: community trials and clinical trials.

Details Experimental designs  
Community intervention study RCT
Analytic Analytic
Yes Yes
Community Individual
Useful for seeing how effective community-level interventions are, evaluating policies, or implementing healthier behaviors in the community. Useful for testing efficacy of new medications, therapies, treatments, or preventative methods (such as vaccines).
If a multiphase trial, the steps are:
Can ?
Phase 0: efficacy work (pharmacodynamics and pharmacokinetics)
Phase I: assessment
Phase II: Does it ?
Phase III: Does it lead to any in the condition?
Phase IV: Are there any issues that require us to pull it off the ?
Depends on the design Depends on the design
Randomization of communities. Researchers can sometimes manipulate the exposure. Can establish causality. Randomization of subjects. Can manipulate exposure. Can control everything else. Can set up as a cross-over trial (same group of participants serves as both the cases and the controls). Can establish causality.
Hard to control everything such as people moving in and out of the study area. Impossible to make everyone in the area participate. The fact everything is controlled means it is uncertain whether it will work the same way in the real world.
Communities with fluoridated water have better oral health outcomes than communities without fluoridated water. Drug B is more efficacious at reducing atrial fibrillation than standard of care during a phase III trial.

Figure 3.4: Community trials and clinical trials.

One tool that is used to calculate a number of epidemiological measures is the 2×2 table (figure 3.5). This table is repeated many times in the following text. The primary columns represent the presence (e.g., outcome +) or absence (e.g., outcome –) of the outcome or event of interest (e.g., ACL injury). The primary rows represent the presence (e.g., exposed +) or absence (e.g., exposed –) of the exposure of interest (e.g., being hit). In this example table we also show the total number of those exposed and the total number of those with the outcome. These totals are sometimes needed for different calculations.

Long description available at the end of the chapter.

The letters A, B, C, and D represent the number of observations that meet different criteria.

A = The count of observations that have both the outcome and the exposure

B = The count of observations that have the exposure but not the outcome

C = The count of observations that have the outcome but not the exposure

D = The count of observations that have neither the outcome nor the exposure

Using this same logic, the sum of A and B gives us the total number of observations with the exposure and the sum of C and D gives us the total number of observations without the exposure. The sum of A and C gives us the total number of observations with the outcome and the sum of B and D gives us the total number of observations without the outcome.

3.3 Temporality

In order to establish causality, it is important to be able to establish a temporal—or time—relationship between factors. As seen in figures 3.3 and 3.4, all studies are not good at measuring temporality. All studies also are not intended to measure temporality. Studies such as cohort studies or RCTs are the most used when trying to answer questions such as “Did the chicken come first or did the egg?” [ Answer to that question! [1] ]. Figure 3.6 displays at what point in time data collection for different studies starts, the directionality of data collection, and the minimum number of time points captured by the study. For example, in a cross-sectional study all data is captured at the same point in time (the present day) and shows what is happening right now. Cross-sectional studies can be thought of as a snapshot in time, and the time period could vary from something such as a patient’s last visit to patient outcomes over the last year. Because all questions get asked at once and typically involve recalling events from the past, we cannot determine temporality. A cross-sectional study can, however, give us a great perspective about the prevalence of a particular health issue. A retrospective cohort study, on the other hand, starts in the present day but looks backwards to capture information from the past. Oftentimes, there can be confusion about the difference between a retrospective cohort study and a case-control study. Even in a retrospective cohort study, our goal is to determine if a known exposure leads to a disease, such as when we are trying to determine whether playing football leads to developing chronic traumatic encephalopathy (CTE). We have information about the population both before and after diagnosis, which allows us to observe whether the exposure led to the disease. In a case-control study, we are looking to find what exposures could have led to known disease. It is most often used when we need an answer quickly, such as in an outbreak; for example, what caused an outbreak of ringworm in wrestlers. While we start with people we know have the outcome, we have to determine what possible exposures are of interest and then narrow down which one had the higher probability of causing the outcome. We cannot definitively determine temporality. One of the main differences between a prospective cohort study and a randomized control trial is that instead of seeing the natural course of exposure (e.g., choice to smoke or not smoke), we instead randomly allocate participants into our study groups—we choose for them. This means we may give one group the standard of care for an ankle injury and give the other group a new cryotherapy plus standard of care to see the effect the cryotherapy has on the outcome of the injury.

Long description available at the end of the chapter.

3.4 Observational Study Designs

Ecological studies use group summary measures for exposure and outcome rather than measures about individual people. We would use this type of study to compare populations, such as the rate of disease in France compared to the United States or the rate of disease in the United States in 1950 compared to 2000. Because this type of study compares groups, we cannot assume that the results from this study should apply to individuals. It also means that studies we do using data on individuals should not be assumed to apply to groups. If we were to do that, we would be committing the ecologic fallacy .

Example: Ecologic fallacy

If we find out that the rate of heat-related illnesses during track and field is high in states in the southern United States, that does not automatically mean that individuals in the southern United States have higher risks of heat-related illness than people living other places. It just means that on a group basis, their rates are higher. If we find out that 80 of 100 individual people with heat illness at a track meet are from the southern United States, it does not mean that 80 percent of all heat-related illnesses occur in the southern United States. If we do make these incorrect assumptions, we have just been guilty of the ecological fallacy. We need to do a better job being correct in our inferences , or the meaning we assign to the data that we see. It would be a fallacy to assume that people from the southern United States will experience heat illness based on the presentation of data.

Long description available at the end of the chapter.

In this example, we see that there is a positive relationship between the number of Professional Australian Football matches played and the number of concussions that were diagnosed. [2] However, we would not want to assume that every player with more matches will have any concussions. As we can see, at least some players with a high number of matches have no concussions. We also can see that some players with few matches have a higher number of concussions than players with more matches. We can only infer what we see, which is the probability (chance) of an increased risk of concussion with more matches.

In a 2006 TED Talk, statistics expert and physician Hans Rosling provided an excellent example of the importance of ecological studies. You can see it in the first 7 minutes of this video . [3]

Cross-sectional studies measure the prevalence of disease and of exposures (i.e., risk factors) at one point in time. Cross-sectional studies are also known as prevalence studies. When we think about what is being measured in a cross-sectional study, we should think about taking a photo or a snapshot : it is a photo of you right now, not what you looked like in the past or what you will look like in the future. We do not know when an exposure happened or when a disease started, we just know they are present right now .

Example: Cross-sectional study

During the COVID-19 pandemic, professional athletes in the United States needed to pass cardiac testing in order to return to play after testing positive for COVID-19. Researchers conducted a study to find out the “prevalence of detectable inflammatory heart disease” among athletes in the National Basketball Association, the Women’s National Basketball Association, National Hockey League, National Football League, Major League Soccer, and Major League Baseball between May and October 2020. [4] They found that 789 athletes tested positive for COVID-19 and, of those, 30 required further screening. [5] Ultimately, 5 athletes had detectable inflammatory heart disease and were held out of play.

Case-control studies are used to find out whether a particular exposure could have been the source or cause of a disease , particularly in urgent health situations. We start by identifying who already has the disease ( cases ), then we find a set of people who are like the cases in every respect except they do not have disease. These are called controls . We ask these cases and controls questions about their past exposures. Because we start with people who are diseased, case-control studies are great when you are interested in studying people who have rare diseases. This design is explored more in the next section on Outbreak Investigations.

Cohort studies start with a group of individuals based on their exposure status. They are used to find out whether a particular disease comes after a particular exposure or development of a risk factor. If someone does not have the chance of being exposed, they would not be a good selection for a cohort study. You want everyone to have the potential of getting the outcome because of the exposure. Because of this, cohort studies are great when you’re interested in studying people who have rare exposures . Once the exposure status is identified, researchers then identify whether or not the subjects have the outcome of interest already. If they do, they would be removed from a prospective study because our goal is to see if the outcome happens after the exposure, and if they already have both, how would we know? There are roughly three types of cohort studies: prospective , retrospective , and historical . Every cohort study has at least two data collection points and they do not overlap . Prospective means we are setting up the study today and actively following forward into the future. Retrospective means we are setting up the study today but we are looking at information that was previously gathered. So how are retrospective cohort studies different from case-control studies? (See figure 3.6.)

In our next example, we explore how we might approach hospital-acquired infections after anterior cruciate ligament (ACL) reconstruction surgery compared to ACL repair surgery with a cohort study or a case-control study.

Example: Hospital-acquired infections after ACL reconstruction surgery vs ACL repair surgery

Type of question that can be answered with retrospective cohort study: We are interested in identifying whether there are more hospital-acquired infections (the outcome) after ACL reconstruction surgery compared to ACL repair surgery (the exposure).

In a retrospective cohort study, we would start by identifying everyone in the population under study (e.g., all patients seen at hospital A) who was eligible for ACL surgery using hospital records. We would select from this population people who had either the ACL reconstruction or ACL repair surgery at Hospital A. We then go through their records to identify what happened to them prior to having the surgery and then move forward through their records to see whether they developed a hospital-acquired infection after surgery. Measurement 1: Eligibility for study (exposure status) and determination of whether they already had the outcome before the surgery (which would exclude them). Measurement 2: Determination of whether they had the outcome after the surgery. This provides evidence that the hospital-acquired infection came after the surgery but doesn’t rule out that it could have been caused in full or in part by something else postsurgically.

Type of question that can be answered with case-control study: Hospital A has a number of hospital-acquired infections after surgery. We are interested in identifying whether ACL reconstruction surgery or ACL repair surgery is more common (exposure) in people who have hospital-acquired infections (cases).

In a case-control study, we would start by identifying everyone in the population under study (e.g., all patients seen at hospital A) who had a hospital-acquired infection after surgery (the outcome) using hospital records. We would find patients in Hospital A who did not have the hospital-acquired infection but could have gotten it from surgery (controls). We would then use existing records or talk to patients/providers/environmental services to find out more info about the potential places in the hospital where they could have gotten the infection. This would have helped us identify the type of surgery as a potential exposure. We would compare the cases with the exposure (e.g., ACL reconstruction surgery) to those without the exposure to see if there was a difference in the chance of having a hospital-acquired infection. Whatever exposures have the higher OR would be the ones we’d investigate further as the potential place to intervene. Measurement 1: Eligibility for study, exposure status, disease status. No second measurement.

Long description available at the end of the chapter.

3.5 Measures of Association

As noted in section 3.2 , we often use a 2x2 table to analyze data from an epidemiological study (figure 3.5). This table is repeated many times in the following text.

Long description available at the end of the chapter.

Beware ! While one side of the table above has exposure (or risk factors) and the other side has outcomes (such as disease), everyone does not set their table up the same way (see figure 3.10). Before doing any calculations with data from a 2x2 table, pay attention to how it is set up. All examples in this book use the version showing exposure in rows and outcome in columns.

Long description available at the end of the chapter.

When we calculate our measures of association, we refer to the needed components by referring to different boxes of our 2x2 table using letters.

  • A – Has the outcome and is exposed
  • B – Does not have the outcome and is exposed
  • C – Has the outcome and is not exposed
  • D – Does not have the outcome and is not exposed

Examples of the measures of association are the odds ratio and the relative risk. A measure used in cross-sectional studies is the prevalence rate ratio.

Study design Measures of disease Measures of risk Temporality
Prevalence (rough estimate) Prevalence ratio Retrospective
• Proportional mortality
• Standardized mortality
• Proportional mortality ratio
• Standardized mortality ratio
Retrospective
None Odds ratio Retrospective
• Point prevalence
• Period prevalence
• Odds ratio
• Prevalence odds ratio
• Prevalence ratio
• Prevalence difference
Retrospective
None Odds ratio Retrospective
• Point prevalence
• Period prevalence
• Incidence
• Odds ratio
• Prevalence odds ratio
• Prevalence ratio
• Prevalence difference
• Attributable risk
• Incidence rate ratio
• Relative risk
• Risk ratio
• Hazard ratio
• Retrospective only
• Both retrospective and prospective
• Prospective only

Figure 3.11: The variety of measures that can be calculated from different study designs.

3.5.1 Odds Ratio

The only measure of association that can be calculated in a case-control study is the odds ratio (OR) [the probability of being exposed among cases compared to the probability of being exposed among controls]. This particular odds ratio is referred to as the odds ratio of exposure .

\text { OR }(\text { Exposure })={\frac{\frac{\frac{A}{A+C}}{\frac{C}{A+C}}}{\frac{\frac{B}{B+D}}{\frac{D}{B+D}}}}={\frac{\frac{A}{C}}{\frac{B}{D}}}=\frac{AD}{BC}

The resulting answer is a direct comparison of the ratio of the proportion of those with the exposure who have the outcome to proportion of those with the exposure without the outcome. If this number is equal to 1 (roughly, 0.9 to 1.1), there is no difference in the probability of having the exposure between the outcome groups. If this number is greater than 1 (roughly, greater than 1.1), the group with the outcome is more likely to have the exposure than the group without the outcome. If this number is less than 1 (roughly, less than 0.9), the group with the outcome is less likely to have the exposure than the group without the outcome.

Long description available at the end of the chapter.

Always be specific when drawing comparisons. Just saying, for example, “Cases are 3.2 times more likely to have the exposure” is an incomplete interpretation of the OR. “Cases are 3.2 times more likely to have the exposure compared to controls ” is clear about what you are comparing the odds of cases to. This applies to relative risk interpretations as well.

We can also calculate an OR (of exposure or disease) in other study designs, including cross-sectional, cohort, and RCTs. How it gets interpreted in these cases is often different than how we interpret it in a case-control based on the nature of the study and the difference in the full calculation.

3.5.2 Relative Risk

The primary measure of association that is calculated in a cohort study is the relative risk (the risk or incidence of the outcome in the exposed compared to the risk or incidence of the outcome in the unexposed).

\frac{A}{A+B}

If this number is equal to 1 (roughly, 0.9 to 1.1), there is no difference in the risk between exposure groups. If this number is greater than 1 (roughly, greater than 1.1), the group with the exposure is more likely to have the disease than the group without the exposure. If this number is less than 1 (roughly, less than 0.9), the group with the exposure is less likely to have the disease than the group without the exposure.

Long description available at the end of the chapter.

Calculating the odds ratio in a cohort study means that we are calculating the odds ratio of disease . This is calculated differently than the odds ratio of exposure that we calculate in a case-control study (see above). While both formulas result in the cross-product ratio, because they were calculated differently we interpret them differently. Remember that cohort studies are to identify the risk of disease in the exposed compared to the risk of disease in the unexposed .

\text { OR }(\text { Disease })={\frac{\frac{\frac{A}{A+B}}{\frac{B}{A+B}}}{\frac{\frac{C}{C+D}}{\frac{D}{C+D}}}}={\frac{\frac{A}{B}}{\frac{C}{D}}}=\frac{AD}{BC}

3.5.3 Prevalence Rate Ratio

As noted earlier, prevalence is:

\text { Prevalence }=\frac{\text { All cases right now }}{\text { Whole population }}

In cross-sectional studies, a common measure of association we calculate is the prevalence rate ratio . While the name is a misnomer (prevalence is a proportion, not a rate), it still uses a familiar formula to compare things like the prevalence between either two separate groups (e.g., injury prevalence in Oklahoma compared to injury prevalence in Texas) or the same group at different points in time (e.g., injury prevalence in Virginia in 2015 compared to injury prevalence in Virginia in 2020).

\text { Prevalence rate ratio (PRR) }=\frac{\text { Prevalence in group A }}{\text { Prevalence in group B }}

3.6 Outbreak Investigations

An outbreak is the occurrence of disease in an area at a level exceeding the normally expected number of cases . An outbreak technically differs from an epidemic because an outbreak occurs in a more limited geographic area. Epidemics are declared by country-specific health bodies (e.g., the US Centers for Disease Control and Prevention). A disease is endemic if it is occurring at a level expected. It is normally occurring in that place. An epidemic becomes a pandemic when the World Health Organization decides it has become one. A pandemic is an epidemic that is spread over multiple countries or continents. Epidemics and pandemics can have variable time patterns, as seen in section 2.4 .

One of the most common ways that outbreaks are identified is through clinicians paying attention to changes in what they are treating and who they are treating. Figure 3.14 displays the 11 steps to solving an outbreak. [6]

Long description available at the end of the chapter.

Step 1: Establish the existence of an outbreak.

Step 2: Verify the diagnosis.

Before we expend too many resources and too much time, we want to be sure that we are actually observing an outbreak. Things that could look like an outbreak but are not:

  • False positive (specificity)
  • Laboratory error
  • Change in case definition
  • Incorrect time or place
  • False report
  • Record keeping
  • Observation
  • Population composition

Sometimes we improve our surveillance systems or other tracking methods and pick up more cases because we are doing a better job. This does not mean we actually have more cases, we just are doing a better job at seeing them. Other times, we simply make mistakes in identification that could make it appear like we have more cases. Besides these things, we start calculating our prevalence and incidence, as well as if there are reasonable explanations for changes in these numbers, to determine whether to proceed. We should calculate prevalence if we need to know the total burden of the problem. We should calculate incidence if we are trying to find the risk of developing a disease in a given time. Sometimes we need to do both. The most important part of steps 1 and 2 is that we must verify that the diagnosis we think is the problem is in fact the correct diagnosis. For example, if we think that we are having an outbreak of meningitis A, we should confirm that all of the people who are sick actually have meningitis A.

Our goal is to identify all of the following:

  • Individual: Who is affected?
  • Place: Where are they affected?
  • Time: When did this start or change?
  • Connections: What factors are related?

Moving forward in an outbreak investigation is all about what we think, what we know, and what we can prove.

Step 3: Construct a working case definition.

Taking this information, we move into Step 3 and create a working case definition. Many times, this definition stays in flux. Using our case definition, we identify the individual cases, controls, and possible/suspected cases.

Case definitions include a standard set of criteria used to determine if an individual should be classified as a case. Depending on the condition or disease in question, case definitions may already be established. In other situations, this needs to be developed as the investigation progresses. Sometimes the disease or condition in question is required to be reported to the health department or the Centers for Disease Control and Prevention. Nationally notifiable conditions are reported to the National Notifiable Diseases Surveillance System . [7] Each state also has a separate list of notifiable conditions. For example, Virginia’s conditions are reported to the Virginia Department of Health and the State Board of Health . [8]

A case definition usually includes both:

  • Clinical criteria and/or lab test
  • Restrictions by time, place, and/or person

When developing the case definition, we tend to emphasize sensitivity (to identify all possible cases) over specificity (to identify only “true” cases). Part of this is because it is better to err with caution and include too many people than not all cases, especially in the beginning of the investigation. Sensitivity and specificity are discussed in more detail in chapter 4 .  

Example: Case definition

In figure 3.15, we see the diagnostic criteria for hemophagocytic lymphohistiocytosis (HLH), a rare syndrome of excessive immune response. In order to be considered someone who has HLH, a person must have most but not all diagnostic criteria. However, sometimes not all patients will have all tests that are required to be considered a case. If they meet several criteria, they are instead what is known as a possible or probable case.

Long description available at the end of the chapter.

Step 4: Find cases systematically and record information.

Once we have a case definition, we can then work to find all cases (Step 4). We must do this in a systematic fashion and record data on any cases or potential cases we find. We make every effort to find cases that occurred earlier than when we first realized something might be amiss. We use a line listing to organize the data about our cases. In figure 3.16, we see an example of a line listing from an anthrax outbreak. Each row corresponds to a different case, and we include all the possible details relevant to the case status and demographic information.

Case no. Onset date, 2001 Date of anthrax diagnosis by lab testing State Age (years) Sex Race Occupation Case status Anthrax presentation Outcome Diagnostic tests
1 9/22 10/19 NY 31 F W employee Suspect Cutaneous Alive Serum IgG reactive
2 9/25 10/12 NY 38 F W NBC anchor assistant Confirmed Cutaneous Alive Skin biopsy
IHC+/
Serum IgG reactive
3 9/26 10/18 NJ 39 M W USPS machine mechanic Suspect Cutaneous Alive Serum IgG reactive

Figure 3.16: Example line listing.

Step 5: Perform descriptive epidemiology.

In Step 5, we perform descriptive epidemiology on the data we have gathered from clinical records, questionnaires, interviews, and so on. Just as with several other conditions, if it is a suspected foodborne outbreak, we can use tools from CDC [9]  to gather all the pertinent details. We are specifically looking for patterns and associations between risk factors and disease. All the information we will compare is in our line listing. Our measures of association and effect are very useful at this step.

If it is a foodborne outbreak, instead of calculating incidence as we learned before, we usually reframe risk as the attack rate (figure 3.17).

\text { Attack rate }=\frac{\text { Number ill that ate a food }}{\text { Number that ate a food }} \times 100

We can use this to find what percentage of those at risk are actually ill.

Long description available at the end of the chapter.

We interpret the attack rate as the percentage of those with the exposure that are sick. In the above example, 70.6 percent of those that ate salad are sick. We would compare attack rates to determine which exposures deserve more attention as possible causes.

Step 6: Develop hypotheses.

Step 7: Evaluate hypotheses epidemiologically.

In Steps 6 and 7, we form hypotheses based on our existing data and test them. Among other things, our hypotheses may relate to:

  • Cause of the outbreak
  • Risk factors for disease
  • Risk factors for infection
  • Intervention to stop spread: quarantine and vaccinate
  • Treatment of affected individuals

Step 8: As necessary, reconsider, refine, and reevaluate hypotheses.

Step 9: Compare and reconcile with laboratory and/or environmental studies.

Step 10: Implement control and prevention measures.

Step 11: Initiate or maintain surveillance findings.

Our final steps of an outbreak investigation are to continue refining our hypotheses, compiling more data to support or refute our hypotheses, controlling the outbreak, and performing surveillance to keep an eye on the problem. Sometimes we find the source of the problem but cannot just “solve” it. The cost of treating the problem, the cost of the intervention to fix the problem, and the existence of other alternatives all play into our decision about what to do. Controlling the problem might include vaccine development and distribution, it might be stopping access to a dangerous substance, or recalling food products.

In the case of some problems, like COVID-19 or sickle cell disease, we initiate and maintain an ongoing systematic data collection system. This is known as disease surveillance . The US Centers for Disease Control and Prevention reports on the surveillance of notifiable diseases in both the Morbidity and Mortality Weekly Report [10]  (MMWR) and CDC WONDER . [11]

3.7 Measures of Effect

When we are comparing results from our study, we compare the measures that we found. Often, we look at:

  • How big is the difference between groups or individuals with and without a particular risk factor? (Magnitude of effect; ratio, difference)
  • Could the difference we found be just due to chance variation? (Significance of effect; p values)
  • How certain are we of the size of the effect? (Precision of, or uncertainty in, estimate; confidence intervals)

We specifically discuss A in this book. More details on B and C can be found in many books on biostatistics.

We already looked at whether one factor was associated with (or related to) another factor or whether an outcome was associated with an exposure. But in the grand scheme of things, what does that really mean for the population we are focused on?

Measures of effect (how big is the effect of an exposure or risk factor) include the attributable risk (attributable fraction) and the population attributable risk (population attributable fraction). Sometimes epidemiologists and others will refer to these as more measures of association rather than separating them into their own category. Because they are very interrelated, it does not matter whether you refer to them as measures of effect or measures of association but rather when and how to use them. When we’re focused on population health, looking at relative differences like the odds ratio or relative risk is extremely useful to decide where we want to make a difference and what factors we should spend our time and energy on. But when we’re trying to figure out how to approach the problem at the individual level (for your patient for example), absolute measures can be much more useful.

3.7.1 Attributable Risk

Long description available at the end of the chapter.

Of everyone that has the exposure, how much of the occurrence of the disease is due to the exposure in question? That’s the attributable risk . In other words, what’s the difference in how much disease we could already expect without the exposure (risk in the unexposed) and how much disease we have if the exposure is present (the risk in the exposed)? This could also be called the risk difference . The risk in the unexposed is often referred to as the baseline risk .

Example: Attributable risk

Long description available at the end of the chapter.

AR = 0.625 - 0.500 = 0.125

The number we get—0.125—is called an absolute number that tells us how different the risk is for the exposed than the risk for the unexposed. For improved understanding, we tend to make it relative by turning it into a percentage.

\text{AR percent} = \frac{\text{Risk in the exposed - Risk in the unexposed}}{\text{Risk in the exposed}}\times 100

The AR percent tells us what percent of the risk of disease in the exposed group is attributable to the exposure itself. In this case, 20 percent of the risk of an ankle sprain in those that play racquet sports is due to those people playing racquet sports.

When we use attributable risk to see how well a clinical intervention (e.g., a vaccination) performs, we know that the relative risk correlates to how well the intervention will perform. If the relative risk is < 1 (lower risk of the outcome due to the intervention), then the AR will be negative. This is what happens if the intervention works! If the relative risk is > 1 (higher risk of the outcome due to the intervention), then the AR will be positive. This is what happens when the intervention is not that great.

Further reading

Check out this article on the use of the risk difference and the relative risk when comparing the effectiveness of treatment options. [12]

3.7.2 Measures Especially Important in Clinical Medicine

If we can figure out the attributable risk, we can also identify the relative risk reduction , the absolute risk reduction , the number needed to treat , and the number needed to harm .

Measure Equation Which way to round
Relative risk reduction (RRR) -
Absolute risk reduction (ARR) Neither. Take the absolute value.
Number needed to treat (NNT) Up
Number needed to harm (NNH) Down

Figure 3.20: Summary of important clinical medicine measures.

Long description available at the end of the chapter.

The relative risk reduction : If there is a reduction in the risk of the outcome when a particular intervention is used, how much of that is due to the intervention compared to the control?

\text{Relative risk reduction (RRR)} = \text{1 - relative risk}

The absolute risk reduction (also known as the risk difference ): While the ARR and the AR can both be referred to as the risk difference , there is a distinct difference between the two. AR refers to the difference in risk for the outcome among the exposed due to the exposure itself. The ARR is broader and refers to the difference in risk for the outcome in the group that did not have the intervention and the risk for the outcome in the group that did have the intervention.

\text { Absolute risk reduction }(A R R)=\left|\frac{C}{C+D}-\frac{A}{A+B}\right|

Remember that the vertical bars mean that we take the absolute value of anything between them. So mathematically, | -3 | is equal to 3. We should remember that the difference was negative, so we can take that into account later.

The number needed to treat : How many patients have to be treated in order to make a difference for one patient?

\text{Number needed to treat (NNT)} = \frac{1}{\text{absolute risk reduction}}

Always round the result of the NNT formula up .

The number needed to harm : How many patients have to be exposed to a risk factor in order to harm one patient?

\text{Number needed to harm (NNH)} = \frac{1}{\text{attributable risk}}

Always round the result of the NNH formula down .

These four measures (NNH, NNT, ARR, and RRR) are very important in clinical medicine. [13] Figure 3.22 provides an example of how to calculate these statistics.

Example: NNH, NNT, ARR, RRR

Long description available at the end of the chapter.

Female athletes have a greater risk for ACL injury than male athletes for a variety of reasons. Some 70 percent of ACL injuries in female athletes are due to reasons other than coming in contact with an object or a person. Basketball players are at risk for ACL injury due to the movements they make during play. A study was conducted by Omi et al. [14] to identify the effectiveness of an intervention that aimed to alter risk factors like landing mechanics, muscular strength, postural control, and hip joint control.

The graphic shown (figure 3 from the manuscript [15] ) shows the following rates:

  • Incidence rate of noncontact ACL injury for 309 athletes who did not receive an intervention (the initial observation period) [Total of 13 injuries]
  • Incidence rate for 268 athletes who received Intervention I (players used a ball to simulate basketball rebounding motions and worked to have appropriate knee alignment during landing) [Total of five injuries]
  • Incidence rate for 268 athletes who received Intervention II (an upgrade to Intervention I that included [a] application of a flexible band at the thigh level in all jump-landing maneuvers except for contact jump to reduce hip adduction, hip internal rotation, and knee valgus; [b] implementation of hip external rotation strengthening in addition to hip abduction strengthening; and [c] enhancement in quality of balance exercises such as cross-leg hop forward and side hop) [Total of three injuries]
  • Combined incidence rate for Interventions I and II

If you need more numbers to follow along, download the manuscript 17 from PubMed. Remember that rounding differently and using the rates per 1000 athlete-exposures (aka, person-time) as opposed to incidence per total in the group results in differences in numbers during calculations.

For the purpose of our example, we’ll refer only to the Observation, Intervention I, and Intervention II parts of the graphic.

  • Risk of noncontact ACL injury during Observation = 0.21
  • Risk of noncontact ACL injury during Intervention I = 0.09
  • Risk of noncontact ACL injury during Intervention II = 0.08

How much of the risk of noncontact ACL injury during Intervention I is due to participating in Intervention I?

If we are comparing Intervention I to the Observation (which can be considered baseline since no intervention has taken place):

\frac{0.09}{0.21}

Athletes who participate in Intervention I have 0.43 times the risk of a noncontact ACL injury compared to athletes at baseline. Intervention I seems to reduce the risk of noncontact ACL injury.

Attributable risk (risk difference) = 0.09 – 0.21 = -0.13

Our risk difference is negative. The risk of a noncontact ACL injury is reduced by 13 percent in those who participate in Intervention I.

Relative Risk Reduction = 1 – 0.43 = 0.57

The intervention reduces the risk of noncontact ACL injuries by 57 percent.

\frac{13}{309}

The intervention reduces the risk of noncontact ACL injury 2.5 percent compared to baseline.

\frac{1}{0.025}

To prevent a noncontact ACL injury in just 1 athlete, 40 athletes must participate in the intervention.

Number Needed to Harm = N/A [There is a positive NNT, so there is no NNH for Intervention I]

If we are comparing Intervention II to the Observation:

\frac{0.08}{0.21}

Athletes who participate in Intervention II have 0.38 times the risk of a noncontact ACL injury compared to athletes at baseline. Intervention II seems to reduce the risk of noncontact ACL injury.

Attributable risk (risk difference) = 0.08 – 0.21 = -0.14

Our risk difference is negative. The risk of a noncontact ACL injury is reduced by 14 percent in those who participate in Intervention II.

Relative Risk Reduction = 1 – 0.38 = 0.62

The intervention reduces the risk of noncontact ACL injuries by 62 percent.

\frac{3}{268}

The intervention reduces the risk of noncontact ACL injury 3 percent compared to baseline.

Number Needed to Treat = 1/0.03 = 34

To prevent a noncontact ACL injury in just 1 athlete, 34 athletes must participate in the intervention.

Number Needed to Harm = N/A [There is a positive NNT, so there is no NNH for Intervention II]

Both the relative risk of noncontact ACL injury after Intervention I and after Intervention II are less than half the risk of noncontact ACL injury when no intervention was used. Intervention II had a slight improvement over Intervention I for how much it reduced the risk of noncontact ACL injury when comparing the absolute risk reductions vs baseline (3 percent vs 2.5 percent).

Attributable risk and its derivatives are important when we are considering a specific population, but often when we develop medications or create other interventions we are considering how much impact they will have on the overall burden of a health problem. Extending our example (figure 3.22), how many noncontact ACL injuries could we have eliminated from the entire population if we eliminated them from women’s basketball? The answer to this question is the population attributable risk . The population attributable risk is the absolute level of risk of the outcome in the whole population due to the exposure. The difference between this and the attributable risk is that this applies to the risk reduction even in those that do not have the exposure . One way to calculate this is:

\text { PAR }=\frac{A R}{\frac{\text { Total exposed }}{\text { Total in the population }}}

Example: Population attributable risk

Say there are 4500 NCAA women’s basketball players. Based on our example data for Intervention I:

A R=-0.14

Just like the AR, it can be easier to understand this as a percentage.

\text { PAR percent }=\left\{\frac{A R}{(A+B) *\left\lceil\frac{A+C}{N}\right]}\right\} * 100

By implementing Intervention II among all NCAA women’s basketball players, we would reduce the total burden of noncontact ACL injuries in this population by less than 1 percent. This intervention may work well on an individual level but not as a population level intervention for noncontact ACL injuries.

Want to dive deeper into how the ARR and the RRR should (and shouldn't) be used in real life?

Here’s a great explanation related to how not to confuse the public with the COVID-19 vaccination. [16]

Here's a helpful video on how to calculate the NNT. [17]

Interested in why you need the RR to calculate the AR? [18]

Here's an article on how to use risk difference, risk ratio, and odds ratio in clinical medicine. [19]

3.8 Reporting Results of Epidemiologic and Clinical Studies

There are various standards for the reporting of study results and methods. Figure 3.24 provides an example list of different standards. You can find additional standards for various disciplines and different types of studies at the EQUATOR network website. [20]

Standard name Acronym Website
Consolidated standards of reporting trials CONSORT
Strengthening the reporting of observational studies in epidemiology STROBE
Standards for reporting studies of diagnostic accuracy STARD
Quality assessment of diagnostic accuracy studies QUADAS
Preferred reporting items for systematic reviews and meta-analyses PRISMA
Consolidated criteria for reporting qualitative research COREQ
Statistical analyses and methods in the published literature SAMPL
Consensus-based clinical case reporting guideline development CARE
Standards for quality improvement reporting excellence SQUIRE
Consolidated health economic evaluation reporting standards CHEERS
Enhancing transparency in reporting the synthesis of qualitative research ENTREQ

Figure 3.24: Standards for study design and reporting.

In addition to reporting study results, it is also normal and helpful to report on how studies were designed and implemented. This reporting of methods helps others better understand all the work that goes into obtaining results as well as potential roadblocks to watch out for when designing a similar study to expand what is known about a topic. The CARE Consortium published a journal article in 2017 about how they built a national study of concussion in service academy students and collegiate athletes with the Department of Defense. [21]

Figure Descriptions

Figure 3.1: Flow chart. Following pathway to left: Controlled assignment of subjects to study conditions arrow to randomized (clinical trials) or non-randomized/quasi-experimental (community trials). Clinical trials and community trials are types of experimental studies. Following pathway to right: Uncontrolled/not randomized assignment of subjects to study conditions, arrow to sampling with regard to exposure, characteristic, or cause (prospective studies). Sampling with regard to disease or effect, arrow to time of exposure/characteristic. Exposure or characteristic at time of study (cross-sectional studies). History of exposure or characteristic prior to time of study (retrospective studies). Prospective studies, retrospective studies, and cross-sectional studies are types of observational studies. Return to figure 3.1 .

Figure 3.5: Headers on top of table are outcome (+) and outcome (-). Headers to left of the table are exposed (+) and exposed (-). If outcome (+) and exposed (+), A. If outcome (-) and exposed (+), B. If outcome (+) and exposed (-), C. If outcome (-) and exposed (-), D. Reading left to right in the table: A, B, C, D. Outside of the table are calculations for finding total exposed and total outcome. Below the table left to right: total exposed, A+C, B+D, A+B+C+D. Right of the table top to bottom: total outcome, A+B, C+D, A+B+C+D. Total population represented by A+B+C+D in bottom right corner. Return to figure 3.5 .

Figure 3.6: Cross-sectional study (natural allocation): in the present, risk factor (+) and risk factor (-) point to compare disease prevalence. Case-control study (natural allocation): in the present, controls without disease and diseased cases both point to past box stating compare risk factor frequency. Retrospective study (natural allocation): in the past, risk factor (+) and risk factor (-) point to present box stating compare disease incidence; another box in present time states review previous records with a dotted arrow pointing back to the past risk factors. Prospective cohort study (natural allocation): in the present, risk factor (+) and risk factor (-) point to future box stating compare disease incidence. Randomized control trial (random allocation): in the present, risk factor (+) and risk factor (-) point to future box stating compare disease incidence. Return to figure 3.6 .

Figure 3.7: X-axis displays number of matches played (ranging from 0 to 350). Y-axis displays number of concussions (ranging from 0 to 12). Roughly 50 data points on the graph with a regression line indicating the average. As number of matches played increases, the number of concussions increases. Return to figure 3.7 .

Figure 3.8: Cohort study: study population is disease-free and at-risk. Half of the study population is labeled cohort 1 (exposed group), the other half is labeled cohort 2 (unexposed group). Of the cohort 1 group, there are some with disease and some with no disease. Of the cohort 2 group, there are some with disease and some with no disease. Diseased status in two cohorts is identified. Case control study: there are separate groups based on outcome status. First group: cases (outcome present). Second group: controls (outcome absent). Each of these groups have subgroups where there is either a present exposure or an absent exposure. Return to figure 3.8 .

Figure 3.9: Headers on top of table are outcome (+) and outcome (-). Headers to left of the table are exposed (+) and exposed (-). If outcome (+) and exposed (+), A. If outcome (-) and exposed (+), B. If outcome (+) and exposed (-), C. If outcome (-) and exposed (-), D. Reading left to right in the table: A, B, C, D. Outside of the table are calculations for finding total exposed and total outcome. Below the table left to right: total exposed, A+C, B+D, A+B+C+D. Right of the table top to bottom: total outcome, A+B, C+D, A+B+C+D. Total population represented by A+B+C+D in bottom right corner. Return to figure 3.9 .

Figure 3.10: Three separate 2x2 tables. First: Outcome (-) and outcome (+) are above the table and exposure (-) and exposure (+) are left of the table. Second: exposure (+) and exposure (-) are above the table and outcome (+) and outcome (-) are left of the table. Third: exposure (-) and exposure (+) are above the table and outcome (-) and outcome (+) are left of the table. Return to figure 3.10 .

Figure 3.12: OR < 1 (e.g., 0.9): exposure less likely in those with outcome compared to those without the outcome. OR = 1: no difference. OR > 1 (e.g., 1.1): exposure more likely in those with outcome compared to those without the outcome. Return to figure 3.12 .

Figure 3.13: RR < 1 (e.g., 0.9): disease less likely in the exposed group compared to those that are unexposed. RR = 1: no difference. RR > 1 (e.g., 1.1): disease more likely in the exposed group compared to those that are unexposed. Return to figure 3.13 .

Figure 3.14: 1: Establish the existence of an outbreak. 2: Verify the diagnosis. 3: Construct a working case definition. 4: Find cases systematically and record information. 5: Perform descriptive epidemiology. 6: Develop hypotheses. 7: Evaluate hypotheses epidemiologically. 8: As necessary, reconsider, refine, and re-evaluate hypotheses. 9: Compare and reconcile with laboratory and/or environmental studies. 10: Implement control and prevention measures. 11: Initiate or maintain surveillance findings. Steps 8-11 often happen simultaneously. Return to figure 3.14 .

Figure 3.15: 1: familial disease/known genetic defect. 2: clinical and laboratory criteria (5/8 criteria should be fulfilled). Criteria: fever, splenomegaly, cytopenia greater than or equal to 2 cell lines (hemoglobin less than 90 g/l or less than 120 g/l if below 4 weeks of age, platelets less than 100 x 10^9/l, neutrophils less than 1 x 10^9/l), hypertriglyceridemia and/or hypofibrinogenemia (fasting triglycerides greater than or equal to 3 mmol/l, fibrinogen less than 1.5 g/l), ferritin greater than or equal to 500 mu g/l, soluble IL-2 receptor 25 greater than or equal to 2400 U/ml, decreased or absent natural killer cell activity, hemophagocytosis in bone marrow, cerebrospinal fluid, or lymph nodes. Supportive evidence is cerebral symptoms with moderate pleocytosis and/or elevated protein, elevated transaminases, bilirubin, lactate dehydrogenase. Return to figure 3.15 .

Figure 3.17: 2x2 table. Above table labels: sick (outcome) and not sick (outcome). Left table labels: ate salad (exposure) and didn't eat salad (exposure). A: 48 (sick and ate salad). B: 20 (not sick and ate salad). C: 2 (sick and didn't eat salad). D: 100 (not sick and didn't eat salad). Return to figure 3.17 .

Figure 3.18: Above the table is outcome (+) and outcome (-). Left of the table is exposed (+) and exposed (-). If outcome (+) and exposed (+), A. If outcome (-) and exposed (+), B. If outcome (+) and exposed (-), C. If outcome (-) and exposed (-), D. Reading left to right in the table: A, B, C, D. Outside of the table are calculations for finding total exposed and total outcome. Below the table left to right: total exposed, A+C, B+D, A+B+C+D. Right of the table top to bottom: total outcome, A+B, C+D, A+B+C+D. Additional rightmost column: risk. A/(A+B) and C/(C+D). Return to figure 3.18 .

Figure 3.19: Attributable risk: Of everyone that has the exposure, how much of the occurrence of the disease is due to the exposure in question? Example: Of everyone that plays racquet sports, how many ankle sprains are due to playing racquet sports? Example follows. Total exposed (play racquet sports): 16 people (A=10 and B=6). A represents people that have ankle sprains (outcome). A (10) divided by total exposed (16) equals 0.625. Total unexposed (don't play racquet sports): 20 people (C=10 and D=10). C represents people that have ankle sprains (outcome). C (10) divided by total unexposed (20) equals 0.5. Return to figure 3.19 .

Figure 3.21: Three boxed columns with steps for calculations of relative risk reduction, number needed to treat, and number needed to harm based on relative risk. If risk in exposed is smaller than baseline, AR is negative. If risk in exposed is larger than baseline, AR is positive. Left column: When relative risk is equal to one, the baseline risk and risk in exposed are equal. Calculating RR: 4 (risk in exposed) divided by 4 (baseline risk) equals an RR of 1. Calculating AR: 4 (risk in exposed) minus by 4 (baseline risk) equals an AR of 0. Calculating ARR: absolute value of 4 (baseline risk) minus 4 (risk in exposed) equals an ARR of 0. Middle column: When relative risk is greater than one, the baseline risk is smaller than the risk in exposed. Calculating RR: 5 (risk in exposed) divided by 3 (baseline risk) equals an RR of 1.667. Calculating AR: 5 (risk in exposed) minus 3 (baseline risk) equals an AR of 2. Calculating ARR: absolute value of 3 (baseline risk) minus 5 (risk in exposed) equals an ARR of 2. Right column: When relative risk is less than one, the baseline risk is larger than the exposed. Calculating RR: 2 (risk in exposed) divided by 7 (baseline risk) equals an RR of 0.286. Calculating AR: 2 (risk in exposed) minus 7 (baseline risk) equals an AR of -5. Calculating ARR: absolute value of 7 (baseline risk) minus 2 (risk in exposed) equals an ARR of 5. Return to figure 3.21 .

Figure 3.22: Bar chart showing incidence of noncontact ACL injury. Incidence on x-axis and rates on y-axis. Observation: 0.21. Intervention one: 0.09. Intervention two: 0.08. Intervention one and two: 0.08. Return to figure 3.22 .

Figure 3.23: Exposure = women's basketball. Outcome = Noncontact ACL injuries. Noncontact ACL injuries due to women's basketball is a small subset of all noncontact ACL injuries. If we eliminate the small subset, how much does the all noncontact ACL injuries category shrink? Population attributable risk (PAR) equals (risk in exposed minus risk in unexposed) divided by (number exposed divided by total population). Risk in exposed = A divided by (A+B). Risk in unexposed = C divided by (C+D). Return to figure 3.23 .

Figure References

Figure 3.1: Overview of study designs. Kindred Grey. 2022. Adapted under fair use from Lilienfeld AM. Advances in quantitative methods in epidemiology. Public Health Rep . 1980;95(5):462–469.

Figure 3.2: Bhopal RS. The concept of risk and fundamental measures of disease frequency: Incidence and prevalence. In: Bhopal, RS. Concepts of epidemiology: Integrating the ideas, theories, principles and methods of epidemiology. Oxford University Press; 2008:201–234.

Figure 3.3: Epidemiological study designs. Adapted under fair use from USMLE First Aid, Step 1.

Figure 3.4: Community trials and clinical trials. Adapted under fair use from USMLE First Aid, Step 1.

Figure 3.5: Example 2x2 table. Kindred Grey. 2022. CC BY 4.0 .

Figure 3.6: Temporality. Kindred Grey. 2022. CC BY 4.0 .

Figure 3.7: Ecological relationship between concussion incidence and matches played. Kindred Grey. 2022. CC BY 4.0 . Data from Gibbs N, Watsford M. Concussion incidence and recurrence in professional Australian football match-play: A 14-year analysis. J Sports Med (Hindawi Publ Corp). 2017;2017:2831751. DOI:10.1155/2017/2831751

Figure 3.8: Case control versus cohort studies. Kindred Grey. 2022. Includes person by Gan Khoon Lay from Noun Project ( Noun Project License ). Adapted under fair use from Song JW, Chung KC. Observational studies: Cohort and case-control studies. Plast Reconstr Surg . 2010;126(6):2234–2242. DOI:10.1097/PRS.0b013e3181f44abc

Figure 3.9: Example 2x2 table. Kindred Grey. 2022. CC BY 4.0 .

Figure 3.10: Example of alternative 2x2 tables. Kindred Grey. 2022. CC BY 4.0 .

Figure 3.11: The variety of measures that can be calculated from different study designs. Thiese MS. Observational and interventional study design types: An overview. Biochem Med (Zagreb). 2014;24(2):199–210. DOI:10.11613/BM.2014.022 ( CC BY-NC-ND 3.0 )

Figure 3.12: Interpreting odds ratios. Kindred Grey. 2022. CC BY 4.0 .

Figure 3.13: Interpreting relative risks. Kindred Grey. 2022. CC BY 4.0 .

Figure 3.14: Steps to solving an outbreak. Kindred Grey. 2022. CC BY 4.0 . Adapted from CDC . Public Domain.

Figure 3.15: Example case definition. Kindred Grey. 2022. Adapted under fair use from Janka GE. Familial and acquired hemophagocytic lymphohistiocytosis. Annu Rev Med . 2012;63:233–246. DOI:10.1146/annurev-med-041610-134208 and Henter J-I, Horne A, Aricó M, et al. HLH-2004: Diagnostic and therapeutic guidelines for hemophagocytic lymphohistiocytosis. Pediatr Blood Cancer . 2007;48(2):124–131.

Figure 3.16: Example line listing. Data from table 6.5 of Lesson 6: Investigating an outbreak, from CDC . Public domain.

Figure 3.17: Example attack rate. Kindred Grey. 2022. CC BY 4.0 .

Figure 3.18: Using a 2x2 table to calculate attributable risk. Kindred Grey. 2022. CC BY 4.0 .

Figure 3.19: Calculating attributable risk. Kindred Grey. 2022. CC BY 4.0 .

Figure 3.21: Graphical representation of figure 3.20. Kindred Grey. 2022. CC BY 4.0 .

Figure 3.22: EXAMPLE NNH, NNT, ARR, RRR. Kindred Grey. 2022. CC BY 4.0 . Data from Omi Y, Sugimoto D, Kuriyama S, et al. Effect of hip-focused injury prevention training for anterior cruciate ligament injury reduction in female basketball players: A 12-year prospective intervention study. Am J Sports Med . 2018;46(4):852–861. DOI:10.1177/0363546517749474

Figure 3.23: Calculating the population attributable risk using women’s basketball injuries. Kindred Grey. 2022. CC BY 4.0 .

Figure 3.24: Standards for study design and reporting. Adapted under fair use from Thiese MS. Observational and interventional study design types: An overview. Biochem Med (Zagreb). 2014;24(2):199–210. DOI:10.11613/BM.2014.022 ( CC BY-NC-ND 3.0 )

  • Which Came First—The Chicken or the Egg? https://www.youtube.com/watch?v=1a8pI65emDE . AsapSCIENCE via YouTube; 2013. ↵
  • Gibbs N, Watsford M. Concussion incidence and recurrence in professional Australian football match-play: A 14-year analysis. J Sports Med (Hindawi Publ Corp). 2017;2017:2831751. ↵
  • Rosling H. The best stats you've ever seen. https://www.youtube.com/watch?v=hVimVzgtD6w . TED via YouTube; 2006. ↵
  • Martinez MW, Tucker AM, Bloom OJ, et al. Prevalence of inflammatory heart disease among professional athletes with prior COVID-19 infection who received systematic return-to-play cardiac screening. JAMA Cardiol . 2021;6(7):745–752. ↵
  • Centers for Disease Control and Prevention. Investigating an outbreak. In: Principles of epidemiology . 3rd ed: U.S. Department of Health and Human Services; 2006:6-1–6-78. ↵
  • Centers for Disease Control and Prevention. Surveillance case definitions for current and historical conditions. https://ndc.services.cdc.gov/ . Updated 2023. Accessed September 15, 2023. ↵
  • Virginia Department of Health. Rules and regulations of the Board of Health, Commonwealth of Virginia. https://www.vdh.virginia.gov/surveillance-and-investigation/ . Published 2021. Accessed 2021. ↵
  • Centers for Disease Control and Prevention. Investigating outbreaks: Using data to link foodborne disease outbreaks to a contaminated source. https://www.cdc.gov/foodsafety/outbreaks/basics/data-types-collected.html?CDC_AA_refVal=https%3A%2F%2Fwww.cdc.gov%2Ffoodsafety%2Foutbreaks%2Finvestigating-outbreaks%2Findex.html . Published 2016. Accessed 2021. ↵
  • Centers for Disease Control and Prevention. Morbidity and mortality weekly report. https://www.cdc.gov/mmwr/index.html . Published 2022. Accessed 2022. ↵
  • Centers for Disease Control and Prevention. CDC WONDER. http://wonder.cdc.gov . Reviewed 2023. Accessed 15 September 2023. ↵
  • Newcombe RG, Bender R. Implementing GRADE: Calculating the risk difference from the baseline risk and the relative risk. Evid Based Med . 2014;19(1):6–8. ↵
  • Irwig L, Irwig J, Trevena L, Sweet M. Relative risk, relative and absolute risk reduction, number needed to treat and confidence intervals. In: Smart Health Choices: Making Sense of Health Advice .Hammersmith Press; 2008. ↵
  • Omi Y, Sugimoto D, Kuriyama S, et al. Effect of hip-focused injury prevention training for anterior cruciate ligament injury reduction in female basketball players: A 12-year prospective intervention study. Am J Sports Med . 2018;46(4):852–861. ↵
  • Reuters Fact Check. Fact Check: Why relative risk reduction, not absolute risk reduction, is most often used in calculating vaccine efficacy. 2023. https://www.reuters.com/article/factcheck-thelancet-riskreduction/fact-check-why-relative-risk-reduction-not-absolute-risk-reduction-is-most-often-used-in-calculating-vaccine-efficacy-idUSL2N2NK1XA . Accessed 15 September 2023. ↵
  • The NNT Group. theNNT, explained. https://www.thennt.com/thennt-explained/ . Accessed 15 September 2023. ↵
  • Noordzij M, van Diepen M, Caskey FC, Jager KJ. Relative risk versus absolute risk: One cannot be interpreted without the other. Nephrol Dial Transplant . 2017;32(suppl 2):ii13–ii18. ↵
  • Kim HY. Statistical notes for clinical researchers: Risk difference, risk ratio, and odds ratio. Restor Dent Endod . 2017;42(1):72–76. ↵
  • EQUATOR network. Enhancing the QUAlity and Transparency Of health Research. https://www.equator-network.org/. Published 2023. Accessed 15 September 2023. ↵
  • Broglio SP, McCrea M, McAllister T, et al. A national study on the effects of concussion in collegiate athletes and US Military Service Academy members: The NCAA-DoD Concussion Assessment, Research and Education (CARE) consortium structure and methods. N Zeal J Sports Med . 2017;47(7):1437–1451. ↵

Epidemiology Copyright © 2023 by Charlotte Baker is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Epidemiologic Case Studies

Classroom case studies.

In these case studies, a group of students works through a public health problem with guidance from a knowledgeable instructor. Each case study begins with the recognition of the problem and proceeds through the resulting investigation in a linear fashion. Information about the problem is slowly revealed to the students. Periodic open-ended questions are used to highlight important aspects of the investigation and provoke discussion and exchange of ideas among participants.

Paper-based student and instructor versions are available for each classroom case study. The instructor's version provides teaching points and basic information about each of the questions. Selected case studies are available in Spanish and German.

File Formats Help:

  • Adobe PDF file
  • Microsoft PowerPoint file
  • Microsoft Word file
  • Microsoft Excel file
  • Audio/Video file
  • Apple Quicktime file
  • RealPlayer file
  • Zip Archive file
  • Page last reviewed: April 7, 2016
  • Page last updated: April 7, 2016
  • Office of Public Health Scientific Services ;
  • Center for Surveillance, Epidemiology, and Laboratory Services ;
  • Division of Scientific Education and Professional Development

Web Analytics

Introduction to Epidemiological Studies

  • First Online: 07 June 2018

Cite this protocol

what is a case study epidemiology

  • Lazaros Belbasis 3 &
  • Vanesa Bellou 3  

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1793))

8663 Accesses

26 Citations

The basic epidemiological study designs are cross-sectional, case-control, and cohort studies. Cross-sectional studies provide a snapshot of a population by determining both exposures and outcomes at one time point. Cohort studies identify the study groups based on the exposure and, then, the researchers follow up study participants to measure outcomes. Case-control studies identify the study groups based on the outcome, and the researchers retrospectively collect the exposure of interest. The present chapter discusses the basic concepts, the advantages, and disadvantages of epidemiological study designs and their systematic biases, including selection bias, information bias, and confounding.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

what is a case study epidemiology

Cohort Studies

what is a case study epidemiology

Porta M (ed) (2014) A dictionary of epidemiology. Oxford University Press, Oxford

Google Scholar  

Morton NE (1997) Genetic epidemiology. Ann Hum Genet 61:1–13

Article   CAS   PubMed   Google Scholar  

Boslaugh SE (2007) Genetic epidemiology. In: Boslaugh SE (ed) Encyclopedia of epidemiology. SAGE Publications, Thousand Oaks, pp 417–420

Khoury M, Little J, Burke W (2004) Human genome epidemiology: scope and strategies. In: Human genome epidemiology. Oxford University Press, New York, pp 3–16

Cordell HJ, Clayton DG (2005) Genetic association studies. Lancet 366:1121–1131

Article   PubMed   Google Scholar  

Grimes DA, Schulz KF (2002) Descriptive studies: what they can and cannot do. Lancet (London, England). 359:145–149

Article   Google Scholar  

Grimes DA, Schulz KF (2002) Cohort studies: marching towards outcomes. Lancet (London, England). 359:341–345

Ioannidis JPA, Munafò MR, Fusar-Poli P et al (2014) Publication and other reporting biases in cognitive sciences: detection, prevalence, and prevention. Trends Cogn Sci 18:235–241

Article   PubMed   PubMed Central   Google Scholar  

Evangelou E, Ioannidis JPA (2013) Meta-analysis methods for genome-wide association studies and beyond. Nat Rev Genet 14:379–389

Grimes DA, Schulz KF (2002) Bias and causal associations in observational research. Lancet (London, England) 359:248–252

Gordis L (2014) Case-control and other study designs. In: Epidemiology. Saunders, Philadelphia, pp 189–214

Schulz KF, Grimes DA (2002) Case-control studies: research in reverse. Lancet (London, England). 359:431–434

Parr CL, Hjartåker A, Laake P et al (2009) Recall bias in melanoma risk factors and measurement error effects: a nested case-control study within the Norwegian women and Cancer study. Am J Epidemiol 169:257–266

Wacholder S, Silverman DT, McLaughlin JK et al (1992) Selection of controls in case-control studies. III. Design options. Am J Epidemiol 135:1042–1050

Rothman K, Greenland S, Lash T (2008) Validity in epidemiologic studies. In: Modern epidemiology. Lippincott Williams & Wilkins, Philadelphia, pp 128–148

Download references

Author information

Authors and affiliations.

Department of Hygiene and Epidemiology, University of Ioannina Medical School, Ioannina, Greece

Lazaros Belbasis & Vanesa Bellou

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Lazaros Belbasis .

Editor information

Editors and affiliations.

Evangelos Evangelou

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Belbasis, L., Bellou, V. (2018). Introduction to Epidemiological Studies. In: Evangelou, E. (eds) Genetic Epidemiology. Methods in Molecular Biology, vol 1793. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7868-7_1

Download citation

DOI : https://doi.org/10.1007/978-1-4939-7868-7_1

Published : 07 June 2018

Publisher Name : Humana Press, New York, NY

Print ISBN : 978-1-4939-7867-0

Online ISBN : 978-1-4939-7868-7

eBook Packages : Springer Protocols

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research
  • Chapter 1. What is epidemiology?
Diagnosed by clinician and confirmed by pathologist53
Diagnosed by clinician and not confirmed by pathologist21
First diagnosed post mortem22
Farmers (self employed)82%
Professionals77%
Skilled manual workers69%
Labourers63%
Armed forces42%
  • Chapter 2. Quantifying disease in populations
  • Chapter 3. Comparing disease rates
  • Chapter 4. Measurement error and bias
  • Chapter 5. Planning and conducting a survey
  • Chapter 6. Ecological studies
  • Chapter 7. Longitudinal studies
  • Chapter 8. Case-control and cross sectional studies
  • Chapter 9. Experimental studies
  • Chapter 10. Screening
  • Chapter 11. Outbreaks of disease
  • Chapter 12. Reading epidemiological reports
  • Chapter 13. Further reading

Follow us on

Content links.

  • Collections
  • Health in South Asia
  • Women’s, children’s & adolescents’ health
  • News and views
  • BMJ Opinion
  • Rapid responses
  • Editorial staff
  • BMJ in the USA
  • BMJ in Latin America
  • BMJ in South Asia
  • Submit your paper
  • BMA members
  • Subscribers
  • Advertisers and sponsors

Explore BMJ

  • Our company
  • BMJ Careers
  • BMJ Learning
  • BMJ Masterclasses
  • BMJ Journals
  • BMJ Student
  • Academic edition of The BMJ
  • BMJ Best Practice
  • The BMJ Awards
  • Email alerts
  • Activate subscription

Information

  • Board of Directors
  • Partnerships
  • Contact APTR
  • Become A Member
  • Member Benefits
  • Academic Unit Membership
  • Individual Membership
  • Public Health Programs
  • Awards Program
  • American Journal of Preventive
  • AJPM Focus-Open Access Journal
  • APTR Community Forum
  • APTR Committees
  • Clinical Prevention Framework
  • Acceleration Projects
  • IPE Curriculum Guide
  • Health Literacy Curriculum
  • Immunization Framework
  • Undergraduate Public Health
  • Learning Modules
  • Case Studies
  • Teaching Complex Issues
  • Call for Webinars
  • Academic Partnership Models
  • Instructor Guides
  • Publications
  • Training Opportunities
  • Past Presentations
  • Job Postings
  • Advocacy Activities
  • Policy & Position Statements
  • APTR-CDC Projects
  • Environmental Health Education
  • Healthy People Curriculum TF
  • Health In All Education
  • Teaching Prevention
  • APTR News Now
  • Latest News
Name:
Category:
Share:
CDC Epidemiology Case Studies
CDC developed case studies in applied epidemiology based on real-life epidemiologic investigations and used them for training new Epidemic Intelligence Service (EIS) officers — CDC’s “disease detectives.” EIS offers these carefully crafted epidemiology case studies for schools of medicine, nursing, and public health to use as a component of an applied epidemiology curriculum.

Students may practice their epidemiologic skills by using these exercises in classroom activities or as homework assignments to reinforce principles and skills previously covered in lectures and reading assignments.

The following case studies use specific examples to teach epidemiology concepts, require active participation, and help strengthen problem-solving skills. These case studies in applied epidemiology:

Case study based on a 1985 outbreak with unknown etiology and mode of transmission in multiple states. Updated in 2003.

Case study based on the classic studies of Doll and Hill in the 1950s. Addresses study design, interpretation of measures of association, and impact of association. 

Case study based on a 1980–1982 multicenter case-control study. Addresses bias and analysis of case-control studies. Updated in 2005.

Case study of a classic, straightforward outbreak investigation in a defined population. Based on a 1940 outbreak of Staphylococcus aureus among church picnic attendees. Additional material: 

Case study based on a community outbreak of Legionnaires’ disease in Bogalusa, Louisiana in 1989. Addresses the steps of a field investigation and a case-control study.. Updated in 2003.

 (2003 Update)

Case study based on surveillance and investigation activities of the Oregon Health Division between 1986 and 1995. 

Case study based on an infectious disease outbreak investigation in Texas.

Instructor guides/Preceptor versions for teachers/faculty can be purchased from the . Instructor guides are available FREE for APTR members and are $20 for non-members.

what is a case study epidemiology

Remember Me

what is a case study epidemiology

7/16/2024 Call for Teaching Prevention 2025 Planning Committee Members

6/17/2024 APTR and AHRQ Welcome New Class of Residents

5/1/2024 APTR Convenes Clinical Health Professions Curriculum Task Force: Curriculum Framework 5th Revision Underway

4/3/2024 APTR Announces 2024 Award Recipients

3/10/2025 » 3/12/2025 Teaching Prevention 2025

what is a case study epidemiology

2000 Pennsylvania Avenue, NW | Suite 7000 Washington, DC 20006 [email protected]

Connect With Us

Our mission, bringing together individuals and institutions devoted to health promotion and disease prevention to redefine how we educate the health professions workforce..

Skip to content

Read the latest news stories about Mailman faculty, research, and events. 

Departments

We integrate an innovative skills-based curriculum, research collaborations, and hands-on field experience to prepare students.

Learn more about our research centers, which focus on critical issues in public health.

Our Faculty

Meet the faculty of the Mailman School of Public Health. 

Become a Student

Life and community, how to apply.

Learn how to apply to the Mailman School of Public Health. 

Case-Crossover Study Design

Software

Websites

Courses

This page briefly describes case-crossover designs as an approach to investigating acute triggers that are potentially causing disease. An annotated resource list is provided.

Description

A “trigger” can be thought of as the final step in leading from pathophysiology to disease, or the final component cause leading a susceptible person to experience a disease. Triggers thus may be important for our understanding of etiology. In addition, a new understanding of disease triggers can help us to prevent disease through trigger reduction, reduction of baseline risk, or a targeted intervention to reduce risk at a time when disease is more likely to occur.

For a potential hypothesis about a trigger to be tested using a case-crossover design, we would look for the following defining characteristics:

short-term changes in exposure

transient changes in disease risk

acute-onset disease

Related study designs

Study designs used to examine exposure outcome association include cohort and case-control studies. Whereas cohort studies can be limited in power for rare disease outcomes, and case-control studies can be biased due to retrospective exposure assessment, case-crossover designs compare individuals to themselves at different times. This parallels the randomized crossover trial approach that compares individuals to themselves as they are going on and off treatment.

Something that the case-crossover design has in common with a case-control approach is the need to find representative controls. However, while case-control designs select control individuals, case-crossover designs select control time windows. This brings our focus to the plausible temporal relationship between exposure to a trigger and disease onset.

Other time-focused designs include ecological time-series data, and interrupted time-series data. Disease counts over time can be modeled across time in a generalized linear modeling framework, often using Poisson regression. For Poisson models the beta coefficient provides information about rate comparisons on a relative scale because they use a log link.

Setting up a case-crossover analysis

A key decision point in setting up a case-crossover study is to decide for what length of time before disease onset would exposure be compatible with triggering. That is your “case” window. For example, if you think physical activity could trigger a myocardial infarction in the subsequent 2 hours, you could identify a case window starting 2 hours before symptom onset, and ending at the time of symptom onset. Sensitivity analyses might be planned altering the length of this time window.

Selection of one or more control windows is then designed to identify whether exposure during the case window was atypical. By comparing exposures over time within the same person, you automatically condition on all stable characteristics of the individual. You might also want to match on potential time-varying confounders such as time of day. Usually, case and control windows are the same length. It may be efficient to select multiple control time windows for every case time window. Control windows should reflect the exposure distribution while at risk for the outcome, should be close enough in time that the baseline risk is similar, and should be far apart enough in time so that exposures are uncorrelated.

Once you have constructed your case and control time windows, compare the probability of exposure during case and control periods. This is usually done using conditional logistic regression (similar to a matched case-control study).

An opportunity to consider when working with a case-crossover design is that although you do not typically need to control for many individual characteristics, you can evaluate effect modification by individual characteristics. For example, while looking at physical activity and myocardial infarction, you might hypothesize that triggering would be most likely to occur for individuals with hypertension.

Methodological Articles

Maclure M. The case-crossover design: a method for studying transient effects on the risk of acute events. Am J Epidemiol 1991; 133(2):144-53.

Maclure M, Mittleman MA. Should we use a case-crossover design? Annu Rev Public Health 2000;21:193-221.

Lu Y, Zeger SL. On the equivalence of case-crossover and time series methods in environmental epidemiology. Biostatistics 2007;8(2):337-344.

Lu Y, Symons JM, Geyh AS, Zeger SL. An approach to checking case-crossover analyses based on equivalence with time-series methods. Epidemiology 2008; 19(2):169-75

Maclure M, Mittleman MA. Case-crossover designs compared with dynamic follow-up designs. Epidemiology 2008; 19(2):176-8.

Janes H, Sheppard L, Lumley T. Case-crossover analyses of air pollution exposure data: referent selection strategies and their implications for bias. Epidemiology 2005; 16(6)717-26.

Application Articles

Basu R, Dominici F, Samet JM. Temperature and Mortality Among the Elderly in the United States: A Comparison of Epidemiologic Methods. Epidemiology 2005;16(1):58-66

Hebert C, Delaney JA, Hemmelgarn B, Levesque LE, Suissa S (2007) Benzodiazepines and elderly drivers: a comparison of pharmacoepidemiological study designs. Pharmacoepidemiol Drug Saf 16: 845–849

Join the Conversation

Have a question about methods? Join us on Facebook

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Study Protocol

Investigating the association between household exposure to Anopheles stephensi and malaria in Sudan and Ethiopia: A case-control study protocol

Contributed equally to this work with: Temesgen Ashine, Yehenew Asmamaw Ebstie, Rayyan Ibrahim, Adrienne Epstein

Roles Conceptualization, Investigation, Methodology, Writing – review & editing

Affiliations Malaria and NTD Research Division, Armauer Hansen Research Institute, Addis Ababa, Ethiopia, Department of Biology, College of Natural and Computational Sciences, Arba Minch University, Arba Minch, Ethiopia

Affiliation Malaria and NTD Research Division, Armauer Hansen Research Institute, Addis Ababa, Ethiopia

Affiliation Department of Community Medicine, Faculty of Medicine, University of Khartoum, Khartoum, Sudan

Roles Conceptualization, Investigation, Methodology, Visualization, Writing – original draft

Affiliation Department of Vector Biology, Liverpool School of Tropical Medicine, Liverpool, United Kingdom

Affiliation Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, United Kingdom

Roles Data curation, Investigation, Software, Writing – review & editing

Roles Data curation, Software, Writing – review & editing

Roles Investigation, Methodology, Writing – review & editing

Roles Investigation, Methodology, Resources, Validation, Writing – review & editing

Affiliation Sennar Malaria Research and Training Centre (SMART Centre), Federal Ministry of Health, Khartoum, Sudan

Roles Investigation, Methodology, Project administration, Writing – review & editing

Affiliation Tropical and Infectious Disease Research Centre, Jimma University, Jimma, Ethiopia

Roles Writing – review & editing

Roles Investigation, Methodology, Supervision, Writing – review & editing

Affiliation Unit of Socio-Ecological Health Research, Department of Public Health, Institute of Tropical Medicine, Antwerpen, Belgium

Affiliation School of Public Health, College of Medicine and Health Sciences, Hawassa University, Hawassa, Ethiopia

ORCID logo

Roles Funding acquisition, Investigation, Methodology, Writing – review & editing

Affiliation Primary Health Care General Directorate, Federal Ministry of Health, Khartoum, Sudan

Roles Methodology, Writing – review & editing

Affiliation Disease Prevention and Control Directorate, Ethiopian Federal Ministry of Health, Addis Ababa, Ethiopia

Affiliation Department of Biology, College of Natural and Computational Sciences, Arba Minch University, Arba Minch, Ethiopia

Roles Funding acquisition, Methodology, Supervision, Writing – review & editing

Roles Funding acquisition, Methodology, Resources, Supervision, Validation, Writing – review & editing

Roles Conceptualization, Funding acquisition, Investigation, Methodology, Supervision, Validation, Writing – review & editing

Affiliation Directorate General of Global Health, Federal Ministry of Health, Khartoum, Sudan

Roles Funding acquisition, Methodology, Writing – review & editing

Roles Conceptualization, Funding acquisition, Investigation, Methodology, Resources, Supervision, Validation, Writing – review & editing

Roles Conceptualization, Funding acquisition, Investigation, Methodology, Supervision, Writing – review & editing

  •  [ ... ],

Roles Conceptualization, Funding acquisition, Investigation, Methodology, Supervision, Writing – original draft, Writing – review & editing

* E-mail: [email protected]

  • [ view all ]
  • [ view less ]
  • Temesgen Ashine, 
  • Yehenew Asmamaw Ebstie, 
  • Rayyan Ibrahim, 
  • Adrienne Epstein, 
  • John Bradley, 
  • Mujahid Nouredayem, 
  • Mikiyas G. Michael, 
  • Amani Sidiahmed, 
  • Nigatu Negash, 

PLOS

  • Published: September 3, 2024
  • https://doi.org/10.1371/journal.pone.0309058
  • Reader Comments

Fig 1

Endemic African malaria vectors are poorly adapted to typical urban ecologies. However, Anopheles stephensi , an urban malaria vector formerly confined to South Asia and the Persian Gulf, was recently detected in Africa and may change the epidemiology of malaria across the continent. Little is known about the public health implications of An . stephensi in Africa. This study is designed to assess the relative importance of household exposure to An . stephensi and endemic malaria vectors for malaria risk in urban Sudan and Ethiopia.

Case-control studies will be conducted in 3 urban settings (2 in Sudan, 1 in Ethiopia) to assess the association between presence of An . stephensi in and around households and malaria. Cases, defined as individuals positive for Plasmodium falciparum and/or P . vivax by microscopy/rapid diagnostic test (RDT), and controls, defined as age-matched individuals negative for P . falciparum and/or P . vivax by microscopy/RDT, will be recruited from public health facilities. Both household surveys and entomological surveillance for adult and immature mosquitoes will be conducted at participant homes within 48 hours of enrolment. Adult and immature mosquitoes will be identified by polymerase chain reaction (PCR). Conditional logistic regression will be used to estimate the association between presence of An . stephensi and malaria status, adjusted for co-occurrence of other malaria vectors and participant gender.

Conclusions

Findings from this study will provide evidence of the relative importance of An . stephensi for malaria burden in urban African settings, shedding light on the need for future intervention planning and policy development.

Citation: Ashine T, Ebstie YA, Ibrahim R, Epstein A, Bradley J, Nouredayem M, et al. (2024) Investigating the association between household exposure to Anopheles stephensi and malaria in Sudan and Ethiopia: A case-control study protocol. PLoS ONE 19(9): e0309058. https://doi.org/10.1371/journal.pone.0309058

Editor: Vivekanandhan Perumal, Chiang Mai University Faculty of Agriculture, THAILAND

Received: March 6, 2024; Accepted: August 6, 2024; Published: September 3, 2024

Copyright: © 2024 Ashine et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: No datasets were generated or analysed during the current study. All relevant data from this study will be made available upon study completion.

Funding: This work was supported by the National Institute for Health Research (NIHR) (using the UK’s Official Development Assistance (ODA) Funding) and Wellcome [220870/Z/20/Z] under the NIHR-Wellcome Partnership for Global Health Research. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. The views expressed are those of the authors and not necessarily those of Wellcome, the NIHR or the Department of Health and Social Care.

Competing interests: The authors have declared that no competing interests exist.

Abbreviations: CDC LT, CDC light trap; CI, confidence interval; ETH, Ethiopia; IRS, indoor residual spraying; ITN, insecticide-treated net; M, meter; PCR, polymerase chain reaction; SOP, standard operating procedure; SUD, Sudan; RDT, rapid diagnostic test

Introduction

Africa currently has the highest rate of urbanization of any continent. The United Nations estimates that the world’s urban population will increase by 2.5 billion by 2050, with 90 percent of the growth in Asia and Africa [ 1 ]. Historically, the malaria burden in Africa has been concentrated in rural areas because African malaria vectors are not well adapted to urban ecologies [ 2 – 4 ]. However, Anopheles stephensi , a species of mosquito formerly confined to South Asia and the Persian Gulf but recently identified in Africa [ 5 , 6 ], may change the epidemiology of malaria across the African continent. An . stephensi thrives in urban environments and is a highly competent vector for both Plasmodium falciparum and P . vivax [ 7 , 8 ]; it therefore constitutes a potential new threat to African malaria control and hopes of elimination [ 5 – 7 ].

The WHO Malaria Threats Map highlights the current state of knowledge on An . stephensi detections across Africa with detections so far in Djibouti, Ethiopia, Sudan, Puntland, Nigeria, Somaliland, Ghana, Eritrea and Kenya [ 9 ]. Larval habitats of An . stephensi are typically man-made containers such as household/community water storage containers, construction water storage and overhead tanks, wells and drums, but it has also been identified in stream margins, sewage overflows, and flooded areas [ 10 – 12 ]. An . stephensi is an opportunistic vector, with biting behaviour driven by availability of hosts. Preliminary entomological surveillance in Ethiopia has revealed a propensity for resting in animal shelters [ 13 , 14 ]. Surveillance indicates both indoor and outdoor biting, indoor and outdoor resting, and a preference for biting at dusk and during the night [ 15 ]. Further investigation is needed to determine if these behaviours are observed in African populations of An . stephensi . It is possible that additional vector control strategies will be necessary to control An . stephensi , in addition to current vector control tools that target vectors indoors (insecticide-treated nets [ITNs] and indoor residual spraying [IRS]).

There is a critical need to understand the public health impact of the threat posed by An . stephensi . An efficient urban malaria transmission cycle could turn cities and towns from areas with minimal malaria transmission to large-scale sources of infection, confounding global elimination efforts. An . stephensi was first detected in Djibouti in 2012 [ 5 ], where it was associated with a significant rise in malaria cases [ 7 ], from 1,684 malaria cases in 2013 to 72,332 confirmed cases reported in 2020 [ 16 ]. A recent dry season malaria outbreak in Dire Dawa, Eastern Ethiopia appears to be associated with An . stephensi [ 17 ]. Mathematical modelling also suggests that over 100 million people in cities across Africa are at risk of An . stephensi mediated malaria transmission [ 15 ]. Modelling by Hamlet et al suggests that annual P . falciparum malaria cases in Ethiopia could increase by 50% (95% CI 14–90) if no additional interventions are implemented [ 18 ]. Despite this growing evidence suggesting the potential involvement of An . stephensi in malaria transmission, there is great uncertainty about the malaria epidemiology in towns and cities in Sudan and Ethiopia; in particular, it is not known if, or to what extent, An . stephensi contributes to malaria transmission compared to native malaria vectors.

This manuscript describes a study protocol for a case-control study aimed at assessing the relative importance of An . stephensi and endemic malaria vectors for malaria burden in urban Sudan and Ethiopia. Case-control studies have been underused for malaria [ 19 – 23 ] but are well suited for our purpose since malaria cases are at present low in urban settings [ 24 – 26 ]. Results from this study will guide public health strategies for malaria control and elimination.

Materials and methods

Study design.

We will conduct a community-based, age-matched case-control study in three study sites: two in Sudan and one in Ethiopia. Within these sites, cases and controls will be selected from health facilities over a 12-month period. Upon identification of cases and controls, entomological surveillance will be conducted at study participant households to assess the association between household entomological exposure and malaria case status.

Study areas

Study sites have been selected using the following criteria: (1) presence of An . stephensi ; (2) heterogeneity of An . stephensi density across a site; (3) ongoing malaria transmission; and (4) accessibility by study teams with minimal security threats. Entomological criteria (criteria #1 and #2) were assessed through ongoing entomological surveillance by study teams conducted at 61 sites in Sudan and 28 sites in Ethiopia. Epidemiological criteria (criterion #3) were assessed through using Health Management Information Systems (HMIS) data.

In Sudan, the study will be conducted at two sites: Tuti Island in Khartoum State (15.621457, 32.504861) and Almaelig in Gezira State (15.017992, 33.094729) ( Fig 1 ). An . arabiensis is considered the major malaria vector in both sites [ 27 ]. Tuti Island is an eight square kilometre island situated where the White Nile and Blue Nile meet in Sudan’s capital city, Khartoum, with a population of approximately 37,702. The island has a settlement, vegetable farms and orchards and is connected to the city via a single suspension bridge. An . stephensi was first detected in Tuti Island in 2018 [ 27 ] and a 2022 entomological survey found that 28% of randomly selected households had An . stephensi adults or larvae within 50 metres (Kafy et al , unpublished). Malaria transmission is seasonal, with a peak from October to December [ 28 ]. There is a single public health facility on Tuti Island from which cases will be recruited. The second site in Sudan, Almaelig, is a small town with a population of approximately 15,370. Almaelig is in the Gezira irrigation scheme which produces cotton, wheat, and groundnut. Presence of An . stephensi in Almaelig was first identified by our team in 2022 with 24% of randomly selected households positive for An . stephensi adults or larvae within 50 m of the home (Kafy et al , unpublished). Like Tuti, malaria transmission is seasonal with malaria transmitted by An . arabiensis (Kafy et al , unpublished). A total of four health facilities in Almaelig town and neighbouring settlements will be included for case and control selection: Almaelig Hospital, Aldibaiba, Alrayhana and Marakraka.

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.pone.0309058.g001

In Ethiopia, the study will be conducted in Metehara town, Oromia Region (8.903536, 39.917514) ( Fig 1 ). Metehara town has a population of approximately 47,661. The main commercial activities are service-based and retail, and farming, including subsistence farming and a large government-owned sugar cane plantation, which supplies the government owned Metehara Sugar Factory located south of the town. Malaria transmission is seasonal with peaks from September to December. In Ethiopia, An . arabiensis is the main malaria vector while An . pharoensis , An . funestus and An . nili are secondary vectors [ 29 ]. Our team and others previously reported presence of An . stephensi in Metehara in 2022 with 18% of randomly selected households positive for either An . stephensi adults or larvae within 50m [ 30 ]. There are several public, faith-based, and private health facilities in the town. The study will recruit cases and controls at two public health centres, one serving Kebele 1 (Dire Gobu) and the other serving Kebele 2 (Haro Adi).

Sample size

Entomological surveillance conducted in 2022 in and around 50 randomly selected households at each site informed the sample size calculations. Sampling for larval mosquitoes was conducted within 50m of each selected household. Similarly, adult collections were performed indoors and outdoors (within 50m of the household in Sudan, and within the compound in Ethiopia) using Prokopack aspirators and Centers for Disease Control miniature light traps (CDC LT). The proportion of households positive for An . stephensi larvae and/or adults was considered in the exposure probability for controls. A ratio of 1 case to 2 controls was adopted to increase statistical power given the identification of cases was thought to be the limiting factor in both settings.

In Sudan, across Tuti Island and Almaelig, 26% of households were exposed to An . stephensi adults or larvae (Kafy et al, unpublished). Assuming this (26%) exposure probability among controls, a 1:2 ratio of cases to controls, an odds ratio of 1.5, 20% correlation of exposure between cases and controls, 80% power, and a significance level of 5%, a total of 407 cases and 814 matched controls will be required. In Ethiopia, 18% of households in Metehara were exposed to An . stephensi adults or larvae (Ashine et al, unpublished). Assuming this (18%) exposure probability among controls, a 1:2 ratio of cases to controls, an odds ratio of 1.5, 20% correlation of exposure between cases and controls, 80% power, and a significance level of 5%, a total of 514 cases and 1028 matched controls will be required. Sample size calculations were run using the power mcc command in Stata (StataCorp. 2015. Stata Statistical Software: Release 14. College Station, TX: StataCorp LP.).

Case and control definitions

The criteria for defining cases and controls are presented in Table 1 . In both countries, the cases and controls or their caregivers must provide voluntary written informed consent to participate in the study, reside within the catchment of the health facility (within 30 minutes journey time) and be willing to be visited at home for additional data collection. In Ethiopia, an additional criterion will be that the participant must have been living in the study area for at least 4 weeks. This will restrict recruitment of non-locally derived cases given that there are seasonal workers who migrate to Metehara to work in the sugarcane plantation. The age range of cases and controls will differ between Sudan and Ethiopia. In Sudan, malaria cases are still predominantly in children; cases will therefore be aged greater than 6 months and less than 12 years. Controls will be matched to cases on two age groups: 6 months to less than 5 years; and 5 years and above to less than 12 years. In Ethiopia, HMIS data indicates that all ages are at risk for malaria; as such, cases and controls will be above 6 months of age with controls matched to cases on three age groups: above 6 months to less than 5 years; 5 years or above to less than 18 years; and 18 years and above. In Sudan, cases must be positive for P . falciparum and/or P . vivax detected using a rapid diagnostic test (RDT) (Bioline™ Malaria Ag P.f/P.v test), while controls must test negative by RDT. In Ethiopia, concerns over pfhrp2/3 gene deletions [ 31 ] and lack of an appropriate and approved RDT means that cases must be positive for P . falciparum and/or P . vivax detected using microscopy, while controls must test negative by microscopy. Cases must have fever (axillary temperature ≥37.5°C) at the time of presentation or a history of fever within the previous 48 hours, while controls must be negative for fever (axillary temperature <37.5°C) or history of fever within the previous 48 hours. Cases and controls should have no signs or symptoms suggesting progression to severe malaria and should not have history of malaria treatment in the preceding two weeks. Controls must attend the same health centre as their matched case within 72 hours of case identification.

thumbnail

Country specific criteria are indicated with acronyms, SUD for Sudan and ETH for Ethiopia.

https://doi.org/10.1371/journal.pone.0309058.t001

Study procedures

Identification and enrolment of study participants and blood sample collection..

At each health facility, staff of the health facility trained by the project teams will be responsible for screening and enrolment in collaboration with the rest of the health facility staff. Potential study participants attending the health centre as outpatients, or their caregivers will be approached and invited to participate in the study. The individual or their caregiver will be provided with information about the study, have an opportunity to ask questions, and will be asked to consider providing written informed consent to participate in the study. Children aged 8 years and above in Sudan and aged 11–17 years in Ethiopia will be asked to provide verbal assent to participate in the study. If the child does not assent, then they will not be included.

In both Sudan and Ethiopia, recruitment will be integrated with standard care as much as possible with RDTs and microscopy performed by health facility laboratory staff. A finger-prick blood sample will be taken to perform an RDT (Bioline™ Malaria Ag P.f/P.v test) in Sudan or microscopy in Ethiopia where thick and thin blood films will be prepared and stained with Giemsa. In addition, dried blood spots will be collected using filter paper to be analysed for the presence of 18s rRNA gene using PCR [ 32 , 33 ]. Dried blood spots will also be stored for future molecular analyses to determine the proportion of P . vivax cases that are relapses and for whole genome sequencing of all species of Plasmodium [ 34 , 35 ].

Household survey.

Cases and controls will be visited at home by study personnel within 48 hours of enrolment.

Study fieldworkers will conduct a household survey to collect information on several variables ( Table 2 ), including demographics, house structure, use of malaria preventive methods, and environment. Socio-economic status will be assessed using an asset index from an established questionnaire [ 36 ]. The condition of ITNs will be assessed and a proportionate hole index calculated according to established methods [ 37 ]. Households will also be mapped using a GPS receiver.

thumbnail

https://doi.org/10.1371/journal.pone.0309058.t002

Entomological survey.

During the household visit, fieldworkers will conduct entomological surveillance for both adults and immature mosquitoes both inside and outside the participant household (within 50 m). Adult mosquito surveillance will be performed both indoor and outdoor using CDC LTs and Prokopack aspirators (both John W. Hock Company, Florida, USA). CDC LTs will be placed indoors at the foot end of an occupied bed net 1.5m off the floor and collections will be conducted for a 12-hour period overnight. Prokopack aspiration will be conducted inside and outside of the house between 5.00–6.00 AM before the CDC LT is retrieved. Collections will be performed for 20 minutes, with an additional 10 minutes for any out-buildings sheltering animals. Prokopack aspiration will also be performed in potential outdoor resting sites such as house exteriors, grain or rice stores, wood stacks, water pipes, drains, wells, trees, and bushes. In addition, Biogents Pro CDC-style traps (Biogents AG, Regensburg, Germany) will be deployed outdoor. BG Pro traps with lure will be placed outside the main dwelling structure close to presumed mosquito resting places, suspended 1.5m off the floor, and out of direct sunlight, wind, or heavy rain. BG Pro traps will be deployed with BG Lures releasing artificial human skin odour (Biogents AG, Regensburg, Germany) and light and run for 12 hours overnight. Immature collections will be done by systematically sampling aquatic habitats within 50 m of the study participant house using a standard dipper [ 38 ]. Both adult and larval collections will be done on one occasion only.

Additional entomological surveillance.

We will attempt to determine whether collections of larvae and adult anophelines in and around the homes of cases and controls shortly after recruitment can be used to infer the presence of larvae/adult and mosquito density at the time of malaria transmission, approximately 10 days previously. Here we will adopt a similar approach to a previous case-control study conducted in The Gambia [ 19 ]. Entomological surveillance of larvae and adult anophelines will be performed within 48 hours of recruitment and repeated 10 days later for 20% of all cases and controls. The presence/absence of larvae/adult and the density of An . stephensi and other anopheline species will be compared between the two catches in the same location for quality assurance.

Due to low catch numbers of adult An . stephensi in Ethiopia, additional entomological surveillance will be conducted every two months. An initial census conducted in December 2023 suggests the presence of permissive and An . stephensi positive aquatic habitats across Metehara. The two Kebeles (Dire Gobu and Haro Adi) will be divided into 8 quadrants (4 quadrants per Kebele) and sampling efforts allocated to each proportional to household and potential aquatic habitat density. Every 2 months, permissive aquatic habitats identified in the census will be surveyed for An . stephensi , along with a subset of habitats selected purposively during each sampling round. Human and animal structures within 20 metres of each aquatic habitat will be surveyed for adult An . stephensi using Prokopack aspiration, CDC LT and BG Pro traps.

Mosquito species identification.

For adult collections, mosquitoes will be sorted to anophelines and culicines. Anophelines will be identified morphologically [ 39 ]. Morphological identification of adults and larvae will be confirmed using polymerase chain reaction (PCR) [ 40 ]. To reduce the number of PCRs run, larvae will be pooled into groups of 20 samples according to their catch location.

Blood meal source and infection rate determination.

Blood meal analysis of adults will be conducted; abdomens of freshly blood-fed An . stephensi and other malaria vectors will be subjected to amplification [ 41 ]. Furthermore, adult An . stephensi and a random sample of other malaria vectors will be screened for P . falciparum and P . vivax DNA. qPCR amplification will target the SSU RNA gene with species-specific primers [ 32 ].

Data management

Study data will be collected and managed using REDCap [ 42 , 43 ] electronic data capture tools hosted at the Liverpool School of Tropical Medicine (for Sudan) and Armauer Hansen Research Institute (for Ethiopia).

Statistical analysis plan

The primary analysis for this study will include three separate exposure variables: 1) presence/absence of adult An . stephensi in and around the household; 2) presence/absence of immature An . stephensi in and around the household; and 3) combined presence/absence of adults and/or immature An . stephensi in and around the household. Should sufficient An . stephensi be caught, additional analyses will assess the impact of adult vector density on malaria status. Association between exposure variables and case status will be assessed using conditional logistic regression. Covariates will include co-occurrence of other anopheline vectors and participant gender. Effect estimates will be expressed as odds ratios with 95% confidence intervals.

A similar approach to the above will be used to assess the associations between other key risk factors for malaria and malaria case status. These include (but are not limited to) presence/absence and density of adult and/or immature endemic malaria vectors, ITN use, ITN quality, travel history, and proximity of the sleeping space to animals. The distance between case and control households and An . stephensi positive habitats or structures identified in the overlaid entomological surveillance will be assessed, taking into account the most recent surveillance round only.

Sensitivity analysis.

While the primary analysis will include all cases and controls recruited into the study, a sensitivity analysis will be conducted to adjust for potential misclassification of cases and controls. Cases with negative PCR for Plasmodium species will be excluded, as will controls with positive PCR results.

Community sensitisation

Meetings have been held to introduce the study to relevant stakeholders in each country, including the Federal Ministry of Health, Regional Health Bureau, and City Administration Health Office in Ethiopia and Federal and State Ministry of Health, Locality Health Affairs, and the facility director in Sudan. During these meetings we presented the objectives of the research, research activities and potential implications for policy. Furthermore, regional health bureaus from the selected sites have been informed of the research and the methods involved. Prior to the start of the study, we will hold meetings with community leaders in the study sites (e.g. urban dwellers association in Ethiopia, Development & Public Services committees, and Health Facility Development Committee in Sudan) to inform them about the project and give them an opportunity to ask questions.

This study received ethical approval from the Liverpool School of Tropical Medicine Research Ethics Committee (Ref # 22–005, Ethiopia: 27 Jan 2023, Sudan: 18 Aug 2022), the London School of Hygiene and Tropical Medicine Research Ethics Committee (Ref # 28287, 24 Nov 2022), the Republic of Sudan National Health Research Ethics Review Committee (Ref # 8-1-21, 7 Feb 2021), the Armauer Hansen Research Institute Ethics Review Committee (Ref # PO-35-22, 20 Aug 2022), and the National Ethics Review Committee in Ethiopia (Ref # 1724642123, 23 Jan 2023).

We will adhere to guidelines set forth by the Declaration of Helsinki. All staff and investigators will receive training on human subjects protection, safeguarding and study-specific standard operating procedures (SOPs). Data collection activities (both epidemiological and entomological) will require written informed consent from the study participant or their caregiver. Consent forms are translated into local languages and explain in-depth the purpose of the study, procedures, risks and benefits, and the voluntary nature of participation.

Dissemination

Findings will be disseminated in the study health facilities and communities through meetings to which local community members and local stakeholders will be invited. During these meetings, plain language summaries of the key findings of the study will be presented. Findings will be presented through meetings and briefs to stakeholders including the Federal Ministry of Health and National Malaria Control Programme. Internationally, findings from this work will be presented through peer-reviewed publications and presentations at scientific conferences and international policy fora.

This case-control study will provide evidence of the relative impact of An . stephensi on malaria in Ethiopian and Sudanese urban settings as compared to endemic Anopheles species. Recruiting cases and controls from health facilities and conducting entomological surveillance in and around houses will determine whether household-level exposure to An . stephensi is associated with greater risk of malaria. This study employs a unique design, pairing epidemiological principles with entomological surveillance.

To date, little is known about the impact of An . stephensi on malaria burden in urban Africa, but some preliminary evidence points to a potential positive association. A 2019 paper from Djibouti describes a simultaneous rise in An . stephensi occurrence and malaria cases from 2013–2017, including an increase in P . vivax cases [ 7 ]. While not able to draw causal conclusions, this paper provides descriptive evidence for the potential that the rise in An . stephensi contributed to malaria transmission. To support this finding, the authors detected a 3.1% sporozoite rate among captured adult An . stephensi females. Furthermore, a case-control study conducted during a dry season malaria outbreak in 2022 in Dire Dawa, Eastern Ethiopia found that presence of An . stephensi adults and/or larvae was associated with 3.30-times the odds of malaria cases compared to controls (95% CI 1.65–6.47) [ 17 ]. Importantly, abundance of An . stephensi in this setting was markedly high: all Anopheles larvae collected from artificial containers were identified as An . stephensi , and 97% of adult Anopheles mosquitoes were An . stephensi . The case-control study proposed here will further contribute to this body of evidence, expanding both geographically and temporally, and allowing for the assessment of potential seasonal changes in the associations between An . stephensi and malaria risk through the 12-month study period. This study will also include important urban settings, including those that rely heavily on irrigation schemes and plantations.

This study design is not without challenges and limitations. Firstly, the design assumes that entomological exposures around households are relatively stable. To test this assumption, entomological surveillance will be repeated after 10 days for a subset of all cases and controls. If the correlation between entomological collections is low, findings from this study may be attenuated. Secondly, this study will assess household-level exposure to vectors, yet malaria transmission could take place outside of the household, in the workplace or during travel. To better understand this, household surveys will collect information on employment, school attendance, and travel history. Finally, from an operational perspective, harmonizing the study in two countries with distinct settings and malaria profiles poses challenges. Substantial work has been done to tailor the studies to each setting, while maintaining the core design across the two countries. Civil unrest in Sudan beginning in April 2023 presents further challenges to the implementation of the study; for this reason, protocol details, including study locations, are subject to change. Civil unrest in Ethiopia close to Metehara may also hinder study progress.

This study will address the critical need to understand the public health impact of the threat posed by An . stephensi . With this understanding, future work can evaluate how existing control interventions can be used against the vector and develop additional strategies to protect urban populations from malaria.

Acknowledgments

We would like to thank the Sudanese Federal Ministry of Health, Sudanese National Malaria Control Programme, Entomological Surveillance teams in the 9 states of Sudan included in the Research, Ethiopian Federal Ministry of Health, Ethiopian National Malaria Elimination Programme, State and District health offices, health facility staff and local communities in the study sites for their kind cooperation with this study.

  • 1. United Nations, 2018 Revision of World Urbanization Prospects 2018, UN: Geneva.
  • View Article
  • PubMed/NCBI
  • Google Scholar
  • 9. World Health Organization. Malaria Threats Map—Invasive Vector Species 2020 19 Nov 2020].
  • 16. World Health Organization, World Malaria Report 2021. 2021, WHO: Geneva.
  • 17. Tadesse F., et al., Anopheles stephensi is implicated in an outbreak of Plasmodium falciparum parasites that carry markers of drug and diagnostic resistance in Dire Dawa City, Ethiopia, January–July 2022. 2023, Research Square.
  • 29. Ethiopian Federal Ministry of Health, National Malaria Guidelines. 2018.
  • 37. WHO, WHO guidance note for estimating the longevity of long-lasting insecticidal nets in malaria control. 2013, World Health Organization: Geneva.
  • 38. Silver J, Mosquito Ecology: Field Sampling Methods. 2008: Springer.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Pan Afr Med J
  • v.27(Suppl 1); 2017

Case studies in applied epidemiology

Richard c dicker.

1 Workforce and Institute Development Branch, Division of Global Health Protection, Center for Global Health, U.S. Centers for Disease Control and Prevention, Atlanta, Georgia, US

A hallmark of field epidemiology training is its focus on acquisition of practical epidemiologic knowledge and skills to address priority public health issues. The training must prepare the trainee to conduct the core functions of a field epidemiologist – investigate outbreaks, conduct public health surveillance, collect and analyze data, use epidemiologic judgment, and communicate effectively. While these functions or competencies are best learned through practice in the field under the guidance of experienced mentors, even the classroom component that usually precedes the fieldwork can help prepare the trainee. For example, to supplement a lecture on the steps of an outbreak investigation, the unfolding circumstances of an actual outbreak can be presented in the classroom, and trainees could be asked what decisions they would make, what hypotheses they would consider, what statistics they might calculate (and given the data, calculate them), what conclusions they might draw from the data, and so on.

The first outbreak known to be used in this way to teach epidemiologic field investigation principles and methods is the now legendary outbreak of gastroenteritis following a church supper in Oswego, New York in 1940. The Oswego Problem was used as a teaching example at the nearby Albany Medical College in 1942. Alexander Langmuir brought Oswego to the Communicable Disease Center (CDC, now the Centers for Disease Control and Prevention), where he used it to teach outbreak investigation to the first cohort of Epidemic Intelligence Service (EIS) Officers in 1951 [ 1 ], Oswego was soon followed by Epidemic Disease in South Carolina and many others.

For many years these teaching examples were called "problems" or "exercises", but neither term seemed entirely satisfactory as a descriptor. In 1988, EIS training staff looked for a new name, and settled on “case study.” However, these applied epidemiology cases studies differ in a number of ways from what are called case studies in other disciplines, particularly the case study based on a single patient in clinical medicine or psychology or the case studies used in business schools. Business-school case studies are in-depth stories, ranging from a few pages to over a hundred pages in length, that present issues for which one or more decisions are needed, often without a right answer; a limited number of questions are posed at the end [ 2 ]. Students read the case study as homework, and come to class prepared to discuss the questions. In contrast, applied epidemiology case studies are usually read by the trainees in class, often out loud, stopping to answer questions that are interspersed throughout, without looking ahead. The questions can ask for a decision, but often they instruct the trainees to perform calculations, draw graphs, generate lists, interpret data, or consider the pros and cons of different approaches.

Applied epidemiology case studies quickly became a cornerstone of the classroom modules of the EIS Program, and later of Field Epidemiology Training Programs (FETPs) around the world. One reason is that they are consistent with the principles of adult learning. They are job-relevant, focusing on tasks, skills, and knowledge of surveillance, outbreak investigation, and data analysis / interpretation that trainees will use during their field placements. They use actual examples to reinforce concepts. They require active participation and problem solving. They are used in a collaborative environment in which trainees contribute to the learning based on their own experience and insights, so trainees learn from their peers as well as from the instructors. In addition, they are fun, consistently receiving the highest ratings on student evaluations of any component of classroom training.

Trainees always prefer that teaching materials use local examples to which they can readily relate. But writing a new case study requires considerable time, effort, and access to the original data, and few FETPs have taken the initiative. Consequently, until recently, most case studies used by FETPs were borrowed from the EIS program, and were U.S.-based.

To address the dearth of locally developed case studies, particularly in Africa, the Africa Field Epidemiology Network (AFENET) and Emory University’s Rollins School of Public Health designed and conducted a workshop to guide African FETP staff or partners in the development of local case studies. The case studies included in this supplement are the products of that workshop. We are pleased to add these case studies to the library of training materials available to FETP trainers, university faculty, and others who wish to teach field epidemiology in an engaging and interactive way.

Competing interest

The authors declare no competing interest.

  • Search Menu
  • Sign in through your institution
  • Advance articles
  • Editor's Choice
  • 100 years of the AJE
  • Collections
  • Author Guidelines
  • Submission Site
  • Open Access Options
  • About American Journal of Epidemiology
  • About the Johns Hopkins Bloomberg School of Public Health
  • Journals Career Network
  • Editorial Board
  • Advertising and Corporate Services
  • Self-Archiving Policy
  • Dispatch Dates
  • Journals on Oxford Academic
  • Books on Oxford Academic

Society for Epidemiologic Research

Article Contents

Can algorithms replace expert knowledge for causal inference a case study on novice use of causal discovery.

ORCID logo

  • Article contents
  • Figures & tables
  • Supplementary Data

Rajesh Gururaghavendran, Eleanor J Murray, Can algorithms Replace Expert Knowledge for Causal Inference? A Case Study on Novice Use of Causal Discovery, American Journal of Epidemiology , 2024;, kwae338, https://doi.org/10.1093/aje/kwae338

  • Permissions Icon Permissions

With growing interest in causal inference and machine learning among epidemiologists, there is increasing discussion of causal discovery algorithms for guiding covariate selection. We present a case study of novice application of causal discovery tools and attempt to validate the results against a well-established causal relationship.

As a case study, we attempted causal discovery of relationships relevant to the effect of adherence on mortality in the placebo arm of the Coronary Drug Project (CDP) dataset. We used four algorithms available as existing software implementations and varied several model inputs.

We identified 15 adjustment sets from 17 model parameterizations. When applied to a baseline covariate adjustment analysis, these 15 adjustment sets returned effect estimates with similar magnitude and direction of bias as prior published results. When using methods to control for time-varying confounding, there was generally more residual bias than compared to expert-selected adjustment sets.

Although causal discovery algorithms can perform on par with expert knowledge, we do not recommend novice use of causal discovery without the input of experts in causal discovery. Expert support is recommended to aid in choosing the algorithm, selecting input parameters, assessing underlying assumptions, and finalizing selection of the adjustment variables.

  • epidemiologic causality
  • graphical displays
  • machine learning
  • epidemiologists

Supplementary data

Email alerts, citing articles via, looking for your next opportunity.

  • Recommend to your Library

Affiliations

  • Online ISSN 1476-6256
  • Print ISSN 0002-9262
  • Copyright © 2024 Johns Hopkins Bloomberg School of Public Health
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

IMAGES

  1. Descriptive Epidemiology, Case Reports, Case Series, Cross-Sectional Studies, and Ecologic Studies

    what is a case study epidemiology

  2. case control study epidemiology example

    what is a case study epidemiology

  3. Overview of Epidemiological Study Designs

    what is a case study epidemiology

  4. case control study epidemiology example

    what is a case study epidemiology

  5. Epidemiologic Methods

    what is a case study epidemiology

  6. case control study epidemiology example

    what is a case study epidemiology

VIDEO

  1. Basics of Epidemiological Studies

  2. case control study , epidemiology part 5 community medicine

  3. case control study part 2 || epidemiology|| PSM|| @Sudarshan263

  4. Epidemiology 5

  5. Epidemiology part 5, Descriptive study#1, #community_medicine

  6. History Of Framingham Heart Study:Cohort Study Introduction

COMMENTS

  1. Epidemiology in Practice: Case-Control Studies

    Introduction. A case-control study is designed to help determine if an exposure is associated with an outcome (i.e., disease or condition of interest). In theory, the case-control study can be described simply. First, identify the cases (a group known to have the outcome) and the controls (a group known to be free of the outcome).

  2. Case Control Studies

    A case-control study is a type of observational study commonly used to look at factors associated with diseases or outcomes.[1] The case-control study starts with a group of cases, which are the individuals who have the outcome of interest. The researcher then tries to construct a second group of individuals called the controls, who are similar to the case individuals but do not have the ...

  3. Case Reports and Case Series

    Case Series. A case series is a report on the characteristics of a group of subjects who all have a particular disease or condition. Common features among the group may suggest hypotheses about disease causation. Note that the "series" may be small (as in the example below) or it may be large (hundreds or thousands of "cases").

  4. What Is a Case-Control Study?

    Case-control studies are a type of observational study often used in fields like medical research, environmental health, or epidemiology. While most observational studies are qualitative in nature, case-control studies can also be quantitative, and they often are in healthcare settings. Case-control studies can be used for both exploratory and ...

  5. Epidemiologic Case Study Resources

    Case Studies in Public Health by Theodore H. Tulchinsky The cases chosen for this collection include those on traditional public health, such as sanitation, hygiene and infectious disease control, and also the organization, financing and quality of health care services. Each case study is presented in a systematic fashion to facilitate learning, with the case, background, current relevance ...

  6. Introduction to Epidemiological Studies

    The basic epidemiological study designs are cross-sectional, case-control, and cohort studies. Cross-sectional studies provide a snapshot of a population by determining both exposures and outcomes at one time point. Cohort studies identify the study groups based on the exposure and, then, the researchers follow up study participants to measure ...

  7. Case Control Studies

    A case-control study is a type of observational study commonly used to look at factors associated with diseases or outcomes. The case-control study starts with a group of cases, which are the individuals who have the outcome of interest. The researcher then tries to construct a second group of individuals called the controls, who are similar to ...

  8. Case-control study

    A case-control study (also known as case-referent study) is a type of observational study in which two existing groups differing in outcome are identified and compared on the basis of some supposed causal attribute. ... Porta's Dictionary of Epidemiology defines the case-control study as: "an observational epidemiological study of persons ...

  9. Case-control study

    case-control study, in epidemiology, observational (nonexperimental) study design used to ascertain information on differences in suspected exposures and outcomes between individuals with a disease of interest (cases) and comparable individuals who do not have the disease (controls). Analysis yields an odds ratio (OR) that reflects the relative ...

  10. Descriptive and Analytical Epidemiology

    The most commonly used types of analytical studies in epidemiology are ecological studies (correlational studies), case-control studies, cohort studies, and experimental studies. ... 4.2 Case-Control Studies. A case-control study is a type of analytical study that can be conducted relatively quickly and with smaller sample sizes. Such ...

  11. Study Designs

    Figure 3.3 describes five types of observational study designs: case series, ecologic studies, cross-sectional studies, case-control studies, and cohort studies. From left to right, the designs are listed in order of the strength of their evidence (weakest to strongest). Details.

  12. Classroom Case Studies

    The epidemiologic case studies for the classroom are based on real-life outbreaks and public health problems. They were developed in collaboration with the original investigators and experts from the Centers for Disease Control and Prevention (CDC). In these case studies, a group of students works through a public health problem with guidance ...

  13. Module 4

    Describe the design features and the advantages and weaknesses of each of the following study designs: Cross-sectional studies, ecological studies, retrospective and prospective cohort studies, case control studies, and intervention studies. Identify the study design when reading an article or abstract. Explain how different study designs can ...

  14. Epidemiology Of Study Design

    In epidemiology, researchers are interested in measuring or assessing the relationship of exposure with a disease or an outcome. As a first step, they define the hypothesis based on the research question and then decide which study design will be best suited to answer that question. How the researcher conducts the investigation is directed by the chosen study design. The study designs can be ...

  15. Introduction to Epidemiological Studies

    Epidemiology is defined as "the study of the occurrence and distribution of health-related events, states, and processes in specified populations, including the study of the determinants influencing such processes, and the application of this knowledge to control relevant health problems" [].It is apparent that the scope of Epidemiology is very wide and mainly includes the study of ...

  16. Designing and Conducting Analytic Studies in the Field

    Case-control studies are commonly performed in field epidemiology when a cohort study is impractical (e.g., no defined cohort or too many non-ill persons in the group to interview). Whereas a cohort study proceeds conceptually from exposure to disease or condition, a case-control study begins conceptually with the disease or condition and ...

  17. PDF An Overview of Analytic Epidemiology

    food contaminated by an infected food worker. produce irrigated/processed with contaminated water. shellfish from contaminated water. drinking feces-contaminated water. sexual: (e.g., oral-anal contact). Incubation period: 15-50 days (avg.= 28-30). Most infectious from 1-2 weeks before symptoms until 1 week after.

  18. Chapter 1. What is epidemiology?

    Epidemiology is the study of how often diseases occur in different groups of people and why. Epidemiological information is used to plan and evaluate strategies to prevent illness and as a guide to the management of patients in whom disease has already developed. Like the clinical findings and pathology, the epidemiology of a disease is an ...

  19. Epidemiology Training & Resources|Epidemic Intelligence Service|CDC

    The CDC Field Epidemiology Manual — This manual serves as an essential resource for epidemiologists and other health professionals working in local, state, national, and international settings for effective outbreak response to acute and emerging threats. CDC EIS Case Studies in Applied Epidemiology — Collection of student versions of nine case studies used to train new officers in the ...

  20. CDC Epidemiology Case Studies

    CDC developed case studies in applied epidemiology based on real-life epidemiologic investigations and used them for training new Epidemic Intelligence Service (EIS) officers — CDC's "disease detectives.". EIS offers these carefully crafted epidemiology case studies for schools of medicine, nursing, and public health to use as a ...

  21. Observational Studies: Cohort and Case-Control Studies

    Cohort studies and case-control studies are two primary types of observational studies that aid in evaluating associations between diseases and exposures. In this review article, we describe these study designs, methodological issues, and provide examples from the plastic surgery literature. Keywords: observational studies, case-control study ...

  22. Case-Crossover Study Design

    Case-crossovers study designs are an approach to investigating acute triggers that are potentially causing diseases. Learn more about how to perform one. ... A Comparison of Epidemiologic Methods. Epidemiology 2005;16(1):58-66. Hebert C, Delaney JA, Hemmelgarn B, Levesque LE, Suissa S (2007) Benzodiazepines and elderly drivers: a comparison of ...

  23. Investigating the association between household exposure to Anopheles

    Background Endemic African malaria vectors are poorly adapted to typical urban ecologies. However, Anopheles stephensi, an urban malaria vector formerly confined to South Asia and the Persian Gulf, was recently detected in Africa and may change the epidemiology of malaria across the continent. Little is known about the public health implications of An. stephensi in Africa. This study is ...

  24. 7.2: Epidemiology basics

    Introduction. To introduce conditional probability, I have elected to push you into epidemiology and risk analysis.We introduced epidemiology definitions and here, we build on basic terminology of epidemiology. Epidemiology is the study of the causes and distribution of health-related events in a population. It is has been called the basic science of public health (p. 16, Decker 2008).

  25. Case studies in applied epidemiology

    In 1988, EIS training staff looked for a new name, and settled on "case study.". However, these applied epidemiology cases studies differ in a number of ways from what are called case studies in other disciplines, particularly the case study based on a single patient in clinical medicine or psychology or the case studies used in business ...

  26. Can algorithms Replace Expert Knowledge for Causal Inference? A Case

    With growing interest in causal inference and machine learning among epidemiologists, there is increasing discussion of causal discovery algorithms for guiding covariate selection. We present a case study of novice application of causal discovery tools and attempt to validate the results against a well-established causal relationship.