- Search Menu
- Sign in through your institution
- Advance articles
- Editor's Choice
- 100 years of the AJE
- Collections
- Author Guidelines
- Submission Site
- Open Access Options
- About American Journal of Epidemiology
- About the Johns Hopkins Bloomberg School of Public Health
- Journals Career Network
- Editorial Board
- Advertising and Corporate Services
- Self-Archiving Policy
- Dispatch Dates
- Journals on Oxford Academic
- Books on Oxford Academic
Article Contents
Abbreviations, a framework for hypothesis generation, acknowledgments.
- < Previous
Hypothesis Generation During Foodborne-Illness Outbreak Investigations
- Article contents
- Figures & tables
- Supplementary Data
Alice E White, Kirk E Smith, Hillary Booth, Carlota Medus, Robert V Tauxe, Laura Gieraltowski, Elaine Scallan Walter, Hypothesis Generation During Foodborne-Illness Outbreak Investigations, American Journal of Epidemiology , Volume 190, Issue 10, October 2021, Pages 2188–2197, https://doi.org/10.1093/aje/kwab118
- Permissions Icon Permissions
Hypothesis generation is a critical, but challenging, step in a foodborne outbreak investigation. The pathogens that contaminate food have many diverse reservoirs, resulting in seemingly limitless potential vehicles. Identifying a vehicle is particularly challenging for clusters detected through national pathogen-specific surveillance, because cases can be geographically dispersed and lack an obvious epidemiologic link. Moreover, state and local health departments could have limited resources to dedicate to cluster and outbreak investigations. These challenges underscore the importance of hypothesis generation during an outbreak investigation. In this review, we present a framework for hypothesis generation focusing on 3 primary sources of information, typically used in combination: 1) known sources of the pathogen causing illness; 2) person, place, and time characteristics of cases associated with the outbreak (descriptive data); and 3) case exposure assessment. Hypothesis generation can narrow the list of potential food vehicles and focus subsequent epidemiologic, laboratory, environmental, and traceback efforts, ensuring that time and resources are used more efficiently and increasing the likelihood of rapidly and conclusively implicating the contaminated food vehicle.
Shiga toxin-producing Escherichia coli
pulsed-field gel electrophoresis
whole-genome sequencing
hypothesis-generating questionnaire
Foodborne diseases are a continuing public health problem in the United States, where they cause an estimated 48 million illnesses, 128,000 hospitalizations, and 3,000 deaths annually ( 1 ). Public health and regulatory agencies rely on data from foodborne disease surveillance and outbreak investigations to prioritize food safety regulations, policies, and practices aimed at reducing the burden of disease ( 2 ). In particular, foodborne illness outbreaks provide critical information on the foods causing illness, common food-pathogen pairs, and high-risk production technologies and practices. However, only half of the foodborne outbreaks reported each year identify a pathogen, and less than half implicate a food vehicle, decreasing the utility of these data ( 3 ).
A model framework for hypothesis generation during a foodborne-illness outbreak investigation.
Foodborne disease outbreaks require rapid public health response to quickly identify potential sources and prevent future exposures; however, implicating a food vehicle in an outbreak can be challenging. The pathogens that contaminate food have many diverse reservoirs and can be transmitted in other ways (e.g., from one person to another or through contact with animals or contaminated water), resulting in seemingly limitless potential vehicles ( 2 ). Identifying a food vehicle is particularly challenging for clusters detected through national pathogen-specific surveillance: Cases can be geographically dispersed and lack an obvious epidemiologic link ( 4 ). Moreover, state and local health departments might have limited resources to dedicate to cluster and outbreak investigations ( 5 ). These challenges underscore the importance of hypothesis generation during an outbreak investigation. Hypothesis generation can narrow the list of potential food vehicles and focus subsequent epidemiologic, laboratory, environmental, and traceback efforts, ensuring that time and resources are used more efficiently and increasing the likelihood of timely identification of the vehicle. Timely investigations can prevent additional illnesses and increase the likelihood of identifying factors contributing to the outbreak.
The Integrated Food Safety Centers of Excellence were established in 2012 under the Food Safety Modernization Act to serve as resources for federal, state, and local public health professionals who detect and respond to foodborne illness outbreaks. The Integrated Food Safety Centers of Excellence aim to improve the quality of foodborne-illness outbreak investigations by providing public health professionals with training, tools, and model practices. In this paper, we provide a framework for generating hypotheses early during investigation of an outbreak or cluster detected through pathogen-specific surveillance; highlight tools to support rapid and effective hypothesis generation; and illustrate the practice of hypothesis generation using example outbreak case studies.
A hypothesis is “a supposition, arrived at from observation or reflection, that leads to refutable predictions; (or) any conjecture cast in a form that will allow it to be tested and refuted” ( 6 ). In a foodborne outbreak, the hypothesis states which food vehicle(s) could be the source of the outbreak and warrant further investigation. In practice, hypothesis generation is dynamic and iterative. It begins in the earliest stages of an investigation as investigators review available information and look for a pattern or “signal” that might emerge. As more information becomes available hypotheses are frequently evaluated and refined.
The framework presented here focuses on 3 primary sources of information for generating hypotheses, typically used in combination: 1) known sources of the pathogen causing illness; 2) person, place, and time characteristics of cases associated with the outbreak (descriptive data); and 3) case exposure assessment ( Figure 1 ). We discuss the approach for collecting, summarizing, and interpreting each of these sources of information and provide example outbreak case studies ( Table 1 ). We focus primarily on food exposures. However, at the onset of an investigation the transmission route is often unknown, and many pathogens commonly transmitted though food can also be transmitted through other routes (e.g., animal contact, person-to-person, waterborne). Thus, hypothesis generation should consider all potential transmission routes early in the investigation. Moreover, hypothesis generation should involve a multidisciplinary outbreak investigation team, including experienced colleagues who can provide information about past outbreaks and known sources of the pathogen causing illness.
Foodborne-Illness Outbreak Case Studies Highlighting Hypothesis-Generation Methods, United States, 2006–2018
Abbreviations: STEC: Shiga toxin-producing Escherichia coli , HG: hypothesis generation, HGQ: hypothesis-generating questionnaires, PFGE: pulsed-field gel electrophoresis.
Known pathogen sources
When generating a hypothesis, investigators should consider historical information about the causative pathogen, including known reservoirs; foods (and animals) implicated in past outbreaks; findings from case-control studies of sporadic illnesses (i.e., diagnosed cases investigated during routine surveillance not linked to other cases); and molecular subtyping information of the pathogen, including information about nonhuman isolates (i.e., food, animal, or environmental sources).
The reservoir of the infectious agent can indicate potential sources and contributing factors. Pathogens with a human reservoir (e.g., norovirus, hepatitis A virus, and Shigella ) are commonly associated with infected food handlers or ready-to-eat foods that have been contaminated with human feces. In contrast, pathogens with animal reservoirs (e.g., Shiga toxin-producing Escherichia coli (STEC), nontyphoidal Salmonella , and Campylobacter ) are often associated with food sources of animal origin or foods that have been contaminated by animal feces during production (e.g., fresh produce). Pathogens with environmental reservoirs (e.g., Vibrio spp., Listeria monocytogenes , Clostridium botulinum ) are commonly associated with foods that can become contaminated by soil or water. Tools that help identify known pathogen sources include the National Outbreak Reporting System Dashboard ( 7 ), the Food and Drug Administration Bad Bug Book ( 8 ), and An Atlas of Salmonella in the United States ( 9 ).
Food-pathogen pairs identified in past outbreaks and case-control studies of sporadic illnesses provide information on common food vehicles associated with a pathogen. Using data on reported outbreaks from 1998–2016, the Interagency Food Safety Analytics Collaboration estimated the proportion of illnesses attributable to 17 major food categories ( 10 ). The foods most commonly associated with Salmonella illnesses were seeded vegetables (e.g., tomatoes and cucumbers), chicken, pork, and fruit, whereas most STEC illnesses were attributed to leafy greens or beef, and most Listeria illnesses to dairy products or fruits. Similarly, case-control studies of sporadic illnesses have found associations between pathogens and specific foods; for example, Campylobacter and poultry ( 11 ) and Listeria monocytogenes and melons and hummus ( 12 ).
For pathogens with multiple reservoirs, information that distinguishes isolates of the same species by phenotypic or genotypic characteristics can provide increased specificity. For example, there are over 2,600 serotypes of Salmonella ; however, some serotypes have been associated with specific food vehicles, such as Salmonella enterica serotype Enteritidis (SE) and eggs and chicken; serotypes Uganda and Infantis and pork; and serotypes Litchfield, Poona, Oranienburg, and Javiana and fruit ( 13 ). Antimicrobial resistance has also proven useful in differentiating major sources of Salmonella serotypes found in both animal- and plant-derived food commodities. For example, antimicrobial-resistant Salmonella outbreaks were more likely to be associated with meat and poultry (e.g., beef, chicken, and turkey), whereas foods commonly associated with susceptible Salmonella outbreaks were eggs, tomatoes, and melons ( 14 ).
Molecular subtyping with pulsed-field gel electrophoresis (PFGE) has been an essential subtyping tool for outbreak detection, and PFGE patterns have been associated with specific foods . For example, SE isolates with PFGE PulseNet pattern JEGX01.0004 have commonly been associated with eggs (and more recently, chicken), pattern JEGX01.0005 with chicken, and pattern JEGX01.0002 with travel or exposure to the US Pacific Northwest region and Mexico. Similarly, the same PFGE pattern of STEC O157:H7 has been associated with recurrent romaine lettuce outbreaks ( 15 , 16 ). In July 2019, whole-genome sequencing (WGS) replaced PFGE as the standard molecular subtyping method for the national PulseNet network, providing greater discrimination and more reliable indication of genetically related groupings than PFGE. This change in molecular method might limit historical comparisons temporarily, particularly to isolates from before the transition, as PFGE patterns and WGS results are not readily comparable. However, WGS allele codes have been applied to sequenced historical isolates in PulseNet, and although this represents a small proportion of all isolates in PulseNet, the representativeness of the WGS database will increase with time. As historical isolates and regulatory isolates from the Food and Drug Administration and US Department of Agriculture Food Safety and Inspection Service are sequenced, information about recent findings in foods and animals will fill the national database maintained at the National Center for Biotechnology Information ( 17 ) and be readily comparable to sequenced human clinical isolates.
Subtyping of nonhuman isolates collected by regulatory agencies from foods and food chain environments through routine testing or special studies can lead to the identification of outbreaks of human illness by searching the PulseNet database for the same molecular subtypes in human infections, sometimes referred to as “backward” outbreaks. For example, in 2007 public health authorities were investigating a multistate outbreak of Salmonella serotype Wandsworth in which patients reported consuming a puffed vegetable-coated snack food. Food testing yielded the outbreak strain of Salmonella serotype Wandsworth, but it also yielded Salmonella serotype Typhimurium; a search in the PulseNet database identified matching isolates from human cases of Salmonella serotype Typhimurium infection, and these cases confirmed consumption of the same snack food upon re-interview ( 18 ). Importantly, identifying a close genetic match between strains from a product and an illness does not alone establish causation; epidemiologic investigation and traceback are needed to connect the product and patient.
Descriptive data
Descriptive epidemiology of cases, including person, place, or time characteristics, remains a powerful tool for hypothesis generation. Person characteristics can suggest foods that are more likely to be eaten by certain groups, whereas place and time characteristics can provide clues about the geographic distribution and shelf life of the food.
Person characteristics suggestive of certain foods include, but are not limited to, sex age, race, and ethnicity. For example, the median percentage of female cases in vegetable-associated STEC outbreaks was 64%, compared with 50% in beef STEC outbreaks ( 19 ). Likewise, there are differences in food consumption patterns by age, with the lowest median percent of children and adolescents in vegetable-associated STEC outbreaks and the highest in STEC dairy outbreaks ( 19 ). Similar trends are evident in the Centers for Disease Control and Prevention FoodNet Population Survey, a population-based survey to estimate the prevalence of risk factors for foodborne illness, which found that women reported consuming more fruits and vegetables than men, and men reported consuming more meat and poultry ( 20 ).
Time characteristics, displayed by the shape and pattern of an epidemic curve, can indicate the shelf life of a product or the harvest duration of a contaminated field. For example, cases spread over a longer time period might suggest a shelf-stable or frozen food item, ongoing harborage of the contaminating pathogen in a food processing plant, or other sustained mechanism of contamination. Conversely, cases with illness onset dates spread over a limited duration of time might suggest a perishable item, such as fresh produce. However, some fresh produce items have longer shelf lives than others and can cause more protracted outbreaks. Additionally, there are “special case” produce types. For example, outbreaks associated with sprouted seeds or beans, which have a short shelf life, are typically driven by a single contaminated seed lot, and un-sprouted seeds and beans can have a shelf life of months to years. Thus, single batches might be sprouted from the same contaminated lot of seeds at different times and in different places leading to a more sustained outbreak, or resulting in temporally and geographically distinct outbreaks ( 21 ). If an outbreak is detected early and exposure is ongoing, the temporal distribution of cases might be less clear early in an investigation. Thus, epidemic curves can provide supporting evidence that adds to the plausibility of a suspected food vehicle; however, depending on the outbreak, epidemic curves might provide more relevant information as the outbreak progresses.
Geographical mapping of cases can also help assess the plausibility of a suspected vehicle by comparing the distribution of cases with the distribution pattern of that food item, in consultation with regulatory and industry partners. For example, widespread outbreaks are caused by widely distributed commercial products, and some foods are more likely to be distributed nationally (e.g., bagged leafy greens, packaged cereal, national meat brands), whereas other are more likely to be distributed regionally (e.g., popular brands of ice cream) or locally (e.g., raw milk) ( 22 ). Likewise, if some outbreak-associated illnesses are clearly related to travel to a specific country, and others are in nontravelers, it suggests the latter might be associated with a product imported from that country. For example, a 2018 outbreak of Salmonella serotype Typhimurium infections in Canada occurred among persons traveling to Thailand, and among others who shopped at particular stores in Western Canada; the outbreak was ultimately traced to contaminated frozen profiteroles imported from Thailand ( 23 ). Similarly, in a 2011 multistate outbreak in the United States, a subset of cases traveled to Mexico and ate papaya there, and nontravel-associated cases ate papaya imported from Mexico ( 24 ).
Outbreak size and distribution can suggest certain food-pathogen pairs. For example, seafood toxins like ciguatoxin are typically produced or concentrated in an individual fish and therefore cause illness in a limited number of people in a single jurisdiction, whereas Salmonella and other bacterial pathogens can contaminate large amounts of a widely distributed product ( 22 ). The distribution of cases can be misleading or incomplete early in an outbreak, so investigators must use caution when using these parameters to rule out hypotheses and revisit as additional cases are identified. Moreover, an apparently local outbreak can be an early indicator of a larger problem. For example, in 2018, a large multistate outbreak of E. coli O157:H7 infections linked to romaine lettuce was initially detected in New Jersey in association with a single restaurant chain; within 8 days of detecting the cluster it had expanded to include many more cases with a variety of different exposure locations as far away as Nome, Alaska ( 15 ).
Case exposure assessment
Rapidly collecting detailed food histories from cases in an outbreak is the most critical step in identifying commonalities between these cases. Before a cluster is detected, local or state public health agencies typically attempt to interview each individual, reportable enteric-pathogen case using a standard pathogen-specific questionnaire. If a cluster is detected, a review of these routine interviews can provide information on obvious high-risk exposures. In most jurisdictions, detailed hypothesis-generating questionnaires (HGQs) historically have been used only if commonalities are not identified from the initial routine interviews or if the hypotheses identified from routine interviews collapse under further investigation. However, a growing number of state health jurisdictions are conducting hypothesis-generating interviews with all cases of laboratory-confirmed Salmonella and STEC infection, opting to gather this information during the initial interview. This method is considered a best practice to maximize exposure recall ( 25 ), shaving days or weeks off the delay between case exposure and hypothesis-generating interview.
There are 3 major types of HGQs used in the United States ( 26 ):
Oregon “shotgun” questionnaire: This questionnaire uses a “shotgun,” or “trawling” approach of asking mostly close-ended questions for a long list of individual food items. The section order is designed to prompt recall of specific food exposures through review of places where food was purchased or eaten out, and specific repetitive questions for high-risk exposures such as raw foods or sprouts.
Minnesota “long form” hypothesis-generating questionnaire: This questionnaire combines close-ended questions about fewer food items with open-ended questions that seek details on dining/purchase location and brand-variety details for all foods.
National Hypothesis Generating Questionnaire: This questionnaire is a hybridized approach developed by Centers for Disease Control and Prevention that contains elements of both the Oregon and Minnesota models. Close-ended questions are asked about an intermediate number of food items, and brand/variety details are obtained only for commonly eaten types of foods. During national cluster investigations, the National Hypothesis Generating Questionnaire is deployed across state and local health departments to improve standardization across jurisdictions.
In addition to these questionnaires, there are many modified state-specific versions and national pathogen-specific HGQs (e.g., Listeria Initiative questionnaire, Cyclospora ). The use of HGQs can be enhanced by adopting a dynamic or iterative cluster investigation approach. In this approach, if a suspected food item or branded product emerges during interviews, that food item can be added to questionnaires administered to subsequent cases, and individuals who have already been interviewed can be re-interviewed to systematically collect information about that exposure ( 27 ). Decisions about which exposures should be pursued through re-interviews can be informed by descriptive data, as well as incubation periods, which can help define the most likely exposure period ( 28 ).
The number of interviewers participating in hypothesis-generating interviews can depend on resources and the specifics of the outbreak. A single interviewer approach can be advantageous in that a single interviewer might more clearly remember what previously interviewed persons mentioned and pursue clues as they arise during a live interview. However, this approach could slow investigations, particularly in sizable multistate clusters. An alternative is the “lead investigator model,” in which a single person directs the interviewing team with a limited number of interviewers, reviews completed interviews, and decides which exposures to pursue. This approach can be faster and more efficient than the single interviewer approach. When interviews are done by multiple agencies, it is important that the completed interviews be forwarded to the lead investigator promptly and that the group meet regularly and review results of interviews as the investigation proceeds.
If interviews with HGQs do not yield an actionable hypothesis, investigators should consider alternative approaches, such as questionnaire modification or open-ended interviews. Deciding when to attempt an alternative approach depends on cluster size, velocity of incident cases, and investigation effort expended and time elapsed without identification of a solid hypothesis. Questionnaire modification could include adding questions, such as open-ended questions or supplemental questions about exposures that came up on previous interviews, or pruning questions. For example, after 8–10 interviews, items that no case reported “yes” or “maybe” to eating may be removed. Removal of questions should be done cautiously because certain foods (e.g., stealth ingredients such as cilantro and sprouts) might be reported by a low proportion of cases who ate them. Another approach is open-ended interviews of recent cases, which could be considered after 20–25 initial cases in a large multistate investigation have been interviewed without yielding solid hypotheses. Conducted by a single interviewer, if possible, open-ended interviews should cover everything that a case ate or drank in the exposure period of interest, as well as other exposures including animals, grocery stores, restaurants, travel, parties or events, and details about how they prepare their food at home, including recipes. After the first person is interviewed, objective questions about specific exposures can be added to the open-ended interviews of subsequent cases, creating a hybrid open-ended/iterative model. This requires cooperative patients and a persistent investigative approach but has yielded correct hypotheses with as few as 2 interviews ( 29 ).
Additional methods to ascertain exposures, such as obtaining consumer food purchase data, can be appropriate, particularly for outbreaks where obtaining a food history is challenging ( 30 ). For example, during a multistate Salmonella serotype Montevideo outbreak, initial hypothesis-generating interviews did not identify a clear signal beyond shopping at the same warehouse store. Investigators used shopper membership card purchase information to generate hypotheses, which ultimately helped identify red and black peppercorns coating a ready-to-eat salami as the vehicle ( 31 ). In addition, information from services for grocery home delivery, restaurant take-out delivery, and meal kits might help to clarify specific exposures. Other potential methods include focus-group interviews and household inspections, although these are used more rarely and in specific scenarios, with mixed results ( 32 ).
Binomial probability comparisons can further refine hypotheses by comparing the proportion of cases in an outbreak reporting a food exposure with the expected background proportion of the population reporting the food exposure ( 33 , 34 ). Binomial probability calculations in foodborne-disease outbreak investigations emerged in Oregon in 2003 as a complement to the pioneered “shotgun” questionnaire and use independent data sources on food exposure frequency from sporadic cases, past outbreak cases, or well persons sampled from the population. Such data sources include data from healthy people surveyed as part of the FoodNet Population Survey, standardized data collected in previous outbreaks, or sporadic cases as is done with the Listeria Initiative and Project Hg ( 33 , 35 , 36 ).
Hypothesis generation is a critical, but challenging, step in a foodborne outbreak investigation. A well-informed hypothesis can increase the likelihood of rapidly and conclusively implicating the contaminated food vehicle; conversely, the chances of implicating a food item are small if that item is not considered as part of the outbreak investigation. Inadequate hypothesis generation can delay investigation progress and limit investigators’ ability to rapidly identify the outbreak source, potentially leading to prolonged exposure and more illnesses. The 3 primary sources of information presented as part of this framework—known sources of the pathogen causing illness, descriptive data, and case exposure assessment—provide vital information for hypothesis generation, particularly when used in combination and revisited throughout the outbreak investigation.
Despite these sources of information, there are certain types of outbreaks for which hypothesis generation is inherently more challenging. These include outbreaks for which the vehicle has a high background rate of consumption (e.g., chicken) or outbreaks associated with a “stealth” food (e.g., garnishes, spices, chili peppers, or sprouts) that many cases could have consumed, but few remember eating. These challenges can sometimes be overcome by obtaining details on food exposures such as brand/variety and point of purchase. Obtaining this information is also critical to rapidly initiating a traceback investigation. An outbreak might also be caused by multiple contaminated food products when, for example, multiple foods have a single common ingredient or when poor sanitation or contaminated equipment leads to cross-contamination. Furthermore, the key exposure might not be a food at all, but rather an environmental or animal exposure, emphasizing that food should not be the default hypothesis.
There might be specific clues or “toe-holds” that help identify a hypothesis and accelerate an investigation. For example, cases with restricted diets, food diaries, or highly unusual or specific exposures can narrow the list of potential foods. This could include cases who traveled briefly to the outbreak location, and thus had a limited number of exposures. Smaller, localized clusters within a larger outbreak associated with restaurants, events, stores, or institutions, or “subclusters,” are often crucial to hypothesis generation, providing a finite list of foods. For example, in a multistate outbreak of Salmonella serotype Typhimurium infections associated with consumption of tomatoes, comparison of 4 restaurant-associated subclusters was instrumental in rapidly identifying a small set of potential vehicles ( 4 ). Subcluster investigations are precisely focused and as such can lead to much more rapid and efficient hypothesis generation and testing than attempts to assess all exposures among all cases in a large outbreak. Because of the immense value of subclusters, every effort should be made to quickly identify them through initial interviews and the iterative interviewing approach ( 25 ).
The majority of outbreaks are associated with common foods previously associated with that pathogen. In an investigation, it is important to both rule in and rule out common vehicles, while keeping an open mind about potential novel vehicles. If investigators suspect a novel vehicle, they should still rule out the most common vehicles when designing epidemiologic studies. For example, if an STEC outbreak investigation implicates cucumbers, regulatory partners will want to confirm that investigators have eliminated common STEC vehicles such as ground beef, leafy greens, and sprouts. That said, food vehicles change over time, reflecting changing food preferences and trends in food safety measures, and new vehicles continue to emerge (e.g., in recent years: SoyNut butter, raw flour, caramel apples, kratom, and chia seed powder). HGQs are biased toward previously implicated foods and a finite list of foods. If cases continue without a clear hypothesis emerging, it might be necessary to try open-ended hypothesis-generating interviews.
Hypothesis generation during foodborne outbreak investigation will evolve as laboratory techniques advance. Molecular sequencing techniques based on WGS might give investigators more conviction in devoting resources to following leads because there is more confidence that the cases have a common source for their illnesses ( 17 , 37 ). Concurrent or recent nonhuman isolates (e.g., food isolates) that match human case isolates by sequencing will be considered even more likely to be related to the human cases and become a priori hypotheses during investigations.
Foodborne-outbreak investigation methods are constantly evolving. Food production, processing, and distribution are changing to meet consumer demands. Outbreak investigations are more complex, given that laboratory methods for subtyping, strategies for epidemiologic investigation, and environmental assessments are also changing. Rapid investigation is essential, because with mass production and distribution, food safety errors can cause large and widespread outbreaks. Outbreak investigations balance the need for expediency to implement control measures with the need for accuracy. If hastily developed hypotheses are incorrect or insufficiently refined, analytical studies are unlikely to succeed and can waste time and resources. Alternatively, a refined hypothesis can lead directly to effective public health interventions, sometimes bypassing the need for an analytical study, if accompanied with other compelling evidence, such as laboratory evidence or traceback information.
Effectively and swiftly sharing data across jurisdictions increases an investigations team’s ability to quickly develop hypotheses and implicate food vehicles. Successful investigations depend on including the correct hypothesis, the result of a systematic approach to hypothesis generation. The exact path to identifying a hypothesis is rarely the same between outbreaks. Therefore, investigators should be familiar with different hypothesis-generating strategies and be flexible in deciding which strategies to employ.
Author affiliations: Department of Epidemiology, Colorado School of Public Health, Aurora, Colorado, United States (Alice E. White, Elaine Scallan Walter); Minnesota Department of Health, St. Paul, Minnesota, United States (Kirk E. Smith, Carlota Medus); Washington State Department of Health, Tumwater, Washington, United States (Hillary Booth); and Division of Foodborne, Waterborne, and Environmental Diseases, National Center for Emerging Zoonotic and Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, Georgia, United States (Robert V. Tauxe, Laura Gieraltowski).
This work was funded in part by the Colorado and Minnesota Integrated Food Safety Centers of Excellence, which are supported by the Epidemiology and Laboratory Capacity for Infectious Disease Cooperative Agreement through the Centers for Disease Control and Prevention.
Conflict of interest: none declared.
Scallan E , Hoekstra RM , Angulo FJ , et al. Foodborne illness acquired in the United States—major pathogens . Emerg Infect Dis . 2011 ; 17 ( 1 ): 7 – 15 .
Google Scholar
Tauxe RV . Surveillance and investigation of foodborne diseases; roles for public health in meeting objectives for food safety . Food Control . 2002 ; 13 ( 6-7 ): 363 – 369 .
Dewey-Mattia D , Manikonda K , Hall AJ , et al. Surveillance for foodborne disease outbreaks—United States, 2009–2015 . MMWR Morb Mortal Wkly Rep . 2018 ; 67 ( 10 ): 1 – 11 .
Behravesh CB , Blaney D , Medus C , et al. Multistate outbreak of Salmonella serotype typhimurium infections associated with consumption of restaurant tomatoes, USA, 2006: hypothesis generation through case exposures in multiple restaurant clusters . Epidemiol Infect . 2012 ; 140 ( 11 ): 2053 – 2061 .
Boulton ML , Rosenberg LD . Food safety epidemiology capacity in state health departments—United States, 2010 . MMWR Morb Mortal Wkly Rep . 2011 ; 60 ( 50 ): 1701 – 1704 .
Porta MA A Dictionary of Epidemiology . 5th ed. New York, NY : Oxford University Press ; 2008 ( 4 ): 82 .
Centers for Disease Control and Prevention . National Outbreak Reporting System Dashboard. https://wwwn.cdc.gov/norsdashboard/ . Updated December 7, 2018 . Accessed April 9, 2021 .
Lampel KA , Al-Khaldi S , Cahill SM , eds. Bad Bug Book, Foodborne Pathogenic Microorganisms and Natural Toxins . 2nd ed. Washington, DC : Food and Drug Administration ; 2012 .
Google Preview
Centers for Disease Control and Prevention . An Atlas of Salmonella in the United States, 1968–2011: Laboratory-Based Enteric Disease Surveillance . Atlanta, GA : US Department of Health and Human Services, CDC ; 2013 . https://www.cdc.gov/salmonella/pdf/salmonella-atlas-508c.pdf . Accessed April 9, 2021 .
Interagency Food Safety Analytics Collaboration . Foodborne Illness Source Attribution Estimates for 2017 for Salmonella , Escherichia coli O157, Listeria monocytogenes , and Campylobacter Using Multi-Year Outbreak Surveillance Data, United States . Atlanta, GA and Washington DC : US Department of Health and Human Services ; 2019 . https://www.cdc.gov/foodsafety/ifsac/pdf/P19-2017-report-TriAgency-508-archived.pdf . Accessed April 9, 2021 .
Friedman CR , Hoekstra RM , Samuel M , et al. Risk factors for sporadic Campylobacter infection in the United States: a case‐control study in FoodNet sites . Clin Infect Dis . 2004 ; 38 ( suppl 3 ): S285 – S296 .
Varma J , Samuel M , Marcus R , et al. Listeria monocytogenes infection from foods prepared in a commercial establishment: a case-control study of potential sources of sporadic illness in the United States . Clin Infect Dis . 2007 ; 44 ( 4 ): 521 – 528 .
Jackson BR , Griffin PM , Cole D , et al. Outbreak-associated Salmonella enterica serotypes and food commodities, United States, 1998--2008 . Emerg Infect Dis . 2013 ; 19 ( 8 ): 1239 – 1244 .
Brown AC , Grass JE , Richardson LC , et al. Antimicrobial resistance in Salmonella that caused foodborne disease outbreaks: United States, 2003–2012 . Epidemiol Infect . 2017 ; 145 ( 4 ): 766 – 774 .
Centers for Disease Control and Prevention . Multistate outbreak of E. coli O157:H7 infections linked to romaine lettuce. https://www.cdc.gov/ecoli/2018/o157h7-04-18/index.html . Published June 28, 2018 . Accessed August 6, 2020 .
Centers for Disease Control and Prevention . Outbreak of E. coli infections linked to romaine lettuce. https://www.cdc.gov/ecoli/2019/o157h7-11-19/index.html . Published January 15, 2020 . Accessed August 6, 2020 .
Besser JM , Carleton HA , Trees E , et al. Interpretation of whole-genome sequencing for enteric disease surveillance and outbreak investigation . Foodborne Pathog Dis . 2019 ; 16 ( 7 ): 504 – 512 .
Sotir MJ , Ewald G , Kimura AC , et al. Outbreak of Salmonella Wandsworth and Typhimurium infections in infants and toddlers traced to a commercial vegetable-coated snack food . Pediatr Infect Dis J . 2009 ; 28 ( 12 ): 1041 – 1046 .
White A , Cronquist A , Bedrick E , et al. Food source prediction of Shiga toxin-producing Escherichia coli outbreaks using demographic and outbreak characteristics, United States, 1998–2014 . Foodborne Pathog Dis . 2016 ; 13 ( 10 ): 527 – 534 .
Shiferaw B , Verrill L , Booth H , et al. Sex-based differences in food consumption: Foodborne Diseases Active Surveillance Network (FoodNet) Population Survey, 2006–2007 . Clin Infect Dis . 2012 ; 54 ( suppl 5 ): S453 – S457 .
Ferguson DD , Scheftel J , Cronquist A , et al. Temporally distinct Escherichia coli O157 outbreaks associated with alfalfa sprouts linked to a common seed source—Colorado and Minnesota, 2003 . Epidemiol Infect . 2005 ; 133 ( 3 ): 439 – 447 .
Tauxe RV . Emerging foodborne diseases: an evolving public health challenge . Emerg Infect Dis . 1997 ; 3 ( 4 ): 425 – 434 .
Public Health Agency of Canada . Public Health Notice—outbreak of Salmonella infections linked to Celebrate brand frozen classic/classical and egg nog flavoured profiteroles (cream puffs) and mini chocolate eclairs. https://www.canada.ca/en/public-health/services/public-health-notices/2019/outbreak-salmonella.html . Published June 27, 2019 . Accessed August 6, 2020 .
Mba-Jonas A , Culpepper W , Hill T , et al. A multistate outbreak of human Salmonella Agona infections associated with consumption of fresh, whole papayas imported from Mexico—United States, 2011 . Clin Infect Dis . 2018 ; 66 ( 11 ): 1756 – 1761 .
Hedberg C . Guidelines for Foodborne Disease Outbreak Response . 3rd ed. Atlanta, GA : Council to Improve Foodborne Outbreak Response (CIFOR) ; 2020 .
Centers for Disease Control and Prevention . Foodborne disease outbreak investigation and surveillance tools. https://www.cdc.gov/foodsafety/outbreaks/surveillance-reporting/investigation-toolkit.html . Reviewed June 10, 2021 . Accessed July 2, 2021 .
Meyer SD , Kirk SE , Hedberg CH . Chapter 7.2—Surveillance for foodborne diseases, part 2: investigation of foodborne disease outbreaks. In: M'ikanatha NM , Lynfield R , Van Beneden CA , et al. eds. Infectious Disease Surveillance . 5th ed. West Sussex, UK : Wiley-Blackwell ; 2013 : 120 – 128 .
Chai S , Gu W , O'Connor KA , et al. Incubation periods of enteric illnesses in foodborne outbreaks, United States, 1998-2013 . Epidemiol Infect . 2019 ; 147 :e285.
Angelo KM , Conrad AR , Saupe A , et al. Multistate outbreak of Listeria monocytogenes infections linked to whole apples used in commercially produced, prepackaged caramel apples: United States, 2014-2015 . Epidemiol Infect . 2017 ; 145 ( 5 ): 848 – 856 .
Møller FT , Mølbak K , Ethelberg S . Analysis of consumer food purchase data used for outbreak investigations, a review . Euro Surveill . 2018 ; 23 ( 24 ):1700503.
Gieraltowski L , Julian E , Pringle J , et al. Nationwide outbreak of Salmonella Montevideo infections associated with contaminated imported black and red pepper: warehouse membership cards provide critical clues to identify the source . Epidemiol Infect . 2013 ; 141 ( 6 ): 1244 – 1252 .
Ickert C , Cheng J , Reimer D , et al. Methods for generating hypotheses in human enteric illness outbreak investigations: a scoping review of the evidence . Epidemiol Infect . 2019 ; 147 :e280.
Jervis RH , Booth H , Cronquist AB , et al. Moving away from population-based case-control studies during outbreak investigations . J Food Prot . 2019 ; 82 ( 8 ): 1412 – 1416 .
Keene W . The use of binomial probabilities in outbreak investigations (abstract). In: Presented at the Annual OutbreakNet Conference, Long Beach . California ; September 22, 2011 .
McCollum JT , Cronquist AB , Silk BJ , et al. Multistate outbreak of listeriosis associated with cantaloupe . N Engl J Med . 2013 ; 369 ( 10 ): 944 – 953 .
Centers for Disease Control and Prevention . National Listeria Surveillance: Listeria initiative. https://www.cdc.gov/nationalsurveillance/listeria-surveillance.html . Published September 13, 2018 . Accessed August 6, 2020
Jackson BR , Tarr C , Strain E , et al. Implementation of nationwide real-time whole-genome sequencing to enhance listeriosis outbreak detection and investigation . Clin Infect Dis . 2016 ; 63 ( 3 ): 380 – 386 .
Sharapov UM , Wendel AM , Davis JP , et al. Multistate outbreak of Escherichia coli O157:H7 infections associated with consumption of fresh spinach: United States, 2006 . J Food Prot . 2016 ; 79 ( 12 ): 2024 – 2030 .
Neil KP , Biggerstaff G , MacDonald JK , et al. A novel vehicle for transmission of Escherichia coli O157:H7 to humans: multistate outbreak of E. coli O157:H7 infections associated with consumption of ready-to-bake commercial prepackaged cookie dough—United States, 2009 . Clin Infect Dis . 2012 ; 54 ( 4 ): 511 – 518 .
Miller BD , Rigdon CE , Ball J , et al. Use of traceback methods to confirm the source of a multistate Escherichia coli O157:H7 outbreak due to in-shell hazelnuts . J Food Prot . 2012 ; 75 ( 2 ): 320 – 327 .
Medus C , Meyer S , Smith K , et al. Multistate outbreak of Salmonella infections associated with peanut butter and peanut butter-containing products—United States, 2008–2009 . MMWR Morb Mortal Wkly Rep . 2009 ; 58 ( 4 ): 85 – 90 .
Gambino-Shirley KJ , Tesfai A , Schwensohn CA , et al. Multistate outbreak of Salmonella Virchow infections linked to a powdered meal replacement product—United States, 2015–2016 . Clin Infect Dis . 2018 ; 67 ( 6 ): 890 – 896 .
Centers for Disease Control and Prevention . Multistate outbreak of Salmonella infections linked to kratom. https://www.cdc.gov/salmonella/kratom-02-18/index.html . 2018 . Published February 20, 2018 . Accessed September 14, 2020 .
Centers for Disease Control and Prevention . Multistate outbreak of Salmonella infections linked to kratom. https://www.cdc.gov/nationalsurveillance/listeria-surveillance.html . Last reviewed September 13, 2018 . Accessed July 2, 2021 .
- disease outbreaks
- pathogenic organism
- foodborne disease
Email alerts
Citing articles via, looking for your next opportunity.
- X (formerly Twitter)
- Recommend to your Library
Affiliations
- Online ISSN 1476-6256
- Print ISSN 0002-9262
- Copyright © 2024 Johns Hopkins Bloomberg School of Public Health
- About Oxford Academic
- Publish journals with us
- University press partners
- What we publish
- New features
- Open access
- Institutional account management
- Rights and permissions
- Get help with access
- Accessibility
- Advertising
- Media enquiries
- Oxford University Press
- Oxford Languages
- University of Oxford
Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide
- Copyright © 2024 Oxford University Press
- Cookie settings
- Cookie policy
- Privacy policy
- Legal notice
This Feature Is Available To Subscribers Only
Sign In or Create an Account
This PDF is available to Subscribers Only
For full access to this pdf, sign in to an existing account, or purchase an annual subscription.
Outbreak Investigations
- 1
- | 2
- | 3
- | 4
- | 5
- | 6
- | 7
- | 8
- | 9
- | 10
Step 6: Develop Hypotheses
Step 7: evaluate hypotheses.
All Modules
As noted previously, these steps are not undertaken in a rigid serial order. In fact, the order may vary depending on the circumstances, and some steps will be undertaken simultaneously. As soon as an outbreak is suspected, one automatically considers what the cause might be and the factors that are fueling it. One of the most important steps in generating hypotheses when investigating an outbreak is to consider what is known about the biology of the disease, including it's possible modes of transmission, whether there are animal reservoirs of disease, and the length of its incubation and infectious periods. Consider this Fact Sheet for Hepatitis A:
This succinct fact sheet provides excellent clues about what to look for when investigating an outbreak of hepatitis A.
Nevertheless, once descriptive epidemiology has been conducted and information about person, place, and time is available, it is useful to reflect on the collected information in order to re-evaluate and rank hypotheses about the causes. Hypotheses are generated by consciously or subconsciously looking for differences, similarities, and correlations.
- Differences: If the frequency of disease differs in two locations or circumstances, it may be due to a factor that differs in the two circumstances.
- Similarities: If there are similarities among the cases (e.g., many reported eating at a particular restaurant), then that common factor may be the cause.
- Correlations: If the frequency of disease varies in relation to some factor, then that factor may be a cause of the disease. For example, communities with low rates of measles immunization may have high rates of measles cases.
Consider the information obtaining during hypothesis-generating interviews, and also consider the location of cases (spot map) and the time course of the epidemic in relation to the incubation period of the disease (the epidemic curve).
The next step is to evaluate the hypotheses. In some outbreaks the descriptive epidemiology rapidly points convincingly to a particular source, and further analysis is unnecessary. For example, in 1991 Massachusetts had an outbreak of vitamin D intoxication in which all of the affected cases reported drinking milk delivered to their homes by a local dairy. Inspection of the dairy revealed that excessive quantities of vitamin D were being added t the milk. However, in other situations the source is unclear, and analytic epidemiology must be utilized to more formally test the hypotheses.
There are two general study designs that can be used in analytical epidemiology: a cohort study or a case control study. Both of these evaluate specific hypotheses by comparing groups of people, but the strategies for sampling subjects for the study are very different. The following illustration summarizes the key differences between these two study designs.
return to top | previous page | next page
Content ©2016. All Rights Reserved. Date last modified: May 3, 2016. Wayne W. LaMorte, MD, PhD, MPH
IMAGES
VIDEO