Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • Current issue
  • Write for Us
  • BMJ Journals

You are here

  • Volume 14, Issue 3
  • What is a systematic review?
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • Jane Clarke
  • Correspondence to Jane Clarke 4 Prime Road, Grey Lynn, Auckland, New Zealand; janeclarkehome{at}gmail.com

https://doi.org/10.1136/ebn.2011.0049

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

A high-quality systematic review is described as the most reliable source of evidence to guide clinical practice. The purpose of a systematic review is to deliver a meticulous summary of all the available primary research in response to a research question. A systematic review uses all the existing research and is sometime called ‘secondary research’ (research on research). They are often required by research funders to establish the state of existing knowledge and are frequently used in guideline development. Systematic review findings are often used within the …

Competing interests None.

Read the full text or download the PDF:

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology
  • Systematic Review | Definition, Examples & Guide

Systematic Review | Definition, Examples & Guide

Published on 15 June 2022 by Shaun Turney . Revised on 18 July 2024.

A systematic review is a type of review that uses repeatable methods to find, select, and synthesise all available evidence. It answers a clearly formulated research question and explicitly states the methods used to arrive at the answer.

They answered the question ‘What is the effectiveness of probiotics in reducing eczema symptoms and improving quality of life in patients with eczema?’

In this context, a probiotic is a health product that contains live microorganisms and is taken by mouth. Eczema is a common skin condition that causes red, itchy skin.

Table of contents

What is a systematic review, systematic review vs meta-analysis, systematic review vs literature review, systematic review vs scoping review, when to conduct a systematic review, pros and cons of systematic reviews, step-by-step example of a systematic review, frequently asked questions about systematic reviews.

A review is an overview of the research that’s already been completed on a topic.

What makes a systematic review different from other types of reviews is that the research methods are designed to reduce research bias . The methods are repeatable , and the approach is formal and systematic:

  • Formulate a research question
  • Develop a protocol
  • Search for all relevant studies
  • Apply the selection criteria
  • Extract the data
  • Synthesise the data
  • Write and publish a report

Although multiple sets of guidelines exist, the Cochrane Handbook for Systematic Reviews is among the most widely used. It provides detailed guidelines on how to complete each step of the systematic review process.

Systematic reviews are most commonly used in medical and public health research, but they can also be found in other disciplines.

Systematic reviews typically answer their research question by synthesising all available evidence and evaluating the quality of the evidence. Synthesising means bringing together different information to tell a single, cohesive story. The synthesis can be narrative ( qualitative ), quantitative , or both.

Prevent plagiarism, run a free check.

Systematic reviews often quantitatively synthesise the evidence using a meta-analysis . A meta-analysis is a statistical analysis, not a type of review.

A meta-analysis is a technique to synthesise results from multiple studies. It’s a statistical analysis that combines the results of two or more studies, usually to estimate an effect size .

A literature review is a type of review that uses a less systematic and formal approach than a systematic review. Typically, an expert in a topic will qualitatively summarise and evaluate previous work, without using a formal, explicit method.

Although literature reviews are often less time-consuming and can be insightful or helpful, they have a higher risk of bias and are less transparent than systematic reviews.

Similar to a systematic review, a scoping review is a type of review that tries to minimise bias by using transparent and repeatable methods.

However, a scoping review isn’t a type of systematic review. The most important difference is the goal: rather than answering a specific question, a scoping review explores a topic. The researcher tries to identify the main concepts, theories, and evidence, as well as gaps in the current research.

Sometimes scoping reviews are an exploratory preparation step for a systematic review, and sometimes they are a standalone project.

A systematic review is a good choice of review if you want to answer a question about the effectiveness of an intervention , such as a medical treatment.

To conduct a systematic review, you’ll need the following:

  • A precise question , usually about the effectiveness of an intervention. The question needs to be about a topic that’s previously been studied by multiple researchers. If there’s no previous research, there’s nothing to review.
  • If you’re doing a systematic review on your own (e.g., for a research paper or thesis), you should take appropriate measures to ensure the validity and reliability of your research.
  • Access to databases and journal archives. Often, your educational institution provides you with access.
  • Time. A professional systematic review is a time-consuming process: it will take the lead author about six months of full-time work. If you’re a student, you should narrow the scope of your systematic review and stick to a tight schedule.
  • Bibliographic, word-processing, spreadsheet, and statistical software . For example, you could use EndNote, Microsoft Word, Excel, and SPSS.

A systematic review has many pros .

  • They minimise research b ias by considering all available evidence and evaluating each study for bias.
  • Their methods are transparent , so they can be scrutinised by others.
  • They’re thorough : they summarise all available evidence.
  • They can be replicated and updated by others.

Systematic reviews also have a few cons .

  • They’re time-consuming .
  • They’re narrow in scope : they only answer the precise research question.

The 7 steps for conducting a systematic review are explained with an example.

Step 1: Formulate a research question

Formulating the research question is probably the most important step of a systematic review. A clear research question will:

  • Allow you to more effectively communicate your research to other researchers and practitioners
  • Guide your decisions as you plan and conduct your systematic review

A good research question for a systematic review has four components, which you can remember with the acronym PICO :

  • Population(s) or problem(s)
  • Intervention(s)
  • Comparison(s)

You can rearrange these four components to write your research question:

  • What is the effectiveness of I versus C for O in P ?

Sometimes, you may want to include a fourth component, the type of study design . In this case, the acronym is PICOT .

  • Type of study design(s)
  • The population of patients with eczema
  • The intervention of probiotics
  • In comparison to no treatment, placebo , or non-probiotic treatment
  • The outcome of changes in participant-, parent-, and doctor-rated symptoms of eczema and quality of life
  • Randomised control trials, a type of study design

Their research question was:

  • What is the effectiveness of probiotics versus no treatment, a placebo, or a non-probiotic treatment for reducing eczema symptoms and improving quality of life in patients with eczema?

Step 2: Develop a protocol

A protocol is a document that contains your research plan for the systematic review. This is an important step because having a plan allows you to work more efficiently and reduces bias.

Your protocol should include the following components:

  • Background information : Provide the context of the research question, including why it’s important.
  • Research objective(s) : Rephrase your research question as an objective.
  • Selection criteria: State how you’ll decide which studies to include or exclude from your review.
  • Search strategy: Discuss your plan for finding studies.
  • Analysis: Explain what information you’ll collect from the studies and how you’ll synthesise the data.

If you’re a professional seeking to publish your review, it’s a good idea to bring together an advisory committee . This is a group of about six people who have experience in the topic you’re researching. They can help you make decisions about your protocol.

It’s highly recommended to register your protocol. Registering your protocol means submitting it to a database such as PROSPERO or ClinicalTrials.gov .

Step 3: Search for all relevant studies

Searching for relevant studies is the most time-consuming step of a systematic review.

To reduce bias, it’s important to search for relevant studies very thoroughly. Your strategy will depend on your field and your research question, but sources generally fall into these four categories:

  • Databases: Search multiple databases of peer-reviewed literature, such as PubMed or Scopus . Think carefully about how to phrase your search terms and include multiple synonyms of each word. Use Boolean operators if relevant.
  • Handsearching: In addition to searching the primary sources using databases, you’ll also need to search manually. One strategy is to scan relevant journals or conference proceedings. Another strategy is to scan the reference lists of relevant studies.
  • Grey literature: Grey literature includes documents produced by governments, universities, and other institutions that aren’t published by traditional publishers. Graduate student theses are an important type of grey literature, which you can search using the Networked Digital Library of Theses and Dissertations (NDLTD) . In medicine, clinical trial registries are another important type of grey literature.
  • Experts: Contact experts in the field to ask if they have unpublished studies that should be included in your review.

At this stage of your review, you won’t read the articles yet. Simply save any potentially relevant citations using bibliographic software, such as Scribbr’s APA or MLA Generator .

  • Databases: EMBASE, PsycINFO, AMED, LILACS, and ISI Web of Science
  • Handsearch: Conference proceedings and reference lists of articles
  • Grey literature: The Cochrane Library, the metaRegister of Controlled Trials, and the Ongoing Skin Trials Register
  • Experts: Authors of unpublished registered trials, pharmaceutical companies, and manufacturers of probiotics

Step 4: Apply the selection criteria

Applying the selection criteria is a three-person job. Two of you will independently read the studies and decide which to include in your review based on the selection criteria you established in your protocol . The third person’s job is to break any ties.

To increase inter-rater reliability , ensure that everyone thoroughly understands the selection criteria before you begin.

If you’re writing a systematic review as a student for an assignment, you might not have a team. In this case, you’ll have to apply the selection criteria on your own; you can mention this as a limitation in your paper’s discussion.

You should apply the selection criteria in two phases:

  • Based on the titles and abstracts : Decide whether each article potentially meets the selection criteria based on the information provided in the abstracts.
  • Based on the full texts: Download the articles that weren’t excluded during the first phase. If an article isn’t available online or through your library, you may need to contact the authors to ask for a copy. Read the articles and decide which articles meet the selection criteria.

It’s very important to keep a meticulous record of why you included or excluded each article. When the selection process is complete, you can summarise what you did using a PRISMA flow diagram .

Next, Boyle and colleagues found the full texts for each of the remaining studies. Boyle and Tang read through the articles to decide if any more studies needed to be excluded based on the selection criteria.

When Boyle and Tang disagreed about whether a study should be excluded, they discussed it with Varigos until the three researchers came to an agreement.

Step 5: Extract the data

Extracting the data means collecting information from the selected studies in a systematic way. There are two types of information you need to collect from each study:

  • Information about the study’s methods and results . The exact information will depend on your research question, but it might include the year, study design , sample size, context, research findings , and conclusions. If any data are missing, you’ll need to contact the study’s authors.
  • Your judgement of the quality of the evidence, including risk of bias .

You should collect this information using forms. You can find sample forms in The Registry of Methods and Tools for Evidence-Informed Decision Making and the Grading of Recommendations, Assessment, Development and Evaluations Working Group .

Extracting the data is also a three-person job. Two people should do this step independently, and the third person will resolve any disagreements.

They also collected data about possible sources of bias, such as how the study participants were randomised into the control and treatment groups.

Step 6: Synthesise the data

Synthesising the data means bringing together the information you collected into a single, cohesive story. There are two main approaches to synthesising the data:

  • Narrative ( qualitative ): Summarise the information in words. You’ll need to discuss the studies and assess their overall quality.
  • Quantitative : Use statistical methods to summarise and compare data from different studies. The most common quantitative approach is a meta-analysis , which allows you to combine results from multiple studies into a summary result.

Generally, you should use both approaches together whenever possible. If you don’t have enough data, or the data from different studies aren’t comparable, then you can take just a narrative approach. However, you should justify why a quantitative approach wasn’t possible.

Boyle and colleagues also divided the studies into subgroups, such as studies about babies, children, and adults, and analysed the effect sizes within each group.

Step 7: Write and publish a report

The purpose of writing a systematic review article is to share the answer to your research question and explain how you arrived at this answer.

Your article should include the following sections:

  • Abstract : A summary of the review
  • Introduction : Including the rationale and objectives
  • Methods : Including the selection criteria, search method, data extraction method, and synthesis method
  • Results : Including results of the search and selection process, study characteristics, risk of bias in the studies, and synthesis results
  • Discussion : Including interpretation of the results and limitations of the review
  • Conclusion : The answer to your research question and implications for practice, policy, or research

To verify that your report includes everything it needs, you can use the PRISMA checklist .

Once your report is written, you can publish it in a systematic review database, such as the Cochrane Database of Systematic Reviews , and/or in a peer-reviewed journal.

A systematic review is secondary research because it uses existing research. You don’t collect new data yourself.

A literature review is a survey of scholarly sources (such as books, journal articles, and theses) related to a specific topic or research question .

It is often written as part of a dissertation , thesis, research paper , or proposal .

There are several reasons to conduct a literature review at the beginning of a research project:

  • To familiarise yourself with the current state of knowledge on your topic
  • To ensure that you’re not just repeating what others have already done
  • To identify gaps in knowledge and unresolved problems that your research can address
  • To develop your theoretical framework and methodology
  • To provide an overview of the key findings and debates on the topic

Writing the literature review shows your reader how your work relates to existing research and what new insights it will contribute.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Turney, S. (2024, July 17). Systematic Review | Definition, Examples & Guide. Scribbr. Retrieved 21 August 2024, from https://www.scribbr.co.uk/research-methods/systematic-reviews/

Is this article helpful?

Shaun Turney

Shaun Turney

Other students also liked, what is a literature review | guide, template, & examples, exploratory research | definition, guide, & examples, what is peer review | types & examples.

Dahlgren Memorial Library

The Graduate Health & Life Sciences Research Library at Georgetown University Medical Center

Systematic reviews.

  • Should I do a systematic review?
  • Writing the Protocol
  • Building a Systematic Search
  • Where to Search
  • Managing Project Data
  • How can a DML librarian help?

Guides and Standards

  • The Cochrane Handbook The Cochrane Handbook has become the de facto standard for planning and carrying out a systematic review. Chapter 6, Searching for Studies, is most helpful in planning your review.
  • Finding What Works in Health Care: Standards for Systematic Reviews The IOM standards promote objective, transparent, and scientifically valid systematic reviews. They address the entire systematic review process, from locating, screening, and selecting studies for the review, to synthesizing the findings (including meta-analysis) and assessing the overall quality of the body of evidence, to producing the final review report.
  • PRISMA Standards The Preferred Reporting Items for Systematic Reviews and Meta-Analyses is an evidence-based minimum set of items for reporting in systematic reviews and meta-analyses. A 27-item checklist, PRISMA focuses on randomized trials but can also be used as a basis for reporting systematic reviews of other types of research, particularly evaluations of interventions.

What is a systematic review?

A systematic literature review is a research methodology designed to answer a focused research question. Authors conduct a methodical and comprehensive literature synthesis focused on a well-formulated research question. Its aim is to identify and synthesize all of the scholarly research on a particular topic, including both published and unpublished studies. Systematic reviews are conducted in an unbiased, reproducible way to provide evidence for practice and policy-making and identify gaps in research.  Every step of the review, including the search, must be documented for reproducibility. 

Researchers in medicine may be most familiar with Cochrane Reviews, which synthesize randomized controlled trials to evaluate specific medical interventions. Systematic reviews are conducted in many other fields, though the type of evidence analyzed varies with the research question. 

When to use systematic review methodology

Systematic reviews require more time and manpower than traditional literature reviews. Before beginning a systematic review, researchers should address these questions:

Is there is enough literature published on the topic to warrant a review? 

Systematic reviews are designed to distill the evidence from many studies into actionable insights. Is there a body of evidence available to analyze, or does more primary research need to be done?

Can your research question be answered by a systematic review?

Systematic review questions should be specific and clearly defined. Questions that fit the PICO (problem/patient, intervention, comparison, outcome) format are usually well-suited for the systematic review methodology. The research question determines the search strategy, inclusion criteria, and data that you extract from the selected studies, so it should be clearly defined at the start of the review process.

Do you have a protocol outlining the review plan?

The protocol is the roadmap for the review project. A good protocol outlines study methodology, includes the rationale for the systematic review, and describes the key question broken into PICO components. It is also a good place to plan out inclusion/exclusion criteria, databases that will be searched, data abstraction and management methods, and how the studies will be assessed for methodological quality.

Do you have a team of experts?

A systematic review is team effort. Having multiple reviewers minimizes bias and strengthens analysis. Teams are often composed of subject experts, two or more literature screeners, a librarian to conduct the search, and a statistician to analyze the data. 

Do you have the time that it takes to properly conduct a systematic review?  

Systematic reviews typically take 12-18 months. 

Do you have a method for discerning bias?  

There are many types of bias, including selection, performance, & reporting bias, and assessing the risk of bias of individual studies is an important part of your study design.

Can you afford to have articles in languages other than English translated?  

You should include all relevant studies in your systematic review, regardless of the language they were published in, so as to avoid language bias. 

Which review is right for you?

If your project does not meet the above criteria, there are many more options for conducting a synthesis of the literature. The chart below highlights several review methodologies. Reproduced from: Grant MJ, Booth A. A typology of reviews: an analysis of 14 review types and associated methodologies. Health Info Libr J. 2009 Jun;26(2):91-108. doi: 10.1111/j.1471-1842.2009.00848.x  . Review. PubMed PMID: 19490148 

Label

Description

Search

Appraisal

Synthesis

Analysis

Critical review Aims to demonstrate writer has extensively researched literature and critically evaluated its quality. Goes beyond mere description to include degree of analysis and conceptual innovation. Typically results in hypothesis or model. Seeks to identify significant items in the field. No formal quality assessment. Attempts to evaluate according to contribution. Typically narrative, perhaps conceptual or chronological. Significant component: seeks to identify conceptual contribution to embody existing or derive new theory.
Literature review Generic term: a search for published materials that provide examination of recent or current literature. Can cover wide range of subjects at various levels of completeness and comprehensiveness. May include research findings. May or may not include comprehensive searching. May or may not include quality assessment. Typically narrative. Analysis may be chronological, conceptual, thematic, etc.
Mapping review/systematic map Maps out and categorizes existing literature from which to commission further reviews and/or primary research by identifying gaps in research literature. Completeness of searching determined by time/scope constraints. No formal quality assessment. May be graphical and tabular. Characterizes quantity and quality of literature, perhaps by study design and other key features. May identify need for primary or secondary research.
Meta-analysis Technique that statistically combines the results of quantitative studies to provide a more precise effect of the results. Aims for exhaustive searching. May use funnel plot to assess completeness. Quality assessment may determine inclusion/exclusion and/or sensitivity analyses. Graphical and tabular with narrative commentary. Numerical analysis of measures of effect assuming absence of heterogeneity.
Mixed studies review/mixed methods review Refers to any combination of methods where one significant component is a literature review (usually systematic). Within a review context it refers to a combination of review approaches for example combining quantitative with qualitative research or outcome with process studies. Requires either very sensitive search to retrieve all studies or separately conceived quantitative and qualitative strategies. Requires either a generic appraisal instrument or separate appraisal processes with corresponding checklists. Typically both components will be presented as narrative and in tables. May also employ graphical means of integrating quantitative and qualitative studies. Analysis may characterize both quantitative and qualitative studies and look for correlations between their characteristics or use gap analysis to identify aspects present in one type of study but missing in the other.
Overview Generic term: summary of the [medical] literature that attempts to survey the literature and describe its characteristics. May or may not include comprehensive searching (depends whether systematic overview or not). May or may not include quality assessment (depends whether systematic overview or not). Synthesis depends on whether systematic overview or not. Typically narrative but may include tabular features. Analysis may be chronological, conceptual, thematic, etc.
Qualitative systematic review/qualitative evidence synthesis Method for integrating or comparing the findings from qualitative studies. It looks for ‘themes’ or ‘constructs’ that lie in or across individual qualitative studies. May employ selective or purposive sampling. Quality assessment typically used to mediate messages not for inclusion/exclusion. Qualitative, narrative synthesis. Thematic analysis, may include conceptual models.
Rapid review Assessment of what is already known about a policy or practice issue, by using systematic review methods to search and critically appraise existing research. Completeness of searching determined by time constraints. Time-limited formal quality assessment. Typically narrative and tabular. Quantities of literature and overall quality/direction of effect of literature.
Scoping review Preliminary assessment of potential size and scope of available research literature. Aims to identify nature and extent of research evidence (usually including ongoing research). Completeness of searching determined by time/scope constraints. May include research in progress. No formal quality assessment. Typically tabular with some narrative commentary. Characterizes quantity and quality of literature, perhaps by study design and other key features. Attempts to specify a viable review.
State-of-the-art review Tend to address more current matters in contrast to other combined retrospective and current approaches. May offer new perspectives on issue or point out area for further research. Aims for comprehensive searching of current literature. No formal quality assessment. Typically narrative, may have tabular accompaniment. Current state of knowledge and priorities for future investigation and research.
Systematic review Seeks to systematically search for, appraise and synthesize research evidence, often adhering to guidelines on the conduct of a review. Aims for exhaustive, comprehensive searching. Quality assessment may determine inclusion/exclusion. Typically narrative with tabular accompaniment. What is known; recommendations for practice. What remains unknown; uncertainty around findings, recommendations for future research.
Systematic search and review Combines strengths of critical review with a comprehensive search process. Typically addresses broad questions to produce ‘best evidence synthesis.' Aims for exhaustive, comprehensive searching. May or may not include quality assessment. Minimal narrative, tabular summary of studies. What is known; recommendations for practice. Limitations.
Systematized review Attempt to include elements of systematic review process while stopping short of systematic review. Typically conducted as postgraduate student assignment. May or may not include comprehensive searching. May or may not include quality assessment. Typically narrative with tabular accompaniment. What is known; uncertainty around findings; limitations of methodology.
Umbrella review Specifically refers to review compiling evidence from multiple reviews into one accessible and usable document. Focuses on broad condition or problem for which there are competing interventions and highlights reviews that address these interventions and their results. Identification of component reviews, but no search for primary studies. Quality assessment of studies within component reviews and/or of reviews themselves. Graphical and tabular with narrative commentary. What is known; recommendations for practice. What remains unknown; recommendations for future research.
  • Next: Writing the Protocol >>
  • Last Updated: Jul 22, 2024 3:29 PM
  • URL: https://guides.dml.georgetown.edu/systematicreviews

The Responsible Use of Electronic Resources policy governs the use of resources provided on these guides. © Dahlgren Memorial Library, Georgetown University Medical Center. Unless otherwise stated, these guides may be used for educational or academic purposes as long as proper attribution is given. Please seek permission for any modifications, adaptations, or for commercial purposes. Email [email protected] to request permission. Proper attribution includes: Written by or adapted from, Dahlgren Memorial Library, URL.

is a systematic review primary or secondary research

Understanding Nursing Research

  • Primary Research
  • Qualitative vs. Quantitative Research
  • Experimental Design
  • Is it a Nursing journal?
  • Is it Written by a Nurse?

Secondary Research and Systematic Reviews

Comparative table.

  • Quality Improvement Plans

Secondary Research is when researchers collect lots of research that has already been published on a certain subject. They conduct searches in databases, go through lots of primary research articles, and analyze the findings in those pieces of primary research. The goal of secondary research is to pull together lots of diverse primary research (like studies and trials), with the end goal of making a generalized statement. Primary research can only make statements about the specific context in which their research was conducted (for example, this specific intervention worked in this hospital with these participants), but secondary research can make broader statements because it compiled lots of primary research together. So rather than saying, "this specific intervention worked at this specific hospital with these specific participants, a piece of secondary research can say, "This intervention works at hospitals that serve this population."

Systematic Reviews are a kind of secondary research. The creators of systematic reviews are very intentional about their inclusion/exclusion criteria, or which articles they'll include in their review and the goal is to make a generalized statement so other researchers can build upon the practices or interventions they recommend. Use the chart below to understand the differences between a systematic review and a literature review.

Check out the video below to watch the Nursing and Health Sciences librarian describe the differences between primary and secondary research.

Literature Review Systematic Review Meta-Analysis

 

  • "Literature Reviews and Systematic Reviews: What Is the Difference?" This article explains in depth the differences between Literature Reviews and Systematic Reviews. It is from the journal RADIOLOGIC TECHNOLOGY, Nov/Dec 2013, v. 85, #2. It is one to which Bell Library subscribes and meets copyright clearance requirements through our subscription to CCC.
  • << Previous: Is it Written by a Nurse?
  • Next: Permalinks >>
  • Last Updated: Aug 8, 2024 4:25 PM
  • URL: https://guides.library.tamucc.edu/nursingresearch

Home

  • Duke NetID Login
  • 919.660.1100
  • Duke Health Badge: 24-hour access
  • Accounts & Access
  • Databases, Journals & Books
  • Request & Reserve
  • Training & Consulting
  • Request Articles & Books
  • Renew Online
  • Reserve Spaces
  • Reserve a Locker
  • Study & Meeting Rooms
  • Course Reserves
  • Pay Fines/Fees
  • Recommend a Purchase
  • Access From Off Campus
  • Building Access
  • Computers & Equipment
  • Wifi Access
  • My Accounts
  • Mobile Apps
  • Known Access Issues
  • Report an Access Issue
  • All Databases
  • Article Databases
  • Basic Sciences
  • Clinical Sciences
  • Dissertations & Theses
  • Drugs, Chemicals & Toxicology
  • Grants & Funding
  • Interprofessional Education
  • Non-Medical Databases
  • Search for E-Journals
  • Search for Print & E-Journals
  • Search for E-Books
  • Search for Print & E-Books
  • E-Book Collections
  • Biostatistics
  • Global Health
  • MBS Program
  • Medical Students
  • MMCi Program
  • Occupational Therapy
  • Path Asst Program
  • Physical Therapy
  • Researchers
  • Community Partners

Conducting Research

  • Archival & Historical Research
  • Black History at Duke Health
  • Data Analytics & Viz Software
  • Data: Find and Share
  • Evidence-Based Practice
  • NIH Public Access Policy Compliance
  • Publication Metrics
  • Qualitative Research
  • Searching Animal Alternatives

Systematic Reviews

  • Test Instruments

Using Databases

  • JCR Impact Factors
  • Web of Science

Finding & Accessing

  • COVID-19: Core Clinical Resources
  • Health Literacy
  • Health Statistics & Data
  • Library Orientation

Writing & Citing

  • Creating Links
  • Getting Published
  • Reference Mgmt
  • Scientific Writing

Meet a Librarian

  • Request a Consultation
  • Find Your Liaisons
  • Register for a Class
  • Request a Class
  • Self-Paced Learning

Search Services

  • Literature Search
  • Systematic Review
  • Animal Alternatives (IACUC)
  • Research Impact

Citation Mgmt

  • Other Software

Scholarly Communications

  • About Scholarly Communications
  • Publish Your Work
  • Measure Your Research Impact
  • Engage in Open Science
  • Libraries and Publishers
  • Directions & Maps
  • Floor Plans

Library Updates

  • Annual Snapshot
  • Conference Presentations
  • Contact Information
  • Gifts & Donations
  • What is a Systematic Review?

Types of Reviews

  • Manuals and Reporting Guidelines
  • Our Service
  • 1. Assemble Your Team
  • 2. Develop a Research Question
  • 3. Write and Register a Protocol
  • 4. Search the Evidence
  • 5. Screen Results
  • 6. Assess for Quality and Bias
  • 7. Extract the Data
  • 8. Write the Review
  • Additional Resources
  • Finding Full-Text Articles

Review Typologies

There are many types of evidence synthesis projects, including systematic reviews as well as others. The selection of review type is wholly dependent on the research question. Not all research questions are well-suited for systematic reviews.

  • Review Typologies (from LITR-EX) This site explores different review methodologies such as, systematic, scoping, realist, narrative, state of the art, meta-ethnography, critical, and integrative reviews. The LITR-EX site has a health professions education focus, but the advice and information is widely applicable.

Review the table to peruse review types and associated methodologies. Librarians can also help your team determine which review type might be appropriate for your project. 

Reproduced from Grant, M. J. and Booth, A. (2009), A typology of reviews: an analysis of 14 review types and associated methodologies. Health Information & Libraries Journal, 26: 91-108.  doi:10.1111/j.1471-1842.2009.00848.x

Aims to demonstrate writer has extensively researched literature and critically evaluated its quality. Goes beyond mere description to include degree of analysis and conceptual innovation. Typically results in hypothesis or mode

Seeks to identify most significant items in the field

No formal quality assessment. Attempts to evaluate according to contribution

Typically narrative, perhaps conceptual or chronological

Significant component: seeks to identify conceptual contribution to embody existing or derive new theory

Generic term: published materials that provide examination of recent or current literature. Can cover wide range of subjects at various levels of completeness and comprehensiveness. May include research findings

May or may not include comprehensive searching

May or may not include quality assessment

Typically narrative

Analysis may be chronological, conceptual, thematic, etc.

Map out and categorize existing literature from which to commission further reviews and/or primary research by identifying gaps in research literature

Completeness of searching determined by time/scope constraints

No formal quality assessment

May be graphical and tabular

Characterizes quantity and quality of literature, perhaps by study design and other key features. May identify need for primary or secondary research

Technique that statistically combines the results of quantitative studies to provide a more precise effect of the results

Aims for exhaustive, comprehensive searching. May use funnel plot to assess completeness

Quality assessment may determine inclusion/ exclusion and/or sensitivity analyses

Graphical and tabular with narrative commentary

Numerical analysis of measures of effect assuming absence of heterogeneity

Refers to any combination of methods where one significant component is a literature review (usually systematic). Within a review context it refers to a combination of review approaches for example combining quantitative with qualitative research or outcome with process studies

Requires either very sensitive search to retrieve all studies or separately conceived quantitative and qualitative strategies

Requires either a generic appraisal instrument or separate appraisal processes with corresponding checklists

Typically both components will be presented as narrative and in tables. May also employ graphical means of integrating quantitative and qualitative studies

Analysis may characterise both literatures and look for correlations between characteristics or use gap analysis to identify aspects absent in one literature but missing in the other

Generic term: summary of the [medical] literature that attempts to survey the literature and describe its characteristics

May or may not include comprehensive searching (depends whether systematic overview or not)

May or may not include quality assessment (depends whether systematic overview or not)

Synthesis depends on whether systematic or not. Typically narrative but may include tabular features

Analysis may be chronological, conceptual, thematic, etc.

Method for integrating or comparing the findings from qualitative studies. It looks for ‘themes’ or ‘constructs’ that lie in or across individual qualitative studies

May employ selective or purposive sampling

Quality assessment typically used to mediate messages not for inclusion/exclusion

Qualitative, narrative synthesis

Thematic analysis, may include conceptual models

Assessment of what is already known about a policy or practice issue, by using systematic review methods to search and critically appraise existing research

Completeness of searching determined by time constraints

Time-limited formal quality assessment

Typically narrative and tabular

Quantities of literature and overall quality/direction of effect of literature

Preliminary assessment of potential size and scope of available research literature. Aims to identify nature and extent of research evidence (usually including ongoing research)

Completeness of searching determined by time/scope constraints. May include research in progress

No formal quality assessment

Typically tabular with some narrative commentary

Characterizes quantity and quality of literature, perhaps by study design and other key features. Attempts to specify a viable review

Tend to address more current matters in contrast to other combined retrospective and current approaches. May offer new perspectives

Aims for comprehensive searching of current literature

No formal quality assessment

Typically narrative, may have tabular accompaniment

Current state of knowledge and priorities for future investigation and research

Seeks to systematically search for, appraise and synthesis research evidence, often adhering to guidelines on the conduct of a review

Aims for exhaustive, comprehensive searching

Quality assessment may determine inclusion/exclusion

Typically narrative with tabular accompaniment

What is known; recommendations for practice. What remains unknown; uncertainty around findings, recommendations for future research

Combines strengths of critical review with a comprehensive search process. Typically addresses broad questions to produce ‘best evidence synthesis’

Aims for exhaustive, comprehensive searching

May or may not include quality assessment

Minimal narrative, tabular summary of studies

What is known; recommendations for practice. Limitations

Attempt to include elements of systematic review process while stopping short of systematic review. Typically conducted as postgraduate student assignment

May or may not include comprehensive searching

May or may not include quality assessment

Typically narrative with tabular accompaniment

What is known; uncertainty around findings; limitations of methodology

Specifically refers to review compiling evidence from multiple reviews into one accessible and usable document. Focuses on broad condition or problem for which there are competing interventions and highlights reviews that address these interventions and their results

Identification of component reviews, but no search for primary studies

Quality assessment of studies within component reviews and/or of reviews themselves

Graphical and tabular with narrative commentary

What is known; recommendations for practice. What remains unknown; recommendations for future research

  • << Previous: What is a Systematic Review?
  • Next: Manuals and Reporting Guidelines >>
  • Last Updated: Jun 18, 2024 9:41 AM
  • URL: https://guides.mclibrary.duke.edu/sysreview
  • Duke Health
  • Duke University
  • Duke Libraries
  • Medical Center Archives
  • Duke Directory
  • Seeley G. Mudd Building
  • 10 Searle Drive
  • [email protected]

Frequently asked questions

Is a systematic review primary research.

A systematic review is secondary research because it uses existing research. You don’t collect new data yourself.

Frequently asked questions: Methodology

Attrition refers to participants leaving a study. It always happens to some extent—for example, in randomized controlled trials for medical research.

Differential attrition occurs when attrition or dropout rates differ systematically between the intervention and the control group . As a result, the characteristics of the participants who drop out differ from the characteristics of those who stay in the study. Because of this, study results may be biased .

Action research is conducted in order to solve a particular issue immediately, while case studies are often conducted over a longer period of time and focus more on observing and analyzing a particular ongoing phenomenon.

Action research is focused on solving a problem or informing individual and community-based knowledge in a way that impacts teaching, learning, and other related processes. It is less focused on contributing theoretical input, instead producing actionable input.

Action research is particularly popular with educators as a form of systematic inquiry because it prioritizes reflection and bridges the gap between theory and practice. Educators are able to simultaneously investigate an issue as they solve it, and the method is very iterative and flexible.

A cycle of inquiry is another name for action research . It is usually visualized in a spiral shape following a series of steps, such as “planning → acting → observing → reflecting.”

To make quantitative observations , you need to use instruments that are capable of measuring the quantity you want to observe. For example, you might use a ruler to measure the length of an object or a thermometer to measure its temperature.

Criterion validity and construct validity are both types of measurement validity . In other words, they both show you how accurately a method measures something.

While construct validity is the degree to which a test or other measurement method measures what it claims to measure, criterion validity is the degree to which a test can predictively (in the future) or concurrently (in the present) measure something.

Construct validity is often considered the overarching type of measurement validity . You need to have face validity , content validity , and criterion validity in order to achieve construct validity.

Convergent validity and discriminant validity are both subtypes of construct validity . Together, they help you evaluate whether a test measures the concept it was designed to measure.

  • Convergent validity indicates whether a test that is designed to measure a particular construct correlates with other tests that assess the same or similar construct.
  • Discriminant validity indicates whether two tests that should not be highly related to each other are indeed not related. This type of validity is also called divergent validity .

You need to assess both in order to demonstrate construct validity. Neither one alone is sufficient for establishing construct validity.

  • Discriminant validity indicates whether two tests that should not be highly related to each other are indeed not related

Content validity shows you how accurately a test or other measurement method taps  into the various aspects of the specific construct you are researching.

In other words, it helps you answer the question: “does the test measure all aspects of the construct I want to measure?” If it does, then the test has high content validity.

The higher the content validity, the more accurate the measurement of the construct.

If the test fails to include parts of the construct, or irrelevant parts are included, the validity of the instrument is threatened, which brings your results into question.

Face validity and content validity are similar in that they both evaluate how suitable the content of a test is. The difference is that face validity is subjective, and assesses content at surface level.

When a test has strong face validity, anyone would agree that the test’s questions appear to measure what they are intended to measure.

For example, looking at a 4th grade math test consisting of problems in which students have to add and multiply, most people would agree that it has strong face validity (i.e., it looks like a math test).

On the other hand, content validity evaluates how well a test represents all the aspects of a topic. Assessing content validity is more systematic and relies on expert evaluation. of each question, analyzing whether each one covers the aspects that the test was designed to cover.

A 4th grade math test would have high content validity if it covered all the skills taught in that grade. Experts(in this case, math teachers), would have to evaluate the content validity by comparing the test to the learning objectives.

Snowball sampling is a non-probability sampling method . Unlike probability sampling (which involves some form of random selection ), the initial individuals selected to be studied are the ones who recruit new participants.

Because not every member of the target population has an equal chance of being recruited into the sample, selection in snowball sampling is non-random.

Snowball sampling is a non-probability sampling method , where there is not an equal chance for every member of the population to be included in the sample .

This means that you cannot use inferential statistics and make generalizations —often the goal of quantitative research . As such, a snowball sample is not representative of the target population and is usually a better fit for qualitative research .

Snowball sampling relies on the use of referrals. Here, the researcher recruits one or more initial participants, who then recruit the next ones.

Participants share similar characteristics and/or know each other. Because of this, not every member of the population has an equal chance of being included in the sample, giving rise to sampling bias .

Snowball sampling is best used in the following cases:

  • If there is no sampling frame available (e.g., people with a rare disease)
  • If the population of interest is hard to access or locate (e.g., people experiencing homelessness)
  • If the research focuses on a sensitive topic (e.g., extramarital affairs)

The reproducibility and replicability of a study can be ensured by writing a transparent, detailed method section and using clear, unambiguous language.

Reproducibility and replicability are related terms.

  • Reproducing research entails reanalyzing the existing data in the same manner.
  • Replicating (or repeating ) the research entails reconducting the entire analysis, including the collection of new data . 
  • A successful reproduction shows that the data analyses were conducted in a fair and honest manner.
  • A successful replication shows that the reliability of the results is high.

Stratified sampling and quota sampling both involve dividing the population into subgroups and selecting units from each subgroup. The purpose in both cases is to select a representative sample and/or to allow comparisons between subgroups.

The main difference is that in stratified sampling, you draw a random sample from each subgroup ( probability sampling ). In quota sampling you select a predetermined number or proportion of units, in a non-random manner ( non-probability sampling ).

Purposive and convenience sampling are both sampling methods that are typically used in qualitative data collection.

A convenience sample is drawn from a source that is conveniently accessible to the researcher. Convenience sampling does not distinguish characteristics among the participants. On the other hand, purposive sampling focuses on selecting participants possessing characteristics associated with the research study.

The findings of studies based on either convenience or purposive sampling can only be generalized to the (sub)population from which the sample is drawn, and not to the entire population.

Random sampling or probability sampling is based on random selection. This means that each unit has an equal chance (i.e., equal probability) of being included in the sample.

On the other hand, convenience sampling involves stopping people at random, which means that not everyone has an equal chance of being selected depending on the place, time, or day you are collecting your data.

Convenience sampling and quota sampling are both non-probability sampling methods. They both use non-random criteria like availability, geographical proximity, or expert knowledge to recruit study participants.

However, in convenience sampling, you continue to sample units or cases until you reach the required sample size.

In quota sampling, you first need to divide your population of interest into subgroups (strata) and estimate their proportions (quota) in the population. Then you can start your data collection, using convenience sampling to recruit participants, until the proportions in each subgroup coincide with the estimated proportions in the population.

A sampling frame is a list of every member in the entire population . It is important that the sampling frame is as complete as possible, so that your sample accurately reflects your population.

Stratified and cluster sampling may look similar, but bear in mind that groups created in cluster sampling are heterogeneous , so the individual characteristics in the cluster vary. In contrast, groups created in stratified sampling are homogeneous , as units share characteristics.

Relatedly, in cluster sampling you randomly select entire groups and include all units of each group in your sample. However, in stratified sampling, you select some units of all groups and include them in your sample. In this way, both methods can ensure that your sample is representative of the target population .

The key difference between observational studies and experimental designs is that a well-done observational study does not influence the responses of participants, while experiments do have some sort of treatment condition applied to at least some participants by random assignment .

An observational study is a great choice for you if your research question is based purely on observations. If there are ethical, logistical, or practical concerns that prevent you from conducting a traditional experiment , an observational study may be a good choice. In an observational study, there is no interference or manipulation of the research subjects, as well as no control or treatment groups .

It’s often best to ask a variety of people to review your measurements. You can ask experts, such as other researchers, or laypeople, such as potential participants, to judge the face validity of tests.

While experts have a deep understanding of research methods , the people you’re studying can provide you with valuable insights you may have missed otherwise.

Face validity is important because it’s a simple first step to measuring the overall validity of a test or technique. It’s a relatively intuitive, quick, and easy way to start checking whether a new measure seems useful at first glance.

Good face validity means that anyone who reviews your measure says that it seems to be measuring what it’s supposed to. With poor face validity, someone reviewing your measure may be left confused about what you’re measuring and why you’re using this method.

Face validity is about whether a test appears to measure what it’s supposed to measure. This type of validity is concerned with whether a measure seems relevant and appropriate for what it’s assessing only on the surface.

Statistical analyses are often applied to test validity with data from your measures. You test convergent validity and discriminant validity with correlations to see if results from your test are positively or negatively related to those of other established tests.

You can also use regression analyses to assess whether your measure is actually predictive of outcomes that you expect it to predict theoretically. A regression analysis that supports your expectations strengthens your claim of construct validity .

When designing or evaluating a measure, construct validity helps you ensure you’re actually measuring the construct you’re interested in. If you don’t have construct validity, you may inadvertently measure unrelated or distinct constructs and lose precision in your research.

Construct validity is often considered the overarching type of measurement validity ,  because it covers all of the other types. You need to have face validity , content validity , and criterion validity to achieve construct validity.

Construct validity is about how well a test measures the concept it was designed to evaluate. It’s one of four types of measurement validity , which includes construct validity, face validity , and criterion validity.

There are two subtypes of construct validity.

  • Convergent validity : The extent to which your measure corresponds to measures of related constructs
  • Discriminant validity : The extent to which your measure is unrelated or negatively related to measures of distinct constructs

Naturalistic observation is a valuable tool because of its flexibility, external validity , and suitability for topics that can’t be studied in a lab setting.

The downsides of naturalistic observation include its lack of scientific control , ethical considerations , and potential for bias from observers and subjects.

Naturalistic observation is a qualitative research method where you record the behaviors of your research subjects in real world settings. You avoid interfering or influencing anything in a naturalistic observation.

You can think of naturalistic observation as “people watching” with a purpose.

A dependent variable is what changes as a result of the independent variable manipulation in experiments . It’s what you’re interested in measuring, and it “depends” on your independent variable.

In statistics, dependent variables are also called:

  • Response variables (they respond to a change in another variable)
  • Outcome variables (they represent the outcome you want to measure)
  • Left-hand-side variables (they appear on the left-hand side of a regression equation)

An independent variable is the variable you manipulate, control, or vary in an experimental study to explore its effects. It’s called “independent” because it’s not influenced by any other variables in the study.

Independent variables are also called:

  • Explanatory variables (they explain an event or outcome)
  • Predictor variables (they can be used to predict the value of a dependent variable)
  • Right-hand-side variables (they appear on the right-hand side of a regression equation).

As a rule of thumb, questions related to thoughts, beliefs, and feelings work well in focus groups. Take your time formulating strong questions, paying special attention to phrasing. Be careful to avoid leading questions , which can bias your responses.

Overall, your focus group questions should be:

  • Open-ended and flexible
  • Impossible to answer with “yes” or “no” (questions that start with “why” or “how” are often best)
  • Unambiguous, getting straight to the point while still stimulating discussion
  • Unbiased and neutral

A structured interview is a data collection method that relies on asking questions in a set order to collect data on a topic. They are often quantitative in nature. Structured interviews are best used when: 

  • You already have a very clear understanding of your topic. Perhaps significant research has already been conducted, or you have done some prior research yourself, but you already possess a baseline for designing strong structured questions.
  • You are constrained in terms of time or resources and need to analyze your data quickly and efficiently.
  • Your research question depends on strong parity between participants, with environmental conditions held constant.

More flexible interview options include semi-structured interviews , unstructured interviews , and focus groups .

Social desirability bias is the tendency for interview participants to give responses that will be viewed favorably by the interviewer or other participants. It occurs in all types of interviews and surveys , but is most common in semi-structured interviews , unstructured interviews , and focus groups .

Social desirability bias can be mitigated by ensuring participants feel at ease and comfortable sharing their views. Make sure to pay attention to your own body language and any physical or verbal cues, such as nodding or widening your eyes.

This type of bias can also occur in observations if the participants know they’re being observed. They might alter their behavior accordingly.

The interviewer effect is a type of bias that emerges when a characteristic of an interviewer (race, age, gender identity, etc.) influences the responses given by the interviewee.

There is a risk of an interviewer effect in all types of interviews , but it can be mitigated by writing really high-quality interview questions.

A semi-structured interview is a blend of structured and unstructured types of interviews. Semi-structured interviews are best used when:

  • You have prior interview experience. Spontaneous questions are deceptively challenging, and it’s easy to accidentally ask a leading question or make a participant uncomfortable.
  • Your research question is exploratory in nature. Participant answers can guide future research questions and help you develop a more robust knowledge base for future research.

An unstructured interview is the most flexible type of interview, but it is not always the best fit for your research topic.

Unstructured interviews are best used when:

  • You are an experienced interviewer and have a very strong background in your research topic, since it is challenging to ask spontaneous, colloquial questions.
  • Your research question is exploratory in nature. While you may have developed hypotheses, you are open to discovering new or shifting viewpoints through the interview process.
  • You are seeking descriptive data, and are ready to ask questions that will deepen and contextualize your initial thoughts and hypotheses.
  • Your research depends on forming connections with your participants and making them feel comfortable revealing deeper emotions, lived experiences, or thoughts.

The four most common types of interviews are:

  • Structured interviews : The questions are predetermined in both topic and order. 
  • Semi-structured interviews : A few questions are predetermined, but other questions aren’t planned.
  • Unstructured interviews : None of the questions are predetermined.
  • Focus group interviews : The questions are presented to a group instead of one individual.

Deductive reasoning is commonly used in scientific research, and it’s especially associated with quantitative research .

In research, you might have come across something called the hypothetico-deductive method . It’s the scientific method of testing hypotheses to check whether your predictions are substantiated by real-world data.

Deductive reasoning is a logical approach where you progress from general ideas to specific conclusions. It’s often contrasted with inductive reasoning , where you start with specific observations and form general conclusions.

Deductive reasoning is also called deductive logic.

There are many different types of inductive reasoning that people use formally or informally.

Here are a few common types:

  • Inductive generalization : You use observations about a sample to come to a conclusion about the population it came from.
  • Statistical generalization: You use specific numbers about samples to make statements about populations.
  • Causal reasoning: You make cause-and-effect links between different things.
  • Sign reasoning: You make a conclusion about a correlational relationship between different things.
  • Analogical reasoning: You make a conclusion about something based on its similarities to something else.

Inductive reasoning is a bottom-up approach, while deductive reasoning is top-down.

Inductive reasoning takes you from the specific to the general, while in deductive reasoning, you make inferences by going from general premises to specific conclusions.

In inductive research , you start by making observations or gathering data. Then, you take a broad scan of your data and search for patterns. Finally, you make general conclusions that you might incorporate into theories.

Inductive reasoning is a method of drawing conclusions by going from the specific to the general. It’s usually contrasted with deductive reasoning, where you proceed from general information to specific conclusions.

Inductive reasoning is also called inductive logic or bottom-up reasoning.

A hypothesis states your predictions about what your research will find. It is a tentative answer to your research question that has not yet been tested. For some research projects, you might have to write several hypotheses that address different aspects of your research question.

A hypothesis is not just a guess — it should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations and statistical analysis of data).

Triangulation can help:

  • Reduce research bias that comes from using a single method, theory, or investigator
  • Enhance validity by approaching the same topic with different tools
  • Establish credibility by giving you a complete picture of the research problem

But triangulation can also pose problems:

  • It’s time-consuming and labor-intensive, often involving an interdisciplinary team.
  • Your results may be inconsistent or even contradictory.

There are four main types of triangulation :

  • Data triangulation : Using data from different times, spaces, and people
  • Investigator triangulation : Involving multiple researchers in collecting or analyzing data
  • Theory triangulation : Using varying theoretical perspectives in your research
  • Methodological triangulation : Using different methodologies to approach the same topic

Many academic fields use peer review , largely to determine whether a manuscript is suitable for publication. Peer review enhances the credibility of the published manuscript.

However, peer review is also common in non-academic settings. The United Nations, the European Union, and many individual nations use peer review to evaluate grant applications. It is also widely used in medical and health-related fields as a teaching or quality-of-care measure. 

Peer assessment is often used in the classroom as a pedagogical tool. Both receiving feedback and providing it are thought to enhance the learning process, helping students think critically and collaboratively.

Peer review can stop obviously problematic, falsified, or otherwise untrustworthy research from being published. It also represents an excellent opportunity to get feedback from renowned experts in your field. It acts as a first defense, helping you ensure your argument is clear and that there are no gaps, vague terms, or unanswered questions for readers who weren’t involved in the research process.

Peer-reviewed articles are considered a highly credible source due to this stringent process they go through before publication.

In general, the peer review process follows the following steps: 

  • First, the author submits the manuscript to the editor.
  • Reject the manuscript and send it back to author, or 
  • Send it onward to the selected peer reviewer(s) 
  • Next, the peer review process occurs. The reviewer provides feedback, addressing any major or minor issues with the manuscript, and gives their advice regarding what edits should be made. 
  • Lastly, the edited manuscript is sent back to the author. They input the edits, and resubmit it to the editor for publication.

Exploratory research is often used when the issue you’re studying is new or when the data collection process is challenging for some reason.

You can use exploratory research if you have a general idea or a specific question that you want to study but there is no preexisting knowledge or paradigm with which to study it.

Exploratory research is a methodology approach that explores research questions that have not previously been studied in depth. It is often used when the issue you’re studying is new, or the data collection process is challenging in some way.

Explanatory research is used to investigate how or why a phenomenon occurs. Therefore, this type of research is often one of the first stages in the research process , serving as a jumping-off point for future research.

Exploratory research aims to explore the main aspects of an under-researched problem, while explanatory research aims to explain the causes and consequences of a well-defined problem.

Explanatory research is a research method used to investigate how or why something occurs when only a small amount of information is available pertaining to that topic. It can help you increase your understanding of a given topic.

Clean data are valid, accurate, complete, consistent, unique, and uniform. Dirty data include inconsistencies and errors.

Dirty data can come from any part of the research process, including poor research design , inappropriate measurement materials, or flawed data entry.

Data cleaning takes place between data collection and data analyses. But you can use some methods even before collecting data.

For clean data, you should start by designing measures that collect valid data. Data validation at the time of data entry or collection helps you minimize the amount of data cleaning you’ll need to do.

After data collection, you can use data standardization and data transformation to clean your data. You’ll also deal with any missing values, outliers, and duplicate values.

Every dataset requires different techniques to clean dirty data , but you need to address these issues in a systematic way. You focus on finding and resolving data points that don’t agree or fit with the rest of your dataset.

These data might be missing values, outliers, duplicate values, incorrectly formatted, or irrelevant. You’ll start with screening and diagnosing your data. Then, you’ll often standardize and accept or remove data to make your dataset consistent and valid.

Data cleaning is necessary for valid and appropriate analyses. Dirty data contain inconsistencies or errors , but cleaning your data helps you minimize or resolve these.

Without data cleaning, you could end up with a Type I or II error in your conclusion. These types of erroneous conclusions can be practically significant with important consequences, because they lead to misplaced investments or missed opportunities.

Data cleaning involves spotting and resolving potential data inconsistencies or errors to improve your data quality. An error is any value (e.g., recorded weight) that doesn’t reflect the true value (e.g., actual weight) of something that’s being measured.

In this process, you review, analyze, detect, modify, or remove “dirty” data to make your dataset “clean.” Data cleaning is also called data cleansing or data scrubbing.

Research misconduct means making up or falsifying data, manipulating data analyses, or misrepresenting results in research reports. It’s a form of academic fraud.

These actions are committed intentionally and can have serious consequences; research misconduct is not a simple mistake or a point of disagreement but a serious ethical failure.

Anonymity means you don’t know who the participants are, while confidentiality means you know who they are but remove identifying information from your research report. Both are important ethical considerations .

You can only guarantee anonymity by not collecting any personally identifying information—for example, names, phone numbers, email addresses, IP addresses, physical characteristics, photos, or videos.

You can keep data confidential by using aggregate information in your research report, so that you only refer to groups of participants rather than individuals.

Research ethics matter for scientific integrity, human rights and dignity, and collaboration between science and society. These principles make sure that participation in studies is voluntary, informed, and safe.

Ethical considerations in research are a set of principles that guide your research designs and practices. These principles include voluntary participation, informed consent, anonymity, confidentiality, potential for harm, and results communication.

Scientists and researchers must always adhere to a certain code of conduct when collecting data from others .

These considerations protect the rights of research participants, enhance research validity , and maintain scientific integrity.

In multistage sampling , you can use probability or non-probability sampling methods .

For a probability sample, you have to conduct probability sampling at every stage.

You can mix it up by using simple random sampling , systematic sampling , or stratified sampling to select units at different stages, depending on what is applicable and relevant to your study.

Multistage sampling can simplify data collection when you have large, geographically spread samples, and you can obtain a probability sample without a complete sampling frame.

But multistage sampling may not lead to a representative sample, and larger samples are needed for multistage samples to achieve the statistical properties of simple random samples .

These are four of the most common mixed methods designs :

  • Convergent parallel: Quantitative and qualitative data are collected at the same time and analyzed separately. After both analyses are complete, compare your results to draw overall conclusions. 
  • Embedded: Quantitative and qualitative data are collected at the same time, but within a larger quantitative or qualitative design. One type of data is secondary to the other.
  • Explanatory sequential: Quantitative data is collected and analyzed first, followed by qualitative data. You can use this design if you think your qualitative data will explain and contextualize your quantitative findings.
  • Exploratory sequential: Qualitative data is collected and analyzed first, followed by quantitative data. You can use this design if you think the quantitative data will confirm or validate your qualitative findings.

Triangulation in research means using multiple datasets, methods, theories and/or investigators to address a research question. It’s a research strategy that can help you enhance the validity and credibility of your findings.

Triangulation is mainly used in qualitative research , but it’s also commonly applied in quantitative research . Mixed methods research always uses triangulation.

In multistage sampling , or multistage cluster sampling, you draw a sample from a population using smaller and smaller groups at each stage.

This method is often used to collect data from a large, geographically spread group of people in national surveys, for example. You take advantage of hierarchical groupings (e.g., from state to city to neighborhood) to create a sample that’s less expensive and time-consuming to collect data from.

No, the steepness or slope of the line isn’t related to the correlation coefficient value. The correlation coefficient only tells you how closely your data fit on a line, so two datasets with the same correlation coefficient can have very different slopes.

To find the slope of the line, you’ll need to perform a regression analysis .

Correlation coefficients always range between -1 and 1.

The sign of the coefficient tells you the direction of the relationship: a positive value means the variables change together in the same direction, while a negative value means they change together in opposite directions.

The absolute value of a number is equal to the number without its sign. The absolute value of a correlation coefficient tells you the magnitude of the correlation: the greater the absolute value, the stronger the correlation.

These are the assumptions your data must meet if you want to use Pearson’s r :

  • Both variables are on an interval or ratio level of measurement
  • Data from both variables follow normal distributions
  • Your data have no outliers
  • Your data is from a random or representative sample
  • You expect a linear relationship between the two variables

Quantitative research designs can be divided into two main categories:

  • Correlational and descriptive designs are used to investigate characteristics, averages, trends, and associations between variables.
  • Experimental and quasi-experimental designs are used to test causal relationships .

Qualitative research designs tend to be more flexible. Common types of qualitative design include case study , ethnography , and grounded theory designs.

A well-planned research design helps ensure that your methods match your research aims, that you collect high-quality data, and that you use the right kind of analysis to answer your questions, utilizing credible sources . This allows you to draw valid , trustworthy conclusions.

The priorities of a research design can vary depending on the field, but you usually have to specify:

  • Your research questions and/or hypotheses
  • Your overall approach (e.g., qualitative or quantitative )
  • The type of design you’re using (e.g., a survey , experiment , or case study )
  • Your sampling methods or criteria for selecting subjects
  • Your data collection methods (e.g., questionnaires , observations)
  • Your data collection procedures (e.g., operationalization , timing and data management)
  • Your data analysis methods (e.g., statistical tests  or thematic analysis )

A research design is a strategy for answering your   research question . It defines your overall approach and determines how you will collect and analyze data.

Questionnaires can be self-administered or researcher-administered.

Self-administered questionnaires can be delivered online or in paper-and-pen formats, in person or through mail. All questions are standardized so that all respondents receive the same questions with identical wording.

Researcher-administered questionnaires are interviews that take place by phone, in-person, or online between researchers and respondents. You can gain deeper insights by clarifying questions for respondents or asking follow-up questions.

You can organize the questions logically, with a clear progression from simple to complex, or randomly between respondents. A logical flow helps respondents process the questionnaire easier and quicker, but it may lead to bias. Randomization can minimize the bias from order effects.

Closed-ended, or restricted-choice, questions offer respondents a fixed set of choices to select from. These questions are easier to answer quickly.

Open-ended or long-form questions allow respondents to answer in their own words. Because there are no restrictions on their choices, respondents can answer in ways that researchers may not have otherwise considered.

A questionnaire is a data collection tool or instrument, while a survey is an overarching research method that involves collecting and analyzing data from people using questionnaires.

The third variable and directionality problems are two main reasons why correlation isn’t causation .

The third variable problem means that a confounding variable affects both variables to make them seem causally related when they are not.

The directionality problem is when two variables correlate and might actually have a causal relationship, but it’s impossible to conclude which variable causes changes in the other.

Correlation describes an association between variables : when one variable changes, so does the other. A correlation is a statistical indicator of the relationship between variables.

Causation means that changes in one variable brings about changes in the other (i.e., there is a cause-and-effect relationship between variables). The two variables are correlated with each other, and there’s also a causal link between them.

While causation and correlation can exist simultaneously, correlation does not imply causation. In other words, correlation is simply a relationship where A relates to B—but A doesn’t necessarily cause B to happen (or vice versa). Mistaking correlation for causation is a common error and can lead to false cause fallacy .

Controlled experiments establish causality, whereas correlational studies only show associations between variables.

  • In an experimental design , you manipulate an independent variable and measure its effect on a dependent variable. Other variables are controlled so they can’t impact the results.
  • In a correlational design , you measure variables without manipulating any of them. You can test whether your variables change together, but you can’t be sure that one variable caused a change in another.

In general, correlational research is high in external validity while experimental research is high in internal validity .

A correlation is usually tested for two variables at a time, but you can test correlations between three or more variables.

A correlation coefficient is a single number that describes the strength and direction of the relationship between your variables.

Different types of correlation coefficients might be appropriate for your data based on their levels of measurement and distributions . The Pearson product-moment correlation coefficient (Pearson’s r ) is commonly used to assess a linear relationship between two quantitative variables.

A correlational research design investigates relationships between two variables (or more) without the researcher controlling or manipulating any of them. It’s a non-experimental type of quantitative research .

A correlation reflects the strength and/or direction of the association between two or more variables.

  • A positive correlation means that both variables change in the same direction.
  • A negative correlation means that the variables change in opposite directions.
  • A zero correlation means there’s no relationship between the variables.

Random error  is almost always present in scientific studies, even in highly controlled settings. While you can’t eradicate it completely, you can reduce random error by taking repeated measurements, using a large sample, and controlling extraneous variables .

You can avoid systematic error through careful design of your sampling , data collection , and analysis procedures. For example, use triangulation to measure your variables using multiple methods; regularly calibrate instruments or procedures; use random sampling and random assignment ; and apply masking (blinding) where possible.

Systematic error is generally a bigger problem in research.

With random error, multiple measurements will tend to cluster around the true value. When you’re collecting data from a large sample , the errors in different directions will cancel each other out.

Systematic errors are much more problematic because they can skew your data away from the true value. This can lead you to false conclusions ( Type I and II errors ) about the relationship between the variables you’re studying.

Random and systematic error are two types of measurement error.

Random error is a chance difference between the observed and true values of something (e.g., a researcher misreading a weighing scale records an incorrect measurement).

Systematic error is a consistent or proportional difference between the observed and true values of something (e.g., a miscalibrated scale consistently records weights as higher than they actually are).

On graphs, the explanatory variable is conventionally placed on the x-axis, while the response variable is placed on the y-axis.

  • If you have quantitative variables , use a scatterplot or a line graph.
  • If your response variable is categorical, use a scatterplot or a line graph.
  • If your explanatory variable is categorical, use a bar graph.

The term “ explanatory variable ” is sometimes preferred over “ independent variable ” because, in real world contexts, independent variables are often influenced by other variables. This means they aren’t totally independent.

Multiple independent variables may also be correlated with each other, so “explanatory variables” is a more appropriate term.

The difference between explanatory and response variables is simple:

  • An explanatory variable is the expected cause, and it explains the results.
  • A response variable is the expected effect, and it responds to other variables.

In a controlled experiment , all extraneous variables are held constant so that they can’t influence the results. Controlled experiments require:

  • A control group that receives a standard treatment, a fake treatment, or no treatment.
  • Random assignment of participants to ensure the groups are equivalent.

Depending on your study topic, there are various other methods of controlling variables .

There are 4 main types of extraneous variables :

  • Demand characteristics : environmental cues that encourage participants to conform to researchers’ expectations.
  • Experimenter effects : unintentional actions by researchers that influence study outcomes.
  • Situational variables : environmental variables that alter participants’ behaviors.
  • Participant variables : any characteristic or aspect of a participant’s background that could affect study results.

An extraneous variable is any variable that you’re not investigating that can potentially affect the dependent variable of your research study.

A confounding variable is a type of extraneous variable that not only affects the dependent variable, but is also related to the independent variable.

In a factorial design, multiple independent variables are tested.

If you test two variables, each level of one independent variable is combined with each level of the other independent variable to create different conditions.

Within-subjects designs have many potential threats to internal validity , but they are also very statistically powerful .

Advantages:

  • Only requires small samples
  • Statistically powerful
  • Removes the effects of individual differences on the outcomes

Disadvantages:

  • Internal validity threats reduce the likelihood of establishing a direct relationship between variables
  • Time-related effects, such as growth, can influence the outcomes
  • Carryover effects mean that the specific order of different treatments affect the outcomes

While a between-subjects design has fewer threats to internal validity , it also requires more participants for high statistical power than a within-subjects design .

  • Prevents carryover effects of learning and fatigue.
  • Shorter study duration.
  • Needs larger samples for high power.
  • Uses more resources to recruit participants, administer sessions, cover costs, etc.
  • Individual differences may be an alternative explanation for results.

Yes. Between-subjects and within-subjects designs can be combined in a single study when you have two or more independent variables (a factorial design). In a mixed factorial design, one variable is altered between subjects and another is altered within subjects.

In a between-subjects design , every participant experiences only one condition, and researchers assess group differences between participants in various conditions.

In a within-subjects design , each participant experiences all conditions, and researchers test the same participants repeatedly for differences between conditions.

The word “between” means that you’re comparing different conditions between groups, while the word “within” means you’re comparing different conditions within the same group.

Random assignment is used in experiments with a between-groups or independent measures design. In this research design, there’s usually a control group and one or more experimental groups. Random assignment helps ensure that the groups are comparable.

In general, you should always use random assignment in this type of experimental design when it is ethically possible and makes sense for your study topic.

To implement random assignment , assign a unique number to every member of your study’s sample .

Then, you can use a random number generator or a lottery method to randomly assign each number to a control or experimental group. You can also do so manually, by flipping a coin or rolling a dice to randomly assign participants to groups.

Random selection, or random sampling , is a way of selecting members of a population for your study’s sample.

In contrast, random assignment is a way of sorting the sample into control and experimental groups.

Random sampling enhances the external validity or generalizability of your results, while random assignment improves the internal validity of your study.

In experimental research, random assignment is a way of placing participants from your sample into different groups using randomization. With this method, every member of the sample has a known or equal chance of being placed in a control group or an experimental group.

“Controlling for a variable” means measuring extraneous variables and accounting for them statistically to remove their effects on other variables.

Researchers often model control variable data along with independent and dependent variable data in regression analyses and ANCOVAs . That way, you can isolate the control variable’s effects from the relationship between the variables of interest.

Control variables help you establish a correlational or causal relationship between variables by enhancing internal validity .

If you don’t control relevant extraneous variables , they may influence the outcomes of your study, and you may not be able to demonstrate that your results are really an effect of your independent variable .

A control variable is any variable that’s held constant in a research study. It’s not a variable of interest in the study, but it’s controlled because it could influence the outcomes.

Including mediators and moderators in your research helps you go beyond studying a simple relationship between two variables for a fuller picture of the real world. They are important to consider when studying complex correlational or causal relationships.

Mediators are part of the causal pathway of an effect, and they tell you how or why an effect takes place. Moderators usually help you judge the external validity of your study by identifying the limitations of when the relationship between variables holds.

If something is a mediating variable :

  • It’s caused by the independent variable .
  • It influences the dependent variable
  • When it’s taken into account, the statistical correlation between the independent and dependent variables is higher than when it isn’t considered.

A confounder is a third variable that affects variables of interest and makes them seem related when they are not. In contrast, a mediator is the mechanism of a relationship between two variables: it explains the process by which they are related.

A mediator variable explains the process through which two variables are related, while a moderator variable affects the strength and direction of that relationship.

There are three key steps in systematic sampling :

  • Define and list your population , ensuring that it is not ordered in a cyclical or periodic order.
  • Decide on your sample size and calculate your interval, k , by dividing your population by your target sample size.
  • Choose every k th member of the population as your sample.

Systematic sampling is a probability sampling method where researchers select members of the population at a regular interval – for example, by selecting every 15th person on a list of the population. If the population is in a random order, this can imitate the benefits of simple random sampling .

Yes, you can create a stratified sample using multiple characteristics, but you must ensure that every participant in your study belongs to one and only one subgroup. In this case, you multiply the numbers of subgroups for each characteristic to get the total number of groups.

For example, if you were stratifying by location with three subgroups (urban, rural, or suburban) and marital status with five subgroups (single, divorced, widowed, married, or partnered), you would have 3 x 5 = 15 subgroups.

You should use stratified sampling when your sample can be divided into mutually exclusive and exhaustive subgroups that you believe will take on different mean values for the variable that you’re studying.

Using stratified sampling will allow you to obtain more precise (with lower variance ) statistical estimates of whatever you are trying to measure.

For example, say you want to investigate how income differs based on educational attainment, but you know that this relationship can vary based on race. Using stratified sampling, you can ensure you obtain a large enough sample from each racial group, allowing you to draw more precise conclusions.

In stratified sampling , researchers divide subjects into subgroups called strata based on characteristics that they share (e.g., race, gender, educational attainment).

Once divided, each subgroup is randomly sampled using another probability sampling method.

Cluster sampling is more time- and cost-efficient than other probability sampling methods , particularly when it comes to large samples spread across a wide geographical area.

However, it provides less statistical certainty than other methods, such as simple random sampling , because it is difficult to ensure that your clusters properly represent the population as a whole.

There are three types of cluster sampling : single-stage, double-stage and multi-stage clustering. In all three types, you first divide the population into clusters, then randomly select clusters for use in your sample.

  • In single-stage sampling , you collect data from every unit within the selected clusters.
  • In double-stage sampling , you select a random sample of units from within the clusters.
  • In multi-stage sampling , you repeat the procedure of randomly sampling elements from within the clusters until you have reached a manageable sample.

Cluster sampling is a probability sampling method in which you divide a population into clusters, such as districts or schools, and then randomly select some of these clusters as your sample.

The clusters should ideally each be mini-representations of the population as a whole.

If properly implemented, simple random sampling is usually the best sampling method for ensuring both internal and external validity . However, it can sometimes be impractical and expensive to implement, depending on the size of the population to be studied,

If you have a list of every member of the population and the ability to reach whichever members are selected, you can use simple random sampling.

The American Community Survey  is an example of simple random sampling . In order to collect detailed data on the population of the US, the Census Bureau officials randomly select 3.5 million households per year and use a variety of methods to convince them to fill out the survey.

Simple random sampling is a type of probability sampling in which the researcher randomly selects a subset of participants from a population . Each member of the population has an equal chance of being selected. Data is then collected from as large a percentage as possible of this random subset.

Quasi-experimental design is most useful in situations where it would be unethical or impractical to run a true experiment .

Quasi-experiments have lower internal validity than true experiments, but they often have higher external validity  as they can use real-world interventions instead of artificial laboratory settings.

A quasi-experiment is a type of research design that attempts to establish a cause-and-effect relationship. The main difference with a true experiment is that the groups are not randomly assigned.

Blinding is important to reduce research bias (e.g., observer bias , demand characteristics ) and ensure a study’s internal validity .

If participants know whether they are in a control or treatment group , they may adjust their behavior in ways that affect the outcome that researchers are trying to measure. If the people administering the treatment are aware of group assignment, they may treat participants differently and thus directly or indirectly influence the final results.

  • In a single-blind study , only the participants are blinded.
  • In a double-blind study , both participants and experimenters are blinded.
  • In a triple-blind study , the assignment is hidden not only from participants and experimenters, but also from the researchers analyzing the data.

Blinding means hiding who is assigned to the treatment group and who is assigned to the control group in an experiment .

A true experiment (a.k.a. a controlled experiment) always includes at least one control group that doesn’t receive the experimental treatment.

However, some experiments use a within-subjects design to test treatments without a control group. In these designs, you usually compare one group’s outcomes before and after a treatment (instead of comparing outcomes between different groups).

For strong internal validity , it’s usually best to include a control group if possible. Without a control group, it’s harder to be certain that the outcome was caused by the experimental treatment and not by other variables.

An experimental group, also known as a treatment group, receives the treatment whose effect researchers wish to study, whereas a control group does not. They should be identical in all other ways.

Individual Likert-type questions are generally considered ordinal data , because the items have clear rank order, but don’t have an even distribution.

Overall Likert scale scores are sometimes treated as interval data. These scores are considered to have directionality and even spacing between them.

The type of data determines what statistical tests you should use to analyze your data.

A Likert scale is a rating scale that quantitatively assesses opinions, attitudes, or behaviors. It is made up of 4 or more questions that measure a single attitude or trait when response scores are combined.

To use a Likert scale in a survey , you present participants with Likert-type questions or statements, and a continuum of items, usually with 5 or 7 possible responses, to capture their degree of agreement.

In scientific research, concepts are the abstract ideas or phenomena that are being studied (e.g., educational achievement). Variables are properties or characteristics of the concept (e.g., performance at school), while indicators are ways of measuring or quantifying variables (e.g., yearly grade reports).

The process of turning abstract concepts into measurable variables and indicators is called operationalization .

There are various approaches to qualitative data analysis , but they all share five steps in common:

  • Prepare and organize your data.
  • Review and explore your data.
  • Develop a data coding system.
  • Assign codes to the data.
  • Identify recurring themes.

The specifics of each step depend on the focus of the analysis. Some common approaches include textual analysis , thematic analysis , and discourse analysis .

There are five common approaches to qualitative research :

  • Grounded theory involves collecting data in order to develop new theories.
  • Ethnography involves immersing yourself in a group or organization to understand its culture.
  • Narrative research involves interpreting stories to understand how people make sense of their experiences and perceptions.
  • Phenomenological research involves investigating phenomena through people’s lived experiences.
  • Action research links theory and practice in several cycles to drive innovative changes.

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

Operationalization means turning abstract conceptual ideas into measurable observations.

For example, the concept of social anxiety isn’t directly observable, but it can be operationally defined in terms of self-rating scores, behavioral avoidance of crowded places, or physical anxiety symptoms in social situations.

Before collecting data , it’s important to consider how you will operationalize the variables that you want to measure.

When conducting research, collecting original data has significant advantages:

  • You can tailor data collection to your specific research aims (e.g. understanding the needs of your consumers or user testing your website)
  • You can control and standardize the process for high reliability and validity (e.g. choosing appropriate measurements and sampling methods )

However, there are also some drawbacks: data collection can be time-consuming, labor-intensive and expensive. In some cases, it’s more efficient to use secondary data that has already been collected by someone else, but the data might be less reliable.

Data collection is the systematic process by which observations or measurements are gathered in research. It is used in many different contexts by academics, governments, businesses, and other organizations.

There are several methods you can use to decrease the impact of confounding variables on your research: restriction, matching, statistical control and randomization.

In restriction , you restrict your sample by only including certain subjects that have the same values of potential confounding variables.

In matching , you match each of the subjects in your treatment group with a counterpart in the comparison group. The matched subjects have the same values on any potential confounding variables, and only differ in the independent variable .

In statistical control , you include potential confounders as variables in your regression .

In randomization , you randomly assign the treatment (or independent variable) in your study to a sufficiently large number of subjects, which allows you to control for all potential confounding variables.

A confounding variable is closely related to both the independent and dependent variables in a study. An independent variable represents the supposed cause , while the dependent variable is the supposed effect . A confounding variable is a third variable that influences both the independent and dependent variables.

Failing to account for confounding variables can cause you to wrongly estimate the relationship between your independent and dependent variables.

To ensure the internal validity of your research, you must consider the impact of confounding variables. If you fail to account for them, you might over- or underestimate the causal relationship between your independent and dependent variables , or even find a causal relationship where none exists.

Yes, but including more than one of either type requires multiple research questions .

For example, if you are interested in the effect of a diet on health, you can use multiple measures of health: blood sugar, blood pressure, weight, pulse, and many more. Each of these is its own dependent variable with its own research question.

You could also choose to look at the effect of exercise levels as well as diet, or even the additional effect of the two combined. Each of these is a separate independent variable .

To ensure the internal validity of an experiment , you should only change one independent variable at a time.

No. The value of a dependent variable depends on an independent variable, so a variable cannot be both independent and dependent at the same time. It must be either the cause or the effect, not both!

You want to find out how blood sugar levels are affected by drinking diet soda and regular soda, so you conduct an experiment .

  • The type of soda – diet or regular – is the independent variable .
  • The level of blood sugar that you measure is the dependent variable – it changes depending on the type of soda.

Determining cause and effect is one of the most important parts of scientific research. It’s essential to know which is the cause – the independent variable – and which is the effect – the dependent variable.

In non-probability sampling , the sample is selected based on non-random criteria, and not every member of the population has a chance of being included.

Common non-probability sampling methods include convenience sampling , voluntary response sampling, purposive sampling , snowball sampling, and quota sampling .

Probability sampling means that every member of the target population has a known chance of being included in the sample.

Probability sampling methods include simple random sampling , systematic sampling , stratified sampling , and cluster sampling .

Using careful research design and sampling procedures can help you avoid sampling bias . Oversampling can be used to correct undercoverage bias .

Some common types of sampling bias include self-selection bias , nonresponse bias , undercoverage bias , survivorship bias , pre-screening or advertising bias, and healthy user bias.

Sampling bias is a threat to external validity – it limits the generalizability of your findings to a broader group of people.

A sampling error is the difference between a population parameter and a sample statistic .

A statistic refers to measures about the sample , while a parameter refers to measures about the population .

Populations are used when a research question requires data from every member of the population. This is usually only feasible when the population is small and easily accessible.

Samples are used to make inferences about populations . Samples are easier to collect data from because they are practical, cost-effective, convenient, and manageable.

There are seven threats to external validity : selection bias , history, experimenter effect, Hawthorne effect , testing effect, aptitude-treatment and situation effect.

The two types of external validity are population validity (whether you can generalize to other groups of people) and ecological validity (whether you can generalize to other situations and settings).

The external validity of a study is the extent to which you can generalize your findings to different groups of people, situations, and measures.

Cross-sectional studies cannot establish a cause-and-effect relationship or analyze behavior over a period of time. To investigate cause and effect, you need to do a longitudinal study or an experimental study .

Cross-sectional studies are less expensive and time-consuming than many other types of study. They can provide useful insights into a population’s characteristics and identify correlations for further research.

Sometimes only cross-sectional data is available for analysis; other times your research question may only require a cross-sectional study to answer it.

Longitudinal studies can last anywhere from weeks to decades, although they tend to be at least a year long.

The 1970 British Cohort Study , which has collected data on the lives of 17,000 Brits since their births in 1970, is one well-known example of a longitudinal study .

Longitudinal studies are better to establish the correct sequence of events, identify changes over time, and provide insight into cause-and-effect relationships, but they also tend to be more expensive and time-consuming than other types of studies.

Longitudinal studies and cross-sectional studies are two different types of research design . In a cross-sectional study you collect data from a population at a specific point in time; in a longitudinal study you repeatedly collect data from the same sample over an extended period of time.

Longitudinal study Cross-sectional study
observations Observations at a in time
Observes the multiple times Observes (a “cross-section”) in the population
Follows in participants over time Provides of society at a given point

There are eight threats to internal validity : history, maturation, instrumentation, testing, selection bias , regression to the mean, social interaction and attrition .

Internal validity is the extent to which you can be confident that a cause-and-effect relationship established in a study cannot be explained by other factors.

In mixed methods research , you use both qualitative and quantitative data collection and analysis methods to answer your research question .

The research methods you use depend on the type of data you need to answer your research question .

  • If you want to measure something or test a hypothesis , use quantitative methods . If you want to explore ideas, thoughts and meanings, use qualitative methods .
  • If you want to analyze a large amount of readily-available data, use secondary data. If you want data specific to your purposes with control over how it is generated, collect primary data.
  • If you want to establish cause-and-effect relationships between variables , use experimental methods. If you want to understand the characteristics of a research subject, use descriptive methods.

A confounding variable , also called a confounder or confounding factor, is a third variable in a study examining a potential cause-and-effect relationship.

A confounding variable is related to both the supposed cause and the supposed effect of the study. It can be difficult to separate the true effect of the independent variable from the effect of the confounding variable.

In your research design , it’s important to identify potential confounding variables and plan how you will reduce their impact.

Discrete and continuous variables are two types of quantitative variables :

  • Discrete variables represent counts (e.g. the number of objects in a collection).
  • Continuous variables represent measurable amounts (e.g. water volume or weight).

Quantitative variables are any variables where the data represent amounts (e.g. height, weight, or age).

Categorical variables are any variables where the data represent groups. This includes rankings (e.g. finishing places in a race), classifications (e.g. brands of cereal), and binary outcomes (e.g. coin flips).

You need to know what type of variables you are working with to choose the right statistical test for your data and interpret your results .

You can think of independent and dependent variables in terms of cause and effect: an independent variable is the variable you think is the cause , while a dependent variable is the effect .

In an experiment, you manipulate the independent variable and measure the outcome in the dependent variable. For example, in an experiment about the effect of nutrients on crop growth:

  • The  independent variable  is the amount of nutrients added to the crop field.
  • The  dependent variable is the biomass of the crops at harvest time.

Defining your variables, and deciding how you will manipulate and measure them, is an important part of experimental design .

Experimental design means planning a set of procedures to investigate a relationship between variables . To design a controlled experiment, you need:

  • A testable hypothesis
  • At least one independent variable that can be precisely manipulated
  • At least one dependent variable that can be precisely measured

When designing the experiment, you decide:

  • How you will manipulate the variable(s)
  • How you will control for any potential confounding variables
  • How many subjects or samples will be included in the study
  • How subjects will be assigned to treatment levels

Experimental design is essential to the internal and external validity of your experiment.

I nternal validity is the degree of confidence that the causal relationship you are testing is not influenced by other factors or variables .

External validity is the extent to which your results can be generalized to other contexts.

The validity of your experiment depends on your experimental design .

Reliability and validity are both about how well a method measures something:

  • Reliability refers to the  consistency of a measure (whether the results can be reproduced under the same conditions).
  • Validity   refers to the  accuracy of a measure (whether the results really do represent what they are supposed to measure).

If you are doing experimental research, you also have to consider the internal and external validity of your experiment.

A sample is a subset of individuals from a larger population . Sampling means selecting the group that you will actually collect data from in your research. For example, if you are researching the opinions of students in your university, you could survey a sample of 100 students.

In statistics, sampling allows you to test a hypothesis about the characteristics of a population.

Quantitative research deals with numbers and statistics, while qualitative research deals with words and meanings.

Quantitative methods allow you to systematically measure variables and test hypotheses . Qualitative methods allow you to explore concepts and experiences in more detail.

Methodology refers to the overarching strategy and rationale of your research project . It involves studying the methods used in your field and the theories or principles behind them, in order to develop an approach that matches your objectives.

Methods are the specific tools and procedures you use to collect and analyze data (for example, experiments, surveys , and statistical tests ).

In shorter scientific papers, where the aim is to report the findings of a specific study, you might simply describe what you did in a methods section .

In a longer or more complex research project, such as a thesis or dissertation , you will probably include a methodology section , where you explain your approach to answering the research questions and cite relevant sources to support your choice of methods.

Ask our team

Want to contact us directly? No problem.  We  are always here for you.

Support team - Nina

Our team helps students graduate by offering:

  • A world-class citation generator
  • Plagiarism Checker software powered by Turnitin
  • Innovative Citation Checker software
  • Professional proofreading services
  • Over 300 helpful articles about academic writing, citing sources, plagiarism, and more

Scribbr specializes in editing study-related documents . We proofread:

  • PhD dissertations
  • Research proposals
  • Personal statements
  • Admission essays
  • Motivation letters
  • Reflection papers
  • Journal articles
  • Capstone projects

Scribbr’s Plagiarism Checker is powered by elements of Turnitin’s Similarity Checker , namely the plagiarism detection software and the Internet Archive and Premium Scholarly Publications content databases .

The add-on AI detector is powered by Scribbr’s proprietary software.

The Scribbr Citation Generator is developed using the open-source Citation Style Language (CSL) project and Frank Bennett’s citeproc-js . It’s the same technology used by dozens of other popular citation tools, including Mendeley and Zotero.

You can find all the citation styles and locales used in the Scribbr Citation Generator in our publicly accessible repository on Github .

NAU Cline Library logo

Evidence Based Practice

  • 1. Ask: PICO(T) Question
  • 2. Align: Levels of Evidence
  • 3a. Acquire: Resource Types
  • 3b. Acquire: Searching
  • 4. Appraise

Primary vs. Secondary Sources

  • Qualitative and Quantitative Sources
  • Managing References

Sources are considered primary, secondary, or tertiary depending on the originality of the information presented and their proximity or how close they are to the source of information. This distinction can differ between subjects and disciplines.

In the sciences, research findings may be communicated informally between researchers through email, presented at conferences (primary source), and then, possibly, published as a journal article or technical report (primary source). Once published, the information may be commented on by other researchers (secondary sources), and/or professionally indexed in a database (secondary sources). Later the information may be summarized into an encyclopedic or reference book format (tertiary sources). Source

Primary Sources

A primary source in science is a document or record that reports on a study, experiment, trial or research project. Primary sources are usually written by the person(s) who did the research, conducted the study, or ran the experiment, and include hypothesis, methodology, and results.

Primary Sources include:

  • Pilot/prospective studies
  • Cohort studies
  • Survey research
  • Case studies
  • Lab notebooks
  • Clinical trials and randomized clinical trials/RCTs
  • Dissertations

Secondary Sources

Secondary sources list, summarize, compare, and evaluate primary information and studies so as to draw conclusions on or present current state of knowledge in a discipline or subject. Sources may include a bibliography which may direct you back to the primary research reported in the article.

Secondary Sources include:

  • reviews, systematic reviews, meta-analysis
  • newsletters and professional news sources
  • practice guidelines & standards
  • clinical care notes
  • patient education Information
  • government & legal Information
  • entries in nursing or medical encyclopedias Source

More on Systematic Reviews and Meta-Analysis

Systematic reviews – Systematic reviews are best for answering single questions (eg, the effectiveness of tight glucose control on microvascular complications of diabetes). They are more scientifically structured than traditional reviews, being explicit about how the authors attempted to find all relevant articles, judge the scientific quality of each study, and weigh evidence from multiple studies with conflicting results. These reviews pay particular attention to including all strong research, whether or not it has been published, to avoid publication bias (positive studies are preferentially published). Source

Meta-analysis -- Meta-analysis, which is commonly included in systematic reviews, is a statistical method that quantitatively combines the results from different studies. It can be used to provide an overall estimate of the net benefit or harm of an intervention, even when these effects may not have been apparent in the individual studies [ 9 ]. Meta-analysis can also provide an overall quantitative estimate of other parameters such as diagnostic accuracy, incidence, or prevalence. Source

  • << Previous: 4. Appraise
  • Next: Qualitative and Quantitative Sources >>
  • Last Updated: Nov 9, 2023 12:14 PM
  • URL: https://libraryguides.nau.edu/evidencebasedpractice

Banner Image

Peer-Reviewed Literature: Peer-Reviewed Research: Primary vs. Secondary

  • Peer-Reviewed Research: Primary vs. Secondary
  • Types of Peer Review
  • Identifying Peer-Reviewed Research

Peer Reviewed Research

Published literature can be either peer-reviewed or non-peer-reviewed. Official research reports are almost always peer reviewed while a journal's other content is usually not. In the health sciences, official research can be primary, secondary, or even tertiary. It can be an original experiment or investigation (primary), an analysis or evaluation of primary research (secondary), or findings that compile secondary research (tertiary). If you are doing research yourself, then primary or secondary sources can reveal more in-depth information.

Primary Research

Primary research is information presented in its original form without interpretation by other researchers. While it may acknowledge previous studies or sources, it always presents original thinking, reports on discoveries, or new information about a topic.

Health sciences research that is primary includes both experimental trials and observational studies where subjects may be tested for outcomes or investigated to gain relevant insight.  Randomized Controlled Trials are the most prominent experimental design because randomized subjects offer the most compelling evidence for the effectiveness of an intervention. See the below graphic and below powerpoint for further information on primary research studies.

is a systematic review primary or secondary research

  • Research Design

Secondary Research

Secondary research is an account of original events or facts. It is secondary to and retrospective of the actual findings from an experiment or trial. These studies may be appraised summaries, reviews, or interpretations of primary sources and often exclude the original researcher(s). In the health sciences, meta-analysis and systematic reviews are the most frequent types of secondary research. 

  • A meta-analysis is a quantitative method of combining the results of primary research. In analyzing the relevant data and statistical findings from experimental trials or observational studies, it can more accurately calculate effective resolutions regarding certain health topics.
  • A systematic review is a summary of research that addresses a focused clinical question in a systematic, reproducible manner. In order to provide the single best estimate of effect in clinical decision making, primary research studies are pooled together and then filtered through an inclusion/exclusion process. The relevant data and findings are then compiled and synthesized to arrive at a more accurate conclusion about a specific health topic. Only peer-reviewed publications are used and analyzed in a methodology which may or may not include a meta-analysis.

is a systematic review primary or secondary research

  • << Previous: Home
  • Next: Types of Peer Review >>
  • Last Updated: Sep 29, 2023 10:05 AM
  • URL: https://ttuhsc.libguides.com/PeerReview

Texas Tech University Health Sciences Center logo

Understanding and Evaluating Resources

  • Evaluating Journal Articles
  • Evaluating News Resources
  • Evaluating Web Resources
  • Primary vs. Secondary Sources
  • Different Types of Sources
  • Primary Sources
  • Secondary Sources
  • Tertiary Sources

What is a primary source?

personal data and research

Anthropology, Archeology

Articles describing research, ethnographies, surveys, cultural and historical artifacts

Communications, Journalism

News (printed, radio, TV, online), photographs, blogs, social media sites

Education, Political Science, Public  Policy 

Government publications, laws, court cases, speeches, test results, interviews, polls, surveys

Fine Arts

Original artwork, photographs, recordings of performances and music, scripts (film, theater, television), music scores, interviews, memoirs, diaries, letters

History

Government publications, newspapers, photographs, diaries, letters, manuscripts, business records, court cases, videos, polls, census data, speeches

Language and Literature

Novels, plays, short stories, poems, dictionaries,

 language manuals

Psychology, Sociology, Economics

Articles describing research, experiment results, ethnographies, interviews, surveys, data sets

Sciences

Articles describing research and methodologies, documentation of lab research, research studies

Search Primary Sources @ RWU

What is a secondary source.

analyzing data

Anthropology, Archeology

Reviews of the literature, critical interpretations of scholarly studies

Communications, Journalism

Interpretive journal articles, books, and blogs about the communications industry.

Education, Political Science, Public Policy 

Reviews of the literature, critical interpretations of scholarly studies

Fine Arts

Critical interpretations of art and artists—biographies, reviews, recordings of live performances

History

Interpretive journal articles and books

Language and Literature

Literary criticism, biographies, reviews, text books

Psychology, Sociology, Economics

Reviews of the literature, critical interpretations of scholarly studies

Sciences

Publications about the significance of research or experiments

What is a tertiary source?

three authors into 1 source

  • Encyclopedias, like Wikipedia, Encyclopedia Brittanica, etc.
  • Dictionaries, like Oxford English Dictionary, Etymology Online, etc.
  • Almanacs, like World Almanac, Book of Facts, etc.
  • Factbooks, like CIA World Factbook
  • Chronologies, like Chronicle of the 20th Century
  • Some Textbooks
  • Last Updated: Aug 16, 2024 12:07 PM
  • URL: https://rwu.libguides.com/EvaluatingSources

5.4.2  Prioritizing outcomes: main, primary and secondary outcomes

Main outcomes.

Once a full list of relevant outcomes has been compiled for the review, authors should prioritize the outcomes and select the main outcomes of relevance to the review question. The main outcomes are the essential outcomes for decision-making, and are those that would form the basis of a ‘Summary of findings’ table. ‘Summary of findings’ tables provide key information about the amount of evidence for important comparisons and outcomes, the quality of the evidence and the magnitude of effect (see Chapter 11, Section 11.5 ). There should be no more than seven main outcomes, which should generally not include surrogate or interim outcomes. They should not be chosen on the basis of any anticipated or observed magnitude of effect, or because they are likely to have been addressed in the studies to be reviewed.

Primary outcomes

Primary outcomes for the review should be identified from among the main outcomes. Primary outcomes are the outcomes that would be expected to be analysed should the review identify relevant studies, and conclusions about the effects of the interventions under review will be based largely on these outcomes. There should in general be no more than three primary outcomes and they should include at least one desirable and at least one undesirable outcome (to assess beneficial and adverse effects respectively).

Secondary outcomes

Main outcomes not selected as primary outcomes would be expected to be listed as secondary outcomes. In addition, secondary outcomes may include a limited number of additional outcomes the review intends to address. These may be specific to only some comparisons in the review. For example, laboratory tests and other surrogate measures may not be considered as main outcomes as they are less important than clinical endpoints in informing decisions, but they may be helpful in explaining effect or determining intervention integrity (see Chapter 7, Section 7.3.4 ).

Box 5.4.a summarizes the principal factors to consider when developing criteria for the ‘Types of outcomes’.

What is secondary analysis? A comprehensive overview

Last updated

16 August 2024

Reviewed by

Miroslav Damyanov

Within your projects and initiatives, you can leverage secondary analysis, like case studies, census data, or past clinical trials, to accelerate growth and innovation. But there’s more to understand when peeling back this onion.

Building on the idea of leveraging existing work, we’ll provide a comprehensive overview of what secondary analysis really is, including its applications in various fields and its advantages and disadvantages. You’ll also discover the most effective ways to incorporate it into your projects.

  • What is secondary data analysis?

Secondary analysis is any form of research that relies on or uses previously conducted research for the purposes of a new study. If existing data is cited or previously conducted studies help to achieve a new outcome, it’s secondary analysis.

It happens quite often, especially when researchers use quantitative or qualitative data that has been gathered previously and analyze it in a new way. The secondary data in these instances is often published or made available publicly with permission to cite and use it

  • Why is secondary data analysis important?

Secondary data analysis is important for innovation. It can serve as a historical reference to studies of yesterday.

This kind of previously summarized data, usually written by other parties, can be great for reaffirming a similar result or finding you may have. It’s also where you can find commentary or analysis of the steps taken before you’re on the scene.

Secondary analysis is important because it can accelerate your research by providing reliable springboards from previous research. This allows you to pick up where someone else left off in the field.

Imagine you’re studying a local population or community demographic. In this case, turning to the latest census data will be more useful than conducting your own headcount. Even simpler: if you’re making a cake from scratch, it’s a good thing you don’t have to churn your own butter to complete the recipe. You get the idea. But there’s more to secondary analysis than just census and cake.

  • Types of secondary research

Imagine all the libraries of studies and the internet’s access to peer reviews, case studies, statistics, and data. Think about how much data is out there. A lot! However, it won’t all be useful in your studies or projects. This is why it’s best to understand the categories and types of secondary research. From there, you can narrow your focus to the data and research that best applies to your work.

Statistical analysis

A collection of thousands of data points is combined with the intention of discovering new insights. And those statistics are applied to every corner of business, government, health, and science.

This type of secondary analysis is great for drawing new insights, testing new hypotheses, and validating new findings. We rely on statistics to help us identify areas of improvement and avoid mistakes every day.

Literature reviews

Think of these as published pieces of information within a particular segment or subject area.

The primary purpose of any literature review is to provide a multifaceted and comprehensive overview of current knowledge, identify gaps, and establish a theoretical framework for further research.

Sometimes, literature reviews are part of a topic within a specific time period. Researchers can present them as an in-depth analysis or just a simple source summary.

Case studies

Case studies are more in-depth analyses of a person, group, or specific event.

In secondary research, you’ll look for existing case study reports, published papers, and documented instances to gather and analyze data. These studies can provide comprehensive insights into specific phenomena, processes, or practices.

Nearly every aspect of the topic is explored, highlighting challenges, solutions, how those solutions are applied, and final outcomes. They can prove or disprove a theory but typically serve as a demonstrative piece of evidence or analysis.

Content analysis

Content analysis is, as it sounds, the study of certain phrases, words, or themes within a body of qualitative data.

This type of secondary analysis provides context so you can analyze particular relationships between words, meanings, or concepts. It’s a type of secondary research that involves examining and interpreting pre-existing material to uncover patterns, trends, and insights.

  • Advantages of conducting secondary data analysis

Starting your research from an advanced position of knowledge gives you an advantage. Secondary data analysis presents many benefits, regardless of the topic or information you’re exploring.

Cost-effectiveness

Secondary analytics are cost-effective. Since studies and data already exist, you don’t have to repeat certain tests or steps, alleviating costs without compromising your findings. You’ll gain access to high-level data that might be too cost-prohibitive to perform independently.

It can also be cost-saving in exploratory research. It’s helpful to gain valuable preliminary or exploratory research first to refine your hypothesis before committing to primary data collection.

Time-saving

Someone else may have already invested in the study, producing findings and data you can readily use within your research and saving you time.

Being able to access large datasets, which are typically extensive and robust, provides researchers with a wealth of information that would otherwise be too time-consuming or difficult to gather independently.

Ability to answer additional research questions

Within the scope of your project or research, there will be questions you can’t answer first-hand. Using secondary analysis allows you to answer those additional research questions using pre-existing findings or results.

You can also analyze old longitudinal data and still find new trends, theories, or applications. Enabling longitudinal studies can also help you track changes and trends over time.

  • Disadvantages of secondary data analysis

You can’t run your business and projects entirely by piggybacking on secondary data alone. There are some disadvantages to relying solely on analysis that has already been published.

Data quality concerns

You weren’t “there.” You weren’t part of the study or model behind the secondary analysis, so there’s a risk that mistakes were present or findings weren’t entirely accurate. You’ll need to verify that the sources aren’t citing outdated information or presenting original data with bias.

Relevance is another potential issue. Some data might be outdated or not reflective of current conditions, especially in rapidly changing fields or industries. These concerns about data quality could be problematic for your new research project or objective.

Data accessibility

Access to certain secondary data sources may be restricted, require payment, or come with usage limitations. Statista and Deloitte, for example, provide high-quality reports, but some are only accessible after payment.

Also, some academic articles may require a subscription or an account affiliated with a higher education institution.

Need to de-identify information

When you incorporate secondary analysis into your project, you’ll need to remove identifiers from the original source. For example, de-identified patient data won’t have specifics about the patient’s personal information.

  • How to carry out secondary data analysis

Follow this simple roadmap for carrying out your own secondary analysis study.

1. Identify and define the research topic

The first step is recognizing the goal of the project or question you’re attempting to answer. Define your research topic in a way that provides clarity for the datasets you’ll need to collect.

2. Find research and existing data sources

Consider which data sources exist that might present the findings you need to help answer your question.

Start digging through reputable sources within relevant timelines to explore what data already exists. These might include academic databases, government records, organizational reports, and online repositories.

3. Begin searching and collecting the existing data

With an idea of what types of secondary analysis will offer the best data for your project, start searching for relevant studies and collect those that might help. Collect several sets of metrics and analyses to inform your analysis.

4. Combine the data and compare the results

When you feel you’ve collected the right studies, you can begin combining your findings and comparing results.

Before you get started, you might have to “clean” the data, handle missing values, or merge datasets from different sources.

Look for trends and common themes or results. Verify if outlier statistics are anomalies or valuable to your study.

5. Analyze your data and explore further

Analyze your collected data through the lens of market research. Look to determine what the data shows, ensure it makes sense, and connect the dots to reach your original goal.

Use appropriate statistical or qualitative analysis methods to examine your data. For you, that might involve descriptive statistics, inferential statistics, or thematic analysis, depending on the nature of the data and your research questions.

  • Sources of secondary research

With literally thousands of websites and research at your fingertips, you can find relevant secondary research sources practically anywhere. You might also locate research from both internal and external sources.

Internal data

As a company, you can look for internal secondary data to help you reach the answers you’re looking for. Here are some examples:

Historical sales reporting

Website analytics from previous years

Past employee training and testing results

News articles

Internal conversations

Customer databases

Internal communications records

Financial reports

External data

Look outside your company for similar reports other companies may have already executed. Trust industry-specific sources on the web as well as municipal or non-profit studies that may also lend credibility to your work.

Academic journals, public databases, industry reports, trade publications, and government agencies can all be valuable resources for secondary data. Here are some examples:

Springer Nature

Census.gov/data

Demographics Now

Which is better, primary or secondary analysis?

Most of the best innovations and research initiatives are served by both primary and secondary research analysis.

Don’t look at one or the other. Instead, harness both for the most impactful research. Springboard from the work of others and use secondary research to bridge gaps in your efforts. Then, conduct your primary research to deep dive into those arenas, combining both for the most thorough investigation.

What are some use-case examples of secondary analysis?

Secondary analysis is more common than you might realize. You may have used it before without realizing it.

Here are some use case examples of secondary analysis used across business applications, research and development, learning, healthcare, and more.

A grad student expands on an advisor’s research to contribute to a thesis.

A data analyst uses their own data to run additional reports.

A researcher uses new software to further explore historical reporting.

An entrepreneur studies demographic information to create more effective marketing personas.

A school principal uses nationwide studies to inform curriculum development.

A digital marketing specialist uses site metrics to outline areas of improvement for user experiences.

How can you be sure to remove bias from secondary analysis?

As a researcher, be sure to evaluate the data you source to make sure it’s accurate, timely, and reputable. Before applying any secondary analysis, remove bias by answering the following questions about the source:

What was the secondary study’s original purpose?

Who collected the original data? (credentials)

What data was collected and when?

What were the methods used and dataset limitations at the time?

Should you be using a customer insights hub?

Do you want to discover previous research faster?

Do you share your research findings with others?

Do you analyze research data?

Start for free today, add your research, and get to key insights faster

Editor’s picks

Last updated: 18 April 2023

Last updated: 27 February 2023

Last updated: 5 February 2023

Last updated: 16 April 2023

Last updated: 16 August 2024

Last updated: 9 March 2023

Last updated: 30 April 2024

Last updated: 12 December 2023

Last updated: 11 March 2024

Last updated: 4 July 2024

Last updated: 6 March 2024

Last updated: 5 March 2024

Last updated: 13 May 2024

Latest articles

Related topics, .css-je19u9{-webkit-align-items:flex-end;-webkit-box-align:flex-end;-ms-flex-align:flex-end;align-items:flex-end;display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-flex-direction:row;-ms-flex-direction:row;flex-direction:row;-webkit-box-flex-wrap:wrap;-webkit-flex-wrap:wrap;-ms-flex-wrap:wrap;flex-wrap:wrap;-webkit-box-pack:center;-ms-flex-pack:center;-webkit-justify-content:center;justify-content:center;row-gap:0;text-align:center;max-width:671px;}@media (max-width: 1079px){.css-je19u9{max-width:400px;}.css-je19u9>span{white-space:pre;}}@media (max-width: 799px){.css-je19u9{max-width:400px;}.css-je19u9>span{white-space:pre;}} decide what to .css-1kiodld{max-height:56px;display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-align-items:center;-webkit-box-align:center;-ms-flex-align:center;align-items:center;}@media (max-width: 1079px){.css-1kiodld{display:none;}} build next, decide what to build next, log in or sign up.

Get started for free

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • For authors
  • Browse by collection
  • BMJ Journals

You are here

  • Volume 12, Issue 2
  • Consumption and effects of caffeinated energy drinks in young people: an overview of systematic reviews and secondary analysis of UK data to inform policy
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • http://orcid.org/0000-0002-9571-3147 Claire Khouja 1 ,
  • http://orcid.org/0000-0002-7016-978X Dylan Kneale 2 ,
  • Ginny Brunton 3 ,
  • Gary Raine 1 ,
  • Claire Stansfield 2 ,
  • Amanda Sowden 1 ,
  • Katy Sutcliffe 2 ,
  • James Thomas 2
  • 1 Centre for Reviews and Dissemination , University of York , York , UK
  • 2 EPPI-Centre, Social Science Research Unit , UCL Institute of Education, University College London , London , UK
  • 3 Faculty of Health Sciences , Ontario Tech University , Oshawa , Ontario , Canada
  • Correspondence to Claire Khouja; claire.khouja{at}york.ac.uk

Background This overview and analysis of UK datasets was commissioned by the UK government to address concerns about children’s consumption of caffeinated energy drinks and their effects on health and behaviour.

Methods We searched nine databases for systematic reviews, published between 2013 and July 2021, in English, assessing caffeinated energy drink consumption by people under 18 years old (children). Two reviewers rated or checked risk of bias using AMSTAR2, and extracted and synthesised findings. We searched the UK Data Service for country-representative datasets, reporting children’s energy-drink consumption, and conducted bivariate or latent class analyses.

Results For the overview, we included 15 systematic reviews; six reported drinking prevalence and 14 reported associations between drinking and health or behaviour. AMSTAR2 ratings were low or critically low. Worldwide, across reviews, from 13% to 67% of children had consumed energy drinks in the past year. Only two of the 74 studies in the reviews were UK-based. For the dataset analysis, we identified and included five UK cross-sectional datasets, and found that 3% to 32% of children, across UK countries, consumed energy drinks weekly, with no difference by ethnicity. Frequent drinking (5 or more days per week) was associated with low psychological, physical, educational and overall well-being. Evidence from reviews and datasets suggested that boys drank more than girls, and drinking was associated with more headaches, sleep problems, alcohol use, smoking, irritability, and school exclusion. GRADE (Grading of Recommendations, Assessment, Development and Evaluation) assessment suggests that the evidence is weak.

Conclusions Weak evidence suggests that up to a third of children in the UK consume caffeinated energy drinks weekly; and drinking 5 or more days per week is associated with some health and behaviour problems. Most of the evidence is from surveys, making it impossible to distinguish cause from effect. Randomised controlled trials are unlikely to be ethical; longitudinal studies could provide stronger evidence.

PROSPERO registrations CRD42018096292 – no deviations. CRD42018110498 – one deviation - a latent class analysis was conducted.

  • nutrition & dietetics
  • epidemiology
  • public health
  • community child health

Data availability statement

Data are available upon reasonable request. All the data in the overview are publicly available, but not necessarily without charge. Those for the dataset analysis are available from the UK Data Service.

This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/ .

https://doi.org/10.1136/bmjopen-2020-047746

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

The main strength of this study was the novel use of a secondary data analysis to fill a gap in the evidence that was identified by the overview.

A strength of the overview was its robust methods, and that it only included reviews that used systematic methods.

A limitation of the overview was the strength of evidence of the primary research, most of which was from cross-sectional surveys.

The main limitations of the dataset analysis were that longitudinal data were not available, and the survey data could not be combined due to differences between surveys in their designs and measures reported.

Introduction

Caffeinated energy drinks (CEDs) are drinks containing caffeine, among other ingredients, that are marketed as boosting energy, reducing tiredness, and improving concentration. They include brands such as Red Bull, Monster Energy, and Rockstar. There is widespread concern about their consumption and effects in children and adolescents (under 18 years old). 1–4 Some professional organisations have suggested banning sales to children. 2 In the UK, warnings, aimed at children and pregnant women, are required on the packaging for drinks that contain over 150 mg/L of caffeine. 5 An average 250 mL energy drink contains a similar amount of caffeine to a 60 mL espresso, and the European Food Safety Authority proposes a safe level of 3 mg of caffeine per kg of body weight per day for children and adolescents. 6 Many drinks also contain other potentially active ingredients, such as guarana and taurine, and more sugar than other soft drinks, although there are sugar-free options. 7–9 Children may be more at risk of ill effects than adults. 10 11 Effects could be physical (eg, headaches), psychological (eg, anxiety) or behavioural (eg, school attendance or alcohol consumption). 12 Available systematic reviews report a wide range of findings, including positive effects on sports performance.

In 2018, the UK government ran a consultation on implementing a ban on sales to children, 13–15 and in March 2019 they published a policy paper. 16 The research reported here was commissioned by the Department of Health and Social Care (DHSC), England, in 2018, to identify and assess the evidence on the use of CEDs by children. As the deadline was short, and as initial searches identified several systematic reviews, a systematic review of systematic reviews (referred to as overview, from this point onwards) was conducted. As only two UK studies were identified within the reviews included in the overview, UK datasets were sought, and a secondary analysis of relevant data was carried out to supplement the international literature and ensure relevance to UK policy. Full reports are available. 17 18

The research questions (RQ) were:

RQ1. What is the nature and extent of CED consumption among people aged 17 years or under in the UK?

RQ2. What impact do CEDs have on young people’s physical and mental health, and behaviour?

This paper summarises the overview and dataset analysis. 17 18 For the overview, a literature search was conducted during May 2018 and updated on 2 July 2021. EPPI-Reviewer software 19 was used to manage the data. The gaps, identified by the overview and a search for primary studies, guided the search, conducted during August 2018, for UK datasets and their subsequent analysis. STATA v13 20 was used to analyse the datasets. Ethical approval was granted by UCL’s Ethics Committee. Protocols were registered on PROSPERO (CRD42018096292 and CRD42018110498).

Search strategies

For the overview, we searched nine databases, focusing on research in health, psychology, science or social science, or general research. We completed forward citation searching in Google Scholar for 13 included reviews. The databases searched and the MEDLINE search strategy are in the online supplemental file (section 1). The search terms were based on three concepts: caffeine, energy drink, and systematic review. The searches were limited to the publication year of 2013 onwards, to identify the most recent systematic reviews. For the dataset analysis, search terms were based on caffeine and energy drink. We searched the UK Data Service 21 (accessing over 6000 UK nation population datasets), with no restrictions.

Supplemental material

Inclusion criteria.

For the overview:

Systematic review published since 2013

Extractable data on children under 18 years of age

Available in English

Patterns of CED use or associations with physical, mental, social or behavioural effects.

Four reviewers (GB, CK, GR and CS) screened references based on their titles and abstracts, and then screened potential includes on their full texts. The four reviewers double-screened batches of 10 references until their decisions to include or exclude each paper were the same on at least nine of the 10 (90%), then they screened individually. Disagreements and indecisions were resolved by another of the four reviewers, where necessary.

For the dataset analysis:

Downloadable datasets, representative of the UK or a constituent country

Information on the levels and patterns of CED consumption

Data on children under 18 years of age (adults could provide the data on their behalf)

Reporting primary (frequency, amount, or occurrence of drinking/not drinking (comparator)) or secondary (sugar consumption, cardiovascular health, mental health, neurological conditions, educational outcomes, substance misuse, sports performance or sleep characteristics) measures.

After a pilot batch, for which two reviewers (GB and DK) assessed datasets independently and discussed their decisions to include or exclude, the remaining datasets were screened, independently.

Data extraction

From the systematic review reports that met the overview inclusion criteria, we extracted details on/for: systematic review methods; included studies; CED consumption; associations with physical, mental, social or behavioural effects; and risk of bias assessment. One reviewer (GB, CK, GR or CS) extracted these data, which were checked by another reviewer. For the dataset analysis, one reviewer (GB or DK) extracted dataset characteristics (sample size, etc); details on participants (age, gender, etc) and consumption (how it was measured, etc); well-being and health outcomes, including potential confounders; and information on missing data and for risk of bias assessment.

The data extracted from the systematic reviews were synthesised in a narrative format due to variation between reviews. Prevalence was synthesised by the measure used, where possible. Associations were synthesised by whether they were physical, mental, behavioural, or social/educational, and summary tables were produced. One reviewer (GB, CK, GR or CS) synthesised the data and another checked each synthesis.

Each dataset was analysed for prevalence and frequency of CED consumption, and any variations by children’s characteristics. Most of the cross-sectional analyses were bivariate (exploring interactions between two features), with binary and multinomial logistic regression used to control for confounders. A latent class analysis (LCA) was conducted, 22 for one dataset. The latent profiles were based on children’s health experiences, such as headaches, anxiety, or dizziness. The observed variables (11 indicators of child well-being) and latent variables (five classes of well-being) were identified from the data. Class membership was used as the dependent variable in multinomial logistic regressions. Descriptive associations were explored in bivariate analyses of the 11 indicators, separately. The results from individual datasets were synthesised in a narrative because meta-analysis was not deemed to be appropriate. Missing data were not imputed, as it was not possible to determine if they were missing at random. One reviewer (DK) analysed the data.

Risk of bias

AMSTAR2 23 was used to assess the risk of bias in the included systematic reviews, because some reviews included randomised controlled trials (RCTs) as well as non-RCTs. AMSTAR2 has questions on the protocol, inclusion criteria, search, selection, data extraction, risk of bias assessment, reporting, synthesis (RCTs and non-RCTs), and conflicts of interest; a question on relevance was added. The strength of the evidence was assessed using GRADE (Grading of Recommendations, Assessment, Development and Evaluation) criteria, 24 which can be used to determine whether the evidence is strong or weak, based on any risk of bias, including in study design and size, consistency of the results, relevance to the population, and potential publication bias. Overlap, where the same primary studies appear in more than one review, was assessed. 25 Overlap can lead to double counting of the results of a study, giving these more influence than those of other studies. 26 Two reviewers (CK and GR) assessed risk of bias; random samples were checked by a third reviewer (GB). Datasets were not formally assessed, but all datasets met the quality assurance criteria of the UK Data Service. 27 Data on exposure (quantity, frequency and type of drink), sample frame (characteristics of participants), and level of participation (response rate) were extracted, by one reviewer (DK), to determine their parameters. 17 In line with National Institutes of Health guidance, 28 no overall risk of bias score was produced for each dataset because overall scores can be misleading where the risk of bias on each criterion has a different impact on the reliability of the conclusions.

Patient and public involvement

We did not include young people in the research process.

The overview searches identified 1102 references, after deduplication (see figure 1 ); 126 were screened on full texts. We included 15 reviews; six reported information on prevalence, 12 29–33 and 14 reported associations. 12 29–32 34–42 The reasons for exclusion, based on assessment of the full text, are reported in the online supplemental file (section 2). Most were excluded because they did not use systematic review methods or did not report information on children.

  • Download figure
  • Open in new tab
  • Download powerpoint

Flow diagram for the overview. CED, caffeinated energy drinks; T&A, title and abstract.

Three reviews focused on CEDs in children. 12 30 41 One 35 focused on children, with a section on CEDs alongside other drinks. The other 11 reported information on children alongside data for adults; one 29 with CEDs alongside other drinks, and two 31 32 focusing on alcohol mixed with CEDs. For summary and full characteristics, see the online supplemental file (section 3) and the full report. 18

For the dataset analysis, as there was no facility to export results, it was not possible to record the flow of datasets through screening. Five datasets met the inclusion criteria; analyses were not possible for one dataset 43 (see table 1 ). For full descriptions, see the full report. 17

  • View inline

Description of the five datasets included in the secondary data analysis

There was a high risk of bias in all but three of the reviews—Visram et al , 12 and Bull et al 37 Yasuma et al 41 (details in the online supplemental file , section 4)—meaning that some relevant evidence may have been missed. Overlap between studies in the reviews was slight (corrected covered area 3.2%; see the online supplemental file , section 5). The reviews did not include any analyses of the UK datasets that we analysed. Within the reviews, there were four small randomised controlled trials, while most studies were surveys with a high risk of bias; the application of GRADE criteria, which are used to assess the overall strength of the evidence found, suggests that the evidence is weak. Exposure, sample frame and level of participation for the datasets are reported in appendix 1 of the full report. 17

UK studies in the overview

Of the 74 studies identified by the reviews that are summarised in the overview, two were UK surveys. One 44–46 was a longitudinal (two time-points) cross-sectional survey of 11- to 17-year-olds in the south-west of England. The other 47 was a survey of 13- to 18-year-olds across 22 European countries, one of which was the UK (2.6% of respondents).

Below and in tables 2–4 , the overview results are summarised by research question, followed by highlights of the dataset analysis within each topic. The full results of the overview 18 and dataset analysis 17 are available online.

Characteristics and main findings of reviews reporting prevalence of consumption

Prevalence of CED consumption across datasets by school year (approximately weekly consumption with weighted percentages and unweighted sample sizes - see notes below)

Characteristics and main findings of the reviews reporting associations with consumption

RQ1. Nature and extent of CED consumption

The overview included six reviews with data on prevalence of children’s CED consumption, and these are summarised in table 2 .

Across reviews, prevalence varied by study location, population age range, and definition of drinking (ever drunk, in the past year, regularly, with alcohol, etc) from 13% to 67% of children having a CED in the past year. 30 32 One meta-analysis 29 of four studies in the Gulf states suggested that about two thirds of children consumed CEDs (not further defined; 65.3%, 95% CI 41.6 to 102.3 (as reported in the paper)). Across reviews, weekly or monthly drinking ranged from 13% to 54% 48 of children. In one study, across Europe, UK children had the highest proportion of caffeine intake from CEDs, at 11%, 47 but this might reflect a lower intake from coffee or tea. Across reviews, 10% 49 to 46% 50 of children had tried CEDs with alcohol.

In the UK dataset analysis, self-reported prevalence was relatively consistent across UK countries (see table 3 ), although there were differences in the questions asked. About a quarter of children aged 13 to 14 years consumed one drink or more per week (Smoking and Drinking Survey of Young People (SDSYP) data). 51 Prevalence ranged from 3% to 32% of children—slightly lower than found in the overview.

Characteristics of drinkers

In the overview, more boys reported drinking CEDs than girls. 12 29–32 Prevalence by age was inconsistent: for example, within the reviews, one study 48 found that girls started drinking CEDs when they were younger; while one 52 suggested that drinking prevalence peaked at 14 to 15 years; and another 53 suggested that more older boys drank CEDs than younger boys, but more younger girls drank them than older girls. Prevalence by ethnicity was also inconsistent. Children with minority ethnicity drank more than white children, 12 32 but white children drank more than black or Hispanic children, when drinks were mixed with alcohol. 12 In the UK, drinking was associated with being male, older and lower socioeconomic status. 45

In the dataset analysis, the SDSYP reported the most detailed information on sociodemographic characteristics. As in most of the overview evidence, prevalence increased with age, so that between a quarter and a third of children aged 15 to 16 years reported consuming one or more CED per week. More boys (29.3%) than girls (18.1%), and more children living in the North of England than in the South (for example, 33.1% in the North-East vs 16.5% in the South-East), consumed at least one can a week. More children who were eligible for free school meals (29.5%), than those who were not eligible (22.6%), drank CEDs weekly. These differences were robust to the impact of potential confounders (see the online supplemental file , section 6). Unlike the evidence from the overview, which suggested differences in consumption by ethnicity, the proportion of weekly CED consumers was within 3 percentage points of the average across all ethnic groups.

Motives and context

Three reviews reported on motives or context for consumption. 12 29 32 The context was parties and socialising with friends or family 12 32 35 or exams. 29 Children’s motives included taste (particularly with alcohol), for energy, curiosity, friends drinking them, and parental approval or disapproval. Across the reviews, single studies suggested that more girls than boys drank CEDs to suppress appetite, 54 while more boys than girls drank them for performance in sport. 55 And about half of children knew that the drinks contained caffeine, 56 while those who knew that the content might be harmful drank less. 57

Motives and context were not measured in the UK datasets.

RQ2. Associations with drinking CEDs

Fourteen reviews reported associations and are summarised in table 4 . Most reviews included cross-sectional evidence (surveys) or individual case studies. Three reviews 12 40 42 reported prospective trials (four small RCTs in total), which assessed physical performance, cardiovascular response, or the effects of sleep education; one review reported prospective cohort studies.

As most of the evidence was from surveys, measured at a single time-point, cause cannot be distinguished from effect.

Physical health associations

Associations between drinking CEDs and physical symptoms were reported in all but one 40 of the 14 reviews. CEDs improved sports performance. 58 59 There was consistent evidence of associations with headaches, stomach aches and low appetite, 12 35 42 and with sleep problems. 12 30 35 42 Within the reviews, a trial of boys randomised to receive different doses of CED reported dose-dependent increases in diastolic blood pressure and decreases in heart rate. 60 Across reviews, 34 36–39 nine cases of adverse events were reported; eight children had cardiovascular events, and one had renal failure, following a single drink, moderate drinking, or excessive drinking (in a day or for weeks).

Analysis of the Health Behaviour in School Children (HBSC) 2013/14 data found that children drinking CEDs once a week or more, compared with those drinking less often, were statistically significantly more likely to report physical symptoms occurring more than once a week, such as headaches (22.2% vs 16.8%), sleep problems (13.6% vs 8.5%) and stomach problems (31.2% vs 23.1%).

Mental health associations

Associations between drinking CEDs and mental health were inconsistent. 12 29 30 32 35 40 42 One review reported that improvements in mental health and hyperactivity were found in children who were randomised to receive an intervention to lower their intake of CEDs. 61 Associations were found with stress, anxiety or depression, 12 30 35 40 42 but two reviews 12 40 also found studies that did not find an association. Some reviews included evidence of associations with self-harm or suicidal behaviour, 30 35 40 42 and with irritation and anger. 12 30 35 40 42

Secondary analyses of the HBSC 2013/14 data found that children who consumed CEDs at least once a week were statistically significantly more likely, than those who did not, to report low mood (20.3% vs 14.9%) and irritability (30.8% vs 18.0%) on a weekly basis.

Behavioural associations

Some evidence of associations between drinking CEDs and behaviour was reported. 12 30–32 35 42 Drinking CEDs was associated with alcohol, smoking and substance misuse at a single time point, 12 30 35 and at follow-up. 41 CED consumption at baseline predicted alcohol consumption at follow-up. 12 Consumption was associated with increased hyperactivity and inattention, and with sensation seeking. 12 30 35 Injuries were associated with drinking CEDs with alcohol 12 31 and without alcohol. 12 30

Analysis of the SDSYP data found that higher proportions of children who consumed one or more cans per week had tried alcohol (59.1%) and smoking (39.7%), compared with non-CED consumers (alcohol 28.9%, smoking 10.4%).

Social or educational associations

Consistent associations between drinking CEDs and social or educational outcomes were reported. 12 32 Within reviews, one UK study 45 found an association between drinking CEDs once a week or more and poor school attendance. CEDs mixed with alcohol were associated with lower grades and more absence from school. 32

Analysis of the SDSYP data found that almost half of children who had been truant or excluded reported drinking a can of CED on a weekly basis (49.5%), compared with less than a fifth of those who had not been truant or excluded (18.5%).

Well-being profiles

Using the HBSC 2013/14 dataset, we identified 11 indicators of well-being: weekly experience of irritability, sleep difficulties, nervousness, dizziness, headaches, stomach aches, and low mood; as well as low life satisfaction, feeling pressured by schoolwork some or a lot of the time, dislike of school, and low self-rated academic achievement. From these, using LCA, we identified five profiles: low psychological well-being (18.2% of children), high overall well-being (48.6%), low educational well-being (6.7% of children), low physical well-being (13.0%), and low overall well-being (13.5%). See the online supplemental file (section 6) for details.

After controlling for age, gender, rurality, smoking status, alcohol status and Family Affluence Scale (a measure of socioeconomic status; for more information see Hartley et al 62 ), the relative risk of having a low well-being profile, compared with a high well-being profile, was substantially higher for children who consumed CEDs at least 5 days a week (frequent), compared with those who rarely or never did. Relative to a high well-being profile, frequent consumers had a higher risk of low psychological well-being (RR 2.11, 95% CI 1.56 to 2.85) and low physical well-being (RR 2.52, 95% CI 1.76 to 3.61), and were over four times more likely to have low educational well-being (RR 4.81, 95% CI 3.59 to 6.44) and low overall well-being (RR 4.15, 95% CI 2.85 to 6.00). These data suggest that CED consumption is a marker of low well-being, but the analyses also showed that consumption was one of a cluster of factors (eg, smoking and drinking alcohol) in children with low well-being.

Summary of the evidence

Prevalence varied according to the measures used and the ages of children. In the overview, CED consumption prevalence was up to 67% of children in the past year and, in the dataset analyses, up to 32% of children were consuming a CED at least 1 day a week, meaning that up to a third of UK children are regularly consuming caffeine. Evidence from the overview and the dataset analyses consistently suggests that boys drink more than girls, and that drinking tends to increase with age. Some evidence from the overview suggested higher prevalence in children from ethnic minority backgrounds, but no such association was detected in the UK data analysis. This could be due to factors such as area of residence or social class affecting well-being in children from ethnic minorities, where well-being is driving the differences in prevalence of CED consumption, rather than minority background. Reviews included in the overview found that most drinking of CEDs occurred at parties, around exams, with friends, or with family, and motives included taste, energy, curiosity, appetite suppression, and sports performance, which was reported to be improved. There was some evidence that knowledge of content was low, and that children who knew that the content might be harmful drank less, suggesting that education could reduce drinking.

Evidence from the overview suggests worse sleep, and raised blood pressure, with CED consumption, compared with reduced or no consumption. Both the overview and the dataset analysis found that children who consumed CEDs reported headaches, stomach aches and sleep issues more frequently than those who did not; although most studies were cross-sectional, some in the overview were longitudinal, showing changes over time. 18 The overview identified consistent evidence of associations with self-harm, suicide behaviour, alcohol use*, smoking*, substance misuse*, hyperactivity, irritation*, anger, and school performance, attendance, and exclusion (*also found in the UK dataset analysis). This was consistent with findings reported in non-systematic reviews. 10 63 64

The UK dataset analysis suggested that children who consumed CEDs 5 or more days a week had lower psychological, physical, educational and overall well-being than non-drinkers. It remains unclear whether drinking CEDs contributes to low well-being, or low well-being leads to CED consumption, or both. Alternatively, there may be a common cause, such as social inequality.

Strengths and limitations

The overview was limited by the amount of information reported in the included systematic reviews, and by their method limitations; all had a high risk of bias. They mainly included cross-sectional surveys or case reports, which means that cause or effect cannot be determined where an association is found. However, some prospective studies, including four small RCTs, were included in the reviews and where there were common measures, the evidence from these RCTs and from most of the cross-sectional studies within the reviews was consistent. This suggests that the associations found could be reliable. A strength of our work is that the UK evidence in the overview (two studies within the reviews) was supplemented by the analysis of UK data, which was mostly consistent with the non-UK evidence. These data support the idea that there is a link between drinking CEDs and poorer health and behaviour in children, although the cause is unclear. Overlap between reviews in the overview was slight (unsurprisingly, given the different foci of the reviews). There was no overlap between the reviews and the dataset analysis, meaning that the latter added new information. The wide range of tools used to measure prevalence made it difficult to summarise the overview evidence, and meta-analysis of the individual participant UK data was not possible, meaning that the conclusions are based on weaker evidence from single sources.

Recommendations for research

Standardisation is needed in the measurement of the prevalence of drinking—defining the dosage (in drinks and/or caffeine), timing (daily, weekly, etc) and population (age, ethnicity, etc). There was little evidence on children under 12 years old, and both the overview and dataset analysis found little evidence from the UK. Longitudinal data, from the UK datasets, should be collected to understand better the impact of consumption. RCTs may not be ethical, even where benefits are predicted, such as where children who consume CEDs are randomised to interventions to reduce or stop their drinking to see if this improves their well-being.

Based on a comprehensive overview of available systematic reviews, we conclude that up to half of children, worldwide, drink CEDs weekly or monthly, and based on the dataset analysis, up to a third of UK children do so. There is weak but consistent evidence, from reviews and UK datasets, that poorer health and well-being is found in children who drink CEDs. In the absence of RCTs, which are unlikely to be ethical, longitudinal studies could provide stronger evidence.

Ethics statements

Patient consent for publication.

Not applicable.

Ethics approval

This study does not involve human participants.

Acknowledgments

Thank you to Irene Kwan for assisting with data extraction for the review.

  • Committee on Nutrition and the Council on Sports Medicine and Fitness
  • UK Government.
  • Haskell CF ,
  • Kennedy DO ,
  • Wesnes KA , et al
  • Keast RSJ ,
  • Swinburn BA ,
  • Sayompark D , et al
  • Curran CP ,
  • Marczinski CA
  • Henderson R , et al
  • Cheetham M ,
  • Riby DM , et al
  • UK Government
  • Department of Health and Social Care (DHSC)
  • Department of Health and Social Care
  • Brunton G ,
  • Sowden A , et al
  • Raine G , et al
  • Brunton J ,
  • ↵ Uk data service . Available: https://ukdataservice.ac.uk/ [Accessed 3 Apr 2020 ].
  • Huang L , et al
  • Reeves BC ,
  • Wells G , et al
  • Brennan S ,
  • McKenzie J ,
  • Middleton P
  • Antoine S-L ,
  • Mathes T , et al
  • McKenzie JE ,
  • UK Data Service
  • National Institutes of Health
  • El Kashef A ,
  • AlGhaferi H
  • Stockwell T
  • Verster JC ,
  • Johnson SJ , et al
  • Babayan Z , et al
  • Bleich SN ,
  • Vercammen KA
  • Burnett K , et al
  • Goldfarb M ,
  • Tellier C ,
  • Thanassoulis G
  • Cervellin G ,
  • Sanchis-Gomar F
  • Richards G ,
  • Imamura K ,
  • Watanabe K , et al
  • Nadeem IM ,
  • Shanmugaraj A ,
  • Sakha S , et al
  • University of London, I.o.E., Centre for Longitudinal Studies,
  • EFSA Panel on Dietetic Products, Nutrition and Allergies (NDA)
  • Al-Hazzaa H ,
  • Waly MI , et al
  • Jasionowski A
  • Nobile CGA , et al
  • NHS Digital
  • Gambon DL ,
  • Boutkabout C , et al
  • Hammond D ,
  • McCrory C , et al
  • Bryant Ludden A ,
  • Musaiger AO ,
  • Gallimberti L ,
  • Chindamo S , et al
  • Abian-Vicen J ,
  • Salinero JJ , et al
  • Gallo-Salazar C ,
  • Abián-Vicén J , et al
  • Temple JL ,
  • Briatico LN
  • Man Yu MW , et al
  • Hartley JEK ,
  • Zucconi S ,
  • Volpato C ,
  • Adinolfi F , et al
  • Seifert SM ,
  • Schaechter JL ,
  • Hershorin ER , et al
  • Public Health England
  • Northern Ireland Statistics and Research Agency

Supplementary materials

Supplementary data.

This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

  • Data supplement 1
  • Press_Release.pdf

Twitter @katysutcliffe

Contributors GB, CK, GR and CS worked on all stages of the overview. GB, CK, DK and GR worked on the overview update. GB and DK completed all stages of the secondary data analysis. GB, KS, AS and JT supervised the work. All authors discussed the results and contributed to the final manuscript. JT is the guarantor of this work.

Funding This overview and secondary data analysis was funded by the National Institute for Health Research (NIHR) Policy Research Programme (PRP) for the Department of Health and Social Care (DHSC). It was funded through the NIHR PRP contract with the EPPI Centre at UCL (Reviews facility to support national policy development and implementation, PR-R6-0113-11003). Any views expressed in this publication are those of the author(s) and not necessarily those of the NHS, the NIHR or the DHSC.

Competing interests None declared.

Provenance and peer review Not commissioned; externally peer reviewed.

Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Read the full text or download the PDF:

  • Systematic Review
  • Open access
  • Published: 14 August 2024

Experiences of informal caregivers supporting individuals with upper gastrointestinal cancers: a systematic review

  • Melinda Furtado   ORCID: orcid.org/0000-0001-5472-4707 1 ,
  • Dawn Davis 1 ,
  • Jenny M. Groarke 1 , 2 &
  • Lisa Graham-Wisener 1  

BMC Health Services Research volume  24 , Article number:  932 ( 2024 ) Cite this article

129 Accesses

Metrics details

Upper gastrointestinal cancers (UGICs) are increasingly prevalent. With a poor prognosis and significant longer-term effects, UGICs present significant adjustment challenges for individuals with cancer and their informal caregivers. However, the supportive care needs of these informal caregivers are largely unknown. This systematic review of qualitative studies synthesises and critically evaluates the current evidence base on the experience of informal caregivers of individuals with UGIC.

A Joanna Briggs Institute systematic review was conducted. Searches were performed in four databases (MEDLINE, PsycINFO, Embase, CINAHL) from database inception to February 2021. Included studies explored experiences of informal caregivers of individuals diagnosed with primary cancer of the oesophagus, stomach, pancreas, bile duct, gallbladder, or liver. Studies were independently screened for eligibility and included studies were appraised for quality by two reviewers. Data were extracted and synthesised using meta-aggregation.

19 papers were included in this review, and 328 findings were extracted. These were aggregated into 16 categories across three findings: (1) UGIC caregiver burden; UGIC caregivers undertake extensive responsibilities, especially around patient diet as digestion is severely impacted by UGICs. (2) Mediators of caregiver burden; The nature of UGICs, characterised by disruptive life changes for caregivers, was identified as a mediator for caregiver burden. (3) Consequences of caregiver burden: UGIC caregivers’ experiences were shaped by unmet needs, a lack of information and a general decline in social interaction.

Conclusions

The findings of this review suggest the need for a cultural shift within health services. Caregiving for UGIC patients is suggested to adversely affect caregivers’ quality of life, similarly to other cancer caregiving populations and therefore they should be better incorporated as co-clients in care-planning and execution by including them in discussions about the patient’s diagnosis, treatment options, and potential side effects.

Peer Review reports

The National Institute for Clinical Excellence (NICE) [ 1 ] define upper gastrointestinal cancers (UGICs) as cancers of the oesophagus, stomach, pancreas, bile duct/gallbladder, or liver. Of all new cancer diagnoses in 2020 globally, 16.6% were UGICs [ 2 ]. Incidence of UGICs is increasing in countries under economic transition, and in Western countries due to heightened exposure to certain risk factors [ 3 ]. Overall prevalence of UGICs is also expected to rise annually with growing life expectancy and improved diagnostics [ 4 ]. Despite this, UGICs still have a uniquely poor prognosis in comparison to other cancer populations [ 5 ]. UGICs do not typically benefit from screening programmes and individuals are more likely to present at diagnosis with advanced disease [ 6 ]. This is compounded by a high rate of recurrence for individuals able to receive curative treatment [ 7 , 8 , 9 ]. As a result, UGICs persistently account for a significant proportion of global cancer deaths; 27.1% in 2020 [ 2 ]. Poor prognosis contributes significantly to the heightened disease burden of UGIC, alongside increased utilisation of health services due to the complexity of the treatment trajectory and symptom management [ 10 , 11 ]. In comparison to other cancer populations, having UGIC is associated with late consultation with palliative care services [ 12 ] meaning patients and their families have delayed access, if any, to supportive interventions such as counselling, psycho-education, financial advice and structured family meetings [ 13 ].

The supportive care needs of the sizeable population of individuals with UGIC are considerable, with sustained late and longer-term effects. In addition to the common sequalae from cancer diagnosis and treatment, disruption to the digestive system presents problems with swallowing, nausea and keeping food down, a modified diet, extreme changes in weight, chronic pain and living with a stoma [ 14 , 15 ]. The poor prognosis and longer-term effects present a challenge in adjustment both for the individual with UGIC and their informal caregiver, defined as “close persons” who may be related to the diagnosed individual (siblings, relatives, or spouses) or not (friends, neighbours). A caregiver is anyone identified as such by the patient to provide unpaid ongoing care and support [ 16 ]. Examples of challenges for caregivers include learning new practical skills such as managing negative responses to foods, providing a new diet, monitoring weight changes, chronic pain management and stoma management [ 17 , 18 ]. With biomedical advances leading to a reduction in hospital stay length [ 19 ], there is increasing emphasis placed on the role of the UGIC caregiver to provide support to the individual with cancer in the community.

This unique caregiver population face distinct challenges which contribute to caregiver burden which reflects the need for further research into their experiences. For example, due to changes in the diet of the individual with UGIC, the social aspect of dining for both is compromised and can lead to feelings of loneliness, anxiety, and shame [ 20 , 21 ]. Evidence of caregiver burden is suggested by high levels of anxiety and depression. In caregivers of post-treatment oesophageal cancer patients, 30% of caregivers reported moderate-high levels of anxiety and 10% reported moderate-high levels of depression, alongside a significant fear of recurrence [ 22 ]. Research suggests that UGIC caregivers may experience higher levels of psychological distress than the individual with UGIC, and that clinical levels of anxiety and depression may be sustained in the longer-term [ 22 , 23 ]. However it is worth noting that a lot of the effects of UGIC caregiving acknowledged in the literature are consistent with the general experience of informally providing care and as such there is scope to apply the beneficial practices from other settings (both extra-GI cancer and non-cancer).

It is crucial that we recognise the role of caregivers as co-clients and understand the experiences of this significant caregiver population. Caregivers’ personal experiences are inherently subjective, and due to this subjective nature, a qualitative research approach is optimal [ 24 ]. A synthesis of existing qualitative studies will help to establish a knowledge base on the experience of informal caregivers of individuals with UGIC and will help to inform the provision for supportive care. An initial search of PROSPERO, MEDLINE, the Cochrane Database of Systematic Reviews and the Joanna Briggs Institute (JBI) Database of Systematic Reviews and Implementation Reports was conducted and no current or underway systematic reviews on the topic were identified.

This qualitative systematic review aims to synthesise the best available evidence on the experiences of informal caregivers supporting individuals diagnosed with UGIC.

This systematic review was conducted following the JBI approach to qualitative systematic reviews [ 25 ]. A protocol was pre-registered in PROSPERO (registration number CRD42021235354). The systematic review is reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Protocols (PRISMA-P) statement [ 26 ].

Search strategy

An initial limited search of MEDLINE (Ovid) and PsycINFO (Ovid) was undertaken using the following keywords: Oesophageal cancer OR Stomach cancer OR Gastrointestinal cancer OR pancreas cancer OR gallbladder cancer OR liver cancer AND caregiver AND Qualitative. The text words contained in the titles and abstracts of relevant articles, and the index terms used to describe the articles were used to develop a full search strategy for MEDLINE and adapted for the other databases.

The final search strategy (Additional information 1 ) was then employed against four databases: MEDLINE (Ovid), PsycINFO (Ovid), Embase (Elsevier) and CINAHL (EBSCOhost). Each database was searched on 12th February 2021.

Study selection

Following the formal searches, all identified citations were collated and uploaded into Endnote [ 27 ] to identify and remove duplicates. Rayyan reference management software [ 28 ] was then used by independent two reviewers (DD, MF) to screen titles and abstracts against the eligibility criteria. Potentially relevant articles were retrieved in full and screened against the eligibility criteria by two independent reviewers (DD, MF). Reasons for exclusion of papers at full text review were recorded. Any disagreements that arose between the reviewers at each stage of the selection process was resolved through discussion (DD, MF), or with an additional reviewer (LGW). The reference list and citation list of all eligible articles was searched for additional studies.

Inclusion and exclusion criteria

This review included studies exploring experiences of adults (≥ 18 years of age) who are informal caregivers of individuals diagnosed with UGIC at any stage within the disease process. This included those diagnosed with cancer of the oesophagus, stomach, pancreas, bile duct, gallbladder, or liver [ 1 ]. This diagnosis must be the primary cancer site. Studies involving informal caregivers of individuals who had secondary gastrointestinal system metastases were not included.

A caregiver is anyone identified as such by the patient to provide unpaid ongoing care and support [ 16 ]. Paid professional caregivers were not included. The caregivers included provided various services, such as practical (providing transport, overseeing meals) or emotional support roles in caring for the patient. Caregivers with any gender or ethnicity were considered for inclusion. Both active and bereaved caregivers were eligible, if discussing their pre-bereavement experience.

Studies which reviewed experiences of multiple groups (e.g., patients, caregivers, healthcare professionals) or multiple cancers beyond the remit of UGIC were included, provided the data pertaining to informal caregivers and UGICs was clearly delineated and could be extracted separately. Where data was hard to distinguish regarding participant-type or cancer-type, the study was only included if at least 50% of the sample size was drawn from the target population.

Phenomena of interest

The review included qualitative studies that looked at caregivers’ experiences of caring for an individual with UGIC.

Studies for inclusion were based in any geographic location or setting. All care contexts were considered relevant (e.g., primary care, secondary, tertiary, community, or home settings).

Types of studies

Research studies considered for inclusion were focused on qualitative data including, but not limited to; designs such as phenomenology, grounded theory, ethnography, action research and feminist research. Mixed method studies were considered relevant if data from the qualitative component could be clearly extracted. Only English language studies were included.

Only empirical studies published in peer-reviewed journals were included. There was no restriction on publication year. Systematic reviews were not included, however relevant studies were harvested from them, when relevant. Editorials, opinion papers, case studies and any articles without relevant, original data were excluded, alongside grey literature.

Quality Appraisal

Subsequently, two independent reviewers (DD, MF) critically appraised the included studies to evaluate the strength of the evidence for methodological quality using the JBI Critical Appraisal Checklist for Qualitative Research [ 29 ]. All studies, regardless of the results of their methodological quality, underwent data extraction and synthesis. One of the included studies employed use of free-test questionnaires [ 30 ], the robustness of which has been called into question by qualitative researchers as the data generated from these responses is rarely rich enough to provide the necessary strong insights [ 31 ]. However, the reviewers felt the robustness of this study was upheld by the fact that the researchers conducted a comprehensive search on existing literature prior to data collection, thus allowing questionnaire findings to be scaffolded onto existing conceptual frameworks.

Data extraction

Data were extracted using standardized JBI data extraction tool [ 32 ] by two independent reviewers (DD, MF). Each undertook data extraction for half of the articles and then checked the other reviewer’s data extraction. The extracted data included specific details about the population, context, study methods and the phenomena of interest relevant to the review objective. Disagreements between the reviewers were resolved through discussion. Four authors of papers were contacted to request missing or additional data for clarification mainly regarding breakdown of participant populations by cancer type of which no new information arose.

A finding is defined by the JBI as “a verbatim extract of the author’s analytic interpretation accompanied by either a participant voice, or fieldwork observations or other data.” [ 33 , p40]. Findings were identified through repeated reading of the text, and extraction of findings included any distinct analytic observation reported by authors with an accompanying illustration (Additional information 2 ).

Data synthesis

Each finding was identified by an alphanumeric code (e.g., A1, A2, B1, etc.). Each letter corresponded to a study and each number to a unique finding. The progressive numbers indicate the order of the findings within the original article. Each finding was rated with one of three levels of credibility as per the ConQual system [ 34 ]:

Unequivocal - findings accompanied by an illustration that is beyond reasonable doubt and therefore not open to challenge.

Credible - findings accompanied by an illustration lacking clear association with it and therefore open to challenge.

Not Supported - findings are not supported by the data.

Qualitative research findings were pooled with the meta-aggregation approach and captured in a Microsoft Excel spreadsheet [ 33 ]. Findings were aggregated by assembling the findings and categorizing these findings based on similarity in meaning, then labelling the categories accordingly. Categories were then synthesised to produce a comprehensive set of synthesized findings. Two reviewers (DD, MF) repeatedly read the findings and developed a set of categories. To assess the quality and confidence of each qualitative finding synthesised within this review, authors utilised the ConQual system (Additional information 3 ), a tool used to assign ratings of confidence in synthesised qualitative research findings [ 34 ]. Only unequivocal and credible findings were included in the synthesis.

The combined database searches yielded 5465 records. After removing duplicates and screening studies against eligibility criteria (Fig.  1 ), the review included 19 studies [ 18 , 30 , 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , 44 , 45 , 46 , 47 , 48 , 49 , 50 , 51 ]. Additional information 4 displays the characteristics of the 19 included studies.

figure 1

PRISMA flowchart of study selection process

Description of included studies

All included studies were published between 2004 and 2021. Most commonly, studies focused on caregivers of individuals with oesophageal cancer ( N  = 7), or pancreatic ( N  = 7), including one study of pancreatic and bile duct cancer. Other studies included caregivers of individuals with liver cancer ( N  = 2), gastric cancer ( N  = 1) and the gastrointestinal tract generally ( N  = 2). Geographically, studies were conducted in eight regions. The largest group ( N  = 6) were conducted in the US [ 35 , 37 , 38 , 39 , 40 , 41 ], followed by Denmark ( N  = 3) [ 42 , 43 , 44 ]. Most samples included a variety of within-family caregivers ( N  = 13), generally spouses/partners, children, and siblings. Others ( N  = 3) looked specifically at spouses and three did not specify the caregiver-patient relationship. Most studies included a semi-structured interview format ( N  = 12), others used focus groups ( N  = 4), secondary analysis of existing data ( N  = 2) or questionnaires ( N  = 1).

Quality of included studies

The JBI Critical Appraisal Checklist [ 25 ] was used to establish the quality of the research. The included studies were generally of good quality, with all 19 papers achieving at least 60% across the ten JBI quality assessment criteria (Additional information 5 ). Within the JBI checklist there are five questions assessing study dependability, where the studies performed at a lower satisfactory level. Of the included papers, two achieved a 5/5 score on dependability questions, seven achieved 4/5, nine scored 3/5 and one scored 2/5. Only 26% of studies could adequately locate the researcher(s) culturally or theoretically and only 37% of papers addressed the influence of the researcher on the research and vice-versa. Conversely, nearly all papers adequately addressed the research methodology’s congruity on objectives, data collection, data representation and analysis.

Meta-aggregation findings

Across the 19 studies, 328 supported findings were extracted, of which 239 were unequivocal and 89 were credible. Findings could be aligned into 23 categories with unique core meanings, which were then synthesised into three findings: (1) UGIC caregiver burden; (2) Mediators of caregiver burden; (3) Consequences of caregiver burden (Additional information 6 ). Figure  2 outlines how the categories relate to the overarching synthesised findings. To remain grounded in the data, the actual participants’ words are used throughout the narrative and double quotation marks illustrate a direct caregiver quote. References given after a quotation links the quote to the study as outlined in Additional information 2 .

figure 2

Structural arrangement of categories and synthesised findings

Synthesised finding 1: UGIC caregiver burden

As caregivers began supporting those with UGIC, they faced numerous challenges to adjustment. This largely stemmed from efforts to integrate a broad and complex caregiving role within their existing routine. Difficulties such as disruption to daily routines and meals impacted caregivers’ psychological wellbeing. Caregivers were often unprepared for this life disruption, leading them to seek out information from which to learn and distribute to others.

1. Breadth of the caregiver role

The extent of responsibilities on UGIC caregivers was perceived as broad and complex, with an ‘all encompassing’ focus on patient outcomes. UGIC caregivers ‘assume different roles’ [ 42 ].

“The food thing is omnipresent. We have been told that he is not allowed to have further weight loss (K23).

Specific responsibilities included working around reduced appetite and oral intake; monitoring physical signs e.g., patient weight; perioperative management such as care of surgical wounds and organising medical appointments and treatments.

“We’d have to keep. . .going with all the medical appointments and surgery and treatment” (B11).

2. Challenges around patients’ meals

Treatment for and progression of UGIC severely impacts the patient’s relationship with food; with diet quantity and content at times significantly altered. Adaption for the caregiver involved learning about dietary modifications and management of digestive symptoms such as dysphagia. Several studies found that the new dietary restrictions were a source of worry for caregivers regarding the patients’ weight [ 44 , 45 , 46 , 47 ]. The social importance of food was a common theme throughout the included studies, with interruption to established social norms perceived as distressing. Mealtimes are considered a ‘unifying family ritual’ [ 49 ], but when mealtimes constantly serve to remind caregivers of their responsibility of monitoring, they became a potential source of distress.

“I can’t get Bernard out of the small meals. . I have to ring him every day from work to tell him to eat” (A7).

3. Life disruption

UGIC was experienced as coming unexpectedly into caregivers’ lives, intruding on their existing routines, for instance, as working professionals or parents. Caregivers described their responsibilities as time and energy-consuming. This conflict caused caregivers to feel a loss of control [ 44 ]. Caregiving responsibilities for UGICs demanded commitment over a long-time frame, impacting caregivers’ employability and their ‘own social life’ [ 35 ]:

“It’s changed my daily routine. It totally disrupted my life. I have to rearrange a lot of things such as my kids , my work , and getting help for my house cleaning” (J4).

4. Unpreparedness

Caregivers expressed being ill-equipped and unqualified to manage the needs of the UGIC patient. Caregivers reported feeling out of their depth, partially attributed to the lack of available support, relating to patients’ medical requirements:

“I went , ‘You’re not supposed to call 911? What am I supposed to do? What if he just dies right here?’ I mean , it seems they should have somebody say , ’OK , if he’s with you , then here’s the procedure…[The nurse] gave me really no support about what to do” (R21).

Caregivers sometimes felt misled about the extent of their new responsibilities, as while the patient was cared for in hospital by medical staff, they could not gauge what caregiving at home would involve.

“I wish they would have talked to me about it as well… it was a bit of a shock. …but the next morning it all dawned on me that I had just replaced a whole team” (E10).

5. Information manager

Caregivers perceived a key responsibility was to make executive decisions in the dissemination of information, for instance symptomatology, treatment plans and prognosis. Caregivers felt they were the ‘conduit’ [ 18 ] through which medical details were communicated to members of the extended social circle, a time-consuming role where they spent “hours on the phone telling everyone what is happening” (I32).

The caregivers also viewed their role as giving healthcare providers (HCPs) valuable insight into how the patient was coping outside of the medical setting:

“[describing a discussion during a clinical consultation , contradicting the patient] It is not correct that you almost eat as usual. You are eating food of more liquid substance than you usually do and your drinks are high-protein” (C1).

Synthesised finding 2: mediators of caregiver burden

While supporting patients with UGIC, caregivers are exposed to mediators which could increase or reduce caregiver burden, including their use of coping strategies, financial and social resources, and their caregiving context. For instance, higher levels of social support helped alleviate some caregiver burden. Similarly, how excluded a caregiver felt in the medical setting influenced the burden experienced.

1. Degree of inclusion in medical settings

Many studies reported that caregivers perceived they are often kept at a distance in medical settings, increasing caregiver burden. Although some caregivers felt this was fitting and chose to take a ‘subordinate position’ [ 44 ], others struggled with a sense of exclusion, which commonly left unresolved questions:

“…my husband could ask questions , but I didn’t have the space to ask questions , not unless my husband allowed it” (K39).

In such cases, caregivers relied on HCPs’ judgement. Caregivers described only being ‘seen’ if they actively called attention to themselves [ 41 ]. Caregivers experienced being left out of important decisions.

Caregivers expressed wanting to ask questions without the patient present but felt they had no opportunity to directly communicate with HCPs. This pervasive, default invisibility was perceived as disempowering:

“No health professionals involved me in this decision” (K38).

2. Social resources

The degree and quality of support received by caregivers varied and shaped their overall caring experience. The support network is especially beneficial for normalisation of caregivers’ experiences, providing hope and reducing feelings of isolation.

“it was only when I came here that I started talking to people … it was just like a breath of fresh air. . this dumping syndrome , he [the patient] wasn’t the only one” (A10).

Support could be from spiritual groups ( “I have a lot of people that stand behind me…” (B19), empathetic HCPs ( “It’s easier to talk with a nurse when it concerns important questions. You may receive quite good and reassuring answers” (H22) or peers who have undergone a similar caregiving experience, and therefore could reliably address and empathise with caregivers’ challenges.

3. Financial resources

Caregivers reported financial pressure as they had to consider the dyad’s financial situation while one or both members may not be able to work. Providing full-time care was a drain on caregivers’ resources, time, and money. Caregivers struggled with financial planning for the future in the face of prognostic ambiguity.

“We talked about if we should stay on at the house or sell it” (K6).

There were additional pressures on dyads living in countries where utilisation of private health services is the norm.

“Now my grandmother is sick and I can understand how high is the cost of the disease” (D5).

4. Patient-caregiver relationship

The caregiving experience was shaped by the inter-dyad relationship. Some caregivers reported having an emotionally distant relationship with the patient before the diagnosis which led to poor attachment during the cancer trajectory. Others reported a decline in the relationship quality due to cancer-related pressures.

“When I got upset , I would say to my husband , ‘You got cancer because you didn’t listen to me! You deserve it!” (F35).

Others noted a shift within the relationship, transitioning from ‘caregiver’ to ‘curer’ or from a spousal role to a parental one [ 45 ] especially where the caregiver was actively involved in delivering treatment:

“Sometimes I felt like a mother talking to a child: ‘Remember to do this and that’ ” (K29).

Caregivers experienced reciprocal suffering when seeing the patient suffer, especially if an established close relationship existed:

“up when the patient is up and down when the patient is down” (I21).

5. Emotion-focused coping

The cancer experience was perceived to result in significant distress for caregivers. To address this challenge, caregivers engaged in positive emotion-focused coping strategies to directly regulate distress. Many caregivers reported trying to maintain positive thinking. One participant recalled using humour:

“Sometimes you can’t believe what happens and the only thing you can do is laugh” (I41).

Maintaining a positive outlook was perceived to involve “looking for the good in every situation” and by being selective about what news caregivers received through ‘denial’ and “choosing what to hear” (I44). Conversely, another study described positivity as an open-minded reflection on the conflict between current suffering and spiritual beliefs [ 38 ]. Caregivers described how formally addressing their feelings through therapy was also helpful.

Individuals were limited in their opportunity for emotional expression. Caregivers described hiding their own negative thoughts from the patient and took practical measures to divert the patient’s attention by doing “normal things like [going] for a drive and [having] visits from our children and grandchildren” (C15).

6. Information seeking

Caregivers perceived challenges around a lack of information from HCPs regarding UGIC’s pathology and related management options. The experiences of caregivers included difficulties in accessing information.

“We have little information in these areas. When we go to the physician’s office for treatment , the doctor is too busy to give us information in this regard and he merely visits the patients. When we see that nobody could survive from such diseases , we get worried more” (D9).

Caregivers addressed the information challenge by persistently seeking information relating to the disease itself, namely cancer-related symptomatology, prognosis, and treatment options (including alternative therapies). Caregivers referred to sources like medically knowledgeable peers, the internet and print (e.g., encyclopaedias). HCPs were trusted for honest information, with their word choices and body language carefully analysed:

“When my husband and I visit the doctor together , you see when he opens the door that there is no good news today” (H6).

Caregivers were especially empowered when they could differentiate between symptoms due to disease progression and treatment-related adverse effects.

Synthesised finding 3: consequences of caregiver burden

There were consequences of caregiver burden such as feelings of helplessness, distress, anger, guilt and a strong fear of losing the patient. Conversely, there was potential for positive outcomes as caregivers experienced growth and feelings of hope.

1. Distress and helplessness

When recounting the most involved phase of providing care, active treatment, many caregivers reported experiencing heightened distress. One caregiver perceived gastric cancer a ‘death sentence’ [ 49 ], and seeing the patient struggle with the effects of disease and treatments an unbearably ‘ challenging experience’ [ 40 ]. This distress also affected children with one spouse noting their child’s “grades dropped disastrously during his first term” (H14).

Helplessness originates from a lack of control over the disease progression. A particular source of distress were the delays along the cancer trajectory, especially at diagnosis due to the ambiguous presentation of UGICs and lack of control over symptom management.

“It is distressing seeing him in pain all the time” (E6).

2. Anger and guilt

Caregivers experienced a sense of guilt and anger because they perceived stigma from society towards certain cancers. Others may assume that the diagnosis was caused by the patient’s behaviours and therefore indirectly the caregiver may also have been involved. A few studies described this judgement from society towards the patient, with caregivers fearing that others would see the diagnosis as a justified fate:

“You know , when you say cirrhosis of the liver , they think , ‘Oh , you drank yourself’” (R7).

Caregivers also harboured anger at being forced to take on caregiving responsibilities, describing they had “been dealt a bad hand” (I39); however, they felt guilty for feeling this way.

3. Fear of cancer progression and recurrence

Due to the unpredictability of UGICs, caregivers described living in constant dread of the patient’s health declining, and the potential for disease progression or recurrence:

“I am not sure I am going to like the answers I get. Maybe it is better not to know so very much but to do like the ostrich , to bury your head in the sand and hope for the best and keep your fingers crossed” (H41).

Caregivers were fearful of any new physical or psychological symptoms in patients, especially weight-loss, as caregivers saw this as a marker of recurrence. Further, caregivers feared the cancer would progress to a terminal stage which meant they were afraid of the means through which the bereavement would occur and their own subsequent reaction.

“the fear of not being sure of how it’s going to happen and how I’m going to react…I’m afraid of losing him” (L1).

The high mortality associated with most UGICs caused several caregivers to experience acceptance, with the realisation of the long-term impact of their loved one’s cancer and possibility of bereavement.

“The possibility is there for one of us dying quickly” (K5).

4. Isolation and loneliness

Caregivers commonly reported experiencing isolation within their unique role, feeling unable to share their anxieties. As patients were burdened already, caregivers did not want to unload their own worries on to the patient.

“And I had nobody to talk to…There was just nobody. I couldn’t let myself down , my guard down and I found the isolation terrible” (A3).

Loneliness was not only an ongoing concern, but a future threat as spousal caregivers relayed their fear of life post-bereavement.

5. Personal growth

Caregivers reflected that they saw the experience of caregiving as a catalyst for personal change, resulting in positive outcomes such as personal growth and appreciation for life, individually and within the relationship. Caregivers recounted that this unexpected, immense challenge had given them ‘ new perspectives about life’ [ 35 ]. Couples got to spend time together that they would not have had otherwise which led to an improved quality of relationship.

“We’ll talk three or four times a month. Where 10 years ago it might be 6 months or 10 months you know between phone calls” (B14).

The current study presents the first comprehensive synthesis of qualitative research on the experiences of caregivers of individuals with UGICs. This review is the first to systematically identify and synthesise the current evidence base on the experiences of informal caregivers of individuals with UGIC. Given the emergence of this prominent caregiver population, this review contributes to advancing cancer caregiver literature as a whole, an important area of study recognised by individuals with cancer, their family and healthcare professionals [ 52 ]. The review included 19 studies, presented synthesised findings, and identified aspects of caregiving experiences that UGIC caregivers have in common with other cancer caregivers, and aspects more distinct to UGICs. UGIC caregivers experience significant challenges contributing to high levels of burden which are mediated by social, psychological, and practical resources, as well as aspects of health service delivery. The consequences of caregiver burden are primarily negative, including distress, anger, fear, and loneliness.

Caregivers of UGIC patients experienced burden due to the breath and complexity of their role for which they felt unprepared. Caring involved incorporating novel skills into existing responsibilities, causing significant life disruption. Caregivers perceive burden in providing multifaceted care with demands that shift along the illness trajectory. For example, in the beginning caregivers felt it necessary to partake in provision of care, and due to UGIC treatment and disease progression, many responsibilities evolved to monitor and maximise physical health, such as diligent weight monitoring and meal preparation [ 45 , 46 , 47 ]. These findings align to the general cancer caregiver literature [ 53 ], with caregivers recognised in having steep initial learning curves to rapidly acquire skills to provide care. Only one of the 19 studies evaluated data over an extended period [ 45 ]. An extended review is needed to map supportive care resources available across the disease path and longitudinal studies tracking UGIC caregiver support needs across the illness trajectory is warranted.

One of the most reported findings in this review was informal caregivers’ continuous search for information related to their role. Many struggle to satisfy their informational needs at different stages of the disease trajectory contributing to caregiver burden. This corresponds with systematic review findings of Wang et al. [ 54 ] that informational needs were the most common unmet need of informal caregivers. To begin addressing this need, caregivers could be signposted to existing sources of general caregiver support information and interventions, such as Cancer Caring Coping [ 55 , 56 ]. These supports could be used to develop informational resources tailored for UGIC caregivers. A core information set has been developed to aid HCPs at consultation with UGIC patients, to ensure key information is being delivered [ 57 , 58 ] and now the focus of improving patient-carer education should be raising awareness of this key information toolkit to HCPs who commonly interact with this population. A similar approach could be utilised by identifying informational needs of UGIC caregivers at consultations and developing standardised information points delivered by HCPs to caregivers within those consultations. There is also potential to expand the pool of reliable sources of information to individuals outside of the HCP cohort, such as peer networks or psychologists in providing longitudinal support without necessarily adding to the cost burden required for the development of additional personnel and resources.

This review found caregivers experienced exclusion in the medical setting, suggesting enhanced communication between HCPs and caregivers could improve caregivers’ experience. Indeed, a qualitative study by Reblin et al. [ 59 ] identified communication within health services as a key driver for improving cancer caregiver support. One potential avenue to bridge the gap between HCPs and patient-caregiver dyad is incorporating better the clinical nurse specialist (CNS) [ 60 ] as these professionals can be a key contact for bi-directional communication between HCPs and caregivers. That is, caregivers support and help the clinical team to understand the patient’s progress and through this process HCPs acknowledge and include caregivers in the patient’s care. However, the current issue of under resourcing in cancer nursing would need to be addressed as it presently limits the amount of CNS time available to support caregivers [ 61 ].

One review finding specific to UGIC caregiver burden was the challenge around preparing meals. Taleghani and colleagues [ 62 ] mirror this, highlighting gastric caregivers experienced inadequate education in managing patient’s dietary requirements appropriately, resulting in feeling inefficient, uncomfortable, and fearful. Dietician-led interventions are typically patient focused [ 63 , 64 , 65 ]. However, this review highlights an opportunity for HCPs to include caregivers in dietician-led interventions as many caregivers assume responsibility over meal preparation and grocery shopping. The challenge around meals also has social consequences as meals are important social settings. Changes in eating behaviours can lead to both dyad members feeling isolated and lonely [ 18 , 66 ]. Loneliness is prevalent among people living with cancer and is influenced by cancer-specific and non-cancer specific risk factors, such as lack of social support [ 67 ]. There is less of an understanding of loneliness among UGIC caregivers compared to general cancer caregivers [ 68 ]. This is of concern as negative physical and mental health impacts of loneliness are well-established [ 69 , 70 ]. Peer support is the most used intervention to reduce caregivers’ loneliness, with strategies of psychoeducation and emotional support featuring prominently [ 71 ]. Research is needed to identify risk and protective factors for loneliness among UGIC caregivers.

In addition to loneliness, distress and negative affect were identified as consequences of UGIC caregiver burden. There is evidence of heightened distress and reduced physical and mental health among UGIC caregivers relative to UGIC patients [ 72 , 73 ]. This review also found that caregivers engage in emotion-focused strategies to cope with their caregiving role. A review by Teixeira et al. [ 74 ] found that among cancer caregivers, emotion-focused coping was related to higher distress, whereas problem-focused coping was related to better adjustment and reduced burden. There is a need to develop targeted theory-based psychosocial interventions for this caregiver group. The Transactional Theory of Stress and Coping (TTSC) framework could be utilised to understand how mediating processes specific to coping strategies influence distress and negative affect among UGIC caregivers [ 75 , 76 , 77 ], similar to how Bowan et al. [ 78 ] used a Baltes and Baltes [ 79 ] coping framework to develop interventions for cancer patients’ families. Candidate interventions could involve problem-solving and coping skills training [ 80 , 81 ], which could in turn ameliorate the negative consequences of caregiver burden. If effective with UGIC caregivers, such interventions could be extended to all caregivers as part of a standard care pathway. This review recommends further research to develop an understanding of adjustment in UGIC caregivers.

In contrast to the many negative consequences described by informal caregivers, there were a small group of findings which indicated some positive outcomes. These findings align with a review of the positive aspects of caregiving, which reported improved relationship quality, reward, fulfilment, and personal growth [ 82 ]. The review concluded that positive aspects of caregiving are interconnected and suggested, in addition to interventions reducing negative burden, that interventions could be developed to enhance positive outcomes, such as personal growth. Tedeschi and Calhoun’s Transformational Model (TM) [ 83 ] proposes that potentially traumatic stressors, such as caring for an individual diagnosed with cancer, cause a disruption in one’s worldview triggering attempts to make meaning in response to the stressor. Cognitive disruptions also lead to distress, which in turn can act as a catalyst for post-traumatic growth (PTG). Studies have found that caregivers of people with advanced cancer and early-stage breast cancer experience PTG in relation to their caregiving role [ 84 , 85 ], and that PTG was positively associated with greater social support and perceived hope [ 86 ]. Additional research is needed to understand how the challenging UGIC caregiver role may facilitate growth and help the caregiver adjust to their role.

Study limitations

The current systematic review has several strengths. Firstly, it followed an internationally recognised methodology (JBI) for the conduct of qualitative systematic reviews. This helped ensure methodological approach rigour and subsequently, confidence in findings should they be used to inform policy and practice. There are however several limitations. Although studies in the review are generally of good quality, only 19 studies were identified. Indeed, the UK Less Survivable Cancers Taskforce [ 87 ] advocates for more research focused on cancers with low life expectancy, two-thirds of which are UGICs. This lack of research into UGICs extends to the evidence on caregivers. Synthesised findings are therefore based on a small number of studies, largely conducted in the US and Denmark. Within the studies, caregivers of individuals with oesophageal and pancreatic cancer were well represented. However, there were a dearth of studies focused on caregivers’ experiences with gallbladder, or stomach cancer, alongside multiple studies exploring caregivers’ experiences related to dysphagia and malabsorption but fewer exploring jaundice. Therefore, more primary qualitative research is necessary to understand experiences of all UGIC caregiver populations.

Clinical implications

Of relevance for clinical practice was the finding that caregivers often felt excluded in medical settings, increasing caregiver burden. Caregivers should be seen as co-clients along with patients in the medical setting. This is very much in line with the priorities of care within palliative healthcare settings. Since the palliative care approach seeks to addresses the physical, psychological, cultural, social, and spiritual needs [ 88 ] of both individuals with life-limiting and chronic illnesses like cancer and their support networks, early referral to palliative care services could be particularly beneficial for caregivers as their needs are formally and expertly acknowledged and thus help alleviate the burden identified for informal caregivers in this study.

HCPs have an opportunity to give caregivers reliable, specific, and up-to-date information, pitched at the right level to reassure but not overwhelm. Morris and Thomas [ 89 ] mirror this suggestion and highlight its importance, as there is potential for tension in information exchange due to HCP’s lack of formal acknowledgement of caregivers. Clinical guidance and policy could be updated to include recognition of caregivers as co-clients, and with caregiver training, could formally be part of the patient support team. This could help meet the caregivers’ needs, especially post-diagnosis. On an institutional level, caregivers may be more recognised within their role if acknowledged formally, for example in NICE [ 1 ] guidelines for UGICs. In understanding the considerable role caregivers undertake supporting the care of UGIC patients outside of the healthcare system, policymakers and HCPs need to improve support for caregivers which will in turn reduce the burden on health services.

The aim of this qualitative systematic review was to synthesize evidence about the experiences of UGIC caregivers and has found that caregivers face significant challenges leading to caregiver burden which negatively impacts adjustment. Due to the nature of UGICs, caregivers experienced unique challenges such as how best to manage disruptions to mealtimes and how to monitor surrogate markers of patient health, such as weight. UGICs are a medically complex and evolving chronic condition and caregivers struggle to gain information. This review found that caregiver burden was impacted by feeling excluded in medical settings which could be improved with better communication between HCPs, patients, and their caregivers. There is a lack of data relating to the experiences of certain UGIC caregivers (e.g., gallbladder, stomach) in comparison to others (e.g., oesophageal), as well as a lack of understanding on how to manage the impact of caregiving for these types of cancer, thus providing directions for future research.

Data availability

No datasets were generated or analysed during the current study.

Gastrointestinal tract (upper). cancers - recognition and referral | Health topics A to Z | CKS |NICE. https://cks.nice.org.uk/topics/gastrointestinal-tract-upper-cancers-recognition-referral/2022 . Accessed 25 Aug 2022.

Ferlay J, Ervik M, Lam F, Colombet M, Mery L, Piñeros M, Znaor A, Soerjomataram I, Bray F. Global cancer observatory: cancer today. Lyon, France: International Agency for Research on Cancer (2020). https://gco.iarc.fr/today . Accessed 1 Sep 2021.

Xie Y, Shi L, He X, Luo Y. Gastrointestinal cancers in China, the USA, and Europe. Gastroenterol Rep. 2021;9(2):91–104.

Article   Google Scholar  

Gastrointestinal Cancers Global Burden Expected to Rise. National Cancer Institute. 2020. https://dceg.cancer.gov/news-events/news/2020/global-burden-gastro . Accessed 25 Aug 2022.

Veitch AM, Uedo N, Yao K, East JE. Optimizing early upper gastrointestinal cancer detection at endoscopy. Nat Reviews Gastroenterol Hepatol. 2015;12(11):660–7.

Vedeld HM, Goel A, Lind GE. Epigenetic biomarkers in gastrointestinal cancers: the current state and clinical perspectives. InSeminars in cancer biology 2018 Aug 1 (Vol. 51, pp. 36–49). Academic.

Hong L, Huang YX, Zhuang QY, Zhang XQ, Tang LR, Du KX, Lin XY, Zheng BH, Cai SL, Wu JX, Li JL. Survival benefit of re-irradiation in esophageal cancer patients with locoregional recurrence: a propensity score-matched analysis. Radiat Oncol. 2018;13(1):1–9.

Article   CAS   Google Scholar  

Lee S, Kang TW, Song KD, Lee MW, Rhim H, Lim HK, Kim SY, Sinn DH, Kim JM, Kim K, Ha SY. Effect of microvascular invasion risk on early recurrence of hepatocellular carcinoma after surgery and radiofrequency ablation. Ann Surg. 2021;273(3):564–71.

Article   PubMed   Google Scholar  

Sakamoto H, Attiyeh MA, Gerold JM, Makohon-Moore AP, Hayashi A, Hong J, Kappagantula R, Zhang L, Melchor JP, Reiter JG, Heyde A. The evolutionary origins of recurrent pancreatic CancerOrigins of recurrent pancreatic cancer. Cancer Discov. 2020;10(6):792–805.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Peery AF, Crockett SD, Murphy CC, Lund JL, Dellon ES, Williams JL… Sandler, R. S. Burden and cost of gastrointestinal, liver, and pancreatic diseases in the United States: update 2018. Gastroenterology. 2019;156(1):254–72. Study 2.

Peery AF, Crockett SD, Murphy CC, Lund JL, Dellon ES, Williams JL, Jensen ET, Shaheen NJ, Barritt AS, Lieber SR, Kochar B. Burden and cost of gastrointestinal, liver, and pancreatic diseases in the United States: update 2018. Gastroenterology. 2019;156(1):254–72.

Valentino T, Paiva B, Oliveira M, Hui D, Paiva C. Factors associated with palliative care referral among patients with advanced cancers: a retrospective analysis of a large Brazilian cohort. Support Care Cancer. 2018;26:1933–41.

Hudson P, Remedios C, Thomas K. A systematic review of psychosocial interventions for family carers of palliative care patients. BMC Palliat Care. 2010;9:17–17.

Article   PubMed   PubMed Central   Google Scholar  

Khan NN, Maharaj A, Evans S, Pilgrim C, Zalcberg J, Brown W, Cashin P, Croagh D, Michael N, Shapiro J, White K. A qualitative investigation of the supportive care experiences of people living with pancreatic and oesophagogastric cancer. BMC Health Serv Res. 2022;22(1):1–1.

Cencioni C, Trestini I, Piro G, Bria E, Tortora G, Carbone C, Spallotta F. Gastrointestinal cancer patient nutritional management: from specific needs to novel epigenetic dietary approaches. Nutrients. 2022;14(8):1542.

Given BA, Given CW, Sherwood PR. Family and caregiver needs over the course of the cancer trajectory. J Support Oncol. 2012;10:57–64. https://doi.org/10.1016/j.suponc.2011.10.003

Kim Y, Baek W. Caring experiences of family caregivers of patients with pancreatic cancer: an integrative literature review. Support Care Cancer. 2022;7:1–0.

McCorry NK, Dempster M, Clarke C, Doyle R. Adjusting to life after esophagectomy: the experience of survivors and carers. Qual Health Res. 2009;19:1485–94. https://doi.org/10.1177/1049732309348366

Bozzetti F, Braga M, Gianotti L, Gavazzi C, Mariani L. Postoperative enteral versus parenteral nutrition in malnourished patients with gastrointestinal cancer: a randomised multicentre trial. Lancet. 2001;358(9292):1487–92.

Article   CAS   PubMed   Google Scholar  

Laursen L, Schønau MN, Bergenholtz HM, Siemsen M, Christensen M, Missel M. Table in the corner: a qualitative study of life situation and perspectives of the everyday lives of oesophageal cancer patients in palliative care. BMC Palliat care. 2019;18(1):1–0.

Cooper C, Burden ST, Molassiotis A. An explorative study of the views and experiences of food and weight loss in patients with operable pancreatic cancer perioperatively and following surgical intervention. Support Care Cancer. 2015;23(4):1025–33.

Dempster M, McCorry NK, Brennan E, Donnelly M, Murray LJ, Johnston BT. Psychological distress among family carers of oesophageal cancer survivors: the role of illness cognitions and coping. Psycho-oncology. 2011;20(7):698–705.

Graham L, Dempster M, McCorry NK, Donnelly M, Johnston BT. Change in psychological distress in longer-term oesophageal cancer carers: are clusters of illness perception change a useful determinant? Psycho‐Oncology. 2016;25(6):663–9.

Penner JL, McClement SE. Using phenomenology to examine the experiences of family caregivers of patients with advanced head and neck cancer: reflections of a novice researcher. Int J Qualitative Methods. 2008;7:92–101. https://doi.org/10.1177/160940690800700206

Munn Z, Moola S, Riitano D, Lisy K. The development of a critical appraisal tool for use in systematic reviews addressing questions of prevalence. Int J Health Policy Manage. 2014;3(3):123.

Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021;372:n71. https://doi.org/10.1136/bmj.n71

Clarivate analytics. endnote (version 7.7.1). Philadelphia, USA. https://endnote.com Accessed 25 Aug 2022.

Mourad Ouzzani H, Hammady Z, Fedorowicz, Ahmed Elmagarmid. Rayyan — a web and mobile app for systematic reviews. Syst Reviews. 2016;5:210. https://doi.org/10.1186/s13643-016-0384-4

Joanna Briggs Institute. Joanna Briggs Institute critical appraisal checklist for studies reporting prevalence data. Adelaide: Joanna Briggs Institute; 2011.

Google Scholar  

Hodgson T. Oesophageal cancer: experiences of patients and their partners. Br J Nurs. 2006;15(21):1157–60.

LaDonna K, Taylor T, Lingard L. Why open-ended survey questions are unlikely to support rigorous qualitative insights. Acad Medicine: J Association Am Med Colleges. 2017;93(3):347–9.

Joanna Briggs Institute. Joanna Briggs Institute reviewers’ manual: 2014 edition. Australia: The Joanna Briggs Institute. 2014:88–91.

Lockwood C, Munn Z, Porritt K. Qualitative research synthesis: methodological guidance for systematic reviewers utilizing meta-aggregation. JBI Evid Implement. 2015;13(3):179–87.

Munn Z, Porritt K, Lockwood C, Aromataris E, Pearson A. Establishing confidence in the output of qualitative research synthesis: the ConQual approach. BMC Med Res Methodol. 2014;14(1):1–7.

Sherman DW, McGuire DB, Free D, Cheon JY. A pilot study of the experience of family caregivers of patients with advanced pancreatic cancer using a mixed methods approach. J Pain Symptom Manag. 2014;48(3):385–99.

Winterling J, Wasteson E, Glimelius B, Sjödén PO, Nordin K. Substantial changes in life: perceptions in patients with newly diagnosed advanced cancer and their spouses. Cancer Nurs. 2004;27:381–8.

Hansen L, Rosenkranz SJ, Wherity K, Sasaki A. Living with hepatocellular carcinoma near the end of life: family caregivers’ perspectives. In Oncology nursing forum 2017 Sep 1 (Vol. 44, No. 5).

Nolan MT, Hodgin MB, Olsen SJ, Coleman J. Spiritual issues of family members in a pancreatic cancer chat room. In Oncology Nursing Forum 2006 Mar 1 (Vol. 33, No. 2, p. 239). Oncology Nursing Society.

Padron A. Dyadic relationships among sense of mastery, cancer-related distress, and cortisol in patients and caregivers impacted by abdominal and pelvic malignancies. In University of Florida. 2018.

Petrin K, Bowen DJ, Alfano CM, Bennett R. Adjusting to pancreatic cancer: perspectives from first-degree relatives. Palliat Support Care. 2009;7(3):281–8.

Wong SS, George TJ, Godfrey M, Le J, Pereira DB. Using photography to explore psychological distress in patients with pancreatic cancer and their caregivers: a qualitative study. Support Care Cancer. 2019;27(1):321–8.

Larsen LS, Blix BH, Hamran T. Family caregivers’ involvement in decision-making processes regarding admission of persons with dementia to nursing homes. Dementia. 2020;19(6):2038–55.

Gerhardt S, Dengsø KE, Herling S, Thomsen T. From bystander to enlisted carer–a qualitative study of the experiences of caregivers of patients attending follow-up after curative treatment for cancers in the pancreas, duodenum and bile duct. Eur J Oncol Nurs. 2020;44:101717.

Larsen MK, Birkelund R, Mortensen MB, Schultz H. Being a relative on the sideline to the patient with oesophageal cancer: a qualitative study across the treatment course. Scand J Caring Sci. 2021;35(1):277–86.

Shaw J, Harrison J, Young J, Butow P, Sandroussi C, Martin D, Solomon M. Coping with newly diagnosed upper gastrointestinal cancer: a longitudinal qualitative study of family caregivers’ role perception and supportive care needs. Support Care Cancer. 2013;21(3):749–56.

Gooden HM, White KJ. Pancreatic cancer and supportive care—pancreatic exocrine insufficiency negatively impacts on quality of life. Support Care Cancer. 2013;21(7):1835–41.

Andreassen S, Randers I, Näslund E, Stockeld D, Mattiasson AC. Family members’ experiences, information needs and information seeking in relation to living with a patient with oesophageal cancer. Eur J Cancer Care. 2005;14(5):426–34.

Morowatisharifabad MA, Gerayllo S, Jouybari L, Amirbeigy MK, Fallahzadeh H. Concerns and fear of esophageal cancer in relatives of patients with cancer: a qualitative study. J Gastrointest cancer. 2020;51(3):957–64.

Yi M, Kahn D. Experience of gastric cancer survivors and their spouses in Korea: secondary analysis. J Korean Acad Nurs. 2004;34(4):625–35.

Morowatisharifabad MA, Gerayllo S, Jouybari L, Amirbeigy MK, Fallahzadeh H. Perceived threats toward esophageal cancer among immediate relatives of sufferers: a qualitative study. J Gastrointest Cancer. 2021;52:643–50. https://doi.org/10.1007/s12029-020-00422-y

Shih WM, Hsiao PJ, Chen ML, Lin MH. Experiences of family of patient with newly diagnosed advanced terminal stage hepatocellular cancer. Asian Pac J Cancer Prev. 2013;14(8):4655–60.

James Lind, Alliance. 2021. https://www.jla.nihr.ac.uk . Accessed 1 Sep 2021.

Girgis A, Lambert S. Caregivers of cancer survivors: the state of the field. In Cancer Forum 2009 Nov (Vol. 33, No. 3, pp. 168–171).

Wang T, Molassiotis A, Chung BP, Tan JY. Unmet care needs of advanced cancer patients and their informal caregivers: a systematic review. BMC Palliat care. 2018;17(1):1–29.

Cancer Caring Coping. 2021. https://www.qub.ac.uk/sites/CancerCaringCoping . Accessed 1 Sep 2021.

Santin O, McShane T, Hudson P, Prue G. Using a six-step co‐design model to develop and test a peer‐led web‐based resource (PLWR) to support informal carers of cancer patients. Psycho‐oncology. 2019;28(3):518–24.

Blazeby JM, Macefield R, Blencowe NS, Jacobs M, McNair AG, Sprangers M, Brookes ST, Avery KN, Blazeby JM, Blencowe NS, Brookes ST. Core information set for oesophageal cancer surgery. J Br Surg. 2015;102(8):936–43.

Jacobs M, Henselmans I, Macefield RC, Blencowe NS, Smets EM, de Haes JC, Sprangers MA, Blazeby JM, van Berge Henegouwen MI. Delphi survey to identify topics to be addressed at the initial follow-up consultation after oesophageal cancer surgery. J Br Surg. 2014;101(13):1692–701.

Reblin M, Ketcher D, Vadaparampil ST. Care for the cancer caregiver: a qualitative study of facilitators and barriers to caregiver integration and support. J Cancer Educ. 2021;30:1–7.

Pape E, Decoene E, Debrauwere M, Van Nieuwenhove Y, Pattyn P, Feryn T, Pattyn PR, Verhaeghe S, Van Hecke A, Vandecandelaere P, Desnouck S. Experiences and needs of partners as informal caregivers of patients with major low anterior resection syndrome: a qualitative study. Eur J Oncol Nurs. 2022;58:102143.

Dacre J, Francis R, Appleby J, Charlesworth A, Peckham S, Ahmed S, Brown J, Dickson J, Morris N, Patel M. Expert panel: evaluation of the Government’s commitments in the area of cancer services in England.

Taleghani F, Ehsani M, Farzi S, Farzi S, Adibi P, Moladoost A, Shahriari M, Tabakhan M. Nutritional challenges of gastric cancer patients from the perspectives of patients, family caregivers, and health professionals: a qualitative study. Support Care Cancer. 2021;29(7):3943–50.

Reece L, Hogan S, Allman-Farinelli M, Carey S. Oral nutrition interventions in patients undergoing gastrointestinal surgery for cancer: a systematic literature review. Support Care Cancer. 2020;28(12):5673–91.

Hanna L, Huggins CE, Furness K, Silvers MA, Savva J, Frawley H, Croagh D, Cashin P, Low L, Bauer J, Truby H. Effect of early and intensive nutrition care, delivered via telephone or mobile application, on quality of life in people with upper gastrointestinal cancer: study protocol of a randomised controlled trial. BMC Cancer. 2018;18(1):1–3.

Missel M, Hansen M, Jackson R, Siemsen M, Schønau MN. Re-embodying eating after surgery for oesophageal cancer: patients’ lived experiences of participating in an education and counselling nutritional intervention. J Clin Nurs. 2018;27(7–8):1420–30.

Dempster M, McCorry NK, Brennan E, Donnelly M, Murray LJ, Johnston BT. Illness perceptions among carer–survivor dyads are related to psychological distress among oesophageal cancer survivors. J Psychosom Res. 2011;70(5):432–9.

Deckx L, van den Akker M, Buntinx F. Risk factors for loneliness in patients with cancer: a systematic literature review and meta-analysis. Eur J Oncol Nurs. 2014;18(5):466–77.

Gray TF, Azizoddin DR, Nersesian PV. Loneliness among cancer caregivers: a narrative review. Palliat Support Care. 2020;18(3):359–67.

Beutel ME, Klein EM, Brähler E, Reiner I, Jünger C, Michal M, et al. Loneliness in the general population: prevalence, determinants and relations to mental health. BMC Psychiatry. 2017;17(1):97. pmid:28320380.

Holt-Lunstad J, Smith TB, Baker M, Harris T, Stephenson D. Loneliness and social isolation as risk factors for mortality: a meta-analytic review. Perspect Psychol Sci. 2015;10(2):227–37. pmid:25910392.

Velloze IG, Jester DJ, Jeste DV, Mausbach BT. Interventions to reduce loneliness in caregivers: an integrative review of the literature. Psychiatry Res. 2022;12:114508.

Haj Mohammad N, Walter AW, van Oijen MGH, et al. Burden of spousal caregivers of stage II and III esophageal cancer survivors 3 years after treatment with curative intent. Support Care Cancer. 2015;23:3589–98. https://doi.org/10.1007/s00520-015-2727-4

Langenberg SM, Reyners AK, Wymenga AN, Sieling GC, Veldhoven CM, van Herpen CM, Prins JB, van der Graaf WT. Caregivers of patients receiving long-term treatment with a tyrosine kinase inhibitor (TKI) for gastrointestinal stromal tumour (GIST): a cross-sectional assessment of their distress and burden. Acta Oncol. 2019;58(2):191–9.

Teixeira RJ, Applebaum AJ, Bhatia S, Brandão T. The impact of coping strategies of cancer caregivers on psychophysiological outcomes: an integrative review. Psychol Res Behav Manage. 2018;11:207.

Folkman S, Lazarus RS. Stress, appraisal, and coping. New York: Springer Publishing Company; 1984. p. 460.

Han Y, Hu D, Liu Y, Lu C, Luo Z, Zhao J, Lopez V, Mao J. Coping styles and social support among depressed Chinese family caregivers of patients with esophageal cancer. Eur J Oncol Nurs. 2014;18(6):571–7.

Rankin SR. Influence of coping styles on social support seeking among cancer patient family caregivers (Doctoral dissertation, Walden University).

Bowman KF, Rose JH, Radziewicz RM, O’Toole EE, Berila RA. Family caregiver engagement in a coping and communication support intervention tailored to advanced cancer patients and families. Cancer Nurs. 2009;32(1):73–81.

Baltes PB, Baltes MM, editors. Successful aging: perspectives from the behavioral sciences. Cambridge University Press; 1993 May. p. 28.

Toseland RW, Blanchard CG, McCallion P. A problem solving intervention for caregivers of cancer patients. Soc Sci Med. 1995;40(4):517–28.

McMillan SC, Small BJ, Weitzner M, Schonwetter R, Tittle M, Moody L, Haley WE. Impact of coping skills intervention with family caregivers of hospice patients with cancer: a randomized clinical trial. Cancer. 2006;106(1):214–22.

Li Q, Loke AY. The positive aspects of caregiving for cancer patients: a critical review of the literature and directions for future research. Psycho-oncology. 2013;22(11):2399–407.

Tedeschi RG, Calhoun LG. Trauma and transformation. Sage; 1995 Jun. p. 20.

Liu Y, Li Y, Chen L, Li Y, Qi W, Yu L. Relationships between family resilience and posttraumatic growth in breast cancer survivors and caregiver burden. Psycho-oncology. 2018;27(4):1284–90.

Moore AM, Gamblin TC, Geller DA, Youssef MN, Hoffman KE, Gemmell L, Likumahuwa SM, Bovbjerg DH, Marsland A, Steel JL. A prospective study of posttraumatic growth as assessed by self-report and family caregiver in the context of advanced cancer. Psycho‐oncology. 2011;20(5):479–87.

Nouzari R, Najafi SS, Momennasab M. Post-traumatic growth among family caregivers of cancer patients and its association with social support and hope. Int J Community Based Nurs Midwifery. 2019;7(4):319.

PubMed   Google Scholar  

Less Survivable Cancers Taskforce.org.uk. London: Principle Consulting. https://lesssurvivablecancers.org.uk . Accessed 30 December 2021.

Dy S, Isenberg S, Hamayel N. Palliative care for cancer survivors. Med Clin N Am. 2017;101(6):1181–96.

Morris SM, Thomas C. The need to know: informal carers and information 1. Eur J Cancer Care. 2002;11(3):183–7.

Download references

Acknowledgements

Not applicable.

Author information

Authors and affiliations.

School of Psychology, Queen’s University Belfast, University Road, Belfast, BT7 1NN, UK

Melinda Furtado, Dawn Davis, Jenny M. Groarke & Lisa Graham-Wisener

School of Psychology, University of Galway, University Road, Galway, Ireland

Jenny M. Groarke

You can also search for this author in PubMed   Google Scholar

Contributions

M.F., D.D. wrote the main manuscript text with supervision/assistance from J.M.G and L G-W . All authors reviewed the manuscript.

Corresponding author

Correspondence to Melinda Furtado .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1: Additional file 1

Search Strategy

Supplementary Material 2: Additional file 2

Findings illustrations table

Supplementary Material 3: Additional file 3

ConQual table

Supplementary Material 4: Additional file 4

Characteristics of studies

Supplementary Material 5: Additional file 5

Methodological assessment

Supplementary Material 6: Additional file 6

Meta-synthesis

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Furtado, M., Davis, D., Groarke, J.M. et al. Experiences of informal caregivers supporting individuals with upper gastrointestinal cancers: a systematic review. BMC Health Serv Res 24 , 932 (2024). https://doi.org/10.1186/s12913-024-11306-3

Download citation

Received : 20 June 2024

Accepted : 10 July 2024

Published : 14 August 2024

DOI : https://doi.org/10.1186/s12913-024-11306-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Caregiver burden
  • Esophageal neoplasms
  • Gastrointestinal neoplasms
  • Gallbladder
  • Quality of life
  • Delivery of health care

BMC Health Services Research

ISSN: 1472-6963

is a systematic review primary or secondary research

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Perspect Clin Res
  • v.11(2); Apr-Jun 2020

Study designs: Part 7 – Systematic reviews

Priya ranganathan.

Department of Anaesthesiology, Tata Memorial Centre, Homi Bhabha National Institute, Mumbai, Maharashtra, India

Rakesh Aggarwal

1 Director, Jawaharlal Institute of Postgraduate Medical Education and Research, Puducherry, India

In this series on research study designs, we have so far looked at different types of primary research designs which attempt to answer a specific question. In this segment, we discuss systematic review, which is a study design used to summarize the results of several primary research studies. Systematic reviews often also use meta-analysis, which is a statistical tool to mathematically collate the results of various research studies to obtain a pooled estimate of treatment effect; this will be discussed in the next article.

In the previous six articles in this series on study designs, we have looked at different types of primary research study designs which are used to answer research questions. In this article, we describe the systematic review, a type of secondary research design that is used to summarize the results of prior primary research studies. Systematic reviews are considered the highest level of evidence for a particular research question.[ 1 ]

SYSTEMATIC REVIEWS

As defined in the Cochrane Handbook for Systematic Reviews of Interventions , “Systematic reviews seek to collate evidence that fits pre-specified eligibility criteria in order to answer a specific research question. They aim to minimize bias by using explicit, systematic methods documented in advance with a protocol.”[ 2 ]

NARRATIVE VERSUS SYSTEMATIC REVIEWS

Review of available data has been done since times immemorial. However, the traditional narrative reviews (“expert reviews”) do not involve a systematic search of the literature. Instead, the author of the review, usually an expert on the subject, used informal methods to identify (what he or she thinks are) the key studies on the topic. The final review thus is a summary of these “selected” studies. Since studies are chosen at will (haphazardly!) and without clearly defined criteria, such reviews preferentially include those studies that favor the author's views, leading to a potential for subjectivity or selection bias.

In contrast, systematic reviews involve a formal prespecified protocol with explicit, transparent criteria for the inclusion and exclusion of studies, thereby ensuring completeness of coverage of the available evidence, and providing a more objective, replicable, and comprehensive overview it.

META-ANALYSIS

Many systematic reviews use an additional tool, known as meta-analysis, which is a statistical technique for combining the results of multiple studies in a systematic review in a mathematically appropriate way, to create a single (pooled) and more precise estimate of treatment effect. The feasibility of performing a meta-analysis in a systematic review depends on the number of studies included in the final review and the degree of heterogeneity in the inclusion criteria as well as the results between the included studies. Meta-analysis will be discussed in detail in the next article in this series.

THE PROCESS OF A SYSTEMATIC REVIEW

The conduct of a systematic review involves several sequential key steps.[ 3 , 4 ] As in other research study designs, a clearly stated research question and a well-written research protocol are essential before commencing a systematic review.

Step 1: Stating the review question

Systematic reviews can be carried out in any field of medical research, e.g. efficacy or safety of interventions, diagnostics, screening or health economics. In this article, we focus on systematic reviews of studies looking at the efficacy of interventions. As for the other study designs, for a systematic review too, the question is best framed using the Population, Intervention, Comparator, and Outcome (PICO) format.

For example, Safi et al . carried out a systematic review on the effect of beta-blockers on the outcomes of patients with myocardial infarction.[ 5 ] In this review, the Population was patients with suspected or confirmed myocardial infarction, the Intervention was beta-blocker therapy, the Comparator was either placebo or no intervention, and the Outcomes were all-cause mortality and major adverse cardiovascular events. The review question was “ In patients with suspected or confirmed myocardial infarction, does the use of beta-blockers affect mortality or major adverse cardiovascular outcomes? ”

Step 2: Listing the eligibility criteria for studies to be included

It is essential to explicitly define a priori the criteria for selection of studies which will be included in the review. Besides the PICO components, some additional criteria used frequently for this purpose include language of publication (English versus non-English), publication status (published as full paper versus unpublished), study design (randomized versus quasi-experimental), age group (adults versus children), and publication year (e.g. in the last 5 years, or since a particular date). The PICO criteria used may not be very specific, e.g. it is possible to include studies that use one or the other drug belonging to the same group. For instance, the systematic review by Safi et al . included all randomized clinical trials, irrespective of setting, blinding, publication status, publication year, or language, and reported outcomes, that had used any beta-blocker and in a broad range of doses.[ 5 ]

Step 3: Comprehensive search for studies that meet the eligibility criteria

A thorough literature search is essential to identify all articles related to the research question and to ensure that no relevant article is left out. The search may include one or more electronic databases and trial registries; in addition, it is common to hand-search the cross-references in the articles identified through such searches. One could also plan to reach out to experts in the field to identify unpublished data, and to search the grey literature non-peer-reviewednon-peer-reviewed. This last option is particularly helpful non-pharmacologic (theses, conference abstracts, and non-peer-reviewed journals). These sources are particularly helpful when the intervention is relatively new, since data on these may not yet have been published as full papers and hence are unlikely to be found in literature databases. In the review by Safi et al ., the search strategy included not only several electronic databases (Cochrane, MEDLINE, EMBASE, LILACS, etc.) but also other resources (e.g. Google Scholar, WHO International Clinical Trial Registry Platform, and reference lists of identified studies).[ 5 ] It is not essential to include all the above databases in one's search. However, it is mandatory to define in advance which of these will be searched.

Step 4: Identifying and selecting relevant studies

Once the search strategy defined in the previous step has been run to identify potentially relevant studies, a two-step process is followed. First, the titles and abstracts of the identified studies are processed to exclude any duplicates and to discard obviously irrelevant studies. In the next step, full-text papers of the remaining articles are retrieved and closely reviewed to identify studies that meet the eligibility criteria. To minimize bias, these selection steps are usually performed independently by at least two reviewers, who also assign a reason for non-selection to each discarded study. Any discrepancies are then resolved either by an independent reviewer or by mutual consensus of the original reviewers. In the Cochrane review on beta-blockers referred to above, two review authors independently screened the titles for inclusion, and then, four review authors independently reviewed the screen-positive studies to identify the trials to be included in the final review.[ 5 ] Disagreements were resolved by discussion or by taking the opinion of a separate reviewer. A summary of this selection process, showing the degree of agreement between reviewers, and a flow diagram that depicts the numbers of screened, included and excluded (with reason for exclusion) studies are often included in the final review.

Step 5: Data extraction

In this step, from each selected study, relevant data are extracted. This should be done by at least two reviewers independently, and the data then compared to identify any errors in extraction. Standard data extraction forms help in objective data extraction. The data extracted usually contain the name of the author, the year of publication, details of intervention and control treatments, and the number of participants and outcome data in each group. In the review by Safi et al ., four review authors independently extracted data and resolved any differences by discussion.[ 5 ]

Handling missing data

Some of the studies included in the review may not report outcomes in accordance with the review methodology. Such missing data can be handled in two ways – by contacting authors of the original study to obtain the necessary data and by using data imputation techniques. Safi et al . used both these approaches – they tried to get data from the trial authors; however, where that failed, they analyzed the primary outcome (mortality) using the best case (i.e. presuming that all the participants in the experimental arm with missing data had survived and those in the control arm with missing mortality data had died – representing the maximum beneficial effect of the intervention) and the worst case (all the participants with missing data in the experimental arm assumed to have died and those in the control arm to have survived – representing the least beneficial effect of the intervention) scenarios.

Evaluating the quality (or risk of bias) in the included studies

The overall quality of a systematic review depends on the quality of each of the included studies. Quality of a study is inversely proportional to the potential for bias in its design. In our previous articles on interventional study design in this series, we discussed various methods to reduce bias – such as randomization, allocation concealment, participant and assessor blinding, using objective endpoints, minimizing missing data, the use of intention-to-treat analysis, and complete reporting of all outcomes.[ 6 , 7 ] These features form the basis of the Cochrane Risk of Bias Tool (RoB 2), which is a commonly used instrument to assess the risk of bias in the studies included in a systematic review.[ 8 ] Based on this tool, one can classify each study in a review as having low risk of bias, having some concerns regarding bias, or at high risk of bias. Safi et al . used this tool to classify the included studies as having low or high risk of bias and presented these data in both tabular and graphical formats.[ 5 ]

In some reviews, the authors decide to summarize only studies with a low risk of bias and to exclude those with a high risk of bias. Alternatively, some authors undertake a separate analysis of studies with low risk of bias, besides an analysis of all the studies taken together. The conclusions from such analyses of only high-quality studies may be more robust.

Step 6: Synthesis of results

The data extracted from various studies are pooled quantitatively (known as a meta-analysis) or qualitatively (if pooling of results is not considered feasible). For qualitative reviews, data are usually presented in the tabular format, showing the characteristics of each included study, to allow for easier interpretation.

Sensitivity analyses

Sensitivity analyses are used to test the robustness of the results of a systematic review by examining the impact of excluding or including studies with certain characteristics. As referred to above, this can be based on the risk of bias (methodological quality), studies with a specific study design, studies with a certain dosage or schedule, or sample size. If results of these different analyses are more-or-less the same, one can be more certain of the validity of the findings of the review. Furthermore, such analyses can help identify whether the effect of the intervention could vary across different levels of another factor. In the beta-blocker review, sensitivity analysis was performed depending on the risk of bias of included studies.[ 5 ]

IMPORTANT RESOURCES FOR CARRYING OUT SYSTEMATIC REVIEWS AND META-ANALYSES

Cochrane is an organization that works to produce good-quality, updated systematic reviews related to human healthcare and policy, which are accessible to people across the world.[ 9 ] There are more than 7000 Cochrane reviews on various topics. One of its main resources is the Cochrane Library (available at https://www.cochranelibrary.com/ ), which incorporates several databases with different types of high-quality evidence to inform healthcare decisions, including the Cochrane Database of Systematic Reviews, Cochrane Central Register of Controlled Trials (CENTRAL), and Cochrane Clinical Answers.

The Cochrane Handbook for Systematic Reviews of Interventions

The Cochrane handbook is an official guide, prepared by the Cochrane Collaboration, to the process of preparing and maintaining Cochrane systematic reviews.[ 10 ]

Review Manager software

Review Manager (RevMan) is a software developed by Cochrane to support the preparation and maintenance of systematic reviews, including tools for performing meta-analysis.[ 11 ] It is freely available in both online (RevMan Web) and offline (RevMan 5.3) versions.

Preferred Reporting Items for Systematic Reviews and Meta-Analyses statement

The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement is an evidence-based minimum set of items for reporting of systematic reviews and meta-analyses of randomized trials.[ 12 ] It can be used both by authors of such studies to improve the completeness of reporting and by reviewers and readers to critically appraise a systematic review. There are several extensions to the PRISMA statement for specific types of reviews. An update is currently underway.

Meta-analysis of Observational Studies in Epidemiology statement

The Meta-analysis of Observational Studies in Epidemiology statement summarizes the recommendations for reporting of meta-analyses in epidemiology.[ 13 ]

PROSPERO is an international database for prospective registration of protocols for systematic reviews in healthcare.[ 14 ] It aims to avoid duplication of and to improve transparency in reporting of results of such reviews.

Financial support and sponsorship

Conflicts of interest.

There are no conflicts of interest.

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

Is educational attainment associated with the onset and outcomes of low back pain? a systematic review and meta-analysis

Roles Data curation, Formal analysis, Investigation, Writing – original draft, Writing – review & editing

Affiliation Faculty of Health Sciences, School of Physical Therapy, Western University, London, Ontario, Canada

ORCID logo

Roles Data curation, Formal analysis, Visualization, Writing – original draft, Writing – review & editing

Affiliation School of Health Sciences, Nursing and Emergency Services, Cambrian College, Sudbury, Ontario, Canada

Roles Data curation, Formal analysis, Methodology, Visualization, Writing – original draft, Writing – review & editing

Affiliation Palliative Care, Fraser Health Authority, New Westminster, British Columbia, Canada

Affiliation Department of Physical Therapy, University of British Columbia, Vancouver, British Columbia, Canada

Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Supervision, Validation, Writing – review & editing

* E-mail: [email protected]

  • Aliyu Lawan, 
  • Alex Aubertin, 
  • Jane Mical, 
  • Joanne Hum, 
  • Michelle L. Graf, 
  • Peter Marley, 
  • Zachary Bolton, 
  • David M. Walton

PLOS

  • Published: August 13, 2024
  • https://doi.org/10.1371/journal.pone.0308625
  • Peer Review
  • Reader Comments

Fig 1

Low back pain (LBP) is the leading global cause of years lived with disability. Of the biopsychosocial domains of health, social determinants of LBP remain under-researched. Socioeconomic status (SES) may be associated with the onset of new LBP or outcomes of acute LBP, with educational attainment (EA) being a key component of SES. The association between EA and LBP has yet to be the subject of a dedicated review and meta-analysis.

To review evidence of the association between EA and a) onset or b) outcomes of acute and subacute LBP in the adult general population and to conduct statistical pooling of data where possible.

An electronic search was conducted in MEDLINE, Embase, CINAHL, and ProQuest from inception to 2 nd November 2023 including reference lists to identify relevant prospective studies. Risk of bias (RoB) was assessed using the Quality in Prognostic Studies (QUIPS) tool. Where adequate data were available, estimates were pooled using a random-effects meta-analysis. Overall evidence for each outcome was graded using an adapted GRADE.

After screening 8498 studies, 29 were included in the review. Study confounding and attrition were common biases. Data from 19 studies were statistically pooled to explore EA as a predictor of new LBP onset or as prognostic for outcomes of acute or subacute LBP. Pooled results showed no association between EA and the onset of new LBP (OR: 0.927, 95%CI: 0.747 to 1.150; I 2 = 0%). For predicting outcomes of acute LBP, compared to those with no more than secondary-level education, post-secondary education or higher was associated with better outcomes of pain (OR: 0.538, 95%CI: 0.432 to 0.671; I 2 = 35%) or disability (OR: 0.565, 95%CI: 0.420 to 0.759; I 2 = 44%). High heterogeneity (I 2 >80%) prevented meaningful pooling of estimates for subacute LBP outcomes.

We found no consistent evidence that lower EA increases the risk of LBP onset. Lower EA shows a consistent association with worse LBP outcomes measured at least 3 months later after acute onset with inconclusive findings in subacute LBP. Causation cannot be supported owing to study designs. High-quality research is needed on potential mechanisms to explain these effects.

Citation: Lawan A, Aubertin A, Mical J, Hum J, Graf ML, Marley P, et al. (2024) Is educational attainment associated with the onset and outcomes of low back pain? a systematic review and meta-analysis. PLoS ONE 19(8): e0308625. https://doi.org/10.1371/journal.pone.0308625

Editor: Mohammad Ali, Uttara Adhunik Medical College, BANGLADESH

Received: March 13, 2024; Accepted: July 26, 2024; Published: August 13, 2024

Copyright: © 2024 Lawan et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The data underlying the results presented in the study are available within the manuscript.

Funding: The author(s) received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Low back pain (LBP) is a leading global cause of years lived with disability [ 1 ]. In North America, chronic LBP is amongst the top ten reasons for seeking medical attention [ 2 ] with a prevalence of 18–23% in adults Canadians [ 3 ]. While many LBP cases resolve within the first three months, it has been estimated that as many as 60% to 80% progresses to chronicity or recurrence within one year including loss of productivity in 40% [ 4 , 5 ]. The acute to chronic transition of LBP is a complex process with multiple mechanisms likely influencing the pathway including biological, psychological, and social [ 6 – 8 ]. Prevention of new LBP or prevention of the acute-to-chronic transition stand to have a major impact on global health burden [ 2 ]. Existing guidelines recommend early identification of psychosocial factors that could prevent or enhance recovery from LBP [ 9 ].

While the biological and psychological sciences have provided considerable evidence to explain onset of and recovery from LBP, much less attention has been paid to the social influences. Social determinants of health (SDOH) are increasingly recognized as potent influences on the genesis of several health states [ 10 ], with some prior authors indicating that neighborhood characteristics may have at least as large an influence on the experience of chronic diseases as do personal genetics [ 11 ]. While SDOHs represent a large and complicated field of research, there are some social variables unique to the person experiencing pain that are worthy of dedicated inquiry. One such variable is educational attainment (EA), defined as the highest level of education completed by a person. As a prognostic variable, EA represents a blend of person-level (e.g., literacy) and society-level (e.g., access) influences and could potentially hold value as a variable through which intervention strategies could be tailored. EA holds value for research on SDOHs as “years of education” is one of few such variables that can be readily quantified.

There is some evidence that chronic pain is more prevalent amongst people with low EA [ 12 ] and lower EA may predict the acute-to-chronic transition [ 13 ]. However, there are limited studies that focus solely on the social predictors of LBP specifically [ 6 ]. EA has been included in some prior systematic reviews in LBP [ 14 , 15 ] though differences in case definitions, variable definitions, or study design have precluded clear findings. Even rarer are reviews or evidence syntheses on the association between EA and the onset of new LBP in population-level cohort studies that start with pain-free participants [ 16 ]. If lower EA is a risk for new onset LBP or for poor recovery following onset of acute LBP, mechanisms could then be explored and if causation is supported EA could be integrated into either public health prevention strategies or tailored treatment planning to prevent the acute-to-chronic transition.

The purpose of this systematic review was to qualitatively and/or quantitatively synthesize published estimates on the risk and prognostic value of EA on the onset or outcomes of LBP.

This review was designed and reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) framework [ 17 ]. The review was limited to observational prospective cohort or population-level studies (not clinical trials) of patients aged ≥18 years with either no LBP at inception (followed to determine onset), or acute (<8 weeks) and/or sub-acute (8–12 weeks) LBP. We focused on ‘non-specific’ LBP and therefore excluded LBP related to underlying systemic medical (such as cancer, infection, or cauda equina syndrome), vertebral compression fracture or osteoporosis, inflammatory conditions (e.g., ankylosing spondylitis) or neurological conditions (e.g., stroke). Beyond that, we accepted the case definitions of LBP as reported by the authors of the primary sources.

Search strategy

The search strategy (Appendix A) was developed with a research librarian using MeSH terms specific to MEDLINE which was adapted for other databases. No specific restrictions on publication date were set. The search strategies were applied to MEDLINE (OVID), The Cumulative Index of Nursing and Allied Health Literature (CINAHL), EMBASE, and ProQuest from inception to 30 th March 2023 and updated to 2 nd November 2023 corresponding to the dates of the respective searches without restriction to language of publication. A grey literature search of unpublished studies was conducted in Researchsquare. Hand searches of reference lists of all included articles were conducted to identify additional primary sources.

Study selection

Yield from each database were imported into Covidence systematic review software (Veritas Health Innovation, Melbourne Australia) and screened by two independent reviewers against the inclusion criteria, with disagreements being resolved by a third reviewer. Titles and abstracts were screened to remove irrelevant sources, followed by full-text screening against the inclusion/exclusion criteria. Kappa was calculated as an indicator of agreement between raters. The reasons for exclusion are included in Fig 1 .

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.pone.0308625.g001

Risk of bias in individual studies

The Quality in Prognostic Studies (QUIPS) tool was used to assess RoB of all included studies. QUIPS consists of six category-domains of potential biases: i) study participation, ii) attrition, iii) prognostic factor measurement, iv) outcome measurement, v) confounding, and vi) statistical analysis/reporting. All included studies were assessed by 2 independent reviewers. We used a worst-score approach, where each paper was assigned a RoB based on the worst (highest risk) rating of any of the 6 categories [ 18 ] classed as low, moderate, or high risk of bias (RoB). RoB agreement was calculated through Cohen’s kappa with disagreements resolved through discussion with a third experienced reviewer.

Data extraction

Data were extracted using a study-specific extraction table that included key study descriptors, sample characteristics, operationalizations of EA and LBP, outcomes assessed, and relevant findings. Educational attainment was extracted with as much detail as reported in the publication. Where possible, the minimum data extracted were related to a 12-year cut-point for EA as representing the threshold between secondary (up to year 12) and post-secondary (beyond year 12) EA in most countries. Where data were not presented with adequate detail, EA was sorted into meaningful order based on the manner reported in the studies (e.g., low vs high). We did not restrict studies based on the length of follow-up but extracted that information for subsequent interpretation as a potential effect modifier.

Outcomes were limited to those broadly categorized as either pain (e.g., presence/absence of LBP or pain intensity) or disability (e.g., return to work or score on a standardized patient-reported outcome). Where studies reported “recovery” as an outcome those operationalizations were reviewed for relevance and if aligned with our purpose the verbatim definition was extracted and assigned to the most relevant outcome category (e.g., pain, disability, or both). The study protocol was prospectively registered in PROSPERO (registration no. CRD 42023402135) as part of a series of reviews on SDOHs and LBP.

Data analysis and synthesis

Where possible, data were pooled and presented as odd ratios through random-effects meta-analysis using Comprehensive Meta-analysis software, version 2.2.04 (Biostat, Inc.©, Englewood, New Jersey). Syntheses were conducted for each of: i) onset of LBP (inception cohorts that start with no LBP and are followed over time to identify those who later report LBP); ii) pain intensity outcomes in acute LBP (inception starting within 8 weeks of LBP onset and followed over time to evaluate recovery), iii) pain-related disability outcomes in acute LBP, and iv) pain or disability outcomes in those entering the study with subacute (8 to 12 weeks) LBP. Heterogeneity in effects was assessed using both the I-squared statistic and p-value. I-squared <30% was deemed low heterogeneity, 30–60% as moderate, 61–75% as substantial, and 76–100% as considerable heterogeneity, and p-value at an alpha of <0.05 [ 19 ]. First, one estimate for pain and/or disability was calculated from unadjusted estimates as reported in each study. Where unadjusted (bivariate) estimates were not available, the adjusted estimates were pooled and where enough data from primary sources were available, sensitivity analyses were conducted to determine whether adjustment for other covariates affected effect size estimates.

When heterogeneity could not be explained, or where there were too few primary sources to permit moderator analysis in otherwise highly heterogeneous effects, a narrative summary of the results is presented.

GRADE assessment

Results across studies were synthesized using a modified Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach that considered the strength of the effect (none, small, medium, or high) and confidence in the results (inconclusive, low, moderate, or high) based on RoB, precision, homogeneity and consistency of effects. Where effects could be statistically pooled, those results were used to determine effect size, where they could not, we used a qualitative synthesis approach focused on overall consistency across papers. For this review, we did not attempt to find study registration through online registries to identify publication bias as observational studies are not consistently registered and many studies were published prior to protocol registration becoming standard practice.

Fig 1 shows the PRISMA flow diagram. The search identified 8498 articles (including 1058 duplicates), of which 163 full texts were screened resulting in the inclusion of 23 articles. An additional 4 from reference lists and 2 from the update search were identified for a total of 29 manuscripts describing 27 prospective observational cohorts. The reliability between raters was Kappa = 0.97. Characteristics of the included studies are summarized in Table 1 . The included studies were grouped into: onset of new LBP (n = 3), outcome of acute LBP (n = 18) and outcome of subacute LBP (n = 8). The publication date of the included studies spanned 1991 to 2022 and were from 13 countries. Sample sizes ranged from 53 [ 20 ] to 12,500 [ 21 ] and follow-up periods from 3 months (n = 4 [ 22 – 25 ]) to 3 years (n = 2 [ 21 , 26 ]).

thumbnail

https://doi.org/10.1371/journal.pone.0308625.t001

Risk of bias

Details of the RoB are reported in Table 1 and RoB for the overall body of literature is presented in Fig 2 . The majority (n = 14) of the included studies were rated as high RoB with 13 rated moderate and 2 low. For the individual domains, low RoB was common in the domains of study participation (77%) and statistical analysis/reporting (74%). High RoB was common for the domains of confounding (45%) and study attrition (39%).

thumbnail

https://doi.org/10.1371/journal.pone.0308625.g002

Prognostic factors

Operational definitions and categories of EA were defined differently across studies. Thirteen studies [ 21 , 22 , 24 , 26 , 28 , 29 , 31 , 32 , 34 , 37 , 38 , 40 , 43 ] had EA according to a 12-year education grade (e.g., 12 years or less, 12 years or more). Three did not present adequate data for extracting years of education. Silva et al [ 36 ] categorized education as ‘low, medium, or high” with no specific details, and Valencia et al [ 42 ] combined income and education into a single index of socio-economic status. Turner et al [ 41 ] reported EA as highest grade of education completed without detailing the years.

Outcomes were broadly categorized into pain, disability or a combination of pain and disability. Pain was evaluated using 21 outcomes across 14 studies with the Numeric Pain Rating Scale (NPRS), [ 29 , 36 , 40 , 42 , 43 ] and Visual Analog Scale (VAS) [ 45 ] the most frequent. Disability was evaluated using 21 outcomes in 9 studies most commonly using the Oswestry Disability Index (ODI) [ 20 , 42 , 48 ] and Roland Morris Disability Questionnaire (RMDQ) [ 23 , 29 , 36 ]. A combination of both pain and disability was evaluated in two studies [ 23 , 32 ].

Meta-analysis

Of the 29 articles from 27 studies, 9 articles [ 25 , 33 , 35 , 39 , 44 – 48 ] did not report adequate data (e.g., proportions, estimates) to permit statistical pooling and we were unsuccessful in contacting the original authors of those papers. Accordingly, data from 19 studies were available for meta-analysis.

EA as a predictor for the onset of new LBP.

Fig 3 shows the forest plot for pooled estimate for the association between EA and the onset of LBP. Three longitudinal inception cohort studies [ 26 – 28 ] (total N = 3,110) presented adequate data for pooling. The pooled effect of the three studies shows consistent evidence of no association between EA and new onset of LBP (OR = 0.93; 95%CI 0.75 to 1.15) with homogeneity (I 2 < 0.1%).

thumbnail

Forest plot of prognostic accuracy (odds ratio, OR) of educational attainment for predicting a) new onset, b) acute outcomes, and c) subacute outcomes in LBP.

https://doi.org/10.1371/journal.pone.0308625.g003

EA as a prognostic variable for predicting outcome of acute LBP.

Eight studies (total N = 15,079) reporting pain as an outcome were pooled with low-moderate heterogeneity in effect sizes (I 2 = 35%). Results supported a significant effect, in which higher EA predicts lower LBP symptoms 3 months to 3 years after onset of acute LBP (OR = 0.54; 95%CI 0.44 to 0.67, Fig 3 ). Three of those studies reported adjusted estimates only, excluding those resulted in an equivocal shift in pooled effects using only the unadjusted estimates (OR = OR: 0.51, 95%CI: 0.37 to 0.70, I = 36%, Fig 4 ). Nine articles (8 cohorts, total N = 4,672 subjects) reported a pain-related disability outcome. Pooling similarly indicated that higher EA measured in the acute phase of LBP predicts lower pain-related disability 3 months to 12 months later (OR = 0.57; 95%CI 0.42 to 0.76, Fig 3 ) with moderate heterogeneity (I 2 = 44%). Excluding the adjusted estimates from two of those studies again resulted in an equivocal shift in pooled effect (OR: 0.62, 95%CI: 0.51 to 0.77, Fig 4 ) but without heterogeneity (I 2 = 0%). Of the three studies that could not be pooled, two studies [ 25 , 39 ] (moderate RoB, total N = 1,369) also reported significant associations between outcomes of acute LBP and EA. The third [ 33 ] (1 moderate RoB, N = 55 subjects) reported no association between EA and disabling LBP/time to return to work.

thumbnail

https://doi.org/10.1371/journal.pone.0308625.g004

EA as a prognostic variable for predicting outcome of subacute LBP.

Two (pain severity) [ 24 , 42 ] and three (pain-related disability) [ 20 , 42 , 43 ] studies predicting outcomes in subacute LBP could not be meaningfully pooled owing to high heterogeneity (I 2 > 79%), inconsistent outcomes, and too few sources to permit moderator analysis. Accordingly, we proceeded with qualitative synthesis. For pain intensity as an outcome, 4 of 6 studies (2 low [ 44 , 45 ] and 2 moderate [ 24 , 46 ] RoB, total N = 2,880) reported unadjusted (bivariate) estimates and indicated no association between EA and follow-up outcome. The remaining two studies (1 low [ 48 ] and 1 moderate [ 42 ] RoB, total N = 272) reported no significant association after adjusting for pain catastrophizing [ 42 ] or age, gender, occupation, and health status [ 48 ]. For pain-related disability, 7 studies (1 low, 3 moderate and 2 high RoB studies, total N = 1,960) reported inconsistent evidence. Two studies [ 43 , 47 ] (total N = 1,504) found a significant negative association between EA and pain-related disability as measured with the RDQ [ 43 ] and ability to return to work [ 47 ]. Four other studies [ 42 , 44 ] (2 low, [ 20 , 48 ] 1 moderate [ 20 ] and 1 high [ 42 ] RoB, total N = 403) reported no association between EA and ODI [ 20 , 42 , 48 ] or sickness profile [ 44 ].

Sex based analysis of educational attainment and low back pain outcomes.

Two studies analyzed data for potential differential effects of EA on LBP when disaggregated by sex. The two studies could not be pooled due to differences in case definitions. Zadro et al [ 27 ] studied new onset LBP and reported lower EA to be associated with increased proportion of new onset LBP in females only, with no significant effects in males. Sterud et al [ 21 ] evaluated outcomes in acute LBP and found no differential effect on outcomes between sexes.

GRADE statement

Evidence profile of all included studies applied using GRADE is presented in Table 2 .

thumbnail

https://doi.org/10.1371/journal.pone.0308625.t002

EA and onset of LBP : On the basis of 3 studies, 1 moderate and 2 high RoB, with consistency in magnitude of effect, we find low-to-moderate confidence that EA has no association with the onset of new LBP in adults when followed for at least 2 years.

EA and outcomes of acute LBP : On the basis of 9 of 11 studies, 6 moderate and 3 high RoB, we have moderate confidence that EA when collected at inception shows a significant association with LBP symptom severity measured at least 3 months later. We find low confidence based on 5 of 11 studies, 2 moderate and 3 high RoB, of a similar association when the outcome is pain-related disability.

EA and outcomes of subacute LBP : On the basis of 7 of 9 studies, 2 low, 3 moderate, and 2 high RoB, we find inconsistent evidence and very low confidence in any association between EA and subsequent outcomes of pain severity when participants are incepted at the subacute (8–12 weeks from onset) stage. We find inconsistent evidence based on 3 of 5 studies (1 low, moderate and high RoB) and very low confidence for no association between EA and disability at least 3 months later. Significant heterogeneity in case definitions and effect sizes preclude more definitive findings.

We have conducted a rigorous and systematic search, extraction, and pooling of effects to explore the associations between a key indicator of socioeconomic status (highest level of education completed) and each of onset of new LBP, outcomes of acute LBP, or outcomes of subacute LBP. This represents part of an ongoing set of reviews exploring the social determinants of health and their associations with LBP, conceptualized herein as evaluating whether lower EA (high school or less) functions as either a potential risk or prognostic factor for onset or outcome of LBP, respectively. As observational studies, these are inherently vulnerable to confounding bias from several other potential variables meaning causation should not be assumed. Based on the strength and effects of available evidence, we have moderate confidence in a significant negative association between EA and pain severity or disability outcomes of acute LBP in which higher EA may offer some protection against poor outcomes, low confidence that EA has no association with onset of new LBP, and very low confidence of any association between EA and outcome when starting from the subacute LBP stage.

While to our knowledge, the quantitative synthesis of evidence related to new-onset LBP is novel, our results are largely consistent with those of other reviews in acute or subacute LBP, each of which included EA as part of a larger set of potential prognostic variables and few of which conducted meta-analyses. Previous LBP studies have failed to establish an association between EA and LBP outcomes for various reasons such as a small sample size to statistically power the study to detect effects [ 43 ], lack of uniform study design [ 49 ] and heterogeneous population, among others. For example, a review by Batista et al [ 49 ] reported that people with higher EA are less often affected by the occurrence of LBP. However, that review included multiple study designs that might have added statistical noise to the estimates of effect. Cancelliere and colleagues [ 50 ] based on best evidence synthesis of systematic reviews on factors affecting return to work after injury or illness identified higher EA and socioeconomic status among factors associated with positive return-to-work outcomes. A similar review by Dionne and colleagues [ 51 ] included multiple study designs that could not allow established prediction or causation, though that review concluded that people with lower EA are more likely to be affected by disabling LBP.

While it is tempting to ascribe mechanisms to our results, any such attempt is necessarily speculative given the design of studies and the inability to feasibly conduct a randomized trial in which one arm remains uneducated. Accordingly, criteria to support cause-and-effect, most famously described by Bradford-Hill [ 52 , 53 ] may never be fully realized. However, it also limits the impact of this work if no potential mechanisms are explored. EA is commonly included as part of the indices used to assign people to socioeconomic strata [ 54 , 55 ], that also include variables such as annual household income and median neighborhood income. From a Bourdieusian perspective, each of these may be interpreted as inferring capital that can be converted to power across different social fields [ 56 ]. In the context of outcomes of LBP, possessing social capital enabled by higher EA may permit easier access to effective care or alternative employment options, meaning that research using outcomes such as work status may find those with economic or educational privilege have better outcomes. However, EA may also be functioning more as a proxy for other influences on experiences of health and wellness outcomes. For example, higher EA may signal higher health literacy, living in more affluent areas with easier access to schools, or family wealth. Lower EA may be associated with, amongst other things, experiences of school bullying, early parentification, poor mental health, or neighborhood poverty [ 57 ]. Each may also play a moderating or mediating role on health outcomes [ 58 ], suggesting that these effects are very likely complex interactions between person- and society-level influences.

That EA showed no significant association with the onset of new LBP also demands further interpretation. Importantly, on the basis of only 3 studies of moderate-to-high RoB, we cannot have more than low confidence in the finding, though the consistency from over 3,000 participants is meaningful. Intuitively we might expect that those with lower EA are also more likely to be in jobs that demand higher physical labour or more repetitive tasks that might increase the risk of musculoskeletal disorders like LBP. However, we can see prior evidence that appears to support the lack of association identified herein. For example, in a large population-level study of >74,000 U.S. adults aged 30–49, Zajacova and colleagues found a non-linear association between EA and pain, in which adults who started but did not finish a post-secondary educational program reported a higher prevalence of painful conditions than either those with completed post-secondary education or those with secondary education only [ 59 ]. There is also an abundance of evidence associating sedentariness or prolonged sitting, as may be more likely experienced by those with higher-level or managerial roles, as risk factors for low back pain. Further, amongst blue-collar workers Lagersted-Olsen and colleagues found no association between daily time spent in a forward-bending posture and onset or aggravation of LBP over one year [ 60 ]. Accordingly, similar to our commentary on EA and LBP outcomes, any association between EA and LBP onset is likely complex and it seems overly simplistic to suggest that lower education does or does not lead to LBP.

Limitations to the study include the inability to establish causation as previously described, though this is more a limitation of the overall field rather than this particular review. With very few exceptions we were also largely unable to retrieve missing or under-reported data by contacting authors, meaning that some potentially relevant data have not been included in the meta-analyses that may otherwise change the results. We did not include studies published in a language other than English or without a formal English translation available, raising the risk that we have missed data from work published in non-English journals that may influence our results. Additionally, due the limited number of studies (fewer than 10) included in the meta-analysis, analysis of publication bias may be inappropriate [ 61 ]. Finally, not all studies reported EA in a way that permitted easy dichotomization into the 12-year categories. We made our best estimates based on reporting in the manuscript when grouping results into one of these two categories, though acknowledge that some errors may have been made.

While this review suggests EA is not associated with onset of new LBP or outcomes of subacute LBP, it does suggest a consistent association with outcomes of acute LBP. We have proposed potential mechanisms to explain these findings, though clearly more theoretical and empirical work is needed in this field. Future high-quality longitudinal studies with adequate sample size, clear and consistent definitions of EA and adjusting for meaning confounders in the study design and/or analysis will improve the understanding of the relative contribution of EA to the onset, and outcome, of acute and subacute LBP.

Supporting information

S1 checklist. prisma 2020 checklist..

https://doi.org/10.1371/journal.pone.0308625.s001

S1 Appendix.

https://doi.org/10.1371/journal.pone.0308625.s002

  • View Article
  • PubMed/NCBI
  • Google Scholar
  • 19. Deeks J, Higgins J, Altman D. Analyzing data and undertaking meta-analyses. In: Higgins J, Green S, Editors . Cochrane Handbook for Systematic Reviews of Intervention Version 5 . 0 . 0 . Wiley; 2008.
  • 54. Lynch J, Kaplan G. Socioeconomic position. In: Edition 1st, ed. Berkman LF, Kawachi I, Eds . Social Epidemiology . Oxford University Press; 2000:13–35.
  • Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar

information for practice

news, new scholarship & more from around the world

  • gary.holden@nyu.edu
  • @ Info4Practice

Racial and Ethnic Disparities in Primary Total Knee Arthroplasty Outcomes: A Systematic Review and Meta-Analysis of Two Decades of Research

Racial disparities in outcomes following total knee arthroplasty (TKA) remain persistent. This systematic review and meta-analysis aims to comprehensively synthesize data between 2000–2020. An electronic search of studies was performed on PubMed, SCOPUS, and the Cochrane Library databases from January 1, 2000, and December 31, 2020. Random effects models were used to report unadjusted and adjusted estimates for a comprehensive list of care outcomes in TKA. 63 studies met PRISMA criteria. Black patients report greater odds of in-hospital mortality (odds ratio [OR]: 1.37, 95% CI: 1.00–1.59 ( p  = 0.049); adjusted OR [aOR]: 1.34, 95% CI: 1.09–1.64), in-hospital complications (OR: 1.31, 95% CI: 1.27–1.35), 30-day complications (aOR: 1.19, 95% CI: 1.07–1.33), infection (OR: 1.11, 95% CI: 1.07–1.16; aOR: 1.30, 95% CI: 1.16–1.46), bleeding (OR: 1.33, 95% CI: 1.03–1.71; aOR: 1.47, 95% CI: 1.23–1.75), peripheral vascular events (PVE) (aOR: 1.46, 95% CI: 1.11–1.92), length of stay (LOS) (OR: 1.20, 95% CI: 1.08–1.34), extended-LOS (aOR: 1.89, 95% CI: 1.53–2.33), discharge disposition (OR: 1.59, 95% CI: 1.29–1.96; aOR: 1.96, 95% CI: 1.70–2.25), 30-day (OR: 1.20, 95% CI: 1.13–1.27; aOR: 1.17 95% CI: 1.09–1.26) and 90-day (OR: 1.46, 95% CI: 1.17–1.82) readmission compared to White patients. Disparities in bleeding, extended-LOS, discharge disposition, PVE, and 30-day readmission were observed in Asian patients. Hispanic patients experienced disparities in extended LOS and discharge disposition, while Native-American patients had disparities in bleeding outcomes. Persistent racial disparities in TKA outcomes highlight a need for standardized outcome measures and comprehensive data collection across multiple racial groups to ensure greater healthy equity.

Read the full article ›

COMMENTS

  1. Systematic reviews: Structure, form and content

    A systematic review collects secondary data, and is a synthesis of all available, relevant evidence which brings together all existing primary studies for review ... although a systematic review may be an inappropriate or unnecessary research methodology for answering many research questions. Systematic reviews can be inadvisable for a variety ...

  2. Systematic Review

    A systematic review is a type of review that uses repeatable methods to find, select, and synthesize all available evidence. It answers a clearly formulated research question and explicitly states the methods used to arrive at the answer. Example: Systematic review. In 2008, Dr. Robert Boyle and his colleagues published a systematic review in ...

  3. Systematic reviews: Structure, form and content

    A systematic review collects secondary data, and is a synthesis of all available, relevant evidence which brings together all existing primary studies for review ... although a systematic review may be an inappropriate or unnecessary research methodology for answering many research questions. Systematic reviews can be inadvisable for a variety ...

  4. Introduction to systematic review and meta-analysis

    In systematic reviews, prior registration of a detailed research plan is very important. In order to make the research process transparent, primary/secondary outcomes and methods are set in advance, and in the event of changes to the method, other researchers and readers are informed when, how, and why.

  5. Guidance to best tools and practices for systematic reviews

    The initial distinction is between primary and secondary studies. Primary studies are then further distinguished by: 1) the type of data reported (qualitative or quantitative); and 2) two defining design features (group or single-case and randomized or non-randomized). ... These records include a review's title, primary author, research ...

  6. What is a systematic review?

    A high-quality systematic review is described as the most reliable source of evidence to guide clinical practice. The purpose of a systematic review is to deliver a meticulous summary of all the available primary research in response to a research question. A systematic review uses all the existing research and is sometime called 'secondary research' (research on research). They are often ...

  7. Systematic Review

    A systematic review is a type of review that uses repeatable methods to find, select, and synthesise all available evidence. It answers a clearly formulated research question and explicitly states the methods used to arrive at the answer. Example: Systematic review. In 2008, Dr Robert Boyle and his colleagues published a systematic review in ...

  8. How to do a systematic review

    A systematic review aims to bring evidence together to answer a pre-defined research question. This involves the identification of all primary research relevant to the defined review question, the critical appraisal of this research, and the synthesis of the findings.13 Systematic reviews may combine data from different.

  9. Library Research Guides: Conducting a Systematic Review: What is a

    In a nutshell, a systematic review is a secondary study from a collection of primary studies (original research) that pertain to a specific research question. Those primary studies have been analyzed, examined, appraised, and evaluated for the highest level of evidence and quality of methodology, in order to provide the best answer to a ...

  10. Guides: Systematic Reviews: Should I do a systematic review?

    A systematic literature review is a research methodology designed to answer a focused research question. Authors conduct a methodical and comprehensive literature synthesis focused on a well-formulated research question. ... May identify need for primary or secondary research. Meta-analysis: Technique that statistically combines the results of ...

  11. Guidance on Conducting a Systematic Literature Review

    Maria Watson is a PhD candidate in the Urban and Regional Science program at Texas A&M University. Her research interests include disaster recovery, public policy, and economic development. Literature reviews establish the foundation of academic inquires. However, in the planning field, we lack rigorous systematic reviews.

  12. Systematic Reviews and Secondary Research

    Systematic Reviews are a kind of secondary research. The creators of systematic reviews are very intentional about their inclusion/exclusion criteria, or which articles they'll include in their review and the goal is to make a generalized statement so other researchers can build upon the practices or interventions they recommend.

  13. Types of Reviews

    This site explores different review methodologies such as, systematic, scoping, realist, narrative, state of the art, meta-ethnography, critical, and integrative reviews. The LITR-EX site has a health professions education focus, but the advice and information is widely applicable. Types of Reviews. Review the table to peruse review types and ...

  14. Primary vs. Secondary Sources

    Primary sources provide raw information and first-hand evidence. Examples include interview transcripts, statistical data, and works of art. Primary research gives you direct access to the subject of your research. Secondary sources provide second-hand information and commentary from other researchers. Examples include journal articles, reviews ...

  15. Is a systematic review primary research?

    Attrition refers to participants leaving a study. It always happens to some extent—for example, in randomized controlled trials for medical research. Differential attrition occurs when attrition or dropout rates differ systematically between the intervention and the control group.As a result, the characteristics of the participants who drop out differ from the characteristics of those who ...

  16. Secondary research

    This article covers all the basics of secondary research. ... Meta-analysis is a type of systematic review, but a systematic review in which statistical analysis is carried out to compare previously published studies and derive new interpretations or new findings. ... Like any primary research, systematic reviews and meta-analysis also need a ...

  17. Primary Research vs Secondary Research in 2024: Definitions

    When doing secondary research, researchers use and analyze data from primary research sources. Secondary research is widely used in many fields of study and industries, such as legal research and market research. In the sciences, for instance, one of the most common methods of secondary research is a systematic review.

  18. Literature review as a research methodology: An ...

    Besides the aim of overviewing a topic, a semi-systematic review often looks at how research within a selected field has progressed over time or how a topic has developed across research traditions. ... Primary, secondary, and meta-analysis of research. Educational Researcher, 5 (1976), pp. 3-8, 10.2307/1174772. View in Scopus Google Scholar.

  19. Systematic Reviews and Meta-analysis: Understanding the Best Evidence

    Primary research is collecting data directly from patients or population, while secondary research is the analysis of data already collected through primary research. A review is an article that summarizes a number of primary studies and may draw conclusions on the topic of interest which can be traditional (unsystematic) or systematic.

  20. Primary vs. Secondary Sources

    Sources are considered primary, secondary, or tertiary depending on the originality of the information presented and their proximity or how close they are to the source of information.This distinction can differ between subjects and disciplines. In the sciences, research findings may be communicated informally between researchers through email, presented at conferences (primary source), and ...

  21. Peer-Reviewed Research: Primary vs. Secondary

    A systematic review is a summary of research that addresses a focused clinical question in a systematic, reproducible manner. In order to provide the single best estimate of effect in clinical decision making, primary research studies are pooled together and then filtered through an inclusion/exclusion process.

  22. Primary vs. Secondary Sources

    What is a secondary source? Secondary sources analyze primary sources, using primary source materials to answer research questions. Secondary sources may analyze, criticize, interpret, or summarize data from primary sources. The most common secondary resources are books, journal articles, or literature reviews.

  23. 5.4.2 Prioritizing outcomes: main, primary and secondary ...

    Primary outcomes. Primary outcomes for the review should be identified from among the main outcomes. Primary outcomes are the outcomes that would be expected to be analysed should the review identify relevant studies, and conclusions about the effects of the interventions under review will be based largely on these outcomes. There should in ...

  24. What Is Secondary Analysis? Overview, Advantages & FAQs

    The primary purpose of any literature review is to provide a multifaceted and comprehensive overview of current knowledge, identify gaps, and establish a theoretical framework for further research. ... Most of the best innovations and research initiatives are served by both primary and secondary research analysis. Don't look at one or the ...

  25. The impact of COVID-19 on young people's mental health, wellbeing and

    Background: The impact of the Covid-19 pandemic on young people's (YP) mental health has been mixed. Systematic reviews to date have focused predominantly on quantitative studies and lacked involvement from YP with lived experience of mental health difficulties. Therefore, our primary aim was to conduct a qualitative systematic review to examine the perceived impact of the Covid-19 pandemic ...

  26. Consumption and effects of caffeinated energy drinks in young people

    A limitation of the overview was the strength of evidence of the primary research, most of which was from cross-sectional surveys. ... by children. As the deadline was short, and as initial searches identified several systematic reviews, a systematic review of systematic reviews (referred to as overview, from this point onwards) was conducted ...

  27. Experiences of informal caregivers supporting individuals with upper

    Background Upper gastrointestinal cancers (UGICs) are increasingly prevalent. With a poor prognosis and significant longer-term effects, UGICs present significant adjustment challenges for individuals with cancer and their informal caregivers. However, the supportive care needs of these informal caregivers are largely unknown. This systematic review of qualitative studies synthesises and ...

  28. Study designs: Part 7

    In the previous six articles in this series on study designs, we have looked at different types of primary research study designs which are used to answer research questions. In this article, we describe the systematic review, a type of secondary research design that is used to summarize the results of prior primary research studies. Systematic reviews are considered the highest level of ...

  29. Is educational attainment associated with the onset and outcomes of low

    Background Low back pain (LBP) is the leading global cause of years lived with disability. Of the biopsychosocial domains of health, social determinants of LBP remain under-researched. Socioeconomic status (SES) may be associated with the onset of new LBP or outcomes of acute LBP, with educational attainment (EA) being a key component of SES. The association between EA and LBP has yet to be ...

  30. Racial and Ethnic Disparities in Primary Total Knee Arthroplasty

    Racial disparities in outcomes following total knee arthroplasty (TKA) remain persistent. This systematic review and meta-analysis aims to comprehensively synthesize data between 2000-2020. An electronic search of studies was performed on PubMed, SCOPUS, and the Cochrane Library databases from January 1, 2000, and December 31, 2020.