Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Social Sci LibreTexts

2.3: Case Selection (Or, How to Use Cases in Your Comparative Analysis)

  • Last updated
  • Save as PDF
  • Page ID 135832

  • Dino Bozonelos, Julia Wendt, Charlotte Lee, Jessica Scarffe, Masahiro Omae, Josh Franco, Byran Martin, & Stefan Veldhuis
  • Victor Valley College, Berkeley City College, Allan Hancock College, San Diego City College, Cuyamaca College, Houston Community College, and Long Beach City College via ASCCC Open Educational Resources Initiative (OERI)

Learning Objectives

By the end of this section, you will be able to:

  • Discuss the importance of case selection in case studies.
  • Consider the implications of poor case selection.

Introduction

Case selection is an important part of any research design. Deciding how many cases, and which cases, to include, will clearly help determine the outcome of our results. If we decide to select a high number of cases, we often say that we are conducting large-N research. Large-N research is when the number of observations or cases is large enough where we would need mathematical, usually statistical, techniques to discover and interpret any correlations or causations. In order for a large-N analysis to yield any relevant findings, a number of conventions need to be observed. First, the sample needs to be representative of the studied population. Thus, if we wanted to understand the long-term effects of COVID, we would need to know the approximate details of those who contracted the virus. Once we know the parameters of the population, we can then determine a sample that represents the larger population. For example, women make up 55% of all long-term COVID survivors. Thus, any sample we generate needs to be at least 55% women.

Second, some kind of randomization technique needs to be involved in large-N research. So not only must your sample be representative, it must also randomly select people within that sample. In other words, we must have a large selection of people that fit within the population criteria, and then randomly select from those pools. Randomization would help to reduce bias in the study. Also, when cases (people with long-term COVID) are randomly chosen they tend to ensure a fairer representation of the studied population. Third, your sample needs to be large enough, hence the large-N designation for any conclusions to have any external validity. Generally speaking, the larger the number of observations/cases in the sample, the more validity we can have in the study. There is no magic number, but if using the above example, our sample of long-term COVID patients should be at least over 750 people, with an aim of around 1,200 to 1,500 people.

When it comes to comparative politics, we rarely ever reach the numbers typically used in large-N research. There are about 200 fully recognized countries, with about a dozen partially recognized countries, and even fewer areas or regions of study, such as Europe or Latin America. Given this, what is the strategy when one case, or a few cases, are being studied? What happens if we are only wanting to know the COVID-19 response in the United States, and not the rest of the world? How do we randomize this to ensure our results are not biased or are representative? These and other questions are legitimate issues that many comparativist scholars face when completing research. Does randomization work with case studies? Gerring suggests that it does not, as “any given sample may be widely representative” (pg. 87). Thus, random sampling is not a reliable approach when it comes to case studies. And even if the randomized sample is representative, there is no guarantee that the gathered evidence would be reliable.

One can make the argument that case selection may not be as important in large-N studies as they are in small-N studies. In large-N research, potential errors and/or biases may be ameliorated, especially if the sample is large enough. This is not always what happens, errors and biases most certainly can exist in large-N research. However, incorrect or biased inferences are less of a worry when we have 1,500 cases versus 15 cases. In small-N research, case selection simply matters much more.

This is why Blatter and Haverland (2012) write that, “case studies are ‘case-centered’, whereas large-N studies are ‘variable-centered’". In large-N studies we are more concerned with the conceptualization and operationalization of variables. Thus, we want to focus on which data to include in the analysis of long-term COVID patients. If we wanted to survey them, we would want to make sure we construct questions in appropriate ways. For almost all survey-based large-N research, the question responses themselves become the coded variables used in the statistical analysis.

Case selection can be driven by a number of factors in comparative politics, with the first two approaches being the more traditional. First, it can derive from the interests of the researcher(s). For example, if the researcher lives in Germany, they may want to research the spread of COVID-19 within the country, possibly using a subnational approach where the researcher may compare infection rates among German states. Second, case selection may be driven by area studies. This is still based on the interests of the researcher as generally speaking scholars pick areas of studies due to their personal interests. For example, the same researcher may research COVID-19 infection rates among European Union member-states. Finally, the selection of cases selected may be driven by the type of case study that is utilized. In this approach, cases are selected as they allow researchers to compare their similarities or their differences. Or, a case might be selected that is typical of most cases, or in contrast, a case or cases that deviate from the norm. We discuss types of case studies and their impact on case selection below.

Types of Case Studies: Descriptive vs. Causal

There are a number of different ways to categorize case studies. One of the most recent ways is through John Gerring. He wrote two editions on case study research (2017) where he posits that the central question posed by the researcher will dictate the aim of the case study. Is the study meant to be descriptive? If so, what is the researcher looking to describe? How many cases (countries, incidents, events) are there? Or is the study meant to be causal, where the researcher is looking for a cause and effect? Given this, Gerring categorizes case studies into two types: descriptive and causal.

Descriptive case studies are “not organized around a central, overarching causal hypothesis or theory” (pg. 56). Most case studies are descriptive in nature, where the researchers simply seek to describe what they observe. They are useful for transmitting information regarding the studied political phenomenon. For a descriptive case study, a scholar might choose a case that is considered typical of the population. An example could involve researching the effects of the pandemic on medium-sized cities in the US. This city would have to exhibit the tendencies of medium-sized cities throughout the entire country. First, we would have to conceptualize what we mean by ‘a medium-size city’. Second, we would then have to establish the characteristics of medium-sized US cities, so that our case selection is appropriate. Alternatively, cases could be chosen for their diversity . In keeping with our example, maybe we want to look at the effects of the pandemic on a range of US cities, from small, rural towns, to medium-sized suburban cities to large-sized urban areas.

Causal case studies are “organized around a central hypothesis about how X affects Y” (pg. 63). In causal case studies, the context around a specific political phenomenon or phenomena is important as it allows for researchers to identify the aspects that set up the conditions, the mechanisms, for that outcome to occur. Scholars refer to this as the causal mechanism , which is defined by Falleti & Lynch (2009) as “portable concepts that explain how and why a hypothesized cause, in a given context, contributes to a particular outcome”. Remember, causality is when a change in one variable verifiably causes an effect or change in another variable. For causal case studies that employ causal mechanisms, Gerring divides them into exploratory case-selection, estimating case-selection, and diagnostic case-selection. The differences revolve around how the central hypothesis is utilized in the study.

Exploratory case studies are used to identify a potential causal hypothesis. Researchers will single out the independent variables that seem to affect the outcome, or dependent variable, the most. The goal is to build up to what the causal mechanism might be by providing the context. This is also referred to as hypothesis generating as opposed to hypothesis testing. Case selection can vary widely depending on the goal of the researcher. For example, if the scholar is looking to develop an ‘ideal-type’, they might seek out an extreme case. An ideal-type is defined as a “conception or a standard of something in its highest perfection” (New Webster Dictionary). Thus, if we want to understand the ideal-type capitalist system, we want to investigate a country that practices a pure or ‘extreme’ form of the economic system.

Estimating case studies start with a hypothesis already in place. The goal is to test the hypothesis through collected data/evidence. Researchers seek to estimate the ‘causal effect’. This involves determining if the relationship between the independent and dependent variables is positive, negative, or ultimately if no relationship exists at all. Finally, diagnostic case studies are important as they help to “confirm, disconfirm, or refine a hypothesis” (Gerring 2017). Case selection can also vary in diagnostic case studies. For example, scholars can choose an least-likely case, or a case where the hypothesis is confirmed even though the context would suggest otherwise. A good example would be looking at Indian democracy, which has existed for over 70 years. India has a high level of ethnolinguistic diversity, is relatively underdeveloped economically, and a low level of modernization through large swaths of the country. All of these factors strongly suggest that India should not have democratized, or should have failed to stay a democracy in the long-term, or have disintegrated as a country.

Most Similar/Most Different Systems Approach

The discussion in the previous subsection tends to focus on case selection when it comes to a single case. Single case studies are valuable as they provide an opportunity for in-depth research on a topic that requires it. However, in comparative politics, our approach is to compare. Given this, we are required to select more than one case. This presents a different set of challenges. First, how many cases do we pick? This is a tricky question we addressed earlier. Second, how do we apply the previously mentioned case selection techniques, descriptive vs. causal? Do we pick two extreme cases if we used an exploratory approach, or two least-likely cases if choosing a diagnostic case approach?

Thankfully, an English scholar by the name of John Stuart Mill provided some insight on how we should proceed. He developed several approaches to comparison with the explicit goal of isolating a cause within a complex environment. Two of these methods, the 'method of agreement' and the 'method of difference' have influenced comparative politics. In the 'method of agreement' two or more cases are compared for their commonalities. The scholar looks to isolate the characteristic, or variable, they have in common, which is then established as the cause for their similarities. In the 'method of difference' two or more cases are compared for their differences. The scholar looks to isolate the characteristic, or variable, they do not have in common, which is then identified as the cause for their differences. From these two methods, comparativists have developed two approaches.

Book cover of John Stuart Mill's A System of Logic, Ratiocinative and Inductive, 1843

What Is the Most Similar Systems Design (MSSD)?

This approach is derived from Mill’s ‘method of difference’. In a Most Similar Systems Design Design, the cases selected for comparison are similar to each other, but the outcomes differ in result. In this approach we are interested in keeping as many of the variables the same across the elected cases, which for comparative politics often involves countries. Remember, the independent variable is the factor that doesn’t depend on changes in other variables. It is potentially the ‘cause’ in the cause and effect model. The dependent variable is the variable that is affected by, or dependent on, the presence of the independent variable. It is the ‘effect’. In a most similar systems approach the variables of interest should remain the same.

A good example involves the lack of a national healthcare system in the US. Other countries, such as New Zealand, Australia, Ireland, UK and Canada, all have robust, publicly accessible national health systems. However, the US does not. These countries all have similar systems: English heritage and language use, liberal market economies, strong democratic institutions, and high levels of wealth and education. Yet, despite these similarities, the end results vary. The US does not look like its peer countries. In other words, why do we have similar systems producing different outcomes?

What Is the Most Different Systems Design (MDSD)?

This approach is derived from Mill’s ‘method of agreement’. In a Most Different System Design, the cases selected are different from each other, but result in the same outcome. In this approach, we are interested in selecting cases that are quite different from one another, yet arrive at the same outcome. Thus, the dependent variable is the same. Different independent variables exist between the cases, such as democratic v. authoritarian regime, liberal market economy v. non-liberal market economy. Or it could include other variables such as societal homogeneity (uniformity) vs. societal heterogeneity (diversity), where a country may find itself unified ethnically/religiously/racially, or fragmented along those same lines.

A good example involves the countries that are classified as economically liberal. The Heritage Foundation lists countries such as Singapore, Taiwan, Estonia, Australia, New Zealand, as well as Switzerland, Chile and Malaysia as either free or mostly free. These countries differ greatly from one another. Singapore and Malaysia are considered flawed or illiberal democracies (see chapter 5 for more discussion), whereas Estonia is still classified as a developing country. Australia and New Zealand are wealthy, Malaysia is not. Chile and Taiwan became economically free countries under the authoritarian military regimes, which is not the case for Switzerland. In other words, why do we have different systems producing the same outcome?

U.S. flag

An official website of the United States government

The .gov means it's official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you're on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • Browse Titles

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Lau F, Kuziemsky C, editors. Handbook of eHealth Evaluation: An Evidence-based Approach [Internet]. Victoria (BC): University of Victoria; 2017 Feb 27.

Cover of Handbook of eHealth Evaluation: An Evidence-based Approach

Handbook of eHealth Evaluation: An Evidence-based Approach [Internet].

Chapter 10 methods for comparative studies.

Francis Lau and Anne Holbrook .

10.1. Introduction

In eHealth evaluation, comparative studies aim to find out whether group differences in eHealth system adoption make a difference in important outcomes. These groups may differ in their composition, the type of system in use, and the setting where they work over a given time duration. The comparisons are to determine whether significant differences exist for some predefined measures between these groups, while controlling for as many of the conditions as possible such as the composition, system, setting and duration.

According to the typology by Friedman and Wyatt (2006) , comparative studies take on an objective view where events such as the use and effect of an eHealth system can be defined, measured and compared through a set of variables to prove or disprove a hypothesis. For comparative studies, the design options are experimental versus observational and prospective versus retro­­spective. The quality of eHealth comparative studies depends on such aspects of methodological design as the choice of variables, sample size, sources of bias, confounders, and adherence to quality and reporting guidelines.

In this chapter we focus on experimental studies as one type of comparative study and their methodological considerations that have been reported in the eHealth literature. Also included are three case examples to show how these studies are done.

10.2. Types of Comparative Studies

Experimental studies are one type of comparative study where a sample of participants is identified and assigned to different conditions for a given time duration, then compared for differences. An example is a hospital with two care units where one is assigned a cpoe system to process medication orders electronically while the other continues its usual practice without a cpoe . The participants in the unit assigned to the cpoe are called the intervention group and those assigned to usual practice are the control group. The comparison can be performance or outcome focused, such as the ratio of correct orders processed or the occurrence of adverse drug events in the two groups during the given time period. Experimental studies can take on a randomized or non-randomized design. These are described below.

10.2.1. Randomized Experiments

In a randomized design, the participants are randomly assigned to two or more groups using a known randomization technique such as a random number table. The design is prospective in nature since the groups are assigned concurrently, after which the intervention is applied then measured and compared. Three types of experimental designs seen in eHealth evaluation are described below ( Friedman & Wyatt, 2006 ; Zwarenstein & Treweek, 2009 ).

Randomized controlled trials ( rct s) – In rct s participants are randomly assigned to an intervention or a control group. The randomization can occur at the patient, provider or organization level, which is known as the unit of allocation. For instance, at the patient level one can randomly assign half of the patients to receive emr reminders while the other half do not. At the provider level, one can assign half of the providers to receive the reminders while the other half continues with their usual practice. At the organization level, such as a multisite hospital, one can randomly assign emr reminders to some of the sites but not others. Cluster randomized controlled trials ( crct s) – In crct s, clusters of participants are randomized rather than by individual participant since they are found in naturally occurring groups such as living in the same communities. For instance, clinics in one city may be randomized as a cluster to receive emr reminders while clinics in another city continue their usual practice. Pragmatic trials – Unlike rct s that seek to find out if an intervention such as a cpoe system works under ideal conditions, pragmatic trials are designed to find out if the intervention works under usual conditions. The goal is to make the design and findings relevant to and practical for decision-makers to apply in usual settings. As such, pragmatic trials have few criteria for selecting study participants, flexibility in implementing the intervention, usual practice as the comparator, the same compliance and follow-up intensity as usual practice, and outcomes that are relevant to decision-makers.

10.2.2. Non-randomized Experiments

Non-randomized design is used when it is neither feasible nor ethical to randomize participants into groups for comparison. It is sometimes referred to as a quasi-experimental design. The design can involve the use of prospective or retrospective data from the same or different participants as the control group. Three types of non-randomized designs are described below ( Harris et al., 2006 ).

Intervention group only with pretest and post-test design – This design involves only one group where a pretest or baseline measure is taken as the control period, the intervention is implemented, and a post-test measure is taken as the intervention period for comparison. For example, one can compare the rates of medication errors before and after the implementation of a cpoe system in a hospital. To increase study quality, one can add a second pretest period to decrease the probability that the pretest and post-test difference is due to chance, such as an unusually low medication error rate in the first pretest period. Other ways to increase study quality include adding an unrelated outcome such as patient case-mix that should not be affected, removing the intervention to see if the difference remains, and removing then re-implementing the intervention to see if the differences vary accordingly. Intervention and control groups with post-test design – This design involves two groups where the intervention is implemented in one group and compared with a second group without the intervention, based on a post-test measure from both groups. For example, one can implement a cpoe system in one care unit as the intervention group with a second unit as the control group and compare the post-test medication error rates in both units over six months. To increase study quality, one can add one or more pretest periods to both groups, or implement the intervention to the control group at a later time to measure for similar but delayed effects. Interrupted time series ( its ) design – In its design, multiple measures are taken from one group in equal time intervals, interrupted by the implementation of the intervention. The multiple pretest and post-test measures decrease the probability that the differences detected are due to chance or unrelated effects. An example is to take six consecutive monthly medication error rates as the pretest measures, implement the cpoe system, then take another six consecutive monthly medication error rates as the post-test measures for comparison in error rate differences over 12 months. To increase study quality, one may add a concurrent control group for comparison to be more convinced that the intervention produced the change.

10.3. Methodological Considerations

The quality of comparative studies is dependent on their internal and external validity. Internal validity refers to the extent to which conclusions can be drawn correctly from the study setting, participants, intervention, measures, analysis and interpretations. External validity refers to the extent to which the conclusions can be generalized to other settings. The major factors that influence validity are described below.

10.3.1. Choice of Variables

Variables are specific measurable features that can influence validity. In comparative studies, the choice of dependent and independent variables and whether they are categorical and/or continuous in values can affect the type of questions, study design and analysis to be considered. These are described below ( Friedman & Wyatt, 2006 ).

Dependent variables – This refers to outcomes of interest; they are also known as outcome variables. An example is the rate of medication errors as an outcome in determining whether cpoe can improve patient safety. Independent variables – This refers to variables that can explain the measured values of the dependent variables. For instance, the characteristics of the setting, participants and intervention can influence the effects of cpoe . Categorical variables – This refers to variables with measured values in discrete categories or levels. Examples are the type of providers (e.g., nurses, physicians and pharmacists), the presence or absence of a disease, and pain scale (e.g., 0 to 10 in increments of 1). Categorical variables are analyzed using non-parametric methods such as chi-square and odds ratio. Continuous variables – This refers to variables that can take on infinite values within an interval limited only by the desired precision. Examples are blood pressure, heart rate and body temperature. Continuous variables are analyzed using parametric methods such as t -test, analysis of variance or multiple regression.

10.3.2. Sample Size

Sample size is the number of participants to include in a study. It can refer to patients, providers or organizations depending on how the unit of allocation is defined. There are four parts to calculating sample size. They are described below ( Noordzij et al., 2010 ).

Significance level – This refers to the probability that a positive finding is due to chance alone. It is usually set at 0.05, which means having a less than 5% chance of drawing a false positive conclusion. Power – This refers to the ability to detect the true effect based on a sample from the population. It is usually set at 0.8, which means having at least an 80% chance of drawing a correct conclusion. Effect size – This refers to the minimal clinically relevant difference that can be detected between comparison groups. For continuous variables, the effect is a numerical value such as a 10-kilogram weight difference between two groups. For categorical variables, it is a percentage such as a 10% difference in medication error rates. Variability – This refers to the population variance of the outcome of interest, which is often unknown and is estimated by way of standard deviation ( sd ) from pilot or previous studies for continuous outcome.

Table 10.1. Sample Size Equations for Comparing Two Groups with Continuous and Categorical Outcome Variables.

Sample Size Equations for Comparing Two Groups with Continuous and Categorical Outcome Variables.

An example of sample size calculation for an rct to examine the effect of cds on improving systolic blood pressure of hypertensive patients is provided in the Appendix. Refer to the Biomath website from Columbia University (n.d.) for a simple Web-based sample size / power calculator.

10.3.3. Sources of Bias

There are five common sources of biases in comparative studies. They are selection, performance, detection, attrition and reporting biases ( Higgins & Green, 2011 ). These biases, and the ways to minimize them, are described below ( Vervloet et al., 2012 ).

Selection or allocation bias – This refers to differences between the composition of comparison groups in terms of the response to the intervention. An example is having sicker or older patients in the control group than those in the intervention group when evaluating the effect of emr reminders. To reduce selection bias, one can apply randomization and concealment when assigning participants to groups and ensure their compositions are comparable at baseline. Performance bias – This refers to differences between groups in the care they received, aside from the intervention being evaluated. An example is the different ways by which reminders are triggered and used within and across groups such as electronic, paper and phone reminders for patients and providers. To reduce performance bias, one may standardize the intervention and blind participants from knowing whether an intervention was received and which intervention was received. Detection or measurement bias – This refers to differences between groups in how outcomes are determined. An example is where outcome assessors pay more attention to outcomes of patients known to be in the intervention group. To reduce detection bias, one may blind assessors from participants when measuring outcomes and ensure the same timing for assessment across groups. Attrition bias – This refers to differences between groups in ways that participants are withdrawn from the study. An example is the low rate of participant response in the intervention group despite having received reminders for follow-up care. To reduce attrition bias, one needs to acknowledge the dropout rate and analyze data according to an intent-to-treat principle (i.e., include data from those who dropped out in the analysis). Reporting bias – This refers to differences between reported and unreported findings. Examples include biases in publication, time lag, citation, language and outcome reporting depending on the nature and direction of the results. To reduce reporting bias, one may make the study protocol available with all pre-specified outcomes and report all expected outcomes in published results.

10.3.4. Confounders

Confounders are factors other than the intervention of interest that can distort the effect because they are associated with both the intervention and the outcome. For instance, in a study to demonstrate whether the adoption of a medication order entry system led to lower medication costs, there can be a number of potential confounders that can affect the outcome. These may include severity of illness of the patients, provider knowledge and experience with the system, and hospital policy on prescribing medications ( Harris et al., 2006 ). Another example is the evaluation of the effect of an antibiotic reminder system on the rate of post-operative deep venous thromboses ( dvt s). The confounders can be general improvements in clinical practice during the study such as prescribing patterns and post-operative care that are not related to the reminders ( Friedman & Wyatt, 2006 ).

To control for confounding effects, one may consider the use of matching, stratification and modelling. Matching involves the selection of similar groups with respect to their composition and behaviours. Stratification involves the division of participants into subgroups by selected variables, such as comorbidity index to control for severity of illness. Modelling involves the use of statistical techniques such as multiple regression to adjust for the effects of specific variables such as age, sex and/or severity of illness ( Higgins & Green, 2011 ).

10.3.5. Guidelines on Quality and Reporting

There are guidelines on the quality and reporting of comparative studies. The grade (Grading of Recommendations Assessment, Development and Evaluation) guidelines provide explicit criteria for rating the quality of studies in randomized trials and observational studies ( Guyatt et al., 2011 ). The extended consort (Consolidated Standards of Reporting Trials) Statements for non-pharmacologic trials ( Boutron, Moher, Altman, Schulz, & Ravaud, 2008 ), pragmatic trials ( Zwarestein et al., 2008 ), and eHealth interventions ( Baker et al., 2010 ) provide reporting guidelines for randomized trials.

The grade guidelines offer a system of rating quality of evidence in systematic reviews and guidelines. In this approach, to support estimates of intervention effects rct s start as high-quality evidence and observational studies as low-quality evidence. For each outcome in a study, five factors may rate down the quality of evidence. The final quality of evidence for each outcome would fall into one of high, moderate, low, and very low quality. These factors are listed below (for more details on the rating system, refer to Guyatt et al., 2011 ).

Design limitations – For rct s they cover the lack of allocation concealment, lack of blinding, large loss to follow-up, trial stopped early or selective outcome reporting. Inconsistency of results – Variations in outcomes due to unexplained heterogeneity. An example is the unexpected variation of effects across subgroups of patients by severity of illness in the use of preventive care reminders. Indirectness of evidence – Reliance on indirect comparisons due to restrictions in study populations, intervention, comparator or outcomes. An example is the 30-day readmission rate as a surrogate outcome for quality of computer-supported emergency care in hospitals. Imprecision of results – Studies with small sample size and few events typically would have wide confidence intervals and are considered of low quality. Publication bias – The selective reporting of results at the individual study level is already covered under design limitations, but is included here for completeness as it is relevant when rating quality of evidence across studies in systematic reviews.

The original consort Statement has 22 checklist items for reporting rct s. For non-pharmacologic trials extensions have been made to 11 items. For pragmatic trials extensions have been made to eight items. These items are listed below. For further details, readers can refer to Boutron and colleagues (2008) and the consort website ( consort , n.d.).

Title and abstract – one item on the means of randomization used. Introduction – one item on background, rationale, and problem addressed by the intervention. Methods – 10 items on participants, interventions, objectives, outcomes, sample size, randomization (sequence generation, allocation concealment, implementation), blinding (masking), and statistical methods. Results – seven items on participant flow, recruitment, baseline data, numbers analyzed, outcomes and estimation, ancillary analyses, adverse events. Discussion – three items on interpretation, generalizability, overall evidence.

The consort Statement for eHealth interventions describes the relevance of the consort recommendations to the design and reporting of eHealth studies with an emphasis on Internet-based interventions for direct use by patients, such as online health information resources, decision aides and phr s. Of particular importance is the need to clearly define the intervention components, their role in the overall care process, target population, implementation process, primary and secondary outcomes, denominators for outcome analyses, and real world potential (for details refer to Baker et al., 2010 ).

10.4. Case Examples

10.4.1. pragmatic rct in vascular risk decision support.

Holbrook and colleagues (2011) conducted a pragmatic rct to examine the effects of a cds intervention on vascular care and outcomes for older adults. The study is summarized below.

Setting – Community-based primary care practices with emr s in one Canadian province. Participants – English-speaking patients 55 years of age or older with diagnosed vascular disease, no cognitive impairment and not living in a nursing home, who had a provider visit in the past 12 months. Intervention – A Web-based individualized vascular tracking and advice cds system for eight top vascular risk factors and two diabetic risk factors, for use by both providers and patients and their families. Providers and staff could update the patient’s profile at any time and the cds algorithm ran nightly to update recommendations and colour highlighting used in the tracker interface. Intervention patients had Web access to the tracker, a print version mailed to them prior to the visit, and telephone support on advice. Design – Pragmatic, one-year, two-arm, multicentre rct , with randomization upon patient consent by phone, using an allocation-concealed online program. Randomization was by patient with stratification by provider using a block size of six. Trained reviewers examined emr data and conducted patient telephone interviews to collect risk factors, vascular history, and vascular events. Providers completed questionnaires on the intervention at study end. Patients had final 12-month lab checks on urine albumin, low-density lipoprotein cholesterol, and A1c levels. Outcomes – Primary outcome was based on change in process composite score ( pcs ) computed as the sum of frequency-weighted process score for each of the eight main risk factors with a maximum score of 27. The process was considered met if a risk factor had been checked. pcs was measured at baseline and study end with the difference as the individual primary outcome scores. The main secondary outcome was a clinical composite score ( ccs ) based on the same eight risk factors compared in two ways: a comparison of the mean number of clinical variables on target and the percentage of patients with improvement between the two groups. Other secondary outcomes were actual vascular event rates, individual pcs and ccs components, ratings of usability, continuity of care, patient ability to manage vascular risk, and quality of life using the EuroQol five dimensions questionnaire ( eq-5D) . Analysis – 1,100 patients were needed to achieve 90% power in detecting a one-point pcs difference between groups with a standard deviation of five points, two-tailed t -test for mean difference at 5% significance level, and a withdrawal rate of 10%. The pcs , ccs and eq-5D scores were analyzed using a generalized estimating equation accounting for clustering within providers. Descriptive statistics and χ2 tests or exact tests were done with other outcomes. Findings – 1,102 patients and 49 providers enrolled in the study. The intervention group with 545 patients had significant pcs improvement with a difference of 4.70 ( p < .001) on a 27-point scale. The intervention group also had significantly higher odds of rating improvements in their continuity of care (4.178, p < .001) and ability to improve their vascular health (3.07, p < .001). There was no significant change in vascular events, clinical variables and quality of life. Overall the cds intervention led to reduced vascular risks but not to improved clinical outcomes in a one-year follow-up.

10.4.2. Non-randomized Experiment in Antibiotic Prescribing in Primary Care

Mainous, Lambourne, and Nietert (2013) conducted a prospective non-randomized trial to examine the impact of a cds system on antibiotic prescribing for acute respiratory infections ( ari s) in primary care. The study is summarized below.

Setting – A primary care research network in the United States whose members use a common emr and pool data quarterly for quality improvement and research studies. Participants – An intervention group with nine practices across nine states, and a control group with 61 practices. Intervention – Point-of-care cds tool as customizable progress note templates based on existing emr features. cds recommendations reflect Centre for Disease Control and Prevention ( cdc ) guidelines based on a patient’s predominant presenting symptoms and age. cds was used to assist in ari diagnosis, prompt antibiotic use, record diagnosis and treatment decisions, and access printable patient and provider education resources from the cdc . Design – The intervention group received a multi-method intervention to facilitate provider cds adoption that included quarterly audit and feedback, best practice dissemination meetings, academic detailing site visits, performance review and cds training. The control group did not receive information on the intervention, the cds or education. Baseline data collection was for three months with follow-up of 15 months after cds implementation. Outcomes – The outcomes were frequency of inappropriate prescribing during an ari episode, broad-spectrum antibiotic use and diagnostic shift. Inappropriate prescribing was computed by dividing the number of ari episodes with diagnoses in the inappropriate category that had an antibiotic prescription by the total number of ari episodes with diagnosis for which antibiotics are inappropriate. Broad-spectrum antibiotic use was computed by all ari episodes with a broad-spectrum antibiotic prescription by the total number of ari episodes with an antibiotic prescription. Antibiotic drift was computed in two ways: dividing the number of ari episodes with diagnoses where antibiotics are appropriate by the total number of ari episodes with an antibiotic prescription; and dividing the number of ari episodes where antibiotics were inappropriate by the total number of ari episodes. Process measure included frequency of cds template use and whether the outcome measures differed by cds usage. Analysis – Outcomes were measured quarterly for each practice, weighted by the number of ari episodes during the quarter to assign greater weight to practices with greater numbers of relevant episodes and to periods with greater numbers of relevant episodes. Weighted means and 95% ci s were computed separately for adult and pediatric (less than 18 years of age) patients for each time period for both groups. Baseline means in outcome measures were compared between the two groups using weighted independent-sample t -tests. Linear mixed models were used to compare changes over the 18-month period. The models included time, intervention status, and were adjusted for practice characteristics such as specialty, size, region and baseline ari s. Random practice effects were included to account for clustering of repeated measures on practices over time. P -values of less than 0.05 were considered significant. Findings – For adult patients, inappropriate prescribing in ari episodes declined more among the intervention group (-0.6%) than the control group (4.2%)( p = 0.03), and prescribing of broad-spectrum antibiotics declined by 16.6% in the intervention group versus an increase of 1.1% in the control group ( p < 0.0001). For pediatric patients, there was a similar decline of 19.7% in the intervention group versus an increase of 0.9% in the control group ( p < 0.0001). In summary, the cds had a modest effect in reducing inappropriate prescribing for adults, but had a substantial effect in reducing the prescribing of broad-spectrum antibiotics in adult and pediatric patients.

10.4.3. Interrupted Time Series on EHR Impact in Nursing Care

Dowding, Turley, and Garrido (2012) conducted a prospective its study to examine the impact of ehr implementation on nursing care processes and outcomes. The study is summarized below.

Setting – Kaiser Permanente ( kp ) as a large not-for-profit integrated healthcare organization in the United States. Participants – 29 kp hospitals in the northern and southern regions of California. Intervention – An integrated ehr system implemented at all hospitals with cpoe , nursing documentation and risk assessment tools. The nursing component for risk assessment documentation of pressure ulcers and falls was consistent across hospitals and developed by clinical nurses and informaticists by consensus. Design – its design with monthly data on pressure ulcers and quarterly data on fall rates and risk collected over seven years between 2003 and 2009. All data were collected at the unit level for each hospital. Outcomes – Process measures were the proportion of patients with a fall risk assessment done and the proportion with a hospital-acquired pressure ulcer ( hapu ) risk assessment done within 24 hours of admission. Outcome measures were fall and hapu rates as part of the unit-level nursing care process and nursing sensitive outcome data collected routinely for all California hospitals. Fall rate was defined as the number of unplanned descents to the floor per 1,000 patient days, and hapu rate was the percentage of patients with stages i-IV or unstageable ulcer on the day of data collection. Analysis – Fall and hapu risk data were synchronized using the month in which the ehr was implemented at each hospital as time zero and aggregated across hospitals for each time period. Multivariate regression analysis was used to examine the effect of time, region and ehr . Findings – The ehr was associated with significant increase in document rates for hapu risk (2.21; 95% CI 0.67 to 3.75) and non-significant increase for fall risk (0.36; -3.58 to 4.30). The ehr was associated with 13% decrease in hapu rates (-0.76; -1.37 to -0.16) but no change in fall rates (-0.091; -0.29 to 011). Hospital region was a significant predictor of variation for hapu (0.72; 0.30 to 1.14) and fall rates (0.57; 0.41 to 0.72). During the study period, hapu rates decreased significantly (-0.16; -0.20 to -0.13) but not fall rates (0.0052; -0.01 to 0.02). In summary, ehr implementation was associated with a reduction in the number of hapu s but not patient falls, and changes over time and hospital region also affected outcomes.

10.5. Summary

In this chapter we introduced randomized and non-randomized experimental designs as two types of comparative studies used in eHealth evaluation. Randomization is the highest quality design as it reduces bias, but it is not always feasible. The methodological issues addressed include choice of variables, sample size, sources of biases, confounders, and adherence to reporting guidelines. Three case examples were included to show how eHealth comparative studies are done.

  • Baker T. B., Gustafson D. H., Shaw B., Hawkins R., Pingree S., Roberts L., Strecher V. Relevance of consort reporting criteria for research on eHealth interventions. Patient Education and Counselling. 2010; 81 (suppl. 7):77–86. [ PMC free article : PMC2993846 ] [ PubMed : 20843621 ]
  • Columbia University. (n.d.). Statistics: sample size / power calculation. Biomath (Division of Biomathematics/Biostatistics), Department of Pediatrics. New York: Columbia University Medical Centre. Retrieved from http://www ​.biomath.info/power/index.htm .
  • Boutron I., Moher D., Altman D. G., Schulz K. F., Ravaud P. consort Group. Extending the consort statement to randomized trials of nonpharmacologic treatment: Explanation and elaboration. Annals of Internal Medicine. 2008; 148 (4):295–309. [ PubMed : 18283207 ]
  • Cochrane Collaboration. Cochrane handbook. London: Author; (n.d.) Retrieved from http://handbook ​.cochrane.org/
  • consort Group. (n.d.). The consort statement . Retrieved from http://www ​.consort-statement.org/
  • Dowding D. W., Turley M., Garrido T. The impact of an electronic health record on nurse sensitive patient outcomes: an interrupted time series analysis. Journal of the American Medical Informatics Association. 2012; 19 (4):615–620. [ PMC free article : PMC3384108 ] [ PubMed : 22174327 ]
  • Friedman C. P., Wyatt J.C. Evaluation methods in biomedical informatics. 2nd ed. New York: Springer Science + Business Media, Inc; 2006.
  • Guyatt G., Oxman A. D., Akl E. A., Kunz R., Vist G., Brozek J. et al. Schunemann H. J. grade guidelines: 1. Introduction – grade evidence profiles and summary of findings tables. Journal of Clinical Epidemiology. 2011; 64 (4):383–394. [ PubMed : 21195583 ]
  • Harris A. D., McGregor J. C., Perencevich E. N., Furuno J. P., Zhu J., Peterson D. E., Finkelstein J. The use and interpretation of quasi-experimental studies in medical informatics. Journal of the American Medical Informatics Association. 2006; 13 (1):16–23. [ PMC free article : PMC1380192 ] [ PubMed : 16221933 ]
  • The Cochrane Collaboration. Cochrane handbook for systematic reviews of interventions. Higgins J. P. T., Green S., editors. London: 2011. (Version 5.1.0, updated March 2011) Retrieved from http://handbook ​.cochrane.org/
  • Holbrook A., Pullenayegum E., Thabane L., Troyan S., Foster G., Keshavjee K. et al. Curnew G. Shared electronic vascular risk decision support in primary care. Computerization of medical practices for the enhancement of therapeutic effectiveness (compete III) randomized trial. Archives of Internal Medicine. 2011; 171 (19):1736–1744. [ PubMed : 22025430 ]
  • Mainous III A. G., Lambourne C. A., Nietert P.J. Impact of a clinical decision support system on antibiotic prescribing for acute respiratory infections in primary care: quasi-experimental trial. Journal of the American Medical Informatics Association. 2013; 20 (2):317–324. [ PMC free article : PMC3638170 ] [ PubMed : 22759620 ]
  • Noordzij M., Tripepi G., Dekker F. W., Zoccali C., Tanck M. W., Jager K.J. Sample size calculations: basic principles and common pitfalls. Nephrology Dialysis Transplantation. 2010; 25 (5):1388–1393. Retrieved from http://ndt ​.oxfordjournals ​.org/content/early/2010/01/12/ndt ​.gfp732.short . [ PubMed : 20067907 ]
  • Vervloet M., Linn A. J., van Weert J. C. M., de Bakker D. H., Bouvy M. L., van Dijk L. The effectiveness of interventions using electronic reminders to improve adherence to chronic medication: A systematic review of the literature. Journal of the American Medical Informatics Association. 2012; 19 (5):696–704. [ PMC free article : PMC3422829 ] [ PubMed : 22534082 ]
  • Zwarenstein M., Treweek S., Gagnier J. J., Altman D. G., Tunis S., Haynes B., Oxman A. D., Moher D. for the consort and Pragmatic Trials in Healthcare (Practihc) groups. Improving the reporting of pragmatic trials: an extension of the consort statement. British Medical Journal. 2008; 337 :a2390. [ PMC free article : PMC3266844 ] [ PubMed : 19001484 ] [ CrossRef ]
  • Zwarenstein M., Treweek S. What kind of randomized trials do we need? Canadian Medical Association Journal. 2009; 180 (10):998–1000. [ PMC free article : PMC2679816 ] [ PubMed : 19372438 ]

Appendix. Example of Sample Size Calculation

This is an example of sample size calculation for an rct that examines the effect of a cds system on reducing systolic blood pressure in hypertensive patients. The case is adapted from the example described in the publication by Noordzij et al. (2010) .

(a) Systolic blood pressure as a continuous outcome measured in mmHg

Based on similar studies in the literature with similar patients, the systolic blood pressure values from the comparison groups are expected to be normally distributed with a standard deviation of 20 mmHg. The evaluator wishes to detect a clinically relevant difference of 15 mmHg in systolic blood pressure as an outcome between the intervention group with cds and the control group without cds . Assuming a significance level or alpha of 0.05 for 2-tailed t -test and power of 0.80, the corresponding multipliers 1 are 1.96 and 0.842, respectively. Using the sample size equation for continuous outcome below we can calculate the sample size needed for the above study.

n = 2[(a+b)2σ2]/(μ1-μ2)2 where

n = sample size for each group

μ1 = population mean of systolic blood pressures in intervention group

μ2 = population mean of systolic blood pressures in control group

μ1- μ2 = desired difference in mean systolic blood pressures between groups

σ = population variance

a = multiplier for significance level (or alpha)

b = multiplier for power (or 1-beta)

Providing the values in the equation would give the sample size (n) of 28 samples per group as the result

n = 2[(1.96+0.842)2(202)]/152 or 28 samples per group

(b) Systolic blood pressure as a categorical outcome measured as below or above 140 mmHg (i.e., hypertension yes/no)

In this example a systolic blood pressure from a sample that is above 140 mmHg is considered an event of the patient with hypertension. Based on published literature the proportion of patients in the general population with hypertension is 30%. The evaluator wishes to detect a clinically relevant difference of 10% in systolic blood pressure as an outcome between the intervention group with cds and the control group without cds . This means the expected proportion of patients with hypertension is 20% (p1 = 0.2) in the intervention group and 30% (p2 = 0.3) in the control group. Assuming a significance level or alpha of 0.05 for 2-tailed t -test and power of 0.80 the corresponding multipliers are 1.96 and 0.842, respectively. Using the sample size equation for categorical outcome below, we can calculate the sample size needed for the above study.

n = [(a+b)2(p1q1+p2q2)]/χ2

p1 = proportion of patients with hypertension in intervention group

q1 = proportion of patients without hypertension in intervention group (or 1-p1)

p2 = proportion of patients with hypertension in control group

q2 = proportion of patients without hypertension in control group (or 1-p2)

χ = desired difference in proportion of hypertensive patients between two groups

Providing the values in the equation would give the sample size (n) of 291 samples per group as the result

n = [(1.96+0.842)2((0.2)(0.8)+(0.3)(0.7))]/(0.1)2 or 291 samples per group

From Table 3 on p. 1392 of Noordzij et al. (2010).

This publication is licensed under a Creative Commons License, Attribution-Noncommercial 4.0 International License (CC BY-NC 4.0): see https://creativecommons.org/licenses/by-nc/4.0/

  • Cite this Page Lau F, Holbrook A. Chapter 10 Methods for Comparative Studies. In: Lau F, Kuziemsky C, editors. Handbook of eHealth Evaluation: An Evidence-based Approach [Internet]. Victoria (BC): University of Victoria; 2017 Feb 27.
  • PDF version of this title (4.5M)
  • Disable Glossary Links

In this Page

  • Introduction
  • Types of Comparative Studies
  • Methodological Considerations
  • Case Examples
  • Example of Sample Size Calculation

Related information

  • PMC PubMed Central citations
  • PubMed Links to PubMed

Recent Activity

  • Chapter 10 Methods for Comparative Studies - Handbook of eHealth Evaluation: An ... Chapter 10 Methods for Comparative Studies - Handbook of eHealth Evaluation: An Evidence-based Approach

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

Connect with NLM

National Library of Medicine 8600 Rockville Pike Bethesda, MD 20894

Web Policies FOIA HHS Vulnerability Disclosure

Help Accessibility Careers

statistics

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • My Account Login
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 25 February 2020

Writing impact case studies: a comparative study of high-scoring and low-scoring case studies from REF2014

  • Bella Reichard   ORCID: orcid.org/0000-0001-5057-4019 1 ,
  • Mark S Reed 1 ,
  • Jenn Chubb 2 ,
  • Ged Hall   ORCID: orcid.org/0000-0003-0815-2925 3 ,
  • Lucy Jowett   ORCID: orcid.org/0000-0001-7536-3429 4 ,
  • Alisha Peart 4 &
  • Andrea Whittle 1  

Palgrave Communications volume  6 , Article number:  31 ( 2020 ) Cite this article

23k Accesses

14 Citations

82 Altmetric

Metrics details

  • Language and linguistics

This paper reports on two studies that used qualitative thematic and quantitative linguistic analysis, respectively, to assess the content and language of the largest ever sample of graded research impact case studies, from the UK Research Excellence Framework 2014 (REF). The paper provides the first empirical evidence across disciplinary main panels of statistically significant linguistic differences between high- versus low-scoring case studies, suggesting that implicit rules linked to written style may have contributed to scores alongside the published criteria on the significance, reach and attribution of impact. High-scoring case studies were more likely to provide specific and high-magnitude articulations of significance and reach than low-scoring cases. High-scoring case studies contained attributional phrases which were more likely to attribute research and/or pathways to impact, and they were written more coherently (containing more explicit causal connections between ideas and more logical connectives) than low-scoring cases. High-scoring case studies appear to have conformed to a distinctive new genre of writing, which was clear and direct, and often simplified in its representation of causality between research and impact, and less likely to contain expressions of uncertainty than typically associated with academic writing. High-scoring case studies in two Main Panels were significantly easier to read than low-scoring cases on the Flesch Reading Ease measure, although both high-scoring and low-scoring cases tended to be of “graduate” reading difficulty. The findings of our work enable impact case study authors to better understand the genre and make content and language choices that communicate their impact as effectively as possible. While directly relevant to the assessment of impact in the UK’s Research Excellence Framework, the work also provides insights of relevance to institutions internationally who are designing evaluation frameworks for research impact.

Similar content being viewed by others

comparative case study examples

Research impact evaluation and academic discourse

comparative case study examples

Demystifying the process of scholarly peer-review: an autoethnographic investigation of feedback literacy of two award-winning peer reviewers

comparative case study examples

Aspiring to greater intellectual humility in science

Introduction.

Academics are under increasing pressure to engage with non-academic actors to generate “usable” knowledge that benefits society and addresses global challenges (Clark et al., 2016 ; Lemos, 2015 ; Rau et al., 2018 ). This is largely driven by funders and governments that seek to justify the societal value of public funding for research (Reed et al., 2020 ; Smith et al., 2011 ) often characterised as ‘impact’. While this has sometimes been defined narrowly as reflective of the need to demonstrate a return on public investment in research (Mårtensson et al., 2016 ; Tsey et al., 2016 ; Warry, 2006 ), there is also a growing interest in the evaluation of “broader impacts” from research (cf. Bozeman and Youtie, 2017 ; National Science Foundation, 2014 ), including less tangible but arguably equally relevant benefits for society and culture. This shift is exemplified by the assessment of impact in the UK’s Research Excellence Framework (REF) in 2014 and 2021, the system for assessing the quality of research in UK higher education institutions, and in the rise of similar policies and evaluation systems in Australia, Hong Kong, the United States, Horizon Europe, The Netherlands, Sweden, Italy, Spain and elsewhere (Reed et al., 2020 ).

The evaluation of research impact in the UK has been criticised by scholars largely for its association with a ‘market logic’ (Olssen and Peters, 2005 ; Rhoads and Torres, 2005 ). Critics argue that a focus of academic performativity can be seen to “destabilise” professional identities (Chubb and Watermeyer, 2017 ), which in the context of research impact evaluation can further “dehumanise and deprofessionalise” academic performance (Watermeyer, 2019 ), whilst leading to negative unintended consequences (which Derrick et al., 2018 , called “grimpact”). MacDonald ( 2017 ), Chubb and Reed ( 2018 ) and Weinstein et al. ( 2019 ) reported concerns from researchers that the impact agenda may be distorting research priorities, “encourag[ing] less discovery-led research” (Weinstein et al., 2019 , p. 94), though these concerns were questioned by University managers in the same study who were reported to “not have enough evidence to support that REF was driving specific research agendas in either direction” (p. 94), and further questioned by Hill ( 2016 ).

Responses to this critique have been varied. Some have called for civil disobedience (Watermeyer, 2019 ) and organised resistance (Back, 2015 ; MacDonald, 2017 ) against the impact agenda. In a review of Watermeyer ( 2019 ), Reed ( 2019 ) suggested that attitudes towards the neoliberal political roots of the impact agenda may vary according to the (political) values and beliefs of researchers, leading them to pursue impacts that either support or oppose neoliberal political and corporate interests. Some have defended the benefits of research impact evaluation. For example, Weinstein et al. ( 2019 ) found that “a focus on changing the culture outside of academia is broadly valued” by academics and managers. The impact agenda might enhance stakeholder engagement (Hill, 2016 ) and give “new currency” to applied research (Chubb, 2017 ; Watermeyer, 2019 ). Others have highlighted the long-term benefits for society of incentivising research impact, including increased public support and funding for a more accountable, outward-facing research system (Chubb and Reed, 2017 ; Hill, 2016 ; Nesta, 2018 ; Oancea, 2010 , 2014 ; Wilsdon et al., 2015 ).

In the UK REF, research outputs and impact are peer reviewed at disciplinary level in ‘Units of Assessment’ (36 in 2014, 34 in 2021), grouped into four ‘Main Panels’. Impact is assessed through case studies that describe the effects of academic research and are given a score between 1* (“recognised but modest”) and 4* (“outstanding”). The case studies follow a set structure of five sections: 1—Summary of the impact; 2—Underpinning research; 3—References to the research; 4—Details of the impact; 5—Sources to corroborate the impact (HEFCE, 2011 ). The publication of over 6000 impact case studies in 2014 Footnote 1 by Research England (formerly Higher Education Funding Council for England, HEFCE) was unique in terms of its size, and unlike the recent selective publication of high-scoring case studies from Australia’s 2018 Engagement and Impact Assessment, both high-scoring and low-scoring case studies were published. This provides a unique opportunity to evaluate the construction of case studies that were perceived by evaluation panels to have successfully demonstrated impact, as evidenced by a 4* rating, and to compare these to case studies that were judged as less successful.

The analysis of case studies included in this research is based on the definition of impact used in REF2014, as “an effect on, change or benefit to the economy, society, culture, public policy or services, health, the environment or quality of life, beyond academia” (HEFCE, 2011 , p. 26). According to REF2014 guidance, the primary functions of an impact case study were to articulate and evidence the significance and reach of impacts arising from research beyond academia, clearly demonstrating the contribution that research from a given institution contributed to those impacts (HEFCE, 2011 ).

In addition to these explicit criteria driving the evaluation of impact in REF2014, a number of analyses have emphasised the role of implicit criteria and subjectivity in shaping the evaluation of impact. For example, Pidd and Broadbent ( 2015 ) emphasised the implicit role a “strong narrative” plays in high-scoring case studies (p. 575). This was echoed by the fears of one REF2014 panellist interviewed by Watermeyer and Chubb ( 2018 ) who said, “I think with impact it is literally so many words of persuasive narrative” as opposed to “giving any kind of substance” (p. 9). Similarly, Watermeyer and Hedgecoe ( 2016 ), reporting on an internal exercise at Cardiff University to evaluate case studies prior to submission, emphasised that “style and structure” were essential to “sell impact”, and that “case studies that best sold impact were those rewarded with the highest evaluative scores” (p. 651).

Recent research based on interviews with REF2014 panellists has also emphasised the subjectivity of the peer-review process used to evaluate impact. Derrick’s ( 2018 ) research findings based on panellist interviews and participant observation of REF2014 sub-panels argued that scores were strongly influenced by who the evaluators were and how the group assessed impact together. Indeed, a panellist interviewed by Watermeyer and Chubb ( 2018 ) concurred that “the panel had quite an influence on the criteria” (p. 7), including an admission that some types of (more intangible) evidence were more likely to be overlooked than other (more concrete) forms of evidence, “privileg[ing] certain kinds of impact”. Other panellists interviewed spoke of their emotional and intellectual vulnerability in making judgements about an impact criterion that they had little prior experience of assessing (Watermeyer and Chubb, 2018 ). Derrick ( 2018 ) argued that this led many evaluators to base their assessments on more familiar proxies for excellence linked to scientific excellence, which led to biased interpretations and shortcuts that mimicked “groupthink” (p. 193).

This paper will for the first time empirically assess the content and language of the largest possible sample of research impact case studies that received high versus low scores from assessment panels in REF2014. Combining qualitative thematic and quantitative linguistic analysis, we ask:

How do high-scoring versus low-scoring case studies articulate and evidence impacts linked to underpinning research?

Do high-scoring and low-scoring case studies have differences in their linguistic features or styles?

Do high-scoring and low-scoring case studies have lexical differences (words and phrases that are statistically more likely to occur in high- or low-scoring cases) or text-level differences (including reading ease, narrative clarity, use of cohesive devices)?

By answering these questions, our goal is to provide evidence for impact case study authors and their institutions to reflect on in order to optimally balance the content and to use language that communicates their impact as effectively as possible. While directly relevant to the assessment of impact in the UK’s REF, the work also provides insights of relevance to institutions internationally who are designing evaluation frameworks for research impact.

Research design and sample

The datasets were generated by using published institutional REF2014 impact scores to deduce the scores of some impact case studies themselves. Although scores for individual case studies were not made public, we were able to identify case studies that received the top mark of 4* based on the distribution of scores received by some institutions, where the whole submission by an institution in a given Unit of Assessment was awarded the same score. In those 20 Units of Assessment (henceforth UoA) where high-scoring case studies could be identified in this way, we also accessed all case studies known to have scored either 1* or 2* in order to compare the features of high-scoring case studies to those of low-scoring case studies.

We approached our research questions with two separate studies, using quantitative linguistic and qualitative thematic analysis respectively. The thematic analysis, explained in more detail in the section “Qualitative thematic analysis” below, allowed us to find answers to research question 1 (see above). The quantitative linguistic analysis was used to extract and compare typical word combinations for high-scoring and low-scoring case studies, as well as assessing their readability. It mainly addressed research questions 2 and 3.

The quantitative linguistic analysis was based on a sample of all identifiable high-scoring case studies in any UoA ( n  = 124) and all identifiable low-scoring impact case studies in those UoAs where high-scoring case studies could be identified ( n  = 93). As the linguistic analysis focused on identifying characteristic language choices in running text, only those sections designed to contain predominantly text were included (1—Summary of the impact; 2—Underpinning research; 4—Details of the impact). Figure 1 shows the distribution of case studies across Main Panels in the quantitative analysis. Table 1 summarises the number of words included in the analysis.

figure 1

Distribution of case studies across Main Panels used for the linguistic analysis sample.

In order to detect patterns of content in high-scoring and low-scoring case studies across all four Main Panels, a sub-sample of case studies was selected for a qualitative thematic analysis. This included 60% of high-scoring case studies and 97% of low-scoring case studies from the quantitative analysis, such that only UoAs were included where both high-scoring and low-scoring case studies are available (as opposed to the quantitative sample, which includes all available high-scoring case studies). Further selection criteria were then designed to create a greater balance in the number of high-scoring and low-scoring case studies across Main Panels. Main Panels A (high) and C (low) were particularly over-represented, so a lower proportion of those case studies were selected and 10 additional high-scoring case studies were considered in Panel B, including institutions where at least 85% of the case studies scored 4* and the remaining scores were 3*. As this added a further UoA, we could also include 14 more low-scoring case studies in Main Panel B. This resulted in a total of 85 high-scoring and 90 low-scoring case studies. Figure 2 shows the distribution of case studies across Main Panels in the thematic analysis, illustrating the greater balance compared to the sample used in the quantitative analysis. The majority (75%) of the case studies analysed are included in both samples (Table 2 ).

figure 2

Distribution of case studies across Main Panels used for the thematic analysis sample.

Quantitative linguistic analysis

Quantitative linguistic analysis can be used to make recurring patterns in language use visible and to assess their significance. We treated the dataset of impact case studies as a text collection (the ‘corpus’) divided into two sections, namely high-scoring and low-scoring case studies (the two ‘sub-corpora’), in order to explore the lexical profile and the readability of the case studies.

One way to explore the lexical profile of groups of texts is to generate frequency-based word lists and compare these to word lists from a reference corpus to determine which words are characteristic of the corpus of interest (“keywords”, cf. Scott, 1997 ). Another way is to extract word combinations that are particularly frequent. Such word combinations, called “lexical bundles”, are “extended collocations” (Hyland, 2008 , p. 41) that appear across a set range of texts (Esfandiari and Barbary, 2017 ). We merged these two approaches in order to uncover meanings that could not be made visible through the analysis of single-word frequencies, comparing lexical bundles from each sub-corpus to the other. Lexical bundles of 2–4 words were extracted with AntConc (specialist software developed by Anthony, 2014 ) firstly from the corpus of all high-scoring case studies and then separately from the sub-corpora of high-scoring case studies in Main Panel A, C and D. Footnote 2 The corresponding lists were extracted from low-scoring case studies overall and separated by panel. The lists of lexical bundles for each of the high-scoring corpus parts were then compared to the corresponding low-scoring parts (High-Overall vs. Low-Overall, High-Main Panel A vs. Low-Main Panel A, etc.) to detect statistically significant over-use and under-use in one set of texts relative to another.

Two statistical measures were used in the analysis of lexical bundles. Log Likelihood was used as a measure of the statistical significance of frequency differences (Rayson and Garside, 2000 ), with a value of >3.84 corresponding to p  < 0.05. This measure had the advantage, compared to the more frequently used chi-square test, of not assuming a normal distribution of data (McEnery et al., 2006 ). The Log Ratio (Hardie, 2014 ) was used as a measure of effect size, which quantifies the scale, rather than the statistical significance, of frequency differences between two datasets. The Log Ratio is technically the binary log of the relative risk, and a value of >0.5 or <−0.5 is considered meaningful in corpus linguistics (Hardie, 2014 ), with values further removed from 0 reflecting a bigger difference in the relative frequencies found in each corpus. There is currently no agreed standard effect size measure for keywords (Brezina, 2018 , p. 85) and the Log Ratio was chosen because it is straightforward to interpret. Each lexical bundle that met the ‘keyness’ threshold (Log Likelihood > 3.84 in the case of expected values > 12, with higher significance levels needed for expected values < 13—see Rayson et al., 2004 , p. 8) was then assigned a code according to its predominant meaning in the texts, as reflected in the contexts captured in the concordance lines extracted from the corpus.

In the thematic analysis, it appeared that high-scoring case studies were easier to read. In order to quantify the readability of the texts, we therefore analysed them using the Coh-Metrix online tool (www.cohmetrix.com, v3.0) developed by McNamara et al. ( 2014 ). This tool provides 106 descriptive indices of language features, including 8 principal component scores developed from combinations of the other indices (Graesser et al., 2011 ). We selected these principal component scores as comprehensive measures of “reading ease” because they assess multiple characteristics of the text, up to whole-text discourse level (McNamara et al., 2014 , p. 78). This was supplemented by the traditional and more wide-spread Flesch Reading Ease score of readability measuring the lengths of words and sentences, which are highly correlated with reading speed (Haberlandt and Graesser, 1985 ). The selected measures were compared across corpus sections using t -tests to evaluate significance. The effect size was measured using Cohen’s D , following Brezina ( 2018 , p. 190), where D  > 0.3 indicates a small, D  > 0.5 a medium, and D  > 0.8 a high effect size. As with the analysis of lexical bundles, comparisons were made between high- and low-scoring case studies in each of Main Panels A, C and D, as well as between all high-scoring and all low-scoring case studies across Main Panels.

Qualitative thematic analysis

While a quantitative analysis as described above can make differences in the use of certain words visible, it does not capture the narrative or content of the texts under investigation. In order to identify common features of high-scoring and low-scoring case studies, thematic analysis was chosen to complement the quantitative analysis by identifying patterns and inferring meaning from qualitative data (Auerbach and Silverstein, 2003 ; Braun and Clarke, 2006 ; Saldana, 2009 ). To familiarise themselves with the data and for inter-coder reliability, two research team members read a selection of REF2014 impact case studies from different Main Panels, before generating initial codes for each of the five sections of the impact case study template. These were discussed with the full research team, comprising three academic and three professional services staff who had all read multiple case studies themselves. They were piloted prior to defining a final set of themes and questions against which the data was coded (based on the six-step process outlined by Braun and Clarke, 2006 ) (Table 3 ). An additional category was used to code stylistic features, to triangulate elements of the quantitative analysis (e.g. readability) and to include additional stylistic features difficult to assess in quantitative terms (e.g. effective use of testimonials). In addition to this, 10 different types of impact were coded for, based on Reed’s ( 2018 ) typology: capacity and preparedness, awareness and understanding, policy, attitudinal change, behaviour change and other forms of decision-making, other social, economic, environmental, health and wellbeing, and cultural impacts. There was room for coders to include additional insights arising in each section of the case study that had not been captured in the coding system; and there was room to summarise other key factors they thought might account for high or low scores.

Coders summarised case study content pertaining to each code, for example by listing examples of effective or poor use of structure and formatting as they arose in each case study. Coders also quoted the original material next to their summaries so that their interpretation could be assessed during subsequent analysis. This initial coding of case study text was conducted by six coders, with intercoder reliability (based on 10% of the sample) assessed at over 90%. Subsequent thematic analysis within the codes was conducted by two of the co-authors. This involved categorising coded material into themes as a way of assigning meaning to features that occurred across multiple case studies (e.g. categorising types of corroborating evidence typically used in high-scoring versus low-scoring case studies).

Results and discussion

In this section, we integrate findings from the quantitative linguistic study and the qualitative analysis of low-scoring versus high-scoring case studies. The results are discussed under four headings based on the key findings that emerged from both analyses. Taken together, these findings provide the most comprehensive evidence to date of the characteristics of a top-rated (4*) impact case study in REF2014.

Highly-rated case studies provided specific, high-magnitude and well-evidenced articulations of significance and reach

One finding from our qualitative thematic analysis was that 84% of high-scoring cases articulated benefits to specific groups and provided evidence of their significance and reach, compared to 32% of low-scoring cases which typically focused instead on the pathway to impact, for example describing dissemination of research findings and engagement with stakeholders and publics without citing the benefits arising from dissemination or engagement. One way of conceptualising this difference is using the content/process distinction: whereas low-scoring cases tended to focus on the process through which impact was sought (i.e. the pathway used), the high-scoring cases tended to focus on the content of the impact itself (i.e. what change or improvement occurred as a result of the research).

Examples of global reach were evidenced across high-scoring case studies from all panels (including Panel D for Arts and Humanities research), but were less often claimed or evidenced in low-scoring case studies. Where reach was more limited geographically, many high-scoring case studies used context to create robust arguments that their reach was impressive in that context, describing reach for example in social or cultural terms or arguing for the importance of reaching a narrow but hard-to-reach or otherwise important target group.

Table 4 provides examples of evidence from high-scoring cases and low-scoring cases that were used to show significance and reach of impacts in REF2014.

Findings from the quantitative linguistic analysis in Table 5 show how high-scoring impact case studies contained more phrases that specified reach (e.g. “in England and”, “in the US”), compared to low-scoring case studies that used the more generic term “international”, leaving the reader in doubt about the actual reach. They also include more phrases that implicitly specified the significance of the impact (e.g. “the government’s” or “to the House of Commons”), compared to low-scoring cases which provided more generic phrases, such as “policy and practice”, rather than detailing specific policies or practices that had been changed.

The quantitative linguistics analysis also identified a number of words and phrases pertaining to engagement and pathways, which were intended to deliver impact but did not actually specify impact (Table 6 ). A number of phrases contained the word “dissemination”, and there were several words and phrases specifying types of engagement that could be considered more one-way dissemination than consultative or co-productive (cf. Reed et al.’s ( 2018 ) engagement typology), e.g. “the book” and “the event”. The focus on dissemination supports the finding from the qualitative thematic analysis that low-scoring case tended to focus more on pathways or routes than on impact. Although it is not possible to infer this directly from the data, it is possible that this may represent a deeper epistemological position underpinning some case studies, where impact generation was seen as one-way knowledge or technology transfer, and research findings were perceived as something that could be given unchanged to publics and stakeholders through dissemination activities, with the assumption that this would be understood as intended and lead to impact.

It is worth noting that none of the four UK countries appear significantly more often in either high-scoring or low-scoring case studies (outside of the phrase “in England and”). Wales ( n  = 50), Scotland ( n  = 71) and Northern Ireland ( n  = 32) appear slightly more often in high-scoring case studies, but the difference is not significant (England: n  = 162). An additional factor to take into account is that our dataset includes only submissions that are either high-scoring or low-scoring, and the geographical spread of the submitting institutions was not a factor in selecting texts. There was a balanced number of high-scoring and low-scoring case studies in the sample from English, Scottish and Welsh universities, but no guaranteed low-scoring submissions from Northern Irish institutions. The REF2014 guidance made it clear that impacts in each UK country would be evaluated equally in comparison to each other, the UK and other countries. While the quantitative analysis of case studies from our sample only found a statistically significant difference for the phrase “in England and”, this, combined with the slightly higher number of phrases containing the other countries of the UK in high-scoring case studies, might indicate that this panel guidance was implemented as instructed.

Figures 3 – 5 shows which types of impact could be identified in high-scoring or low-scoring case studies, respectively, in the qualitative thematic analysis (based on Reed’s ( 2018 ) typology of impacts). Note that percentages do not add up to 100% because it was possible for each case study to claim more than one type of impact (high-scoring impact case studies described on average 2.8 impacts, compared to an average of 1.8 impacts described by low-scoring case studies) Footnote 3 . Figure 3 shows the number of impacts per type as a percentage of the total number of impacts claimed in high-scoring versus low-scoring case studies. This shows that high-scoring case studies were more likely to claim health/wellbeing and policy impacts, whereas low-scoring case studies were more likely to claim understanding/awareness impacts. Looking at this by Main Panel, over 50% of high-scoring case studies in Main Panel A claimed health/wellbeing, policy and understanding/awareness impacts (Fig. 4 ), whereas over 50% of low-scoring case studies in Main Panel A claimed capacity building impacts (Fig. 5 ). There were relatively high numbers of economic and policy claimed in both high-scoring and low-scoring case studies in Main Panels B and C, respectively, with no impact type dominating strongly in Main Panel D (Figs. 4 and 5 ).

figure 3

Number of impacts claimed in high- versus low-scoring case studies by impact type.

figure 4

Percentage of high-scoring case studies that claimed different types of impact.

figure 5

Percentage of low-scoring case studies that claimed different types of impact.

Highly-rated case studies used distinct features to establish links between research (cause) and impact (effect)

Findings from the quantitative linguistic analysis show that high-scoring case studies were significantly more likely to include attributional phrases like “cited in”, “used to” and “resulting in”, compared to low-scoring case studies (Table 7 provides examples for some of the 12 phrases more frequent in high-scoring case studies). However, there were some attributional phrases that were more likely to be found in low-scoring case studies (e.g. “from the”, “of the research” and “this work has”—total of 9 different phrases).

To investigate this further, all 564 and 601 instances Footnote 4 of attributional phrases in high-scoring and low-scoring case studies, respectively, were analysed to categorise the context in which they were used, to establish the extent to which these phrases in each corpus were being used to establish attribution to impacts. The first word or phrase preceding or succeeding the attributional content was coded. For example, if the attributional content was “used the”, followed by “research to generate impact”, the first word succeeding the attributional content (in this case “research”) was coded rather than the phrase it subsequently led to (“generate impact”). According to a Pearson Chi Square test, high-scoring case studies were significantly more likely to establish attribution to impact than low-scoring cases ( p  < 0.0001, but with a small effect size based on Cramer’s V  = 0.22; bold in Table 8 ). 18% ( n  = 106) of phrases in the low-scoring corpus established attribution to impact, compared to 37% ( n  = 210) in the high-scoring corpus, for example, stating that research, pathway or something else led to impact. Instead, low-scoring case studies were more likely to establish attribution to research (40%; n  = 241) compared to high-scoring cases (28%; n  = 156; p  < 0.0001, but with a small effect size based on Cramer’s V  = 0.135). Both high- and low-scoring case studies were similarly likely to establish attribution to pathways (low: 32%; n  = 194; high: 31% n  = 176).

Moreover, low-scoring case studies were more likely to include ambiguous or uncertain phrases. For example, the phrase “a number of” can be read to imply that it is not known how many instances there were. This occurred in all sections of the impact case studies, for example in the underpinning research section as “The research explores a number of themes” or in the summary or details of the impact section as “The work has also resulted in a number of other national and international impacts”, or “has influenced approaches and practices of a number of partner organisations”. Similarly, “an impact on” could give the impression that the nature of the impact is not known. This phrase occurred only in summary and details of the impact sections, for example, “These activities have had an impact on the professional development”, “the research has had an impact on the legal arguments”, or “there has also been an impact on the work of regional agency”.

In the qualitative thematic analysis, we found that only 50% of low-scoring case studies clearly linked the underpinning research to claimed impacts (compared to 97% of high-scoring cases). This gave the impression of over-claimed impacts in some low-scoring submissions. For example, one case study claimed “significant impacts on [a country’s] society” based on enhancing the security of a new IT system in the department responsible for publishing and archiving legislation. Another claimed “economic impact on a worldwide scale” based on billions of pounds of benefits, calculated using an undisclosed method by an undisclosed evaluator in an unpublished final report by the research team. One case study claimed attribution for impact based on similarities between a prototype developed by the researchers and a product subsequently launched by a major corporation, without any evidence that the product as launched was based on the prototype. Similar assumptions were made in a number of other case studies that appeared to conflate correlation with causation in their attempts to infer attribution between research and impact. Table 9 provides examples of different ways in which links between research and impact were evidenced in the details of the research section.

Table 10 shows how corroborating sources were used to support these claims. 82% of high-scoring case studies compared to 7% of low-scoring cases were identified in the qualitative thematic analysis as having generally high-quality corroborating evidence. In contrast, 11% of high-scoring case studies, compared to 71% of low-scoring cases, were identified as having corroborating evidence that was vague and/or poorly linked to claimed impacts. Looking at only case studies that claim policy impact, 11 out of 26 high-scoring case studies in the sample described both policy and implementation (42%), compared to just 5 out of 29 low-scoring case studies that included both policy and implementation (17%; the remainder described policy impacts only with no evidence of benefits arising from implementation). High- scoring case studies were more likely to cite evidence of impacts rather than just citing evidence pertaining to the pathway (which was more common in low-scoring cases). High-scoring policy case studies also provided evidence pertaining to the pathway, but because they typically also included evidence of policy change, this evidence helped attribute policy impacts to research.

Highly-rated case studies were easy to understand and well written

In preparation for the REF, many universities invested heavily in writing assistance (Coleman, 2019 ) to ensure that impact case studies were “easy to understand and evaluation-friendly” (Watermeyer and Chubb, 2018 ) for the assessment panels, which comprised academics and experts from other sectors (HEFCE, 2011 , p. 6). With this in mind, we investigated readability and style, both in the quantitative linguistic and in the qualitative thematic analysis.

High-scoring impact case studies scored more highly on the Flesch Reading Ease score, a readability measure based on the length of words and sentences. The scores in Table 11 are reported out of 100, with a higher score indicating that a text is easier to read. While the scores reveal a significant difference between 4* and 1*/2* impact case studies, they also indicate that impact case studies are generally on the verge of “graduate” difficulty (Hartley, 2016 , p. 1524). As such our analysis should not be understood as suggesting that these technical documents should be adjusted to the readability of a newspaper article, but they should be maintained at interested and educated non-specialist level.

Interestingly, there were differences between the main panels. Footnote 5 In Social Science and Humanities case studies (Main Panels C and D), high-scoring impact case studies scored significantly higher on reading ease than low-scoring ones. There was no significant difference in Main Panel A between 4* and 1*/2* cases. However, all Main Panel A case studies showed, on average, lower reading ease scores than the low-scoring cases in Main Panels C and D. This means that their authors used longer words and sentences, which may be explained in part by more and longer technical terms needed in Main Panel A disciplines; the difference between high- and low-scoring case studies in Main Panels C and D may be explained by the use of more technical jargon (confirmed in the qualitative analysis).

The Flesch Reading Ease measure assesses the sentence- and word-level, rather than capturing higher-level text-processing difficulty. While this is recognised as a reliable indicator of comparative reading ease, and the underlying measures of sentence-length and word-length are highly correlated with reading speed (Haberlandt and Graesser, 1985 ), Hartley ( 2016 ) is right in his criticism that the tool takes neither the meaning of the words nor the wider text into account. The Coh-Metrix tool (McNamara et al., 2014 ) provides further measures for reading ease based on textual cohesion in these texts compared to a set of general English texts. Of the eight principal component scores computed by the tool, most did not reveal a significant difference between high- and low-scoring case studies or between different Main Panels. Moreover, in most measures, impact case studies overall were fairly homogenous compared to the baseline of general English texts. However, there were significant differences between high- and low-scoring impact case studies in two of the measures: “deep cohesion” and “connectivity” (Table 12 ).

“Deep cohesion” shows whether a text makes causal connections between ideas explicit (e.g. “because”, “so”) or leaves them for the reader to infer. High-scoring case studies had a higher level of deep cohesion compared to general English texts (Graesser et al., 2011 ), while low-scoring case studies tended to sit below the general English average. In addition, Main Panel A case studies (Life Sciences), which received the lowest scores in Flesch Reading Ease, on average scored higher on deep cohesion than case studies in more discursive disciplines (Main Panel C—Social Sciences and Main Panel D—Arts and Humanities). “Connectivity” measures the level of explicit logical connectives (e.g. “and”, “or” and “but”) to show relations in the text. Impact case studies were low in connectivity compared to general English texts, but within each of the Main Panels, high-scoring case studies had more explicit connectivity than low-scoring case studies. This means that Main Panel A case studies, while using on average longer words and sentences as indicated by the Flesch Reading Ease scores, compensated for this by making causal and logical relationships more explicit in the texts. In Main Panels C and D, which on average scored lower on these measures, there was a clearer difference between high- and low-scoring case studies than in Main Panel A, with high-scoring case studies being easier to read.

Linked to this, low-scoring case studies across panels were more likely than high-scoring case studies to contain phrases linked to the research process (suggesting an over-emphasis on the research rather than the impact, and a focus on process over findings or quality; Table 18 ) and filler-phrases (Table 13 ).

High-scoring case studies were more likely to clearly identify individual impacts via subheadings and paragraph headings ( p  < 0.0001, with effect size measure Log Ratio 0.54). The difference is especially pronounced in Main Panel D (Log Ratio 1.53), with a small difference in Main Panel C and no significant difference in Main Panel A. In Units of Assessment combined in Main Panel D, a more discursive academic writing style is prevalent (see e.g. Hyland, 2002 ) using fewer visual/typographical distinctions such as headings. The difference in the number of headings used in case studies from those disciplines suggests that high-scoring case studies showed greater divergence from disciplinary norms than low-scoring case studies. This may have allowed them to adapt the presentation of their research impact to the audience of panel members to a greater extent than low-scoring case studies.

The qualitative thematic analysis of Impact Case Studies indicates that it is not simply the number of subheadings that matters, although this comparison is interesting especially in the context of the larger discrepancy in Main Panel D. Table 14 summarises formatting that was considered helpful and unhelpful from the qualitative analysis.

The observations in Tables 11 – 13 stem from quantitative linguistic analysis, which, while enabling statistical testing, does not show directly the effect of a text on the reader. When conducting the qualitative thematic analysis, we collected examples of formatting and stylistic features from the writing and presentation of high and low-scoring case studies that might have affected clarity of the texts (Tables 14 and 15 ). Specifically, 38% of low-scoring case studies made inappropriate use of adjectives to describe impacts (compared to 20% of high-scoring; Table 16 ). Inappropriate use of adjectives may have given an impression of over-claiming or created a less factual impression than case studies that used adjectives more sparingly to describe impacts. Some included adjectives to describe impacts in testimonial quotes, giving third-party endorsement to the claims rather than using these adjectives directly in the case study text.

Highly-rated case studies were more likely to describe underpinning research findings, rather than research processes

To be eligible, case studies in REF2014 had to be based on underpinning research that was “recognised internationally in terms of originality, significance and rigour” (denoted by a 2* quality profile, HEFCE, 2011 , p. 29). Ineligible case studies were excluded from our sample (i.e. those in the “unclassifiable” quality profile), so all the case studies should have been based on strong research. Once this research quality threshold had been passed, scores were based on the significance and reach of impact, so case studies with higher-rated research should not, in theory, get better scores on the basis of their underpinning research. However, there is evidence that units whose research outputs scored well in REF2014 also performed well on impact (unpublished Research England analysis cited in Hill, 2016 ). This observation only shows that high-quality research and impact were co-located, rather than demonstrating a causal relationship between high-quality research and highly rated impacts. However, our qualitative thematic analysis suggests that weaker descriptions of research (underpinning research was not evaluated directly) may have been more likely to be co-located with lower-rated impacts at the level of individual case studies. We know that the majority of underpinning research in the sample was graded 2* or above (because we excluded unclassifiable case studies from the analysis) but individual ratings for outputs in the underpinning research section are not provided in REF2014. Therefore, the qualitative analysis looked for a range of indicators of strong or weak research in four categories: (i) indicators of publication quality; (ii) quality of funding sources; (iii) narrative descriptions of research quality; and (iv) the extent to which the submitting unit (versus collaborators outside the institution) had contributed to the underpinning research. As would be expected (given that all cases had passed the 2* threshold), only a small minority of cases in the sample gave grounds to doubt the quality of the underpinning research. However, both our qualitative and quantitative analyses identified research-related differences between high- and low-scoring impact case studies.

Based on our qualitative thematic analysis of indicators of research quality, a number of low-scoring cases contained indications that underpinning research may have been weak. This was very rare in high-scoring cases. In the most extreme case, one case study was not able to submit any published research to underpin the impact, relying instead on having secured grant funding and having a manuscript under review. Table 17 describes indicators that underpinning research may have been weaker (presumably closer to the 2* quality threshold for eligibility). It also describes the indications of higher quality research (which were likely to have exceeded the 2* threshold) that were found in the rest of the sample. High-scoring case studies demonstrated the quality of the research using a range of direct and indirect approaches. Direct approaches included the construction of arguments that articulated the originality, significance and rigour of the research in the “underpinning research” section of the case study (sometimes with reference to outputs that were being assessed elsewhere in the exercise to provide a quick and robust check on quality ratings). In addition to this, a wide range of indirect proxies were used to infer quality, including publication venue, funding sources, reviews and awards.

These indicators are of particular interest given the stipulation in REF2021 that case studies must provide evidence of research quality, with the only official guidance suggesting that this is done via the use of indicators. The indicators identified in Table 17 overlap significantly with example indicators proposed by panels in the REF2021 guidance. However, there are also a number of additional indicators, which may be of use for demonstrating the quality of research in REF2021 case studies. In common with proposed REF2021 research quality indicators, many of the indicators in Table 17 are highly context dependent, based on subjective disciplinary norms that are used as short-cuts to assessments of quality by peers within a given context. Funding sources, publication venues and reviews that are considered prestigious in one disciplinary context are often perceived very differently in other disciplinary contexts. While REF2021 does not allow the use of certain indicators (e.g. journal impact factors), no comment is given on the appropriateness of the suggested indicators. While this may be problematic, given that an indicator by definition sign-posts, suggests or indicates by proxy rather than representing the outcome of any rigorous assessment, we make no comment on whether it is appropriate to judge research quality via such proxies. Instead, Table 17 presents a subjective, qualitative identification of indicators of high or low research quality, which were as far as possible considered within the context of disciplinary norms in the Units of Assessments to which the case studies belonged.

The quantitative linguistic analysis also found differences between the high-scoring and low-scoring case studies relating to underpinning research. There were significantly more words and phrases in low-scoring case studies compared to high-scoring cases relating to research outputs (e.g. “the paper”, “peer-reviewed”, “journal of”, “et al”), the research process (e.g. “research project”, “the research”, “his work”, “research team”) and descriptions of research (“relationship between”, “research into”, “the research”) (Table 18 ). The word “research” itself appears frequently in both (high: 91× per 10,000 words; low: 110× per 10,000 words), which is nevertheless a small but significant over-use in the low-scoring case studies (effect size measure log ratio = 0.27, p  < 0.0001).

There are two alternative ways to interpret these findings. First, the qualitative research appears to suggest a link between higher-quality underpinning research and higher impact scores. However, the causal mechanism is not clear. An independent review of REF2014 commissioned by the UK Government (Stern, 2016 ) proposed that underpinning research should only have to meet the 2* threshold for rigour, as the academic significance and novelty of the research is not in theory a necessary precursor to significant and far-reaching impact. However, a number of the indications of weaker research in Table 17 relate to academic significance and originality, and many of the indicators that suggested research exceeded the 2* threshold imply academic significance and originality (e.g. more prestigious publication venues often demand stronger evidence of academic significance and originality in addition to rigour). As such, it may be possible to posit two potential causal mechanisms related to the originality and/or significance of research. First, it may be argued that major new academic breakthroughs may be more likely to lead to impacts, whether directly in the case of applied research that addresses societal challenges in new and important ways leading to breakthrough impacts, or indirectly in the case of major new methodological or theoretical breakthroughs that make new work possible that addresses previously intractable challenges. Second, the highest quality research may have sub-consciously biased reviewers to view associated impacts more favourably. Further research would be necessary to test either mechanism.

However, these mechanisms do not explain the higher frequency of words and phrases relating to research outputs and process in low-scoring case studies. Both high-scoring and low-scoring cases described the underpinning research, and none of the phrases that emerged from the analysis imply higher or lower quality of research. We hypothesised that this may be explained by low-scoring case studies devoting more space to underpinning research at the expense of other sections that may have been more likely to contribute towards scores. Word limits were “indicative”, and the real limit of “four pages” in REF2014 (extended to five pages in REF2021) was operationalised in various way. However, a t -test found no significant difference between the underpinning research word counts (mean of 579 and 537 words in high and low-scoring case studies, respectively; p  = 0.11). Instead, we note that words and phrases relating to research in the low-scoring case studies focused more on descriptions of research outputs and processes rather than descriptions of research findings or the quality of research, as requested in REF2014 guidelines. Given that eligibility evidenced in this section is based on whether the research findings underpin the impacts and the quality of the research (HEFCE, 2011 ), we hypothesise that the focus of low-scoring case studies on research outputs and processes was unnecessary (at best) or replaced or obscured research findings (at worst). This could be conceptualised as another instance of the content/process distinction, whereby high-scoring case studies focused on what the research found and low-scoring case studies focused on the process through which the research was conducted and disseminated. It could be concluded that this tendency may have contributed towards lower scores if unnecessary descriptions of research outputs and process, which would not have contributed towards scores, used up space that could otherwise have been used for material that may have contributed towards scores.

Limitations

These findings may be useful in guiding the construction and writing of case studies for REF2021 but it is important to recognise that our analyses are retrospective, showing examples of what was judged to be ‘good’ and ‘poor’ practice in the authorship of case studies for REF2014. Importantly, the findings of this study should not be used to infer a causal relationship between the linguistic features we have identified and the judgements of the REF evaluation panel. Our quantitative analysis has identified similarities and differences in their linguistic features, but there are undoubtedly a range of considerations taken into account by evaluation panels. It is also not possible to anticipate how REF2021 panels will interpret guidance and evaluate case studies, and there is already evidence that practice is changing significantly across the sector. This shift in expectations regarding impact is especially likely to be the case in research concerned with public policy, which are increasingly including policy implementation as well as design in their requirements, and research involving public engagement, which is increasingly being expected to provide longitudinal evidence of benefits and provide evidence of cause and effect. We are unable to say anything conclusive from our sample about case studies that focused primarily on public engagement and pedagogy because neither of these types of impact were common enough in either the high-scoring or low-scoring sample to infer reliable findings. While this is the largest sample of known high-scoring versus low-scoring case studies ever analysed, it is important to note that this represents <3% of the total case studies submitted to REF2014. Although the number of case studies was fairly evenly balanced between Main Panels in the thematic analysis, the sample only included a selection of Units of Assessment from each Main Panel, where sufficient numbers of high and low-scoring cases could be identified (14 and 20 out of 36 Units of Assessment in the qualitative and quantitative studies, respectively). As such, caution should be taken when generalising from these findings.

This paper provides empirical insights into the linguistic differences in high-scoring and low-scoring impact case studies in REF2014. Higher-scoring case studies were more likely to have articulated evidence of significant and far-reaching impacts (rather than just presenting the activities used to reach intended future impacts), and they articulated clear evidence of causal links between the underpinning research and claimed impacts. While a cause and effect relationship between linguistic features, styles and the panel’s evaluation cannot be claimed, we have provided a granularity of analysis that shows how high-scoring versus low-scoring case studies attempted to meet REF criteria. Knowledge of these features may provide useful lessons for future case study authors, submitting institutions and others developing impact assessments internationally. Specifically, we show that high-scoring case studies were more likely to provide specific and high-magnitude articulations of significance and reach, compared to low-scoring cases, which were more likely to provide less specific and lower-magnitude articulations of significance and reach. Lower-scoring case studies were more likely to focus on pathways to impact rather than articulating clear impact claims, with a particular focus on one-way modes of knowledge transfer. High-scoring case studies were more likely to provide clear links between underpinning research and impacts, supported by high-quality corroborating evidence, compared to low-scoring cases that often had missing links between research and impact and were more likely to be underpinned by corroborating evidence that was vague and/or not clearly linked to impact claims. Linked to this, high-scoring case studies were more likely to contain attributional phrases, and these phrases were more likely to attribute research and/or pathways to impact, compared to low-scoring cases, which contained fewer attributional phrases, which were more likely to provide attribution to pathways rather than impact. Furthermore, there is evidence that high-scoring case studies had more explicit causal connections between ideas and more logical connective words (and, or, but) than low-scoring cases.

However, in addition to the explicit REF2014 rules, which appear to have been enacted effectively by sub-panels, there is evidence that implicit rules, particularly linked to written style, may also have played a role. High-scoring case studies appear to have conformed to a distinctive new genre of writing, which was clear and direct, often simplified in its representation of causality between research and impact, and less likely to contain expressions of uncertainty than might be normally expected in academic writing (cf. e.g. Vold, 2006 ; Yang et al., 2015 ). Low-scoring case studies were more likely to contain filler phrases that could be described as “academese” (Biber and Gray, 2019 , p. 1), more likely to use unsubstantiated or vague adjectives to describe impacts, and were less likely to signpost readers to key points using sub-headings and paragraph headings. High-scoring case studies in two Main Panels (out of the three that could be analysed in this way) were significantly easier to read, although both high- and low-scoring case studies tended to be of “graduate” (Hartley, 2016 ) difficulty.

These findings suggest that aspects of written style may have contributed towards or compromised the scores of some case studies in REF2014, in line with previous research emphasising the role of implicit and subjective factors in determining the outcomes of impact evaluation (Derrick, 2018 ; Watermeyer and Chubb, 2018 ). If this were the case, it may raise questions about whether case studies are an appropriate way to evaluate impact. However, metric-based approaches have many other limitations and are widely regarded as inappropriate for evaluating societal impact (Bornmann et al., 2018 ; Pollitt et al., 2016 ; Ravenscroft et al., 2017 ; Wilsdon et al., 2015 ). Comparing research output evaluation systems across different countries, Sivertsen ( 2017 ) presents the peer-review-based UK REF as “best practice” compared to the metrics-based systems elsewhere. Comparing the evaluation of impact in the UK to impact evaluations in USA, the Netherlands, Italy and Finland, Derrick ( 2019 ) describes REF2014 and REF2021 as “the world’s most developed agenda for evaluating the wider benefits of research and its success has influenced the way many other countries define and approach the assessment of impact”.

We cannot be certain about the extent to which linguistic features or style shaped the judgement of REF evaluators, nor can such influences easily be identified or even consciously recognised when they are at work (cf. research on sub-conscious bias and tacit knowledge; the idea that “we know more than we can say”—Polanyi, 1958 cited in Goodman, 2003 , p. 142). Nonetheless, we hope that the granularity of our findings proves useful in informing decisions about presenting case studies, both for case study authors (in REF2021 and other research impact evaluations around the world) and those designing such evaluation processes. In publishing this evidence, we hope to create a more “level playing field” between institutions with and without significant resources available to hire dedicated staff or consultants to help write their impact case studies.

Data availability

The dataset analysed during the current study corresponds to the publicly available impact case studies defined through the method explained in Section “Research design and sample” and Table 2 . A full list of case studies included can be obtained from the corresponding author upon request.

https://impact.ref.ac.uk/casestudies/search1.aspx

For Main Panel B, only six high-scoring and two low-scoring case studies are clearly identifiable and available to the public (cf. Fig. 1 ). The Main Panel B dataset is therefore too small for separate statistical analysis, and no generalisations should be made on the basis of only one high-scoring and one low-scoring submission.

However, in the qualitative analysis, there were a similar number of high-scoring case studies that were considered to have reached this score due to a clear focus on one single, highly impressive impact, compared to those that were singled out for their impressive range of different impacts.

Note that there were more instances of the smaller number of attributional phrases in the low-scoring corpus.

For Main Panel B, only six high-scoring and two low-scoring case studies are clearly identifiable and available to the public. The Main Panel B dataset is therefore too small for separate statistical analysis, and no generalisations should be made on the basis of only one high-scoring and one low-scoring submission.

Anthony L (2014) AntConc, 3.4.4 edn. Waseda University, Tokyo

Google Scholar  

Auerbach CF, Silverstein LB (2003) Qualitative data: an introduction to coding and analyzing data in qualitative research. New York University Press, New York, NY

Back L (2015) On the side of the powerful: the ‘impact agenda’ and sociology in public. https://www.thesociologicalreview.com/on-the-side-of-the-powerful-the-impact-agenda-sociology-in-public/ . Last Accessed 24 Jan 2020

Biber D, Gray B (2019) Grammatical complexity in academic English: linguistic change in writing. Cambridge University Press, Cambridge

Bornmann L, Haunschild R, Adams J (2018) Do altmetrics assess societal impact in the same way as case studies? An empirical analysis testing the convergent validity of altmetrics based on data from the UK Research Excellence Framework (REF). J Informetr 13(1):325–340

Article   Google Scholar  

Bozeman B, Youtie J (2017) Socio-economic impacts and public value of government-funded research: lessons from four US National Science Foundation initiatives. Res Policy 46(8):1387–1398

Braun V, Clarke V (2006) Using thematic analysis in psychology. Quale Res Psychol 3(2):77–101

Brezina V (2018) Statistics in corpus linguistics: a practical guide. Cambridge University Press, Cambridge

Book   Google Scholar  

Chubb J (2017) Instrumentalism and epistemic responsibility: researchers and the impact agenda in the UK and Australia. University of York

Chubb J, Watermeyer R (2017) Artifice or integrity in the marketization of research impact? Investigating the moral economy of (pathways to) impact statements within research funding proposals in the UK and Australia. Stud High Educ 42(2):2360–2372

Chubb J, Reed MS (2017) Epistemic responsibility as an edifying force in academic research: investigating the moral challenges and opportunities of an impact agenda in the UK and Australia. Palgrave Commun 3:20

Chubb J, Reed MS (2018) The politics of research impact: academic perceptions of the implications for research funding, motivation and quality. Br Politics 13(3):295–311

Clark WC et al. (2016) Crafting usable knowledge for sustainable development. Proc Natl Acad Sci USA 113(17):4570–4578

Article   ADS   CAS   PubMed   Google Scholar  

Coleman I (2019) The evolution of impact support in UK universities. Cactus Communications Pvt. Ltd

Derrick G (2018) The evaluators’ eye: impact assessment and academic peer review. Palgrave Macmillan

Derrick G (2019) Cultural impact of the impact agenda: implications for social sciences and humanities (SSH) research. In: Bueno D et al. (eds.), Higher education in the world, vol. 7. Humanities and higher education: synergies between science, technology and humanities. Global University Network for Innovation (GUNi)

Derrick G et al. (2018) Towards characterising negative impact: introducing Grimpact. In: Proceedings of the 23rd international conference on Science and Technology Indicators (STI 2018). Centre for Science and Technology Studies (CWTS), Leiden, The Netherlands

Esfandiari R, Barbary F (2017) A contrastive corpus-driven study of lexical bundles between English writers and Persian writers in psychology research articles. J Engl Academic Purp 29:21–42

Goodman CP (2003) The tacit dimension. Polanyiana 2(1):133–157

Graesser AC, McNamara DS, Kulikowich J (2011) Coh-Metrix: providing multi-level analyses of text characteristics. Educ Res 40:223–234

Haberlandt KF, Graesser AC (1985) Component processes in text comprehension and some of their interactions. J Exp Psychol: Gen 114(3):357–374

Hardie A (2014) Statistical identification of keywords, lockwords and collocations as a two-step procedure. ICAME 35, Nottingham

Hartley J (2016) Is time up for the Flesch measure of reading ease? Scientometrics 107(3):1523–1526

HEFCE (2011) Assessment framework and guidance on submissions. Ref. 02.2011

Hill S (2016) Assessing (for) impact: future assessment of the societal impact of research. Palgrave Commun 2:16073

Hyland K (2002) Directives: argument and engagement in academic writing. Appl Linguist 23(2):215–238

Hyland K (2008) As can be seen: lexical bundles and disciplinary variation. Engl Specif Purp 27(1):4–21

Lemos MC (2015) Usable climate knowledge for adaptive and co-managed water governance. Curr Opin Environ Sustain 12:48–52

MacDonald R (2017) “Impact”, research and slaying Zombies: the pressures and possibilities of the REF. Int J Sociol Soc Policy 37(11–12):696–710

Mårtensson P et al. (2016) Evaluating research: a multidisciplinary approach to assessing research practice and quality. Res Policy 45(3):593–603

McEnery T, Xiao R, Tono Y (2006) Corpus-based language studies: an advanced resource book. Routledge, Abingdon

McNamara DS et al. (2014) Automated evaluation of text and discourse with Coh-Metrix. Cambridge University Press, New York, NY

National Science Foundation (2014) Perspectives on broader impacts

Nesta (2018) Seven principles for public engagement in research and innovation policymaking. https://www.nesta.org.uk/documents/955/Seven_principles_HlLwdow.pdf . Last Accessed 12 Dec 2019

Oancea A (2010) The BERA/UCET review of the impacts of RAE 2008 on education research in UK higher education institutions. ERA/UCET, Macclesfield

Oancea (2014) Research assessment as governance technology in the United Kingdom: findings from a survey of RAE 2008 impacts. Z Erziehungswis 17(S6):83–110

Olssen M, Peters MA (2005) Neoliberalism, higher education and the knowledge economy: from the free market to knowledge capitalism. J Educ Policy 20(3):313–345

Pidd M, Broadbent J (2015) Business and management studies in the 2014 Research Excellence Framework. Br J Manag 26:569–581

Pollitt A et al. (2016) Understanding the relative valuation of research impact: a best–worst scaling experiment of the general public and biomedical and health researchers. BMJ Open 6(8):e010916

Article   PubMed   PubMed Central   Google Scholar  

Rau H, Goggins G, Fahy F (2018) From invisibility to impact: recognising the scientific and societal relevance of interdisciplinary sustainability research. Res Policy 47(1):266–276

Ravenscroft J et al. (2017) Measuring scientific impact beyond academia: an assessment of existing impact metrics and proposed improvements. PLoS ONE 12(3):e0173152

Article   PubMed   PubMed Central   CAS   Google Scholar  

Rayson P, Garside R (2000) Comparing corpora using frequency profiling, Workshop on Comparing Corpora, held in conjunction with the 38th annual meeting of the Association for Computational Linguistics (ACL 2000), Hong Kong, pp. 1–6

Rayson P, Berridge D, Francis B (2004) Extending the Cochran rule for the comparison of word frequencies between corpora. In: Purnelle G, Fairon C, Dister A (eds.), Le poids des mots: Proceedings of the 7th international conference on statistical analysis of textual data (JADT 2004) (II). Presses universitaires de Louvain, Louvain-la-Neuve, Belgium, pp. 926–936

Reed MS (2018) The research impact handbook, 2nd edn. Fast Track Impact, Huntly, Aberdeenshire

Reed MS (2019) Book review: new book calls for civil disobedience to fight “dehumanising” impact agenda. Fast Track Impact

Reed MS et al. (under review) Evaluating research impact: a methodological framework. Res Policy

Rhoads R, Torres CA (2005) The University, State, and Market: The Political Economy of Globalization in the Americas. Stanford University Press, Stanford

Saldana J (2009) The Coding Manual for Qualitative Researchers. Sage, Thousand Oaks

Scott M (1997) PC analysis of key words—and key key words. System 25(2):233–245

Sivertsen G (2017) Unique, but still best practice? The Research Excellence Framework (REF) from an international perspective. Palgrave Commun 3:17078

Smith S, Ward V, House A (2011) ‘Impact’ in the proposals for the UK’s Research Excellence Framework: shifting the boundaries of academic autonomy. Res Policy 40(10):1369–1379

Stern LN (2016) Building on success and learning from experience: an independent review of the Research Excellence Framework

Tsey K et al. (2016) Evaluating research impact: the development of a research for impact tool. Front Public Health 4:160

Vold ET (2006) Epistemic modality markers in research articles: a cross-linguistic and cross-disciplinary study. Int J Appl Linguist 16(1):61–87

Warry P (2006) Increasing the economic impact of the Research Councils (the Warry report). Research Council UK, Swindon

Watermeyer R (2019) Competitive accountability in academic life: the struggle for social impact and public legitimacy. Edward Elgar, Cheltenham

Watermeyer R, Hedgecoe A (2016) ‘Selling ‘impact’: peer reviewer projections of what is needed and what counts in REF impact case studies. A retrospective analysis. J Educ Policy 31:651–665

Watermeyer R, Chubb J (2018) Evaluating ‘impact’ in the UK’s Research Excellence Framework (REF): liminality, looseness and new modalities of scholarly distinction. Stud Higher Educ 44(9):1–13

Weinstein N et al. (2019) The real-time REF review: a pilot study to examine the feasibility of a longitudinal evaluation of perceptions and attitudes towards REF 2021

Wilsdon J et al. (2015) Metric tide: report of the independent review of the role of metrics in research assessment and management

Yang A, Zheng S, Ge G (2015) Epistemic modality in English-medium medical research articles: a systemic functional perspective. Engl Specif Purp 38:1–10

Download references

Acknowledgements

Thanks to Dr. Adam Mearns, School of English Literature, Language & Linguistics at Newcastle University for help with statistics and wider input to research design as a co-supervisor on the Ph.D. research upon which this article is based.

Author information

Authors and affiliations.

Newcastle University, Newcastle, UK

Bella Reichard, Mark S Reed & Andrea Whittle

University of York, York, UK

University of Leeds, Leeds, UK

Northumbria University, Newcastle, UK

Lucy Jowett & Alisha Peart

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Mark S Reed .

Ethics declarations

Competing interests.

MR is CEO of Fast Track Impact Ltd, providing impact training to researchers internationally. JC worked with Research England as part of the Real-Time REF Review in parallel with the writing of this article. BR offers consultancy services reviewing REF impact case studies.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Reichard, B., Reed, M.S., Chubb, J. et al. Writing impact case studies: a comparative study of high-scoring and low-scoring case studies from REF2014. Palgrave Commun 6 , 31 (2020). https://doi.org/10.1057/s41599-020-0394-7

Download citation

Received : 10 July 2019

Accepted : 09 January 2020

Published : 25 February 2020

DOI : https://doi.org/10.1057/s41599-020-0394-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

comparative case study examples

This website may not work correctly because your browser is out of date. Please update your browser .

Comparative case studies

Comparative case studies can be useful to check variation in program implementation. 

Comparative case studies are another way of checking if results match the program theory. Each context and environment is different. The comparative case study can help the evaluator check whether the program theory holds for each different context and environment. If implementation differs, the reasons and results can be recorded. The opposite is also true, similar patterns across sites can increase the confidence in results.

Evaluators used a comparative case study method for the National Cancer Institute’s (NCI’s) Community Cancer Centers Program (NCCCP). The aim of this program was to expand cancer research and deliver the latest, most advanced cancer care to a greater number of Americans in the communities in which they live via community hospitals. The evaluation examined each of the program components (listed below) at each program site. The six program components were:

  • increasing capacity to collect biospecimens per NCI’s best practices;
  • enhancing clinical trials (CT) research;
  • reducing disparities across the cancer continuum;
  • improving the use of information technology (IT) and electronic medical records (EMRs) to support improvements in research and care delivery;
  • improving quality of cancer care and related areas, such as the development of integrated, multidisciplinary care teams; and
  • placing greater emphasis on survivorship and palliative care.

The evaluators use of this method assisted in providing recommendations at the program level as well as to each specific program site.

Advice for choosing this method

  • Compare cases with the same outcome but differences in an intervention (known as MDD, most different design)
  • Compare cases with the same intervention but differences in outcomes (known as MSD, most similar design)

Advice for using this method

  • Consider the variables of each case, and which cases can be matched for comparison.
  • Provide the evaluator with as much detail and background on each case as possible. Provide advice on possible criteria for matching.

National Cancer Institute, (2007).  NCI Community Cancer Centers Program Evaluation (NCCCP) . Retrieved from website: https://digitalscholarship.unlv.edu/jhdrp/vol8/iss1/4/

Expand to view all resources related to 'Comparative case studies'

  • Broadening the range of designs and methods for impact evaluations

'Comparative case studies' is referenced in:

Framework/guide.

  • Rainbow Framework :  Check the results are consistent with causal contribution
  • Sustained and Emerging Impacts Evaluation (SEIE)

Back to top

© 2022 BetterEvaluation. All right reserved.

What is Comparative Analysis and How to Conduct It? (+ Examples)

Appinio Research · 30.10.2023 · 36min read

What Is Comparative Analysis and How to Conduct It Examples

Have you ever faced a complex decision, wondering how to make the best choice among multiple options? In a world filled with data and possibilities, the art of comparative analysis holds the key to unlocking clarity amidst the chaos.

In this guide, we'll demystify the power of comparative analysis, revealing its practical applications, methodologies, and best practices. Whether you're a business leader, researcher, or simply someone seeking to make more informed decisions, join us as we explore the intricacies of comparative analysis and equip you with the tools to chart your course with confidence.

What is Comparative Analysis?

Comparative analysis is a systematic approach used to evaluate and compare two or more entities, variables, or options to identify similarities, differences, and patterns. It involves assessing the strengths, weaknesses, opportunities, and threats associated with each entity or option to make informed decisions.

The primary purpose of comparative analysis is to provide a structured framework for decision-making by:

  • Facilitating Informed Choices: Comparative analysis equips decision-makers with data-driven insights, enabling them to make well-informed choices among multiple options.
  • Identifying Trends and Patterns: It helps identify recurring trends, patterns, and relationships among entities or variables, shedding light on underlying factors influencing outcomes.
  • Supporting Problem Solving: Comparative analysis aids in solving complex problems by systematically breaking them down into manageable components and evaluating potential solutions.
  • Enhancing Transparency: By comparing multiple options, comparative analysis promotes transparency in decision-making processes, allowing stakeholders to understand the rationale behind choices.
  • Mitigating Risks : It helps assess the risks associated with each option, allowing organizations to develop risk mitigation strategies and make risk-aware decisions.
  • Optimizing Resource Allocation: Comparative analysis assists in allocating resources efficiently by identifying areas where resources can be optimized for maximum impact.
  • Driving Continuous Improvement: By comparing current performance with historical data or benchmarks, organizations can identify improvement areas and implement growth strategies.

Importance of Comparative Analysis in Decision-Making

  • Data-Driven Decision-Making: Comparative analysis relies on empirical data and objective evaluation, reducing the influence of biases and subjective judgments in decision-making. It ensures decisions are based on facts and evidence.
  • Objective Assessment: It provides an objective and structured framework for evaluating options, allowing decision-makers to focus on key criteria and avoid making decisions solely based on intuition or preferences.
  • Risk Assessment: Comparative analysis helps assess and quantify risks associated with different options. This risk awareness enables organizations to make proactive risk management decisions.
  • Prioritization: By ranking options based on predefined criteria, comparative analysis enables decision-makers to prioritize actions or investments, directing resources to areas with the most significant impact.
  • Strategic Planning: It is integral to strategic planning, helping organizations align their decisions with overarching goals and objectives. Comparative analysis ensures decisions are consistent with long-term strategies.
  • Resource Allocation: Organizations often have limited resources. Comparative analysis assists in allocating these resources effectively, ensuring they are directed toward initiatives with the highest potential returns.
  • Continuous Improvement: Comparative analysis supports a culture of continuous improvement by identifying areas for enhancement and guiding iterative decision-making processes.
  • Stakeholder Communication: It enhances transparency in decision-making, making it easier to communicate decisions to stakeholders. Stakeholders can better understand the rationale behind choices when supported by comparative analysis.
  • Competitive Advantage: In business and competitive environments , comparative analysis can provide a competitive edge by identifying opportunities to outperform competitors or address weaknesses.
  • Informed Innovation: When evaluating new products , technologies, or strategies, comparative analysis guides the selection of the most promising options, reducing the risk of investing in unsuccessful ventures.

In summary, comparative analysis is a valuable tool that empowers decision-makers across various domains to make informed, data-driven choices, manage risks, allocate resources effectively, and drive continuous improvement. Its structured approach enhances decision quality and transparency, contributing to the success and competitiveness of organizations and research endeavors.

How to Prepare for Comparative Analysis?

1. define objectives and scope.

Before you begin your comparative analysis, clearly defining your objectives and the scope of your analysis is essential. This step lays the foundation for the entire process. Here's how to approach it:

  • Identify Your Goals: Start by asking yourself what you aim to achieve with your comparative analysis. Are you trying to choose between two products for your business? Are you evaluating potential investment opportunities? Knowing your objectives will help you stay focused throughout the analysis.
  • Define Scope: Determine the boundaries of your comparison. What will you include, and what will you exclude? For example, if you're analyzing market entry strategies for a new product, specify whether you're looking at a specific geographic region or a particular target audience.
  • Stakeholder Alignment: Ensure that all stakeholders involved in the analysis understand and agree on the objectives and scope. This alignment will prevent misunderstandings and ensure the analysis meets everyone's expectations.

2. Gather Relevant Data and Information

The quality of your comparative analysis heavily depends on the data and information you gather. Here's how to approach this crucial step:

  • Data Sources: Identify where you'll obtain the necessary data. Will you rely on primary sources , such as surveys and interviews, to collect original data? Or will you use secondary sources, like published research and industry reports, to access existing data? Consider the advantages and disadvantages of each source.
  • Data Collection Plan: Develop a plan for collecting data. This should include details about the methods you'll use, the timeline for data collection, and who will be responsible for gathering the data.
  • Data Relevance: Ensure that the data you collect is directly relevant to your objectives. Irrelevant or extraneous data can lead to confusion and distract from the core analysis.

3. Select Appropriate Criteria for Comparison

Choosing the right criteria for comparison is critical to a successful comparative analysis. Here's how to go about it:

  • Relevance to Objectives: Your chosen criteria should align closely with your analysis objectives. For example, if you're comparing job candidates, your criteria might include skills, experience, and cultural fit.
  • Measurability: Consider whether you can quantify the criteria. Measurable criteria are easier to analyze. If you're comparing marketing campaigns, you might measure criteria like click-through rates, conversion rates, and return on investment.
  • Weighting Criteria : Not all criteria are equally important. You'll need to assign weights to each criterion based on its relative importance. Weighting helps ensure that the most critical factors have a more significant impact on the final decision.

4. Establish a Clear Framework

Once you have your objectives, data, and criteria in place, it's time to establish a clear framework for your comparative analysis. This framework will guide your process and ensure consistency. Here's how to do it:

  • Comparative Matrix: Consider using a comparative matrix or spreadsheet to organize your data. Each row in the matrix represents an option or entity you're comparing, and each column corresponds to a criterion. This visual representation makes it easy to compare and contrast data.
  • Timeline: Determine the time frame for your analysis. Is it a one-time comparison, or will you conduct ongoing analyses? Having a defined timeline helps you manage the analysis process efficiently.
  • Define Metrics: Specify the metrics or scoring system you'll use to evaluate each criterion. For example, if you're comparing potential office locations, you might use a scoring system from 1 to 5 for factors like cost, accessibility, and amenities.

With your objectives, data, criteria, and framework established, you're ready to move on to the next phase of comparative analysis: data collection and organization.

Comparative Analysis Data Collection

Data collection and organization are critical steps in the comparative analysis process. We'll explore how to gather and structure the data you need for a successful analysis.

1. Utilize Primary Data Sources

Primary data sources involve gathering original data directly from the source. This approach offers unique advantages, allowing you to tailor your data collection to your specific research needs.

Some popular primary data sources include:

  • Surveys and Questionnaires: Design surveys or questionnaires and distribute them to collect specific information from individuals or groups. This method is ideal for obtaining firsthand insights, such as customer preferences or employee feedback.
  • Interviews: Conduct structured interviews with relevant stakeholders or experts. Interviews provide an opportunity to delve deeper into subjects and gather qualitative data, making them valuable for in-depth analysis.
  • Observations: Directly observe and record data from real-world events or settings. Observational data can be instrumental in fields like anthropology, ethnography, and environmental studies.
  • Experiments: In controlled environments, experiments allow you to manipulate variables and measure their effects. This method is common in scientific research and product testing.

When using primary data sources, consider factors like sample size , survey design, and data collection methods to ensure the reliability and validity of your data.

2. Harness Secondary Data Sources

Secondary data sources involve using existing data collected by others. These sources can provide a wealth of information and save time and resources compared to primary data collection.

Here are common types of secondary data sources:

  • Public Records: Government publications, census data, and official reports offer valuable information on demographics, economic trends, and public policies. They are often free and readily accessible.
  • Academic Journals: Scholarly articles provide in-depth research findings across various disciplines. They are helpful for accessing peer-reviewed studies and staying current with academic discourse.
  • Industry Reports: Industry-specific reports and market research publications offer insights into market trends, consumer behavior, and competitive landscapes. They are essential for businesses making strategic decisions.
  • Online Databases: Online platforms like Statista , PubMed , and Google Scholar provide a vast repository of data and research articles. They offer search capabilities and access to a wide range of data sets.

When using secondary data sources, critically assess the credibility, relevance, and timeliness of the data. Ensure that it aligns with your research objectives.

3. Ensure and Validate Data Quality

Data quality is paramount in comparative analysis. Poor-quality data can lead to inaccurate conclusions and flawed decision-making. Here's how to ensure data validation and reliability:

  • Cross-Verification: Whenever possible, cross-verify data from multiple sources. Consistency among different sources enhances the reliability of the data.
  • Sample Size : Ensure that your data sample size is statistically significant for meaningful analysis. A small sample may not accurately represent the population.
  • Data Integrity: Check for data integrity issues, such as missing values, outliers, or duplicate entries. Address these issues before analysis to maintain data quality.
  • Data Source Reliability: Assess the reliability and credibility of the data sources themselves. Consider factors like the reputation of the institution or organization providing the data.

4. Organize Data Effectively

Structuring your data for comparison is a critical step in the analysis process. Organized data makes it easier to draw insights and make informed decisions. Here's how to structure data effectively:

  • Data Cleaning: Before analysis, clean your data to remove inconsistencies, errors, and irrelevant information. Data cleaning may involve data transformation, imputation of missing values, and removing outliers.
  • Normalization: Standardize data to ensure fair comparisons. Normalization adjusts data to a standard scale, making comparing variables with different units or ranges possible.
  • Variable Labeling: Clearly label variables and data points for easy identification. Proper labeling enhances the transparency and understandability of your analysis.
  • Data Organization: Organize data into a format that suits your analysis methods. For quantitative analysis, this might mean creating a matrix, while qualitative analysis may involve categorizing data into themes.

By paying careful attention to data collection, validation, and organization, you'll set the stage for a robust and insightful comparative analysis. Next, we'll explore various methodologies you can employ in your analysis, ranging from qualitative approaches to quantitative methods and examples.

Comparative Analysis Methods

When it comes to comparative analysis, various methodologies are available, each suited to different research goals and data types. In this section, we'll explore five prominent methodologies in detail.

Qualitative Comparative Analysis (QCA)

Qualitative Comparative Analysis (QCA) is a methodology often used when dealing with complex, non-linear relationships among variables. It seeks to identify patterns and configurations among factors that lead to specific outcomes.

  • Case-by-Case Analysis: QCA involves evaluating individual cases (e.g., organizations, regions, or events) rather than analyzing aggregate data. Each case's unique characteristics are considered.
  • Boolean Logic: QCA employs Boolean algebra to analyze data. Variables are categorized as either present or absent, allowing for the examination of different combinations and logical relationships.
  • Necessary and Sufficient Conditions: QCA aims to identify necessary and sufficient conditions for a specific outcome to occur. It helps answer questions like, "What conditions are necessary for a successful product launch?"
  • Fuzzy Set Theory: In some cases, QCA may use fuzzy set theory to account for degrees of membership in a category, allowing for more nuanced analysis.

QCA is particularly useful in fields such as sociology, political science, and organizational studies, where understanding complex interactions is essential.

Quantitative Comparative Analysis

Quantitative Comparative Analysis involves the use of numerical data and statistical techniques to compare and analyze variables. It's suitable for situations where data is quantitative, and relationships can be expressed numerically.

  • Statistical Tools: Quantitative comparative analysis relies on statistical methods like regression analysis, correlation, and hypothesis testing. These tools help identify relationships, dependencies, and trends within datasets.
  • Data Measurement: Ensure that variables are measured consistently using appropriate scales (e.g., ordinal, interval, ratio) for meaningful analysis. Variables may include numerical values like revenue, customer satisfaction scores, or product performance metrics.
  • Data Visualization: Create visual representations of data using charts, graphs, and plots. Visualization aids in understanding complex relationships and presenting findings effectively.
  • Statistical Significance: Assess the statistical significance of relationships. Statistical significance indicates whether observed differences or relationships are likely to be real rather than due to chance.

Quantitative comparative analysis is commonly applied in economics, social sciences, and market research to draw empirical conclusions from numerical data.

Case Studies

Case studies involve in-depth examinations of specific instances or cases to gain insights into real-world scenarios. Comparative case studies allow researchers to compare and contrast multiple cases to identify patterns, differences, and lessons.

  • Narrative Analysis: Case studies often involve narrative analysis, where researchers construct detailed narratives of each case, including context, events, and outcomes.
  • Contextual Understanding: In comparative case studies, it's crucial to consider the context within which each case operates. Understanding the context helps interpret findings accurately.
  • Cross-Case Analysis: Researchers conduct cross-case analysis to identify commonalities and differences across cases. This process can lead to the discovery of factors that influence outcomes.
  • Triangulation: To enhance the validity of findings, researchers may use multiple data sources and methods to triangulate information and ensure reliability.

Case studies are prevalent in fields like psychology, business, and sociology, where deep insights into specific situations are valuable.

SWOT Analysis

SWOT Analysis is a strategic tool used to assess the Strengths, Weaknesses, Opportunities, and Threats associated with a particular entity or situation. While it's commonly used in business, it can be adapted for various comparative analyses.

  • Internal and External Factors: SWOT Analysis examines both internal factors (Strengths and Weaknesses), such as organizational capabilities, and external factors (Opportunities and Threats), such as market conditions and competition.
  • Strategic Planning: The insights from SWOT Analysis inform strategic decision-making. By identifying strengths and opportunities, organizations can leverage their advantages. Likewise, addressing weaknesses and threats helps mitigate risks.
  • Visual Representation: SWOT Analysis is often presented as a matrix or a 2x2 grid, making it visually accessible and easy to communicate to stakeholders.
  • Continuous Monitoring: SWOT Analysis is not a one-time exercise. Organizations use it periodically to adapt to changing circumstances and make informed decisions.

SWOT Analysis is versatile and can be applied in business, healthcare, education, and any context where a structured assessment of factors is needed.

Benchmarking

Benchmarking involves comparing an entity's performance, processes, or practices to those of industry leaders or best-in-class organizations. It's a powerful tool for continuous improvement and competitive analysis.

  • Identify Performance Gaps: Benchmarking helps identify areas where an entity lags behind its peers or industry standards. These performance gaps highlight opportunities for improvement.
  • Data Collection: Gather data on key performance metrics from both internal and external sources. This data collection phase is crucial for meaningful comparisons.
  • Comparative Analysis: Compare your organization's performance data with that of benchmark organizations. This analysis can reveal where you excel and where adjustments are needed.
  • Continuous Improvement: Benchmarking is a dynamic process that encourages continuous improvement. Organizations use benchmarking findings to set performance goals and refine their strategies.

Benchmarking is widely used in business, manufacturing, healthcare, and customer service to drive excellence and competitiveness.

Each of these methodologies brings a unique perspective to comparative analysis, allowing you to choose the one that best aligns with your research objectives and the nature of your data. The choice between qualitative and quantitative methods, or a combination of both, depends on the complexity of the analysis and the questions you seek to answer.

How to Conduct Comparative Analysis?

Once you've prepared your data and chosen an appropriate methodology, it's time to dive into the process of conducting a comparative analysis. We will guide you through the essential steps to extract meaningful insights from your data.

What Is Comparative Analysis and How to Conduct It Examples

1. Identify Key Variables and Metrics

Identifying key variables and metrics is the first crucial step in conducting a comparative analysis. These are the factors or indicators you'll use to assess and compare your options.

  • Relevance to Objectives: Ensure the chosen variables and metrics align closely with your analysis objectives. When comparing marketing strategies, relevant metrics might include customer acquisition cost, conversion rate, and retention.
  • Quantitative vs. Qualitative : Decide whether your analysis will focus on quantitative data (numbers) or qualitative data (descriptive information). In some cases, a combination of both may be appropriate.
  • Data Availability: Consider the availability of data. Ensure you can access reliable and up-to-date data for all selected variables and metrics.
  • KPIs: Key Performance Indicators (KPIs) are often used as the primary metrics in comparative analysis. These are metrics that directly relate to your goals and objectives.

2. Visualize Data for Clarity

Data visualization techniques play a vital role in making complex information more accessible and understandable. Effective data visualization allows you to convey insights and patterns to stakeholders. Consider the following approaches:

  • Charts and Graphs: Use various types of charts, such as bar charts, line graphs, and pie charts, to represent data. For example, a line graph can illustrate trends over time, while a bar chart can compare values across categories.
  • Heatmaps: Heatmaps are particularly useful for visualizing large datasets and identifying patterns through color-coding. They can reveal correlations, concentrations, and outliers.
  • Scatter Plots: Scatter plots help visualize relationships between two variables. They are especially useful for identifying trends, clusters, or outliers.
  • Dashboards: Create interactive dashboards that allow users to explore data and customize views. Dashboards are valuable for ongoing analysis and reporting.
  • Infographics: For presentations and reports, consider using infographics to summarize key findings in a visually engaging format.

Effective data visualization not only enhances understanding but also aids in decision-making by providing clear insights at a glance.

3. Establish Clear Comparative Frameworks

A well-structured comparative framework provides a systematic approach to your analysis. It ensures consistency and enables you to make meaningful comparisons. Here's how to create one:

  • Comparison Matrices: Consider using matrices or spreadsheets to organize your data. Each row represents an option or entity, and each column corresponds to a variable or metric. This matrix format allows for side-by-side comparisons.
  • Decision Trees: In complex decision-making scenarios, decision trees help map out possible outcomes based on different criteria and variables. They visualize the decision-making process.
  • Scenario Analysis: Explore different scenarios by altering variables or criteria to understand how changes impact outcomes. Scenario analysis is valuable for risk assessment and planning.
  • Checklists: Develop checklists or scoring sheets to systematically evaluate each option against predefined criteria. Checklists ensure that no essential factors are overlooked.

A well-structured comparative framework simplifies the analysis process, making it easier to draw meaningful conclusions and make informed decisions.

4. Evaluate and Score Criteria

Evaluating and scoring criteria is a critical step in comparative analysis, as it quantifies the performance of each option against the chosen criteria.

  • Scoring System: Define a scoring system that assigns values to each criterion for every option. Common scoring systems include numerical scales, percentage scores, or qualitative ratings (e.g., high, medium, low).
  • Consistency: Ensure consistency in scoring by defining clear guidelines for each score. Provide examples or descriptions to help evaluators understand what each score represents.
  • Data Collection: Collect data or information relevant to each criterion for all options. This may involve quantitative data (e.g., sales figures) or qualitative data (e.g., customer feedback).
  • Aggregation: Aggregate the scores for each option to obtain an overall evaluation. This can be done by summing the individual criterion scores or applying weighted averages.
  • Normalization: If your criteria have different measurement scales or units, consider normalizing the scores to create a level playing field for comparison.

5. Assign Importance to Criteria

Not all criteria are equally important in a comparative analysis. Weighting criteria allows you to reflect their relative significance in the final decision-making process.

  • Relative Importance: Assess the importance of each criterion in achieving your objectives. Criteria directly aligned with your goals may receive higher weights.
  • Weighting Methods: Choose a weighting method that suits your analysis. Common methods include expert judgment, analytic hierarchy process (AHP), or data-driven approaches based on historical performance.
  • Impact Analysis: Consider how changes in the weights assigned to criteria would affect the final outcome. This sensitivity analysis helps you understand the robustness of your decisions.
  • Stakeholder Input: Involve relevant stakeholders or decision-makers in the weighting process. Their input can provide valuable insights and ensure alignment with organizational goals.
  • Transparency: Clearly document the rationale behind the assigned weights to maintain transparency in your analysis.

By weighting criteria, you ensure that the most critical factors have a more significant influence on the final evaluation, aligning the analysis more closely with your objectives and priorities.

With these steps in place, you're well-prepared to conduct a comprehensive comparative analysis. The next phase involves interpreting your findings, drawing conclusions, and making informed decisions based on the insights you've gained.

Comparative Analysis Interpretation

Interpreting the results of your comparative analysis is a crucial phase that transforms data into actionable insights. We'll delve into various aspects of interpretation and how to make sense of your findings.

  • Contextual Understanding: Before diving into the data, consider the broader context of your analysis. Understand the industry trends, market conditions, and any external factors that may have influenced your results.
  • Drawing Conclusions: Summarize your findings clearly and concisely. Identify trends, patterns, and significant differences among the options or variables you've compared.
  • Quantitative vs. Qualitative Analysis: Depending on the nature of your data and analysis, you may need to balance both quantitative and qualitative interpretations. Qualitative insights can provide context and nuance to quantitative findings.
  • Comparative Visualization: Visual aids such as charts, graphs, and tables can help convey your conclusions effectively. Choose visual representations that align with the nature of your data and the key points you want to emphasize.
  • Outliers and Anomalies: Identify and explain any outliers or anomalies in your data. Understanding these exceptions can provide valuable insights into unusual cases or factors affecting your analysis.
  • Cross-Validation: Validate your conclusions by comparing them with external benchmarks, industry standards, or expert opinions. Cross-validation helps ensure the reliability of your findings.
  • Implications for Decision-Making: Discuss how your analysis informs decision-making. Clearly articulate the practical implications of your findings and their relevance to your initial objectives.
  • Actionable Insights: Emphasize actionable insights that can guide future strategies, policies, or actions. Make recommendations based on your analysis, highlighting the steps needed to capitalize on strengths or address weaknesses.
  • Continuous Improvement: Encourage a culture of continuous improvement by using your analysis as a feedback mechanism. Suggest ways to monitor and adapt strategies over time based on evolving circumstances.

Comparative Analysis Applications

Comparative analysis is a versatile methodology that finds application in various fields and scenarios. Let's explore some of the most common and impactful applications.

Business Decision-Making

Comparative analysis is widely employed in business to inform strategic decisions and drive success. Key applications include:

Market Research and Competitive Analysis

  • Objective: To assess market opportunities and evaluate competitors.
  • Methods: Analyzing market trends, customer preferences, competitor strengths and weaknesses, and market share.
  • Outcome: Informed product development, pricing strategies, and market entry decisions.

Product Comparison and Benchmarking

  • Objective: To compare the performance and features of products or services.
  • Methods: Evaluating product specifications, customer reviews, and pricing.
  • Outcome: Identifying strengths and weaknesses, improving product quality, and setting competitive pricing.

Financial Analysis

  • Objective: To evaluate financial performance and make investment decisions.
  • Methods: Comparing financial statements, ratios, and performance indicators of companies.
  • Outcome: Informed investment choices, risk assessment, and portfolio management.

Healthcare and Medical Research

In the healthcare and medical research fields, comparative analysis is instrumental in understanding diseases, treatment options, and healthcare systems.

Clinical Trials and Drug Development

  • Objective: To compare the effectiveness of different treatments or drugs.
  • Methods: Analyzing clinical trial data, patient outcomes, and side effects.
  • Outcome: Informed decisions about drug approvals, treatment protocols, and patient care.

Health Outcomes Research

  • Objective: To assess the impact of healthcare interventions.
  • Methods: Comparing patient health outcomes before and after treatment or between different treatment approaches.
  • Outcome: Improved healthcare guidelines, cost-effectiveness analysis, and patient care plans.

Healthcare Systems Evaluation

  • Objective: To assess the performance of healthcare systems.
  • Methods: Comparing healthcare delivery models, patient satisfaction, and healthcare costs.
  • Outcome: Informed healthcare policy decisions, resource allocation, and system improvements.

Social Sciences and Policy Analysis

Comparative analysis is a fundamental tool in social sciences and policy analysis, aiding in understanding complex societal issues.

Educational Research

  • Objective: To compare educational systems and practices.
  • Methods: Analyzing student performance, curriculum effectiveness, and teaching methods.
  • Outcome: Informed educational policies, curriculum development, and school improvement strategies.

Political Science

  • Objective: To study political systems, elections, and governance.
  • Methods: Comparing election outcomes, policy impacts, and government structures.
  • Outcome: Insights into political behavior, policy effectiveness, and governance reforms.

Social Welfare and Poverty Analysis

  • Objective: To evaluate the impact of social programs and policies.
  • Methods: Comparing the well-being of individuals or communities with and without access to social assistance.
  • Outcome: Informed policymaking, poverty reduction strategies, and social program improvements.

Environmental Science and Sustainability

Comparative analysis plays a pivotal role in understanding environmental issues and promoting sustainability.

Environmental Impact Assessment

  • Objective: To assess the environmental consequences of projects or policies.
  • Methods: Comparing ecological data, resource use, and pollution levels.
  • Outcome: Informed environmental mitigation strategies, sustainable development plans, and regulatory decisions.

Climate Change Analysis

  • Objective: To study climate patterns and their impacts.
  • Methods: Comparing historical climate data, temperature trends, and greenhouse gas emissions.
  • Outcome: Insights into climate change causes, adaptation strategies, and policy recommendations.

Ecosystem Health Assessment

  • Objective: To evaluate the health and resilience of ecosystems.
  • Methods: Comparing biodiversity, habitat conditions, and ecosystem services.
  • Outcome: Conservation efforts, restoration plans, and ecological sustainability measures.

Technology and Innovation

Comparative analysis is crucial in the fast-paced world of technology and innovation.

Product Development and Innovation

  • Objective: To assess the competitiveness and innovation potential of products or technologies.
  • Methods: Comparing research and development investments, technology features, and market demand.
  • Outcome: Informed innovation strategies, product roadmaps, and patent decisions.

User Experience and Usability Testing

  • Objective: To evaluate the user-friendliness of software applications or digital products.
  • Methods: Comparing user feedback, usability metrics, and user interface designs.
  • Outcome: Improved user experiences, interface redesigns, and product enhancements.

Technology Adoption and Market Entry

  • Objective: To analyze market readiness and risks for new technologies.
  • Methods: Comparing market conditions, regulatory landscapes, and potential barriers.
  • Outcome: Informed market entry strategies, risk assessments, and investment decisions.

These diverse applications of comparative analysis highlight its flexibility and importance in decision-making across various domains. Whether in business, healthcare, social sciences, environmental studies, or technology, comparative analysis empowers researchers and decision-makers to make informed choices and drive positive outcomes.

Comparative Analysis Best Practices

Successful comparative analysis relies on following best practices and avoiding common pitfalls. Implementing these practices enhances the effectiveness and reliability of your analysis.

  • Clearly Defined Objectives: Start with well-defined objectives that outline what you aim to achieve through the analysis. Clear objectives provide focus and direction.
  • Data Quality Assurance: Ensure data quality by validating, cleaning, and normalizing your data. Poor-quality data can lead to inaccurate conclusions.
  • Transparent Methodologies: Clearly explain the methodologies and techniques you've used for analysis. Transparency builds trust and allows others to assess the validity of your approach.
  • Consistent Criteria: Maintain consistency in your criteria and metrics across all options or variables. Inconsistent criteria can lead to biased results.
  • Sensitivity Analysis: Conduct sensitivity analysis by varying key parameters, such as weights or assumptions, to assess the robustness of your conclusions.
  • Stakeholder Involvement: Involve relevant stakeholders throughout the analysis process. Their input can provide valuable perspectives and ensure alignment with organizational goals.
  • Critical Evaluation of Assumptions: Identify and critically evaluate any assumptions made during the analysis. Assumptions should be explicit and justifiable.
  • Holistic View: Take a holistic view of the analysis by considering both short-term and long-term implications. Avoid focusing solely on immediate outcomes.
  • Documentation: Maintain thorough documentation of your analysis, including data sources, calculations, and decision criteria. Documentation supports transparency and facilitates reproducibility.
  • Continuous Learning: Stay updated with the latest analytical techniques, tools, and industry trends. Continuous learning helps you adapt your analysis to changing circumstances.
  • Peer Review: Seek peer review or expert feedback on your analysis. External perspectives can identify blind spots and enhance the quality of your work.
  • Ethical Considerations: Address ethical considerations, such as privacy and data protection, especially when dealing with sensitive or personal data.

By adhering to these best practices, you'll not only improve the rigor of your comparative analysis but also ensure that your findings are reliable, actionable, and aligned with your objectives.

Comparative Analysis Examples

To illustrate the practical application and benefits of comparative analysis, let's explore several real-world examples across different domains. These examples showcase how organizations and researchers leverage comparative analysis to make informed decisions, solve complex problems, and drive improvements:

Retail Industry - Price Competitiveness Analysis

Objective: A retail chain aims to assess its price competitiveness against competitors in the same market.

Methodology:

  • Collect pricing data for a range of products offered by the retail chain and its competitors.
  • Organize the data into a comparative framework, categorizing products by type and price range.
  • Calculate price differentials, averages, and percentiles for each product category.
  • Analyze the findings to identify areas where the retail chain's prices are higher or lower than competitors.

Outcome: The analysis reveals that the retail chain's prices are consistently lower in certain product categories but higher in others. This insight informs pricing strategies, allowing the retailer to adjust prices to remain competitive in the market.

Healthcare - Comparative Effectiveness Research

Objective: Researchers aim to compare the effectiveness of two different treatment methods for a specific medical condition.

  • Recruit patients with the medical condition and randomly assign them to two treatment groups.
  • Collect data on treatment outcomes, including symptom relief, side effects, and recovery times.
  • Analyze the data using statistical methods to compare the treatment groups.
  • Consider factors like patient demographics and baseline health status as potential confounding variables.

Outcome: The comparative analysis reveals that one treatment method is statistically more effective than the other in relieving symptoms and has fewer side effects. This information guides medical professionals in recommending the more effective treatment to patients.

Environmental Science - Carbon Emission Analysis

Objective: An environmental organization seeks to compare carbon emissions from various transportation modes in a metropolitan area.

  • Collect data on the number of vehicles, their types (e.g., cars, buses, bicycles), and fuel consumption for each mode of transportation.
  • Calculate the total carbon emissions for each mode based on fuel consumption and emission factors.
  • Create visualizations such as bar charts and pie charts to represent the emissions from each transportation mode.
  • Consider factors like travel distance, occupancy rates, and the availability of alternative fuels.

Outcome: The comparative analysis reveals that public transportation generates significantly lower carbon emissions per passenger mile compared to individual car travel. This information supports advocacy for increased public transit usage to reduce carbon footprint.

Technology Industry - Feature Comparison for Software Development Tools

Objective: A software development team needs to choose the most suitable development tool for an upcoming project.

  • Create a list of essential features and capabilities required for the project.
  • Research and compile information on available development tools in the market.
  • Develop a comparative matrix or scoring system to evaluate each tool's features against the project requirements.
  • Assign weights to features based on their importance to the project.

Outcome: The comparative analysis highlights that Tool A excels in essential features critical to the project, such as version control integration and debugging capabilities. The development team selects Tool A as the preferred choice for the project.

Educational Research - Comparative Study of Teaching Methods

Objective: A school district aims to improve student performance by comparing the effectiveness of traditional classroom teaching with online learning.

  • Randomly assign students to two groups: one taught using traditional methods and the other through online courses.
  • Administer pre- and post-course assessments to measure knowledge gain.
  • Collect feedback from students and teachers on the learning experiences.
  • Analyze assessment scores and feedback to compare the effectiveness and satisfaction levels of both teaching methods.

Outcome: The comparative analysis reveals that online learning leads to similar knowledge gains as traditional classroom teaching. However, students report higher satisfaction and flexibility with the online approach. The school district considers incorporating online elements into its curriculum.

These examples illustrate the diverse applications of comparative analysis across industries and research domains. Whether optimizing pricing strategies in retail, evaluating treatment effectiveness in healthcare, assessing environmental impacts, choosing the right software tool, or improving educational methods, comparative analysis empowers decision-makers with valuable insights for informed choices and positive outcomes.

Conclusion for Comparative Analysis

Comparative analysis is your compass in the world of decision-making. It helps you see the bigger picture, spot opportunities, and navigate challenges. By defining your objectives, gathering data, applying methodologies, and following best practices, you can harness the power of Comparative Analysis to make informed choices and drive positive outcomes.

Remember, Comparative analysis is not just a tool; it's a mindset that empowers you to transform data into insights and uncertainty into clarity. So, whether you're steering a business, conducting research, or facing life's choices, embrace Comparative Analysis as your trusted guide on the journey to better decisions. With it, you can chart your course, make impactful choices, and set sail toward success.

How to Conduct Comparative Analysis in Minutes?

Are you ready to revolutionize your approach to market research and comparative analysis? Appinio , a real-time market research platform, empowers you to harness the power of real-time consumer insights for swift, data-driven decisions. Here's why you should choose Appinio:

  • Speedy Insights:  Get from questions to insights in minutes, enabling you to conduct comparative analysis without delay.
  • User-Friendly:  No need for a PhD in research – our intuitive platform is designed for everyone, making it easy to collect and analyze data.
  • Global Reach:  With access to over 90 countries and the ability to define your target group from 1200+ characteristics, Appinio provides a worldwide perspective for your comparative analysis

Register now EN

Get free access to the platform!

Join the loop 💌

Be the first to hear about new updates, product news, and data insights. We'll send it all straight to your inbox.

Get the latest market research news straight to your inbox! 💌

Wait, there's more

Interval Scale Definition Characteristics Examples

07.05.2024 | 28min read

Interval Scale: Definition, Characteristics, Examples

What is Qualitative Observation Definition Types Examples

03.05.2024 | 29min read

What is Qualitative Observation? Definition, Types, Examples

What is a Perceptual Map and How to Make One Template

02.05.2024 | 32min read

What is a Perceptual Map and How to Make One? (+ Template)

  • Schedule an Appointment

Tufts Logo

  • Undergraduate Students in AS&E and SMFA
  • Graduate Students in AS&E and SMFA
  • Faculty & Staff
  • Parents and Families
  • What is a Career Community?
  • Reflect, Discover & Explore Multiple Interests
  • Arts, Communications & Media
  • Education, Nonprofit & Social Impact
  • Engineering, Technology & Physical Sciences
  • Finance, Consulting, Entrepreneurship & Business
  • Government, International Affairs & Law
  • Healthcare, Life Sciences & the Environment
  • Exploring Your Interests, Careers & Majors
  • Writing Resumes & Cover Letters
  • Finding an Internship
  • Finding Jobs & Fellowships
  • Preparing for Interviews
  • Applying to Graduate & Professional School
  • First Generation
  • International Students
  • Black, Indigenous & People of Color
  • Students with Disabilities
  • Students with Undocumented Status
  • Women & Gender
  • For Employers
  • Contact & Location
  • Career Fellows
  • Career Services by School

Case Interview: Complete Prep Guide

  • Share This: Share Case Interview: Complete Prep Guide on Facebook Share Case Interview: Complete Prep Guide on LinkedIn Share Case Interview: Complete Prep Guide on X

Welcome to our preparation tips for case interviews!  Whether you are just curious about case interviews or are planning to apply for consulting internships or full-time jobs, these tips and resources will help you feel more prepared and confident.

comparative case study examples

A case interview is a role playing exercise in which an employer assesses how logically and persuasively you can present a case. Rather than seeing if you get the “correct” answer, the objective is to evaluate your thought process. ( Adapted with permission from Case In Point: Complete Case Interview Preparation by Marc Cosentino). 

Case interviews are very commonly used in the interview process for consulting firms and companies in similar industries. In the case interview, you will typically be given a business problem and then asked to solve it in a structured way. Learning this structure takes preparation and practice. You can learn more and practice using the resources listed below.  

Why are Case Interviews Used?

Case interviews allow employers to test and evaluate the following skills:

  • Analytical skills and logical ability to solve problems
  • Structure and thought process
  • Ability to ask for relevant data/information
  • Tolerance for ambiguity and data overload
  • Poise and communication skills under pressure and in front of a client

How can I prepare for Case Interviews?

1.) Read Management Consulted’s “Case Interview: Complete Prep Guide (2024)”

Management Consulted is a FREE resource for Tufts students : case and consulting resources such as 500 sample cases, Case Interview Bootcamp,  Market Sizing Drills, Math Drills, case videos, consulting firm directory, and more

2.) Review additional resources:

  • Case in Point – This book, by Marc Cosentino, is a comprehensive guide that walks you through the case interview process from beginning to end. This guide has helped many students over the years and can serve as an excellent foundation for how to approach business problems
  • Casequestions.com – The companion website to Marc Cosentino’s book listed above offers preparation for case interviews, along with links to top 50 consulting firms
  • Management Consulting Case Interviews: Cracking The Case – tips for case interviews from the other side of the table, from Argopoint, a Boston management consulting firm specializing in legal department consulting for Fortune 500 companies
  • Preplounge.com – Free case preparation access for to up to 6 practice interviews with peers, selected cases, and video case solutions
  • RocketBlocks – Features consulting preparation such as drills and coaching
  • Practice sample online cases on consulting firm websites such as McKinsey , BCG , Bain , Deloitte and more!  

3.) Schedule a mock case interview appointment with  Karen Dankers or Kathy Spillane , our advisors for the Finance, Consulting, Entrepreneurship, and Business Career Community.

4.) PRACTICE PRACTICE PRACTICE cases out loud on your own (yes, that can feel odd) or preferably, with another person. See #2 and #3 above for resources and ideas to find partners to practice live cases

5.) Enjoy and have fun solving business problems!

' src=

  • Open access
  • Published: 05 May 2024

The learning curve in endoscopic transsphenoidal skull-base surgery: a systematic review

  • Abdulraheem Alomari 1 ,
  • Mazin Alsarraj 2 &
  • Sarah Alqarni 3  

BMC Surgery volume  24 , Article number:  135 ( 2024 ) Cite this article

145 Accesses

Metrics details

The endoscopic endonasal transsphenoidal approach (EETA) has revolutionized skull-base surgery; however, it is associated with a steep learning curve (LC), necessitating additional attention from surgeons to ensure patient safety and surgical efficacy. The current literature is constrained by the small sample sizes of studies and their observational nature. This systematic review aims to evaluate the literature and identify strengths and weaknesses related to the assessment of EETA-LC.

A systematic review was conducted following the PRISMA guidelines. PubMed and Google Scholar were searched for clinical studies on EETA-LC using detailed search strategies, including pertinent keywords and Medical Subject Headings. The selection criteria included studies comparing the outcomes of skull-base surgeries involving pure EETA in the early and late stages of surgeons’ experience, studies that assessed the learning curve of at least one surgical parameter, and articles published in English.

The systematic review identified 34 studies encompassing 5,648 patients published between 2002 and 2022, focusing on the EETA learning curve. Most studies were retrospective cohort designs (88%). Various patient assortment methods were noted, including group-based and case-based analyses. Statistical analyses included descriptive and comparative methods, along with regression analyses and curve modeling techniques. Pituitary adenoma (PA) being the most studied pathology (82%). Among the evaluated variables, improvements in outcomes across variables like EC, OT, postoperative CSF leak, and GTR. Overcoming the initial EETA learning curve was associated with sustained outcome improvements, with a median estimated case requirement of 32, ranging from 9 to 120 cases. These findings underscore the complexity of EETA-LC assessment and the importance of sustained outcome improvement as a marker of proficiency.

Conclusions

The review highlights the complexity of assessing the learning curve in EETA and underscores the need for standardized reporting and prospective studies to enhance the reliability of findings and guide clinical practice effectively.

Peer Review reports

With the advent of endoscopic techniques, skull-base surgery has significantly advanced. The modern history of neuro-endoscopy began in the early 1900s with an innovation by Lespinasse and Dandy, involving intraventricular endoscopy to coagulate the choroid plexus for treating communicating hydrocephalus [ 1 ]. In 1963, Guiot first reported an endoscopic approach via the transsphenoidal route as an adjunct to procedures performed under microscopy [ 2 , 3 ]. In 1992, Jankowski et al. described a purely endoscopic approach for pituitary adenoma resection [ 1 ].

The advantages of endoscopy have encouraged skull-base surgeons to adopt this technique, which provides a panoramic view of critical anatomical landmarks and improved access to the corners and deep surgical areas while inducing only minor trauma to the nasal structures, thereby enhancing postoperative patient comfort [ 4 ]. Compared with procedures involving microscopy, the endoscopic approach results in a shorter operating time (OT), a reduced hospitalization period, a lower rate of complications, and a higher endocrinological cure rate [ 5 , 6 ]. Despite these benefits, the endoscopic approach is hindered by a two-dimensional view, instrument interference, difficulties in achieving homeostasis, and a steep learning curve (LC) [ 4 ].

Since its inception, pioneers in the field have recognized the steep LC associated with the endoscopic technique [ 7 ]. The safety and efficacy of the endoscopic endonasal transsphenoidal approach (EETA), as an alternative to the gold-standard microscopic technique, have been established. However, the steep LC associated with the endoscopic approach may affect short-term outcomes post-procedure [ 5 , 6 ]. Additionally, as the skull-base endoscopic technique constantly evolves and expands, a thorough understanding of the associated LC is critical.

The results of existing publications on the EETA-LC are challenging to interpret due to small sample sizes, observational study designs, and a lack of standardization in assessment methodologies. In this systematic review aims to elucidate the EETA-LC from the literature by addressing the following questions: How was EETA LC evaluated? Which set of variables was used to assess the LC? What is the influence of the LC on the examined variables?

A systematic review was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines [ 8 ]. The review was registered on PROSPERO (CRD42023494731). We searched different databases for articles that assessed the learning curve of EETA without date restriction (PubMed, and Google Scholar). We used a particular equation for each database using a combination of the following keywords and Medical Subject Headings: (Endoscopy OR endoscopic skull base OR endoscopic endonasal transsphenoidal approach) AND (Skull Base Neoplasms OR Pituitary OR pituitary adenoma) AND (Learning Curve OR endoscopic learning curve OR surgical learning curve).

First, two authors (AA, MA) independently screened the titles and abstracts of articles in the databases for learning curve analysis of EETA, either for a single surgeon or a team, by directly comparing outcomes between early and late cases performed. The full texts of the relevant articles were reviewed. When there was a disagreement, the articles were thoroughly discussed before their inclusion in the review. The bibliographies of the selected studies were also screened for relevant citations, which turned up studies that were already selected from the database search.

Studies were included according to the following inclusion criteria: 1) Comparison of outcomes between initial and advanced experiences with the endonasal endoscopic transsphenoidal approach to treat skull-base pathology, defined as "early experience" and "late experience," respectively; 2) Assessment of at least one parameter based on early and late experiences; 3) Randomized controlled trials, prospective cohort studies, retrospective cohort studies, case–control studies, and case series studies were included; and 4) English-language publications.

The study’s exclusion criteria included the following: 1) Studies not performing learning curve analysis; 2) Studies comparing the outcomes of microscopic and endoscopic transsphenoidal approaches without providing separate data for the endoscopic approach; 3) Studies comparing the learning curve between two EETA techniques, using simulated models or questionnaire-based analysis; 4) Studies comparing the microscopic vs. endoscopic approach without separate data available specifically for the endoscopic arm. Additionally, case reports, reviews, animal studies, technical notes, comments, and correspondence were excluded.

Data collection and analysis

The following data were extracted directly from the articles: 1) author names; 2) the year of publication; 3) Time interval of performed procedures; 4) study design; 5) the sample size; 6) techniques used for learning curve analysis (methods used to assort the patients for the analysis); (conducting statistical analysis vs. simple comparison of outcomes); 7) the sample size in each study arm when group splitting performed (early experience vs. late experience); 8) detailed information about surgeon experience at the time of LC assessment (including or omitting the first few EETA cases); 9) single vs. multiple pathologies; 10) team vs. single-surgeon experiences; 11) evaluated set of variables; 12) Variables that improved with experience; and 13) the number of cases required to overcome the initial LC or other methods to identify overcoming the learning curve.

Study quality assessment and risk of bias

Two reviewers conducted a quality assessment and evaluated the risk of bias in the included articles. We utilized the Newcastle–Ottawa Scale (NOS) [ 9 ] and the GRADE system [ 10 ].

Heterogeneity Analysis: Due to substantial heterogeneity observed among the included studies, which encompassed variations in study design, included pathologies, and outcome measures, a formal meta-analysis was not feasible. Therefore, we opted for a qualitative synthesis instead of a formal meta-analysis. Heterogeneity analysis and sensitivity analyses were not explicitly conducted.

Based on the inclusion and exclusion criteria, a total of 34 studies were identified (6 articles excluded after reviewing the full articles), including 5,648 patients [ 7 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 ] (Fig.  1 ). The included studies were published between 2002 and 2022, and the evaluated procedures were performed between 1990 and 2018. The majority of the included articles comprised retrospective cohort studies (88%), with two being prospective studies, and two articles presenting data from both prospective and retrospective study designs. Assessing a surgical learning curve involves various methods and techniques documented within the included articles. We observed various methods for patient assortment in conducting learning curve analyses across the literature, with group-based learning curve analysis noticeable in a significant proportion of articles (68%). Within these studies, there was an unclear rationale behind patient grouping. Nonetheless, patients were categorized into either equal group, segmented based on arbitrary time periods, or separated based on improvements in outcomes observed retrospectively after data analysis. Eleven articles (32%) utilize case-based analysis, where individual surgical cases serve as distinct data points, and their outcomes are monitored over time.

figure 1

PRISMA flow diagram. PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses

* The bibliographies of the selected studies were also screened for relevant citations which turned up studies already included from databases search

Our systematic review encompasses a wide range of statistical tests employed in the included studies to analyze various data types and address multifaceted research inquiries. The primary statistical methodologies utilized encompass descriptive statistical analysis, which includes metrics such as mean, median, frequency, and standard deviation, along with comparative statistical analysis, which includes techniques such as Chi-square analysis, analysis of variance (ANOVA), and t-tests. Descriptive statistical analysis alone was evident in 10 articles (29%), whereas comparative statistical analysis was present in 24 articles (71%). Noteworthy examples include Leach et al. [ 16 ], who conducted analysis of variance (ANOVA) with post hoc Bonferroni tests for parametric data, Chi-Square Test, or Mann–Whitney tests for nonparametric data, and regression analysis to explore the relationship between surgical duration and relevant factors. Smeth et al. [ 17 ] undertook analyses using chi-square, Fisher exact, Student t-test, Mann–Whitney U test, and analysis of variance, aligning with their examination of categorical and continuous variables across distinct groups. Similarly, Sonnenburg et al. [ 12 ] applied a one-way ANOVA to discern variations between groups, highlighting the importance of understanding differences in means across categorical variables or treatment cohorts.

Regression analyses, scatterplots, McNemar tests, ROC curve analysis, and logistic regression models were integral across various studies, serving multiple purposes. Regression analyses, such as linear regression models, facilitated the exploration of intricate relationships among variables like age, tumor size, and surgical duration, identifying potential risk factors in surgical contexts [ 22 ]. Scatterplots visually depicted these relationships, offering intuitive insights into temporal variations, notably in the examination of surgery date versus duration [ 22 ]. McNemar tests were instrumental in evaluating changes in hormone levels, crucial for understanding postoperative outcomes and hormonal dynamics [ 37 ]. Additionally, ROC curve analysis provided a robust method for determining the level of surgical experience necessary to achieve gross total resection (GTR), offering actionable insights into surgical proficiency and patient outcomes [ 37 ]. Binary logistic regression models were utilized to identify prognostic factors contributing to the attainment of Gross Total Resection (GTR), hormonal recuperation, and visual restoration. For instance, variables such as surgical experience (≤ 100 vs. > 100 cases) were examined within this analytical framework [ 37 ].

In our examination of the included articles, we noted a lack of thorough description regarding the experience of surgeons or surgical teams with the endoscopic endonasal transsphenoidal approach (EETA), the extent of the approach undertaken, and the level of involvement of individual surgeons or surgical teams during procedures. Thirteen articles (38%) reported including the initial cases of EETA, which may indicate a lack of prior experience with the approach. Additionally, seven articles (21%) detailed the experience of a single surgeon, while the majority (79%) evaluated team experiences. There was a wide range of pathologies included in all the studies. Twenty articles (59%) focused on a single pathology, while fourteen studies (41%) examined multiple pathologies. Pituitary adenoma (PA) was the most frequently reported pathology (82%), followed by craniopharyngioma (CP) (44%). Three studies assessed the learning curve of cerebrospinal fluid (CSF) leak repair following treatment of multiple pathologies. Descriptions of the surgical approach, particularly distinguishing between simple and extended techniques, were notably lacking across all articles. However, seventeen articles (50%) did mention pathologies that often require an extended approach, such as meningioma, chordoma, and CP. A number of studies have investigated the variations in tumor type and size among the examined groups, particularly between early and late groups. Notably, findings from studies such as [ 7 , 16 , 17 , 22 , 23 , 26 , 38 ] indicated that no statistical differences were observed between these groups. The characteristics of the included studies [ 7 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 ] are summarized in Table  1 .

The EETA-LC was evaluated based on a diverse set of variables. The most frequently analyzed variables were postoperative cerebrospinal fluid (CSF) leak in 28 articles (82%) [ 7 , 12 , 13 , 15 , 16 , 17 , 19 , 20 , 21 , 22 , 23 , 25 , 27 , 28 , 29 , 31 , 32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 ], gross total resection (GTR) in 21 articles (62%) [ 7 , 13 , 14 , 16 , 19 , 21 , 22 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 36 , 37 , 38 , 39 , 40 ], post operative diabetes insipidus (DI) in 15 articles (44%) [ 12 , 13 , 16 , 17 , 19 , 21 , 22 , 29 , 30 , 31 , 32 , 34 , 36 , 37 , 41 ], operative time (OT) in 12 articles (35%) [ 7 , 13 , 14 , 16 , 17 , 22 , 29 , 32 , 34 , 35 , 36 , 38 ] and visual improvement in 12 articles (35%) [ 13 , 14 , 16 , 21 , 22 , 28 , 31 , 32 , 34 , 36 , 37 , 41 ]. (Fig.  2 ).

figure 2

Frequency at which certain variables were evaluated in the literature to assess the EETA learning curve. EETA, endoscopic endonasal transsphenoidal approach; post-op, postoperative; CSF, cerebrospinal fluid; GTR, gross total resection; DI, diabetes insipidus; LOS, length of stay; IOP, intraoperative; ICA, internal carotid artery; SIADH, syndrome of inappropriate antidiuretic hormone secretion; LD, lumbar drain; CNS, central nervous system; CN, cranial nerve; EBL, estimated blood loss; DVT, deep vein thrombosis

In all the studies included, improvements were observed between early and late-experience stages [ 7 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 ]. Among the evaluated variables, the following improvements were noted: the endocrinological cure rate (EC) showed improvement in all 7 articles out of 7 evaluated [ 13 , 16 , 18 , 21 , 24 , 30 , 33 ], operative time (OT) improved in 11 out of 12 articles (91%) [ 13 , 14 , 16 , 17 , 22 , 29 , 32 , 34 , 35 , 36 , 38 ], postoperative cerebrospinal fluid leak (CSF) improved in 23 out of 28 articles (82%) [ 12 , 15 , 17 , 19 , 20 , 22 , 23 , 25 , 27 , 28 , 29 , 31 , 32 , 33 , 34 , 35 , 37 , 38 , 39 , 40 , 41 , 42 , 43 ], visual improvement was observed in 9 out of 12 articles (75%) [ 13 , 14 , 16 , 22 , 28 , 31 , 34 , 37 , 41 ], gross total resection (GTR) improved in 14 out of 21 articles (67%) [ 7 , 13 , 14 , 19 , 21 , 22 , 26 , 27 , 28 , 29 , 30 , 38 , 39 , 40 ], hospital length of stay (LOS) decreased in five out of 10 studies (50%) [ 11 , 12 , 16 , 17 , 22 ], and postoperative diabetes insipidus (DI) decreased in 7 out of 15 articles (47%) [ 3 , 14 , 16 , 17 , 21 , 22 , 33 ] (Fig.  3 ).

figure 3

Proportion of main improved variables with experiences. EC, Endocrinological cure; OT, Operative time; post-op: postoperative; CSF, cerebrospinal fluid; GTR, gross total resection; hLOS, hospital length of stay; DI, diabetes insipidus

Moreover, 12 articles (35%) reported both significant and non-significant improvements in outcomes [ 7 , 13 , 14 , 16 , 17 , 21 , 22 , 31 , 32 , 34 , 38 , 41 ]. In 10 studies (29%), solely a trend of improvement was observed [ 11 , 15 , 19 , 20 , 23 , 26 , 27 , 29 , 30 , 40 ], while 8 articles (23%) reported solely significant improvements [ 18 , 24 , 25 , 35 , 36 , 37 , 42 , 43 ]. However, in four studies, despite observing a tendency towards better outcomes, no statistical disparities were identified among all assessed variables [ 12 , 28 , 33 , 39 ]. None of the included studies reported a deterioration in any of the assessed outcomes over time, except for one study where a significant decline in GTR was observed in the late group [ 33 ]. This decline was attributed to the inclusion of more invasive and complex tumors in the late group. Nevertheless, Younus et al. documented ongoing improvement in GTR even after surpassing the initial learning curve [ 7 ].

In this systematic review, the primary technique employed to determine the transition point indicating the overcoming of the initial learning curve involved observing sustained and consistent improvement in outcomes over time. In almost half of the included articles, overcoming the initial learning curve (observing improvement of outcomes) was linked to the number of cases performed. Out of the 34 analyzed studies, 16 (47%) estimated the number of cases needed to overcome the initial learning curve of EETA. Reported cases ranged widely from 9 to 120, with a mode of 50. Considering both the median and the Interquartile Range (IQR) provides a comprehensive understanding of the reported case distribution and central tendency for overcoming the initial EETA learning curve. The median number of cases needed is 32, with an IQR of 20. These numbers are estimates and require careful interpretation [ 16 , 17 , 20 , 21 , 22 , 23 , 24 , 25 , 29 , 31 , 32 , 33 , 35 , 36 , 37 , 38 , 42 ].

Regarding the quality of included studies, the NOS quality assessment scale was used. 21 studies graded as fair quality while the remaining 13 articles rated as poor quality [ 9 ]. The risk of bias was evaluated according to the GRADE system. All included studies are observational cohort study and graded either as low or very low grade [ 10 ]. This reflects the great heterogeneity and high risk of bias due to the study design of the current EETA-LC literature.

Endoscopic techniques have drastically improved skull-base surgery. Unlike procedures involving a microscope, many neurosurgeons have acquired experience in endoscopic techniques later in their careers, and the level of exposure to these techniques during training years has varied among surgeons. The LC is a critical factor in the acquisition of new surgical skills. Understanding the link between the EETA-LC and surgical outcomes will enable surgeons to better understand what to expect and what measures to apply as those surgical skills develop. Many studies in other surgical domains have reported on the LC during the acquisition of new surgical techniques [ 44 , 45 , 46 , 47 ]. Most minimally invasive surgeries are associated with a challenging LC, and EETA is no exception [ 7 , 46 ].

The concept of the LC was first established in the field of aircraft manufacturing and refers to an improvement in performance over time [ 48 ]. Smith et al. [ 17 ] have defined it as the number of procedures that must be performed for the outcomes to approach a long-term mean rate. Typically, an LC is characterized by an S-shaped curve with three stages: an early phase, during which new skill sets are acquired; a middle phase, in which the speed of learning rapidly increases; and an expert phase in which the performance reaches a plateau [ 49 ]. However, other curves have been proposed that involve a dip in the LC following the initial acceleration of the learning rate; this occurs especially with handling more challenging cases. Another potential decline may emerge after a long period of experience. Despite having reached a plateau in the learning curve after an extended period, declines in manual dexterity, eyesight, memory, and cognition may overshadow the benefits of accumulated experience, leading to diminished performance levels [ 50 ].

The absence of consensus on the best applicable methods to describe and assess the learning curve may explain the diversity of analysis methods observed in this systematic review. In their large systematic review regarding learning curve assessment in healthcare technologies, Ramsay et al. [ 51 ] reported that group splitting was the most frequent method. They defined group splitting as dividing the data by experience levels and conducting testing on discrete groups, often halves or thirds. The statistical methods applied included t-tests, chi-squared tests, Mann–Whitney U tests, and simple ANOVA.

In our review, we reached a similar conclusion. We observed that a substantial portion of articles (68%) utilized group-based learning curve analysis [ 7 , 11 , 12 , 13 , 16 , 17 , 19 , 21 , 22 , 23 , 26 , 27 , 28 , 30 , 31 , 32 , 33 , 34 , 36 , 37 , 38 , 42 , 43 ]. Additionally, we similarly noted that papers frequently lacked explanations for the selection of cut points, raising concerns about potential bias resulting from data-dependent splitting. It is important to acknowledge that this method of group categorization has inherent drawbacks, including challenges related to small sample sizes, the use of arbitrary cutoff points, and the inability to eliminate all potential confounding variables [ 52 ].

Descriptive analysis was found in 10 articles (29%) within this review [ 11 , 15 , 19 , 20 , 23 , 26 , 27 , 29 , 30 , 40 ]. While providing an initial grasp of data distribution and characteristics, descriptive analysis may fall short in capturing the intricate dynamics of the learning curve over time or the factors affecting its impact [ 51 ]. Alternatively, conducting rigorous statistical analyses afterward offers better insight and interpretation of the results. This approach aims to mitigate the influence of confounding factors on outcome assessments over time [ 51 , 52 ].

In our review, 24 articles (71%) conducted a wide variety of statistical analyses [ 7 , 12 , 13 , 14 , 16 , 17 , 18 , 21 , 22 , 24 , 25 , 28 , 31 , 32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 , 41 , 42 , 43 ], including but not limited to the following tests: Chi-square Test, Fischer exact test, Student's t-test, Analysis of Variance (ANOVA), Mann–Whitney U Test, McNemar tests, Multivariate linear regression model, Cumulative Sum (CUSUM), and ROC Curve Analysis [ 13 , 16 , 22 , 32 , 37 , 38 , 39 ]. Four studies indicated that there was no statistically significant difference observed among the variables under evaluation. The lack of significance was attributed to several factors including small sample sizes, meticulous case selection, involvement of an otolaryngology team throughout the procedure, an increase in the number of invasive tumors in the late-experience study group, previous surgical experience, intensive training, level of supervision, and gradual inclusion of residents [ 12 , 28 , 33 , 39 ]. These efforts should be regarded as beneficial strategies aimed at reducing the steepness of the EETA learning curve.

To obtain more accurate results, it is crucial to eliminate confounding factors, such as the level of supervision, prior experience, the heterogeneity of cases being treated, and their complexity when evaluating the LC. Thus, it is essential to incorporate multivariate logistic regression analysis to mitigate the impact of these potential confounding factors [ 51 ]. Chi et al. [ 22 ] divided their patients into equal groups of 40 cases each. They then compared potential confounding variables to minimize their influence on learning curve assessment. This comparison includes demographic and clinical factors between the two groups, such as sex distribution, mean age, tumor size (microadenomas vs. macroadenomas), visual field defects, and tumor types (non-functioning, functioning adenomas, etc.). By conducting these comparisons, the researchers sought to identify discrepancies in demographic and clinical features between the groups.

The description of a surgeon's extensive prior experience is crucial for accurately quantifying the assessment of the learning curve, a point reported to be neglected during the assessment in various types of learning assessments related to healthcare procedures [ 49 ]. In our review, we observed the same conclusion in all included studies. However, the inclusion of the initial first few cases was mentioned in 13 (38%) articles, which might be used as a surrogate for no prior experience with EETA. Furthermore, five articles did not include the initial few cases. Among these, four studies examined the learning curve of more complex cases such as meningioma, craniopharyngioma, and growth hormone pituitary adenoma, employing an extended approach. Conversely, Younus et al. [ 7 ] deliberately excluded these cases to assess various stages of the learning curve.

Assessing multiple pathologies with varying complexities could significantly impact learning curve assessments. In our review, 59% of articles focused on a single pathology, while 41% explored multiple pathologies. Pituitary adenoma (PA) was the most evaluated (82%), followed by craniopharyngioma (CP) (44%). Controlling confounding variables like tumor type and size may yield more reliable results. Some studies used statistical analyses to compare early and late cases, while others relied on descriptive analyses. Shou et al. noted a drop in GTR over time due to late involvement of complex cases [ 33 ]. Conversely, studies analyzing tumor size and type found GTR improvement with experience [ 7 , 23 ]. Thorough multivariable analysis of confounding factors is crucial for representative LC analysis.

The LC is often assessed based on two main categories of variables: those related to the surgical procedure (OT, estimated blood loss, and extent of resection) and those related to patient outcomes (duration of hospitalization, the incidence of complications, and the mortality rate) [ 50 ]. In this systematic review, OT was one of the most frequent parameters that significantly reduced as one gained experience. Although OT is commonly utilized as an outcome measure, it is only a surrogate means of evaluating the LC and may not always accurately represent patient outcomes [ 52 ]. Another point to consider is the lack of standardized variables for assessing the LC, and the included studies evaluated more than 45 distinct variables. Khan et al. highlighted the importance of using consistent variable definitions across studies to derive accurate conclusions from aggregated LC data [ 52 ].

A dynamic relationship exists between surgical outcomes and the LC, and each phase of the LC influences a distinct set of variables differently. One study, which included data from 1,000 EETA cases after purposely eliminating the first 200 cases, showed that variables such as GTR and the endocrinological cure rate continued to improve after the first 200 cases, whereas other parameters remained unchanged. Authors concluded that some variables will continue to improve after passing the initial LC phase [ 7 ]. Determining the precise number of cases needed to surpass the initial learning curve (LC) has proven challenging. Shikary et al. observed a notable decrease in postoperative CSF leaks after 100 surgeries, while a reduction in operative time was evident after 120 cases [ 35 ]. However, specifying a definitive number to overcome the learning curve of the Endoscopic Endonasal Transsphenoidal Approach (EETA) remains challenging due to individual variability, diverse pathologies, and evolving surgical techniques.

Assessing the learning curve of the Endoscopic Endonasal Transsphenoidal Approach (EETA-LC) faces notable challenges due to its intricate techniques and the wide array of pathologies it addresses. The diversity across specialties makes standardizing studies difficult. To understand the dynamic learning process in EETA-LC, influenced by individual surgeon skill, patient nuances, and procedural complexities, longitudinal studies and advanced analytical methods are essential. Moreover, the complexity of statistical analysis adds another layer of challenge, highlighting the necessity for interdisciplinary collaboration and innovative methodologies.

To address the current limitations in the literature regarding the EETA LC, we propose several key strategies for future studies. Firstly, we advocate for multicenter collaboration, coupled with standardized processes, to comprehensively assess the EETA LC. This collaborative approach will facilitate the aggregation of data from diverse surgical settings, enhancing the generalizability of findings and minimizing bias. Furthermore, rigorous documentation of the previous and current experience of involved surgeons is paramount. We suggest categorizing surgeons based on their levels of experience to accurately elucidate the impact of proficiency on surgical outcomes. Secondly, given the wide variety of complexities of skull base pathologies encountered, we recommend further categorization of cases based on their levels of complexity. This stratification will enable a more nuanced analysis of the learning curve across different levels of surgical challenge. Thirdly, standardization of outcome measures used to assess the learning curve is imperative, with specific definitions provided for each outcome. This ensures consistency and comparability across studies, facilitating meaningful interpretation of results. Finally, conducting prospective study designs with sufficient follow-up periods, along with rigorous multivariate statistical analyses among these categorized groups, is essential to mitigate the influence of confounding variables and strengthen the validity of findings. Implementing these strategies will help future studies to overcome the current limitations in the literature, leading to a deeper understanding of the EETA learning curve and ultimately improving patient outcomes.

This systematic review identified 34 studies that reported a relationship between improvements in surgical outcomes and a surgeon’s level of experience with EETA. There is notable significant heterogeneity in the current literature on EETA-LC regarding the techniques used to assess the LC, variables assessed, types of pathology included, and insufficient reporting of the surgeon or team's current and previous experience with EETA. The main variables improved with experience were EC, postoperative CSF leak, OT, GTR visual improvement, and hospital LOS. Future studies with multicenter collaboration and standardized processes for assessing the EETA LC will enhance generalizability and minimize bias. Rigorous documentation of surgeons' experience levels, categorization of cases by complexity, and standardized outcome measures are essential. Additionally, rigorous statistical analyses will strengthen validity and mitigate confounding variables. Implementing these strategies will deepen our understanding of the EETA learning curve, ultimately leading to improved patient outcomes.

Availability of data and materials

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Abbreviations

Endoscopic endonasal transsphenoidal approach

  • Learning curve

Cerebrospinal fluid

Diabetes insipidus

Gross total resection

Cappabianca P, de Divitiis E. Endoscopy and transsphenoidal surgery. Neurosurgery. 2004;54:1043–50.

Article   PubMed   Google Scholar  

Gandhi CD, Post KD. Historical movements in transsphenoidal surgery. Neurosurg Focus. 2001;11:1–4.

Article   Google Scholar  

Gandhi CD, Christiano LD, Eloy JA, Prestigiacomo CJ, Post KD. The historical evolution of transsphenoidal surgery: facilitation by technological advances. Neurosurg Focus. 2009;27:E8.

de Divitiis E. Endoscopic transsphenoidal surgery: stone-in-the-pond effect. Neurosurgery. 2006;59:512–20.

Rotenberg B, Tam S, Ryu WHA, Duggal N. Microscopic versus endoscopic pituitary surgery: a systematic review. Laryngoscope. 2010;120:1292–7.

Tabaee A, Anand VK, Barrón Y, Hiltzik DH, Brown SM, Kacker A, et al. Endoscopic pituitary surgery: a systematic review and meta-analysis. J Neurosurg. 2009;111:545–54.

Younus I, Gerges MM, Uribe-Cardenas R, Morgenstern PF, Eljalby M, Tabaee A, et al. How long is the tail end of the learning curve? Results from 1000 consecutive endoscopic endonasal skull base cases following the initial 200 cases. J Neurosurg. 2020;134:750–60.

Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Int J Surg. 2021;88:105906.

Wells G, Shea B, O'Connell D, Peterson J, Welch V, Losos M, Tugwell P. The Newcastle-Ottawa Scale (NOS) for assessing the quality of nonrandomised studies in meta-analyses. 2013. Retrieved from http://www.ohri.ca/programs/clinical_epidemiology/oxford.asp .

Balshem H, Helfand M, Schünemann HJ, Oxman AD, Kunz R, Brozek J, Vist GE, Falck-Ytter Y, Meerpohl J, Norris S, Guyatt GH. GRADE guidelines: 3. Rating the quality of evidence. J Clin Epidemiol. 2011;64(4):401–6. https://doi.org/10.1016/j.jclinepi.2010.07.015 .

Cappabianca P, Cavallo L, Colao A, Del Basso De Caro M, Esposito F, Cirillo S, et al. Endoscopic endonasal transsphenoidal approach: outcome analysis of 100 consecutive procedures. Minim Invasive Neurosurg. 2002;45:193–200.

Article   CAS   PubMed   Google Scholar  

Sonnenburg RE, White D, Ewend MG, Senior B. The learning curve in minimally invasive pituitary surgery. Am J Rhinol. 2004;18:259–63.

Kenan K, İhsan A, Dilek O, Burak C, Gurkan K, Savas C. The learning curve in endoscopic pituitary surgery and our experience. Neurosurg Rev. 2006;29:298–305.

Yano S, Kawano T, Kudo M, Makino K, Nakamura H, Kai Y, et al. Endoscopic endonasal transsphenoidal approach through the bilateral nostrils for pituitary adenomas. Neurol Med Chir (Tokyo). 2009;49:1–7.

Gondim JA, Schops M, de Almeida JPC, de Albuquerque LAF, Gomes E, Ferraz T, et al. Endoscopic endonasal transsphenoidal surgery: surgical results of 228 pituitary adenomas treated in a pituitary center. Pituitary. 2010;13:68–77.

Leach P, Abou-Zeid AH, Kearney T, Davis J, Trainer PJ, Gnanalingham KK. Endoscopic transsphenoidal pituitary surgery: evidence of an operative learning curve. Neurosurgery. 2010;67:1205–12.

Smith SJ, Eralil G, Woon K, Sama A, Dow G, Robertson I. Light at the end of the tunnel: the learning curve associated with endoscopic transsphenoidal skull base surgery. Skull Base. 2010;20:69–74.

Article   PubMed   PubMed Central   Google Scholar  

Wagenmakers MAE, Netea-Maier RT, van Lindert EJ, Pieters GF, Grotenhuis AJ, Hermus AR. Results of endoscopic transsphenoidal pituitary surgery in 40 patients with a growth hormone-secreting macroadenoma. Acta Neurochir (Wien). 2011;153:1391–9.

Kumar S, Darr A, Hobbs C, Carlin W. Endoscopic, endonasal, trans-sphenoidal hypophysectomy: retrospective analysis of 171 procedures. J Laryngol Otol. 2012;126:1033–40.

Snyderman CH, Pant H, Kassam AB, Carrau RL, Prevedello DM, Gardner PA. The learning curve for endonasal surgery of the cranial base: A systematic approach to training. In: Kassam AB, Gardner PA, editors. Endoscopic approaches to the skull base. Ettlingen: Karger Publishers; 2012. p. 222–31.

Chapter   Google Scholar  

Bokhari AR, Davies MA, Diamond T. Endoscopic transsphenoidal pituitary surgery: a single surgeon experience and the learning curve. Br J Neurosurg. 2013;27:449.

Chi F, Wang Y, Lin Y, Ge J, Qiu Y, Guo L. A learning curve of endoscopic transsphenoidal surgery for pituitary adenoma. J Craniofac Surg. 2013;24:2064–7.

de los Santos G, Fragola C, Del Castillo R, Rodríguez V, D’oleo C, Reyes P. Endoscopic approaches to pituitary lesions: difficulties and challenges. Acta Otorrinolaringol Esp. 2013;64(258):64.

Google Scholar  

Hazer DB, Işık S, Berker D, Güler S, Gürlek A, Yücel T, et al. Treatment of acromegaly by endoscopic transsphenoidal surgery: surgical experience in 214 cases and cure rates according to current consensus criteria. J Neurosurg. 2013;119:1467–77.

Jakimovski D, Bonci G, Attia M, Shao H, Hofstetter C, Tsiouris AJ, et al. Incidence and significance of intraoperative cerebrospinal fluid leak in endoscopic pituitary surgery using intrathecal fluorescein. World Neurosurg. 2014;82:e513–23.

Koutourousiou M, Fernandez-Miranda JC, Wang EW, Snyderman CH, Gardner PA. Endoscopic endonasal surgery for olfactory groove meningiomas: outcomes and limitations in 50 patients. Neurosurg Focus. 2014;37:E8.

Mascarenhas L, Moshel YA, Bayad F, Szentirmai O, Salek AA, Leng LZ, et al. The transplanum transtuberculum approaches for suprasellar and sellar-suprasellar lesions: avoidance of cerebrospinal fluid leak and lessons learned. World Neurosurg. 2014;82:186–95.

Ottenhausen M, Banu MA, Placantonakis DG, Tsiouris AJ, Khan OH, Anand VK, et al. Endoscopic endonasal resection of suprasellar meningiomas: the importance of case selection and experience in determining extent of resection, visual improvement, and complications. World Neurosurg. 2014;82:442–9.

Ananth G, Hosmath AV, Varadaraju DN, Patil SR, Usman MM, Patil RP, et al. Learning curve in endoscopic transnasal sellar region surgery. J Evid Based Med Healthc. 2016;3:3166–72.

Jang JH, Kim KH, Lee YM, Kim JS, Kim YZ. Surgical results of pure endoscopic endonasal transsphenoidal surgery for 331 pituitary adenomas: a 15-year experience from a single institution. World Neurosurg. 2016;96:545–55.

Kshettry VR, Do H, Elshazly K, Farrell CJ, Nyquist G, Rosen M, et al. The learning curve in endoscopic endonasal resection of craniopharyngiomas. Neurosurg Focus. 2016;41:E9.

Qureshi T, Chaus F, Fogg L, Dasgupta M, Straus D, Byrne RW. Learning curve for the transsphenoidal endoscopic endonasal approach to pituitary tumors. Br J Neurosurg. 2016;30:637–42.

Shou X, Shen M, Zhang Q, Zhang Y, He W, Ma Z, et al. Endoscopic endonasal pituitary adenomas surgery: the surgical experience of 178 consecutive patients and learning curve of two neurosurgeons. BMC Neurol. 2016;16:1–8.

Ding H, Gu Y, Zhang X, Xie T, Liu T, Hu F, et al. Learning curve for the endoscopic endonasal approach for suprasellar craniopharyngiomas. J Clin Neurosci. 2017;42:209–16.

Shikary T, Andaluz N, Meinzen-Derr J, Edwards C, Theodosopoulos P, Zimmer LA. Operative learning curve after transition to endoscopic transsphenoidal pituitary surgery. World Neurosurg. 2017;102:608–12.

Eseonu CI, ReFaey K, Pamias-Portalatin E, Asensio J, Garcia O, Boahene KD, et al. Three-hand endoscopic endonasal transsphenoidal surgery: experience with an anatomy-preserving mononostril approach technique. Oper Neurosurg (Hagerstown). 2018;14:158–65.

Kim JH, Lee JH, Lee JH, Hong AR, Kim YJ, Kim YH. Endoscopic transsphenoidal surgery outcomes in 331 nonfunctioning pituitary adenoma cases after a single surgeon learning curve. World Neurosurg. 2018;109:e409–16.

Lofrese G, Vigo V, Rigante M, Grieco DL, Maresca M, Anile C, et al. Learning curve of endoscopic pituitary surgery: experience of a neurosurgery/ENT collaboration. J Clin Neurosci. 2018;47:299–303.

Robins JM, Alavi SA, Tyagi AK, Nix PA, Wilson TM, Phillips NI. The learning curve for endoscopic trans-sphenoidal resection of pituitary macroadenomas. A single institution experience, Leeds, UK. Acta Neurochir (Wien). 2018;160:39–47.

Algattas H, Setty P, Goldschmidt E, Wang EW, Tyler-Kabara EC, Snyderman CH, et al. Endoscopic endonasal approach for craniopharyngiomas with intraventricular extension: case series, long-term outcomes, and review. World Neurosurg. 2020;144:e447–59.

Soliman MA, Eaton S, Quint E, Alkhamees AF, Shahab S, O’Connor A, et al. Challenges, learning curve, and safety of endoscopic endonasal surgery of sellar-suprasellar lesions in a community hospital. World Neurosurg. 2020;138:e940–54.

Nix P, Alavi SA, Tyagi A, Phillips N. Endoscopic repair of the anterior skull base-is there a learning curve? Br J Neurosurg. 2018;32:407–11.

Park W, Nam D-H, Kong D-S, Lee KE, Park SI, Kim HY, et al. Learning curve and technical nuances of endoscopic skull base reconstruction with nasoseptal flap to control high-flow cerebrospinal fluid leakage: reconstruction after endoscopic skull base surgery other than pituitary surgery. Eur Arch Otorhinolaryngol. 2022;279:1335–40.

Lubowitz JH, Sahasrabudhe A, Appleby D. Minimally invasive surgery in total knee arthroplasty: the learning curve. Orthopedics. 2007;30:80.

PubMed   Google Scholar  

Hoppe DJ, Simunovic N, Bhandari M, Safran MR, Larson CM, Ayeni OR. The learning curve for hip arthroscopy: a systematic review. Arthroscopy. 2014;30:389–97.

Sclafani JA, Kim CW. Complications associated with the initial learning curve of minimally invasive spine surgery: a systematic review. Clin Orthop Relat Res. 2014;472:1711–7.

Pernar LI, Robertson FC, Tavakkoli A, Sheu EG, Brooks DC, Smink DS. An appraisal of the learning curve in robotic general surgery. Surg Endosc. 2017;31:4583–96.

Wright TP. Factors affecting the cost of airplanes. J Aeronaut Sci. 1936;3:122–8.

Cook JA, Ramsay CR, Fayers P. Using the literature to quantify the learning curve: a case study. Int J Technol Assess Health Care. 2007;23:255–60.

Hopper A, Jamison M, Lewis W. Learning curves in surgical practice. Postgrad Med J. 2007;83:777–9.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Ramsay CR, Grant AM, Wallace SA, Garthwaite PH, Monk AF, Russell IT. Assessment of the learning curve in health technologies: a systematic review. Int J Technol Assess Health Care. 2000;16:1095–108.

Khan N, Abboudi H, Khan MS, Dasgupta P, Ahmed K. Measuring the surgical ‘learning curve’: methods, variables and competency. BJU Int. 2014;113:504–8.

Download references

Acknowledgements

Not applicable

No funding was received for this study.

Author information

Authors and affiliations.

Neurosurgery Department, East Jeddah Hospital, 2277 King Abdullah Rd, Al Sulaymaniyah, 22253, Jeddah, Saudi Arabia

Abdulraheem Alomari

Otolaryngology and Head and Neck Surgery Department, King Abdullah Medical Complex, Prince Nayef Street, Northern Abhor, 23816, Jeddah, Saudi Arabia

Mazin Alsarraj

Neurosurgery Department, King Abdulaziz Medical City, 21423, Jeddah, Saudi Arabia

Sarah Alqarni

You can also search for this author in PubMed   Google Scholar

Contributions

AA: Acquisition of Data, Analysis and Interpretation of Data, Drafting the Article, Critically Revising the Article, Drafting the Article, Reviewed submitted version of manuscript, Approved the final version of the manuscript on behalf of all authors. Study supervision; MA: Conception and Design, Analysis and Interpretation of Data, Reviewed submitted version of manuscript, Analysis and Interpretation of Data; SA: Acquisition of Data, Conception and Design, Critically Revising the Article, Analysis and Interpretation of Data, Administrative / technical / material support.

Corresponding author

Correspondence to Abdulraheem Alomari .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Alomari, A., Alsarraj, M. & Alqarni, S. The learning curve in endoscopic transsphenoidal skull-base surgery: a systematic review. BMC Surg 24 , 135 (2024). https://doi.org/10.1186/s12893-024-02418-y

Download citation

Received : 21 September 2023

Accepted : 20 April 2024

Published : 05 May 2024

DOI : https://doi.org/10.1186/s12893-024-02418-y

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Endoscopic skull base
  • Transsphenoidal surgery

BMC Surgery

ISSN: 1471-2482

comparative case study examples

An efficient computational approach for basic feasible solution of fuzzy transportation problems

  • ORIGINAL ARTICLE
  • Published: 06 May 2024

Cite this article

comparative case study examples

  • Anshika Agrawal 1 &
  • Neha Singhal   ORCID: orcid.org/0000-0002-7043-6024 1  

Explore all metrics

In this paper, an improved algorithm has been proposed for solving fully fuzzy transportation problems. The proposed algorithm deals with finding a starting basic feasible solution to the transportation problem with parameters in fuzzy form. The proposed algorithm is an amalgamation of two existing approaches that can be applied to a balanced fuzzy transportation problem where uncertainties are represented by trapezoidal fuzzy numbers. Instead of transforming these uncertainties into crisp values, the proposed algorithm directly handles the fuzzy nature of the problem. To illustrate its effectiveness, the article presents several numerical examples in which parameter uncertainties are characterized using trapezoidal fuzzy numbers. A comparative analysis is performed between the algorithm’s outcomes and the existing results. The existing results are compared with the obtained results. A case study has also been discussed to enhance the significance of the algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

comparative case study examples

Similar content being viewed by others

comparative case study examples

An ingenious approach to optimize a special class of transportation problem in uncertain environment

Fuzzy hungarian modi algorithm to solve fully fuzzy transportation problems.

comparative case study examples

An Innovative Method for Finding Optimal Solution Fully Solved by Using Generalized Quadratic Fuzzy Transportation Problems

Ahuja RK (1986) Algorithms for minmax transportation problem. Naval Res Logist Quat 33:725–739. https://doi.org/10.1002/nav.3800330415

Article   Google Scholar  

Babu MdA, Hoque MA, Uddin MdS (2020) A heuristic for obtaining better initial feasible solution to the transportation problem. Opsearch 57:221–245. https://doi.org/10.1007/s12597-019-00429-5

Article   MathSciNet   Google Scholar  

Ban AI, Coroianu L (2014) Existence, uniqueness and continuity of trapezoidal approximations of fuzzy numbers under a general condition. Fuzzy Sets Syst 257:3–22. https://doi.org/10.1016/j.fss.2013.07.004

Basirzadeh H (2011) An approach for solving fuzzy transportation problem. Appl Math Sci 5(32):1549–1566

MathSciNet   Google Scholar  

Chandrasekaran K, Ghafar AFA, Roslee AA, Yaacob SNK, Omar Sl, Dahalan WM (2023) Port Kelang development moving toward adopting industrial revolution 40 in the seaport system: a review. Adv Technol Transf through IoT IT Solut 73–79 https://doi.org/10.1007/978-3-031-25178-8_8

Charnes A, Cooper WW (1954) The stepping stone method for explaining linear programming calculation in transportation problem. Manage Sci 1:49–69. https://doi.org/10.1287/mnsc.1.1.49

Choudhary A, Yadav SP (2022) An approach to solve interval valued intuitionistic fuzzy transportation problem of Type-2. Int J Syst Assur Eng Manag 13:2992–3001. https://doi.org/10.1007/s13198-022-01771-6

Clifton KJ, Handy SL (2003) Qualitative methods in travel behaviour research, Transport survey quality and innovation . Emerald Group Publishing Limited, Bingley, pp 283–302. https://doi.org/10.1108/9781786359551-016

Book   Google Scholar  

Dantzig GB (1963) Linear Programming and Extensions. Princeton University Press, Princeton. https://doi.org/10.7249/r366

Dash S, Mohanty SP (2018) Uncertain transportation model with rough unit cost, demand and supply. Opsearch 55:1–13. https://doi.org/10.1007/s12597-017-0317-6

De D (2016) A method for solving fuzzy transportation problem of trapezoidal number. In: Proceedings of "The 7th SEAMS-UGC Conference 2015", pp 46–54

Deshmukh A, Mhaske A, Chopade PU, Bondar KL (2018) Fuzzy transportation problem by using trapezoidal fuzzy numbers. Int J Res Analy Rev 5(3):261–265

Google Scholar  

Ebrahimnejad A (2014) A simplified new approach for solving fuzzy transportation problems with generalized trapezoidal fuzzy numbers. Appl Soft Comput 19:171–176. https://doi.org/10.1016/j.asoc.2014.01.041

Ebrahimnejad A (2016) New method for solving fuzzy transportation problems with LR flat fuzzy numbers. Inf Sci 357:108–124. https://doi.org/10.1016/j.ins.2016.04.008

Gani AN, Samuel AE, Anuradha D (2011) Simplex type algorithm for solving fuzzy transportation problem. Tamsui Oxford J Inf Math Sci 27(1):89–98

George G, Maheswari PU, Ganesan K (2020) A modified method to solve fuzzy transportation problem involving trapezoidal fuzzy numbers. In AIP conference proceedings, vol 2277, no 1. https://doi.org/10.1063/5.0025266

Ghadle KP, Pathade PA (2017) Solving transportation problem with generalized hexagonal and generalized octagonal fuzzy numbers by ranking method. Global J Pure Appl Math 13(9):6367–6376

Hitchcock FL (1941) The distribution of a product from several sources to numerous localities. J Math Phys 20:224–230. https://doi.org/10.1002/sapm1941201224

Kaur A, Kumar A (2011a) A new method for solving fuzzy transportation problems using ranking function. Appl Math Model 35(12):5652–5661. https://doi.org/10.1016/j.apm.2011.05.012

Kaur A, Kumar A (2012) A new approach for solving fuzzy transportation problems using generalized trapezoidal fuzzy numbers. Appl Soft Comput 12(3):1201–1213. https://doi.org/10.1016/j.asoc.2011.10.014

Kirca O, Stair A (1990) A heuristic for obtaining an initial solution for the transportation problem. J Oper Res Soc 41:865–867. https://doi.org/10.1038/sj/jors/0410909

Kishore N, Jayswal A (2002) Prioritized goal programming formulation of an unbalanced transportation problem with budgetary constraints: a fuzzy approach. Opsearch 39:151–160. https://doi.org/10.1007/bf03398676

Koc E (2022) What are the barriers to the adoption of industry 40 in container terminals? A qualitative study on Turkish Ports. J Transp Logist 7(2):367–386. https://doi.org/10.26650/jtl.2022.1035565

Koopmans TC (1947) Optimum utilization of the transportation system. In: Proceeding of the international statistical conference, Washington DC. https://doi.org/10.2307/1907301

Kumar PS (2016) PSK method for solving type-1 and type-3 fuzzy transportation problems. Int J Fuzzy Syst Appl (IJFSA) 5(4):121–146. https://doi.org/10.4018/ijfsa.2016100106

Kumar PS (2020a) Algorithms for solving the optimization problems using fuzzy and intuitionistic fuzzy set. Int J Syst Assur Eng Manag 11:189–222. https://doi.org/10.1007/s13198-019-00941-3

Kumar PS (2020b) Algorithms for solving the optimization problems using fuzzy and intuitionistic fuzzy set. Int J Syst Assur Eng Manag 11(1):189–222. https://doi.org/10.1007/s13198-019-00941-3

Kumar PS, Hussain RJ (2015) Computationally simple approach for solving fully intuitionistic fuzzy real life transportation problems. Int J Syst Assur Eng Manag 7(S1):90–101. https://doi.org/10.1007/s13198-014-0334-2

Kumar A, Kaur A (2011b) Application of linear programming for solving fuzzy transportation problems. J Appl Math Inf 29(3–4):831–846

Kumar A, Kaur A (2011c) Application of classical transportation methods to find the fuzzy optimal solution of fuzzy transportation problems. Fuzzy Inf Eng 3(1):81–99. https://doi.org/10.1007/s12543-011-0068-7

Littlewood DC, Kiyumbu WL (2018) “Hub” organisations in Kenya: What are they? What do they do? And what is their potential? Technol Forecast Soc Chang 131:276–285. https://doi.org/10.1016/j.techfore.2017.09.031

Malini P (2019) A new ranking technique on heptagonal fuzzy numbers to solve fuzzy transportation problem. Int J Math Oper Res 15(3):364–371. https://doi.org/10.1504/ijmor.2019.102078

Mathew ER, Kalayathankal SJ (2019) A New ranking method using dodecagonal fuzzy number to solve fuzzy transportation problem. Int J Appl Eng Res 14(4):948–951

Mathur N, Srivastava PK, Paul A (2016) Trapezoidal fuzzy model to optimize transportation problem. Int J Model Simul Sci Comput 7(3):1650028-1–1650038. https://doi.org/10.1142/s1793962316500288

Mohideen SI, Kumar PS (2010) A comparative study on transportation problem in fuzzy environment. Int J Math Res 2(1):151–158

Muthuperumal S, Titus P, Venkatachalapathy M (2020) An algorithmic approach to solve unbalanced triangular fuzzy transportation problems. Soft Comput 24(24):18689–18698. https://doi.org/10.35625/cm960127u

Nagar P, Srivastava PK, Srivastava A (2022) A new dynamic score function approach to optimize a special class of Pythagorean fuzzy transportation problem. Int J Syst Assur Eng Manag 13(2):904–913. https://doi.org/10.1007/s13198-021-01339-w

Narayanamoorthy S, Saranya S, Maheswari S (2013) A method for solving fuzzy transportation problem using fuzzy Russell’s method. Int J Intell Syst Appl 5(2):71–75. https://doi.org/10.5815/ijisa.2013.02.08

Ngastiti PTB, Surarso B, Sutimin T (2018) Zero point and zero suffix methods with robust ranking for solving fully fuzzy transportation problems. J Phys Conf Ser. https://doi.org/10.1088/1742-6596/1022/1/012005

Pandian P, Natarajan G (2010a) A new algorithm for finding a fuzzy optimal solution for fuzzy transportation problems. Appl Math Sci 4(2):79–90

Pandian P, Natrajan G (2010b) An optimal more-for-less solution to fuzzy transportation problems with mixed constraints. Appl Math Sci 4(29):1405–1415

Saad OM (2005) On the integer solutions of the generalized transportation problem under fuzzy environment. Opsearch 42:238–251. https://doi.org/10.1007/bf03398733

Salleh NHM, Selvaduray M, Jeevan J, Ngah AH, Zailani S (2021) Adaptation of industrial revolution 4.0 in a seaport system. Sustainability 13(9):10667. https://doi.org/10.3390/su131910667

Sam'an M, Farikhin, Surarso B, Zaki S (2018) A modified algorithm for full fuzzy transportation problem with simple additive weighting. In: International conference on information and communications technology (ICOIACT). IEEE, pp 684–688. https://doi.org/10.1109/icoiact.2018.8350745

Savitha MT, Mary G (2017) New methods for ranking of trapezoidal fuzzy numbers. Adv Fuzzy Math 12(5):1159–1170

Shanmugasundari M, Ganesan K (2013) A novel approach for the fuzzy optimal solution of fuzzy transportation problem. Int J Eng Res Appl 3(1):1416–1424

Singh SK, Yadav SP (2016) Intuitionistic fuzzy transportation problem with various kinds of uncertainties in parameters and variables. Int J Syst Assur Eng Manag 7:262–272. https://doi.org/10.1007/s13198-016-0456-9

Thamaraiselvi A, Santhi R (2015) Solving fuzzy transportation problem with generalized hexagonal fuzzy numbers. IOSR J Math 11(5):8–13

Vimala S, Prabha SK (2016) Fuzzy transportation problem through Monalisha’s approximation method. Br J Math Comput Sci 17(2):1–11. https://doi.org/10.9734/bjmcs/2016/26097

Vinoliah EM, Ganesan K (2017) Solution of fuzzy transportation problem- a new approach. Int J Pure Appl Math 113(13):20–29

Download references

Acknowledgements

Not applicable.

No funding received.

Author information

Authors and affiliations.

Department of Mathematics, Jaypee Institute of Information Technology, Noida, Uttar Pradesh, 201309, India

Anshika Agrawal & Neha Singhal

You can also search for this author in PubMed   Google Scholar

Contributions

Both authors discussed and contributed in the paper. Both authors read and approved the final manuscript.

Corresponding author

Correspondence to Neha Singhal .

Ethics declarations

Ethics approval and consent to participate, consent for publication, conflict of interest.

The authors declare that they have no competing interests" in this section.

Availability of data and material:

Additional information, publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Agrawal, A., Singhal, N. An efficient computational approach for basic feasible solution of fuzzy transportation problems. Int J Syst Assur Eng Manag (2024). https://doi.org/10.1007/s13198-024-02340-9

Download citation

Received : 29 May 2023

Revised : 29 February 2024

Accepted : 07 April 2024

Published : 06 May 2024

DOI : https://doi.org/10.1007/s13198-024-02340-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Fuzzy transportation problem
  • Trapezoidal fuzzy number
  • Basic feasible solution
  • Find a journal
  • Publish with us
  • Track your research

IMAGES

  1. examples of comparative case studies

    comparative case study examples

  2. FREE 9+ Comparative Research Templates in PDF

    comparative case study examples

  3. Comparative Case Studies

    comparative case study examples

  4. Comparative Essay

    comparative case study examples

  5. (PDF) Qualitative Comparative Analysis for Conducting Multiple Case

    comparative case study examples

  6. (PDF) Comparative case study research in education

    comparative case study examples

VIDEO

  1. GGCI Student Research Summit V (2023) Panel 2: Metropolitan Development Studies

  2. The Great Recession vs Today's Economy: A Comparative Case Study 📉 #stocks #stockmarketcrash #entrep

  3. Anonymization, Hashing and Data Encryption Techniques: A Comparative Case Study

  4. Rohit Kumar (Architecture) in 2023-2023 3MT Final Competition

  5. #KOMEX Comparative Case Study Design course: who is it for and what is it about?

  6. IB HL visual arts

COMMENTS

  1. Comparative Case Studies: An Innovative Approach

    In this article, we argue for a new approach—the comparative case study approach—that attends simultaneously to macro, meso, and micro dimensions of case-based research. The approach engages ...

  2. Comparative Case Studies: Methodological Discussion

    Comparative Case Studies have been suggested as providing effective tools to understanding policy and practice along three different axes of social scientific research, namely horizontal (spaces), vertical (scales), and transversal (time). The chapter, first, sketches the methodological basis of case-based research in comparative studies as a ...

  3. (PDF) A Short Introduction to Comparative Research

    A comparative study is a kind of method that analyzes phenomena and then put them together . ... popular examples of case-oriented research include 'crucial,' 'most-likely,'

  4. A Practical Guide to the Comparative Case Study Method in ...

    1 Recent examples of case study research in these subfields include Fishel's (1992) case study of a California legislator's development over time, Gilbert's (1995) case study of President Johnson for ... The comparative case study is the systematic comparison of two or more data points ("cases") obtained through use of the case study method. ...

  5. Comparative case studies

    Comparative case studies may be selected when it is not feasible to undertake an experimental design and/or when there is a need to understand and explain how features within the context influence the success of programme or policy initiatives. This information is valuable in tailoring interventions to support the achievement of intended ...

  6. Comparative Research Methods

    Comparative Case Study Analysis. Mono-national case studies can contribute to comparative research if they are composed with a larger framework in mind and follow the Method of Structured, Focused Comparison (George & Bennett, 2005). For case studies to contribute to cumulative development of knowledge and theory they must all explore the same ...

  7. Comparative Studies

    The case study subject in comparative approaches may be an event, an institution, a sector, a policy process, or even a whole nation. The case study, frequently, implies the collection of unstructured data and a qualitative analysis of those data. ... in the sense that the study only examines a sample and not every member of the population.

  8. PDF Comparative Case Studies: Methodological Discussion

    Comparative Case Studies: Methodological Discussion Marcelo Parreira do Amaral 3.1 Introduction Exploring landscapes of lifelong learning in Europe is a daunting task as ... examples of cases understood simply as sites to observe/measure vari-ables—in a nomothetic cast—or examples, where cases are viewed as spe- ...

  9. Comparative Case Study Research

    Summary. Case studies in the field of education often eschew comparison. However, when scholars forego comparison, they are missing an important opportunity to bolster case studies' theoretical generalizability. Scholars must examine how disparate epistemologies lead to distinct kinds of qualitative research and different notions of comparison.

  10. 15

    There is a wide divide between quantitative and qualitative approaches in comparative work. Most studies are either exclusively qualitative (e.g., individual case studies of a small number of countries) or exclusively quantitative, most often using many cases and a cross-national focus (Ragin, 1991:7).

  11. 2.3: Case Selection (Or, How to Use Cases in Your Comparative Analysis

    Most case studies are descriptive in nature, where the researchers simply seek to describe what they observe. They are useful for transmitting information regarding the studied political phenomenon. For a descriptive case study, a scholar might choose a case that is considered typical of the population. An example could involve researching the ...

  12. Chapter 10 Methods for Comparative Studies

    Experimental studies are one type of comparative study where a sample of participants is identified and assigned to different conditions for a given time duration, then compared for differences. ... sources of biases, confounders, and adherence to reporting guidelines. Three case examples were included to show how eHealth comparative studies ...

  13. Comparative Case Studies

    Comparative case studies resist the holism of many traditional case studies, which stubbornly refuse to distinguish phenomenon from context, often defined implicitly as place. It is essential to divorce the phenomenon of interest from the context in order to gain analytical purchase. As Geertz (1973, p.

  14. Writing impact case studies: a comparative study of high ...

    The quantitative linguistic analysis was based on a sample of all identifiable high-scoring case studies in any UoA (n = 124) and all identifiable low-scoring impact case studies in those UoAs ...

  15. Full article: Doing comparative case study research in urban and

    1. Introduction 'At the very least, comparative urbanism must be practiced in a conscious manner: comparative conceptual frameworks and comparative methodologies must be explicated and argued' (Nijman, Citation 2007, p. 3). This citation skilfully discloses the challenges associated with comparative research and it also applies to comparative case study research.

  16. Comparative Case Study

    Human-Environment Relationship: Comparative Case Studies. C.G. Knight, in International Encyclopedia of the Social & Behavioral Sciences, 2001 A comparative case study is a research approach to formulate or assess generalizations that extend across multiple cases. The nature of comparative case studies may be explored from the intersection of comparative and case study approaches.

  17. Comparative case studies

    Examples. Evaluators used a comparative case study method for the National Cancer Institute's (NCI's) Community Cancer Centers Program (NCCCP). The aim of this program was to expand cancer research and deliver the latest, most advanced cancer care to a greater number of Americans in the communities in which they live via community hospitals.

  18. Teaching Case Study Methods in Comparative Education

    Drawing on both the logic of connection and the logic of juxtaposition, the comparative case study approach proposes three "axes" of comparison: the vertical, the horizontal, and the transversal (Bartlett and Vavrus 2017).The vertical axis urges comparison across micro-, meso-, and macro-levels or scales. For example, a study of international student migration in the United States not only ...

  19. PDF The Comparative approach: theory and method

    2.2 Comparative Research and case selection 2.3 The Use of Comparative analysis in political science: relating politics, polity ... The Single Case Study over time (i.e. a historical study or time series analysis) (3) Two or more cases at a few time intervals (i.e. closed universe of discourse) ... (Lijphart, 1971: 692). Examples of such ...

  20. What is Comparative Analysis and How to Conduct It? (+ Examples)

    Contextual Understanding: In comparative case studies, it's crucial to consider the context within which each case operates. Understanding the context helps interpret findings accurately. Cross-Case Analysis: Researchers conduct cross-case analysis to identify commonalities and differences across cases. This process can lead to the discovery of ...

  21. Comparative research

    Comparative research is a research methodology in the social sciences exemplified in cross-cultural or comparative studies that aims to make comparisons across different countries or cultures.A major problem in comparative research is that the data sets in different countries may define categories differently (for example by using different definitions of poverty) or may not use the same ...

  22. Writing a Comparative Case Study: Effective Guide

    A comparative study is an effective research method for analyzing case similarities and differences. Writing a comparative study can be daunting, but proper planning and organization can be an effective research method. Define your research question, choose relevant cases, collect and analyze comprehensive data, and present the findings.

  23. POLSC221 Study Guide: Unit 7: Comparative Case Studies

    Comparing and contrasting one country's political economy and development with another involves examining the evolution of each nation's respective political system, economic development, international trade, and the internal distribution of its national income and wealth. To review, see Comparative Case Studies. 7b.

  24. Case Interview: Complete Prep Guide

    Case in Point - This book, by Marc Cosentino, is a comprehensive guide that walks you through the case interview process from beginning to end. This guide has helped many students over the years and can serve as an excellent foundation for how to approach business problems ... Practice sample online cases on consulting firm websites such as ...

  25. The learning curve in endoscopic transsphenoidal skull-base surgery: a

    Most studies were retrospective cohort designs (88%). Various patient assortment methods were noted, including group-based and case-based analyses. Statistical analyses included descriptive and comparative methods, along with regression analyses and curve modeling techniques. Pituitary adenoma (PA) being the most studied pathology (82%).

  26. An efficient computational approach for basic feasible ...

    A comparative analysis is performed between the algorithm's outcomes and the existing results. The existing results are compared with the obtained results. A case study has also been discussed to enhance the significance of the algorithm.