Purdue Online Writing Lab Purdue OWL® College of Liberal Arts

Descriptive Statistics

OWL logo

Welcome to the Purdue OWL

This page is brought to you by the OWL at Purdue University. When printing this page, you must include the entire legal notice.

Copyright ©1995-2018 by The Writing Lab & The OWL at Purdue and Purdue University. All rights reserved. This material may not be published, reproduced, broadcast, rewritten, or redistributed without permission. Use of this site constitutes acceptance of our terms and conditions of fair use.

The mean, the mode, the median, the range, and the standard deviation are all examples of descriptive statistics. Descriptive statistics are used because in most cases, it isn't possible to present all of your data in any form that your reader will be able to quickly interpret.

Generally, when writing descriptive statistics, you want to present at least one form of central tendency (or average), that is, either the mean, median, or mode. In addition, you should present one form of variability , usually the standard deviation.

Measures of Central Tendency and Other Commonly Used Descriptive Statistics

The mean, median, and the mode are all measures of central tendency. They attempt to describe what the typical data point might look like. In essence, they are all different forms of 'the average.' When writing statistics, you never want to say 'average' because it is difficult, if not impossible, for your reader to understand if you are referring to the mean, the median, or the mode.

The mean is the most common form of central tendency, and is what most people usually are referring to when the say average. It is simply the total sum of all the numbers in a data set, divided by the total number of data points. For example, the following data set has a mean of 4: {-1, 0, 1, 16}. That is, 16 divided by 4 is 4. If there isn't a good reason to use one of the other forms of central tendency, then you should use the mean to describe the central tendency.

The median is simply the middle value of a data set. In order to calculate the median, all values in the data set need to be ordered, from either highest to lowest, or vice versa. If there are an odd number of values in a data set, then the median is easy to calculate. If there is an even number of values in a data set, then the calculation becomes more difficult. Statisticians still debate how to properly calculate a median when there is an even number of values, but for most purposes, it is appropriate to simply take the mean of the two middle values. The median is useful when describing data sets that are skewed or have extreme values. Incomes of baseballs players, for example, are commonly reported using a median because a small minority of baseball players makes a lot of money, while most players make more modest amounts. The median is less influenced by extreme scores than the mean.

The mode is the most commonly occurring number in the data set. The mode is best used when you want to indicate the most common response or item in a data set. For example, if you wanted to predict the score of the next football game, you may want to know what the most common score is for the visiting team, but having an average score of 15.3 won't help you if it is impossible to score 15.3 points. Likewise, a median score may not be very informative either, if you are interested in what score is most likely.

Standard Deviation

The standard deviation is a measure of variability (it is not a measure of central tendency). Conceptually it is best viewed as the 'average distance that individual data points are from the mean.' Data sets that are highly clustered around the mean have lower standard deviations than data sets that are spread out.

For example, the first data set would have a higher standard deviation than the second data set:

Notice that both groups have the same mean (5) and median (also 5), but the two groups contain different numbers and are organized much differently. This organization of a data set is often referred to as a distribution. Because the two data sets above have the same mean and median, but different standard deviation, we know that they also have different distributions. Understanding the distribution of a data set helps us understand how the data behave.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Springer Nature - PMC COVID-19 Collection

Logo of phenaturepg

Descriptive Statistics for Summarising Data

Ray w. cooksey.

UNE Business School, University of New England, Armidale, NSW Australia

This chapter discusses and illustrates descriptive statistics . The purpose of the procedures and fundamental concepts reviewed in this chapter is quite straightforward: to facilitate the description and summarisation of data. By ‘describe’ we generally mean either the use of some pictorial or graphical representation of the data (e.g. a histogram, box plot, radar plot, stem-and-leaf display, icon plot or line graph) or the computation of an index or number designed to summarise a specific characteristic of a variable or measurement (e.g., frequency counts, measures of central tendency, variability, standard scores). Along the way, we explore the fundamental concepts of probability and the normal distribution. We seldom interpret individual data points or observations primarily because it is too difficult for the human brain to extract or identify the essential nature, patterns, or trends evident in the data, particularly if the sample is large. Rather we utilise procedures and measures which provide a general depiction of how the data are behaving. These statistical procedures are designed to identify or display specific patterns or trends in the data. What remains after their application is simply for us to interpret and tell the story.

The first broad category of statistics we discuss concerns descriptive statistics . The purpose of the procedures and fundamental concepts in this category is quite straightforward: to facilitate the description and summarisation of data. By ‘describe’ we generally mean either the use of some pictorial or graphical representation of the data or the computation of an index or number designed to summarise a specific characteristic of a variable or measurement.

We seldom interpret individual data points or observations primarily because it is too difficult for the human brain to extract or identify the essential nature, patterns, or trends evident in the data, particularly if the sample is large. Rather we utilise procedures and measures which provide a general depiction of how the data are behaving. These statistical procedures are designed to identify or display specific patterns or trends in the data. What remains after their application is simply for us to interpret and tell the story.

Reflect on the QCI research scenario and the associated data set discussed in Chap. 10.1007/978-981-15-2537-7_4. Consider the following questions that Maree might wish to address with respect to decision accuracy and speed scores:

  • What was the typical level of accuracy and decision speed for inspectors in the sample? [see Procedure 5.4 – Assessing central tendency.]
  • What was the most common accuracy and speed score amongst the inspectors? [see Procedure 5.4 – Assessing central tendency.]
  • What was the range of accuracy and speed scores; the lowest and the highest scores? [see Procedure 5.5 – Assessing variability.]
  • How frequently were different levels of inspection accuracy and speed observed? What was the shape of the distribution of inspection accuracy and speed scores? [see Procedure 5.1 – Frequency tabulation, distributions & crosstabulation.]
  • What percentage of inspectors would have ‘failed’ to ‘make the cut’ assuming the industry standard for acceptable inspection accuracy and speed combined was set at 95%? [see Procedure 5.7 – Standard ( z ) scores.]
  • How variable were the inspectors in their accuracy and speed scores? Were all the accuracy and speed levels relatively close to each other in magnitude or were the scores widely spread out over the range of possible test outcomes? [see Procedure 5.5 – Assessing variability.]
  • What patterns might be visually detected when looking at various QCI variables singly and together as a set? [see Procedure 5.2 – Graphical methods for dispaying data, Procedure 5.3 – Multivariate graphs & displays, and Procedure 5.6 – Exploratory data analysis.]

This chapter includes discussions and illustrations of a number of procedures available for answering questions about data like those posed above. In addition, you will find discussions of two fundamental concepts, namely probability and the normal distribution ; concepts that provide building blocks for Chaps. 10.1007/978-981-15-2537-7_6 and 10.1007/978-981-15-2537-7_7.

Procedure 5.1: Frequency Tabulation, Distributions & Crosstabulation

Frequency tabulation and distributions.

Frequency tabulation serves to provide a convenient counting summary for a set of data that facilitates interpretation of various aspects of those data. Basically, frequency tabulation occurs in two stages:

  • First, the scores in a set of data are rank ordered from the lowest value to the highest value.
  • Second, the number of times each specific score occurs in the sample is counted. This count records the frequency of occurrence for that specific data value.

Consider the overall job satisfaction variable, jobsat , from the QCI data scenario. Performing frequency tabulation across the 112 Quality Control Inspectors on this variable using the SPSS Frequencies procedure (Allen et al. 2019 , ch. 3; George and Mallery 2019 , ch. 6) produces the frequency tabulation shown in Table 5.1 . Note that three of the inspectors in the sample did not provide a rating for jobsat thereby producing three missing values (= 2.7% of the sample of 112) and leaving 109 inspectors with valid data for the analysis.

Frequency tabulation of overall job satisfaction scores

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Tab1_HTML.jpg

The display of frequency tabulation is often referred to as the frequency distribution for the sample of scores. For each value of a variable, the frequency of its occurrence in the sample of data is reported. It is possible to compute various percentages and percentile values from a frequency distribution.

Table 5.1 shows the ‘Percent’ or relative frequency of each score (the percentage of the 112 inspectors obtaining each score, including those inspectors who were missing scores, which SPSS labels as ‘System’ missing). Table 5.1 also shows the ‘Valid Percent’ which is computed only for those inspectors in the sample who gave a valid or non-missing response.

Finally, it is possible to add up the ‘Valid Percent’ values, starting at the low score end of the distribution, to form the cumulative distribution or ‘Cumulative Percent’ . A cumulative distribution is useful for finding percentiles which reflect what percentage of the sample scored at a specific value or below.

We can see in Table 5.1 that 4 of the 109 valid inspectors (a ‘Valid Percent’ of 3.7%) indicated the lowest possible level of job satisfaction—a value of 1 (Very Low) – whereas 18 of the 109 valid inspectors (a ‘Valid Percent’ of 16.5%) indicated the highest possible level of job satisfaction—a value of 7 (Very High). The ‘Cumulative Percent’ number of 18.3 in the row for the job satisfaction score of 3 can be interpreted as “roughly 18% of the sample of inspectors reported a job satisfaction score of 3 or less”; that is, nearly a fifth of the sample expressed some degree of negative satisfaction with their job as a quality control inspector in their particular company.

If you have a large data set having many different scores for a particular variable, it may be more useful to tabulate frequencies on the basis of intervals of scores.

For the accuracy scores in the QCI database, you could count scores occurring in intervals such as ‘less than 75% accuracy’, ‘between 75% but less than 85% accuracy’, ‘between 85% but less than 95% accuracy’, and ‘95% accuracy or greater’, rather than counting the individual scores themselves. This would yield what is termed a ‘grouped’ frequency distribution since the data have been grouped into intervals or score classes. Producing such an analysis using SPSS would involve extra steps to create the new category or ‘grouping’ system for scores prior to conducting the frequency tabulation.

Crosstabulation

In a frequency crosstabulation , we count frequencies on the basis of two variables simultaneously rather than one; thus we have a bivariate situation.

For example, Maree might be interested in the number of male and female inspectors in the sample of 112 who obtained each jobsat score. Here there are two variables to consider: inspector’s gender and inspector’s j obsat score. Table 5.2 shows such a crosstabulation as compiled by the SPSS Crosstabs procedure (George and Mallery 2019 , ch. 8). Note that inspectors who did not report a score for jobsat and/or gender have been omitted as missing values, leaving 106 valid inspectors for the analysis.

Frequency crosstabulation of jobsat scores by gender category for the QCI data

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Tab2_HTML.jpg

The crosstabulation shown in Table 5.2 gives a composite picture of the distribution of satisfaction levels for male inspectors and for female inspectors. If frequencies or ‘Counts’ are added across the gender categories, we obtain the numbers in the ‘Total’ column (the percentages or relative frequencies are also shown immediately below each count) for each discrete value of jobsat (note this column of statistics differs from that in Table 5.1 because the gender variable was missing for certain inspectors). By adding down each gender column, we obtain, in the bottom row labelled ‘Total’, the number of males and the number of females that comprised the sample of 106 valid inspectors.

The totals, either across the rows or down the columns of the crosstabulation, are termed the marginal distributions of the table. These marginal distributions are equivalent to frequency tabulations for each of the variables jobsat and gender . As with frequency tabulation, various percentage measures can be computed in a crosstabulation, including the percentage of the sample associated with a specific count within either a row (‘% within jobsat ’) or a column (‘% within gender ’). You can see in Table 5.2 that 18 inspectors indicated a job satisfaction level of 7 (Very High); of these 18 inspectors reported in the ‘Total’ column, 8 (44.4%) were male and 10 (55.6%) were female. The marginal distribution for gender in the ‘Total’ row shows that 57 inspectors (53.8% of the 106 valid inspectors) were male and 49 inspectors (46.2%) were female. Of the 57 male inspectors in the sample, 8 (14.0%) indicated a job satisfaction level of 7 (Very High). Furthermore, we could generate some additional interpretive information of value by adding the ‘% within gender’ values for job satisfaction levels of 5, 6 and 7 (i.e. differing degrees of positive job satisfaction). Here we would find that 68.4% (= 24.6% + 29.8% + 14.0%) of male inspectors indicated some degree of positive job satisfaction compared to 61.2% (= 10.2% + 30.6% + 20.4%) of female inspectors.

This helps to build a picture of the possible relationship between an inspector’s gender and their level of job satisfaction (a relationship that, as we will see later, can be quantified and tested using Procedure 10.1007/978-981-15-2537-7_6#Sec14 and Procedure 10.1007/978-981-15-2537-7_7#Sec17).

It should be noted that a crosstabulation table such as that shown in Table 5.2 is often referred to as a contingency table about which more will be said later (see Procedure 10.1007/978-981-15-2537-7_7#Sec17 and Procedure 10.1007/978-981-15-2537-7_7#Sec115).

Frequency tabulation is useful for providing convenient data summaries which can aid in interpreting trends in a sample, particularly where the number of discrete values for a variable is relatively small. A cumulative percent distribution provides additional interpretive information about the relative positioning of specific scores within the overall distribution for the sample.

Crosstabulation permits the simultaneous examination of the distributions of values for two variables obtained from the same sample of observations. This examination can yield some useful information about the possible relationship between the two variables. More complex crosstabulations can be also done where the values of three or more variables are tracked in a single systematic summary. The use of frequency tabulation or cross-tabulation in conjunction with various other statistical measures, such as measures of central tendency (see Procedure 5.4 ) and measures of variability (see Procedure 5.5 ), can provide a relatively complete descriptive summary of any data set.

Disadvantages

Frequency tabulations can get messy if interval or ratio-level measures are tabulated simply because of the large number of possible data values. Grouped frequency distributions really should be used in such cases. However, certain choices, such as the size of the score interval (group size), must be made, often arbitrarily, and such choices can affect the nature of the final frequency distribution.

Additionally, percentage measures have certain problems associated with them, most notably, the potential for their misinterpretation in small samples. One should be sure to know the sample size on which percentage measures are based in order to obtain an interpretive reference point for the actual percentage values.

For example

In a sample of 10 individuals, 20% represents only two individuals whereas in a sample of 300 individuals, 20% represents 60 individuals. If all that is reported is the 20%, then the mental inference drawn by readers is likely to be that a sizeable number of individuals had a score or scores of a particular value—but what is ‘sizeable’ depends upon the total number of observations on which the percentage is based.

Where Is This Procedure Useful?

Frequency tabulation and crosstabulation are very commonly applied procedures used to summarise information from questionnaires, both in terms of tabulating various demographic characteristics (e.g. gender, age, education level, occupation) and in terms of actual responses to questions (e.g. numbers responding ‘yes’ or ‘no’ to a particular question). They can be particularly useful in helping to build up the data screening and demographic stories discussed in Chap. 10.1007/978-981-15-2537-7_4. Categorical data from observational studies can also be analysed with this technique (e.g. the number of times Suzy talks to Frank, to Billy, and to John in a study of children’s social interactions).

Certain types of experimental research designs may also be amenable to analysis by crosstabulation with a view to drawing inferences about distribution differences across the sets of categories for the two variables being tracked.

You could employ crosstabulation in conjunction with the tests described in Procedure 10.1007/978-981-15-2537-7_7#Sec17 to see if two different styles of advertising campaign differentially affect the product purchasing patterns of male and female consumers.

In the QCI database, Maree could employ crosstabulation to help her answer the question “do different types of electronic manufacturing firms ( company ) differ in terms of their tendency to employ male versus female quality control inspectors ( gender )?”

Software Procedures

ApplicationProcedures
SPSS or . and select the variable(s) you wish to analyse; for the procedure, hitting the ‘ ’ button will allow you to choose various types of statistics and percentages to show in each cell of the table.
NCSS or and select the variable(s) you wish to analyse.
SYSTAT or ➔ and select the variable(s) you wish to analyse and choose the optional statistics you wish to see.
STATGRAPHICS or and select the variable(s) you wish to analyse; hit ‘ ’ and when the ‘Tables and Graphs’ window opens, choose the Tables and Graphs you wish to see.
Commander or and select the variable(s) you wish to analyse and choose the optional statistics you wish to see.

Procedure 5.2: Graphical Methods for Displaying Data

Graphical methods for displaying data include bar and pie charts, histograms and frequency polygons, line graphs and scatterplots. It is important to note that what is presented here is a small but representative sampling of the types of simple graphs one can produce to summarise and display trends in data. Generally speaking, SPSS offers the easiest facility for producing and editing graphs, but with a rather limited range of styles and types. SYSTAT, STATGRAPHICS and NCSS offer a much wider range of graphs (including graphs unique to each package), but with the drawback that it takes somewhat more effort to get the graphs in exactly the form you want.

Bar and Pie Charts

These two types of graphs are useful for summarising the frequency of occurrence of various values (or ranges of values) where the data are categorical (nominal or ordinal level of measurement).

  • A bar chart uses vertical and horizontal axes to summarise the data. The vertical axis is used to represent frequency (number) of occurrence or the relative frequency (percentage) of occurrence; the horizontal axis is used to indicate the data categories of interest.
  • A pie chart gives a simpler visual representation of category frequencies by cutting a circular plot into wedges or slices whose sizes are proportional to the relative frequency (percentage) of occurrence of specific data categories. Some pie charts can have a one or more slices emphasised by ‘exploding’ them out from the rest of the pie.

Consider the company variable from the QCI database. This variable depicts the types of manufacturing firms that the quality control inspectors worked for. Figure 5.1 illustrates a bar chart summarising the percentage of female inspectors in the sample coming from each type of firm. Figure 5.2 shows a pie chart representation of the same data, with an ‘exploded slice’ highlighting the percentage of female inspectors in the sample who worked for large business computer manufacturers – the lowest percentage of the five types of companies. Both graphs were produced using SPSS.

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig1_HTML.jpg

Bar chart: Percentage of female inspectors

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig2_HTML.jpg

Pie chart: Percentage of female inspectors

The pie chart was modified with an option to show the actual percentage along with the label for each category. The bar chart shows that computer manufacturing firms have relatively fewer female inspectors compared to the automotive and electrical appliance (large and small) firms. This trend is less clear from the pie chart which suggests that pie charts may be less visually interpretable when the data categories occur with rather similar frequencies. However, the ‘exploded slice’ option can help interpretation in some circumstances.

Certain software programs, such as SPSS, STATGRAPHICS, NCSS and Microsoft Excel, offer the option of generating 3-dimensional bar charts and pie charts and incorporating other ‘bells and whistles’ that can potentially add visual richness to the graphic representation of the data. However, you should generally be careful with these fancier options as they can produce distortions and create ambiguities in interpretation (e.g. see discussions in Jacoby 1997 ; Smithson 2000 ; Wilkinson 2009 ). Such distortions and ambiguities could ultimately end up providing misinformation to researchers as well as to those who read their research.

Histograms and Frequency Polygons

These two types of graphs are useful for summarising the frequency of occurrence of various values (or ranges of values) where the data are essentially continuous (interval or ratio level of measurement) in nature. Both histograms and frequency polygons use vertical and horizontal axes to summarise the data. The vertical axis is used to represent the frequency (number) of occurrence or the relative frequency (percentage) of occurrences; the horizontal axis is used for the data values or ranges of values of interest. The histogram uses bars of varying heights to depict frequency; the frequency polygon uses lines and points.

There is a visual difference between a histogram and a bar chart: the bar chart uses bars that do not physically touch, signifying the discrete and categorical nature of the data, whereas the bars in a histogram physically touch to signal the potentially continuous nature of the data.

Suppose Maree wanted to graphically summarise the distribution of speed scores for the 112 inspectors in the QCI database. Figure 5.3 (produced using NCSS) illustrates a histogram representation of this variable. Figure 5.3 also illustrates another representational device called the ‘density plot’ (the solid tracing line overlaying the histogram) which gives a smoothed impression of the overall shape of the distribution of speed scores. Figure 5.4 (produced using STATGRAPHICS) illustrates the frequency polygon representation for the same data.

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig3_HTML.jpg

Histogram of the speed variable (with density plot overlaid)

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig4_HTML.jpg

Frequency polygon plot of the speed variable

These graphs employ a grouped format where speed scores which fall within specific intervals are counted as being essentially the same score. The shape of the data distribution is reflected in these plots. Each graph tells us that the inspection speed scores are positively skewed with only a few inspectors taking very long times to make their inspection judgments and the majority of inspectors taking rather shorter amounts of time to make their decisions.

Both representations tell a similar story; the choice between them is largely a matter of personal preference. However, if the number of bars to be plotted in a histogram is potentially very large (and this is usually directly controllable in most statistical software packages), then a frequency polygon would be the preferred representation simply because the amount of visual clutter in the graph will be much reduced.

It is somewhat of an art to choose an appropriate definition for the width of the score grouping intervals (or ‘bins’ as they are often termed) to be used in the plot: choose too many and the plot may look too lumpy and the overall distributional trend may not be obvious; choose too few and the plot will be too coarse to give a useful depiction. Programs like SPSS, SYSTAT, STATGRAPHICS and NCSS are designed to choose an ‘appropriate’ number of bins to be used, but the analyst’s eye is often a better judge than any statistical rule that a software package would use.

There are several interesting variations of the histogram which can highlight key data features or facilitate interpretation of certain trends in the data. One such variation is a graph is called a dual histogram (available in SYSTAT; a variation called a ‘comparative histogram’ can be created in NCSS) – a graph that facilitates visual comparison of the frequency distributions for a specific variable for participants from two distinct groups.

Suppose Maree wanted to graphically compare the distributions of speed scores for inspectors in the two categories of education level ( educlev ) in the QCI database. Figure 5.5 shows a dual histogram (produced using SYSTAT) that accomplishes this goal. This graph still employs the grouped format where speed scores falling within particular intervals are counted as being essentially the same score. The shape of the data distribution within each group is also clearly reflected in this plot. However, the story conveyed by the dual histogram is that, while the inspection speed scores are positively skewed for inspectors in both categories of educlev, the comparison suggests that inspectors with a high school level of education (= 1) tend to take slightly longer to make their inspection decisions than do their colleagues who have a tertiary qualification (= 2).

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig5_HTML.jpg

Dual histogram of speed for the two categories of educlev

Line Graphs

The line graph is similar in style to the frequency polygon but is much more general in its potential for summarising data. In a line graph, we seldom deal with percentage or frequency data. Instead we can summarise other types of information about data such as averages or means (see Procedure 5.4 for a discussion of this measure), often for different groups of participants. Thus, one important use of the line graph is to break down scores on a specific variable according to membership in the categories of a second variable.

In the context of the QCI database, Maree might wish to summarise the average inspection accuracy scores for the inspectors from different types of manufacturing companies. Figure 5.6 was produced using SPSS and shows such a line graph.

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig6_HTML.jpg

Line graph comparison of companies in terms of average inspection accuracy

Note how the trend in performance across the different companies becomes clearer with such a visual representation. It appears that the inspectors from the Large Business Computer and PC manufacturing companies have better average inspection accuracy compared to the inspectors from the remaining three industries.

With many software packages, it is possible to further elaborate a line graph by including error or confidence intervals bars (see Procedure 10.1007/978-981-15-2537-7_8#Sec18). These give some indication of the precision with which the average level for each category in the population has been estimated (narrow bars signal a more precise estimate; wide bars signal a less precise estimate).

Figure 5.7 shows such an elaborated line graph, using 95% confidence interval bars, which can be used to help make more defensible judgments (compared to Fig. 5.6 ) about whether the companies are substantively different from each other in average inspection performance. Companies whose confidence interval bars do not overlap each other can be inferred to be substantively different in performance characteristics.

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig7_HTML.jpg

Line graph using confidence interval bars to compare accuracy across companies

The accuracy confidence interval bars for participants from the Large Business Computer manufacturing firms do not overlap those from the Large or Small Electrical Appliance manufacturers or the Automobile manufacturers.

We might conclude that quality control inspection accuracy is substantially better in the Large Business Computer manufacturing companies than in these other industries but is not substantially better than the PC manufacturing companies. We might also conclude that inspection accuracy in PC manufacturing companies is not substantially different from Small Electrical Appliance manufacturers.

Scatterplots

Scatterplots are useful in displaying the relationship between two interval- or ratio-scaled variables or measures of interest obtained on the same individuals, particularly in correlational research (see Fundamental Concept 10.1007/978-981-15-2537-7_6#Sec1 and Procedure 10.1007/978-981-15-2537-7_6#Sec4).

In a scatterplot, one variable is chosen to be represented on the horizontal axis; the second variable is represented on the vertical axis. In this type of plot, all data point pairs in the sample are graphed. The shape and tilt of the cloud of points in a scatterplot provide visual information about the strength and direction of the relationship between the two variables. A very compact elliptical cloud of points signals a strong relationship; a very loose or nearly circular cloud signals a weak or non-existent relationship. A cloud of points generally tilted upward toward the right side of the graph signals a positive relationship (higher scores on one variable associated with higher scores on the other and vice-versa). A cloud of points generally tilted downward toward the right side of the graph signals a negative relationship (higher scores on one variable associated with lower scores on the other and vice-versa).

Maree might be interested in displaying the relationship between inspection accuracy and inspection speed in the QCI database. Figure 5.8 , produced using SPSS, shows what such a scatterplot might look like. Several characteristics of the data for these two variables can be noted in Fig. 5.8 . The shape of the distribution of data points is evident. The plot has a fan-shaped characteristic to it which indicates that accuracy scores are highly variable (exhibit a very wide range of possible scores) at very fast inspection speeds but get much less variable and tend to be somewhat higher as inspection speed increases (where inspectors take longer to make their quality control decisions). Thus, there does appear to be some relationship between inspection accuracy and inspection speed (a weak positive relationship since the cloud of points tends to be very loose but tilted generally upward toward the right side of the graph – slower speeds tend to be slightly associated with higher accuracy.

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig8_HTML.jpg

Scatterplot relating inspection accuracy to inspection speed

However, it is not the case that the inspection decisions which take longest to make are necessarily the most accurate (see the labelled points for inspectors 7 and 62 in Fig. 5.8 ). Thus, Fig. 5.8 does not show a simple relationship that can be unambiguously summarised by a statement like “the longer an inspector takes to make a quality control decision, the more accurate that decision is likely to be”. The story is more complicated.

Some software packages, such as SPSS, STATGRAPHICS and SYSTAT, offer the option of using different plotting symbols or markers to represent the members of different groups so that the relationship between the two focal variables (the ones anchoring the X and Y axes) can be clarified with reference to a third categorical measure.

Maree might want to see if the relationship depicted in Fig. 5.8 changes depending upon whether the inspector was tertiary-qualified or not (this information is represented in the educlev variable of the QCI database).

Figure 5.9 shows what such a modified scatterplot might look like; the legend in the upper corner of the figure defines the marker symbols for each category of the educlev variable. Note that for both High School only-educated inspectors and Tertiary-qualified inspectors, the general fan-shaped relationship between accuracy and speed is the same. However, it appears that the distribution of points for the High School only-educated inspectors is shifted somewhat upward and toward the right of the plot suggesting that these inspectors tend to be somewhat more accurate as well as slower in their decision processes.

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig9_HTML.jpg

Scatterplot displaying accuracy vs speed conditional on educlev group

There are many other styles of graphs available, often dependent upon the specific statistical package you are using. Interestingly, NCSS and, particularly, SYSTAT and STATGRAPHICS, appear to offer the most variety in terms of types of graphs available for visually representing data. A reading of the user’s manuals for these programs (see the Useful additional readings) would expose you to the great diversity of plotting techniques available to researchers. Many of these techniques go by rather interesting names such as: Chernoff’s faces, radar plots, sunflower plots, violin plots, star plots, Fourier blobs, and dot plots.

These graphical methods provide summary techniques for visually presenting certain characteristics of a set of data. Visual representations are generally easier to understand than a tabular representation and when these plots are combined with available numerical statistics, they can give a very complete picture of a sample of data. Newer methods have become available which permit more complex representations to be depicted, opening possibilities for creatively visually representing more aspects and features of the data (leading to a style of visual data storytelling called infographics ; see, for example, McCandless 2014 ; Toseland and Toseland 2012 ). Many of these newer methods can display data patterns from multiple variables in the same graph (several of these newer graphical methods are illustrated and discussed in Procedure 5.3 ).

Graphs tend to be cumbersome and space consuming if a great many variables need to be summarised. In such cases, using numerical summary statistics (such as means or correlations) in tabular form alone will provide a more economical and efficient summary. Also, it can be very easy to give a misleading picture of data trends using graphical methods by simply choosing the ‘correct’ scaling for maximum effect or choosing a display option (such as a 3-D effect) that ‘looks’ presentable but which actually obscures a clear interpretation (see Smithson 2000 ; Wilkinson 2009 ).

Thus, you must be careful in creating and interpreting visual representations so that the influence of aesthetic choices for sake of appearance do not become more important than obtaining a faithful and valid representation of the data—a very real danger with many of today’s statistical packages where ‘default’ drawing options have been pre-programmed in. No single plot can completely summarise all possible characteristics of a sample of data. Thus, choosing a specific method of graphical display may, of necessity, force a behavioural researcher to represent certain data characteristics (such as frequency) at the expense of others (such as averages).

Virtually any research design which produces quantitative data and statistics (even to the extent of just counting the number of occurrences of several events) provides opportunities for graphical data display which may help to clarify or illustrate important data characteristics or relationships. Remember, graphical displays are communication tools just like numbers—which tool to choose depends upon the message to be conveyed. Visual representations of data are generally more useful in communicating to lay persons who are unfamiliar with statistics. Care must be taken though as these same lay people are precisely the people most likely to misinterpret a graph if it has been incorrectly drawn or scaled.

ApplicationProcedures
SPSS and choose from a range of gallery chart types: , ; drag the chart type into the working area and customise the chart with desired variables, labels, etc. many elements of a chart, including error bars, can be controlled.
NCSS or or or or or hichever type of chart you choose, you can control many features of the chart from the dialog box that pops open upon selection.
STATGRAPHICS or or or hichever type of chart you choose, you can control a number of features of the chart from the series of dialog boxes that pops open upon selection.
SYSTAT or or or or or (which offers a range of other more novel graphical displays, including the dual histogram). For each choice, a dialog box opens which allows you to control almost every characteristic of the graph you want.
Commander or or or or ; for some graphs ( being the exception), there is minimal control offered by Commander over the appearance of the graph (you need to use full commands to control more aspects; e.g. see Chang ).

Procedure 5.3: Multivariate Graphs & Displays

Graphical methods for displaying multivariate data (i.e. many variables at once) include scatterplot matrices, radar (or spider) plots, multiplots, parallel coordinate displays, and icon plots. Multivariate graphs are useful for visualising broad trends and patterns across many variables (Cleveland 1995 ; Jacoby 1998 ). Such graphs typically sacrifice precision in representation in favour of a snapshot pictorial summary that can help you form general impressions of data patterns.

It is important to note that what is presented here is a small but reasonably representative sampling of the types of graphs one can produce to summarise and display trends in multivariate data. Generally speaking, SYSTAT offers the best facilities for producing multivariate graphs, followed by STATGRAPHICS, but with the drawback that it is somewhat tricky to get the graphs in exactly the form you want. SYSTAT also has excellent facilities for creating new forms and combinations of graphs – essentially allowing graphs to be tailor-made for a specific communication purpose. Both SPSS and NCSS offer a more limited range of multivariate graphs, generally restricted to scatterplot matrices and variations of multiplots. Microsoft Excel or STATGRAPHICS are the packages to use if radar or spider plots are desired.

Scatterplot Matrices

A scatterplot matrix is a useful multivariate graph designed to show relationships between pairs of many variables in the same display.

Figure 5.10 illustrates a scatterplot matrix, produced using SYSTAT, for the mentabil , accuracy , speed , jobsat and workcond variables in the QCI database. It is easy to see that all the scatterplot matrix does is stack all pairs of scatterplots into a format where it is easy to pick out the graph for any ‘row’ variable that intersects a column ‘variable’.

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig10_HTML.jpg

Scatterplot matrix relating mentabil , accuracy , speed , jobsat & workcond

In those plots where a ‘row’ variable intersects itself in a column of the matrix (along the so-called ‘diagonal’), SYSTAT permits a range of univariate displays to be shown. Figure 5.10 shows univariate histograms for each variable (recall Procedure 5.2 ). One obvious drawback of the scatterplot matrix is that, if many variables are to be displayed (say ten or more); the graph gets very crowded and becomes very hard to visually appreciate.

Looking at the first column of graphs in Fig. 5.10 , we can see the scatterplot relationships between mentabil and each of the other variables. We can get a visual impression that mentabil seems to be slightly negatively related to accuracy (the cloud of scatter points tends to angle downward to the right, suggesting, very slightly, that higher mentabil scores are associated with lower levels of accuracy ).

Conversely, the visual impression of the relationship between mentabil and speed is that the relationship is slightly positive (higher mentabil scores tend to be associated with higher speed scores = longer inspection times). Similar types of visual impressions can be formed for other parts of Fig. 5.10 . Notice that the histogram plots along the diagonal give a clear impression of the shape of the distribution for each variable.

Radar Plots

The radar plot (also known as a spider graph for obvious reasons) is a simple and effective device for displaying scores on many variables. Microsoft Excel offers a range of options and capabilities for producing radar plots, such as the plot shown in Fig. 5.11 . Radar plots are generally easy to interpret and provide a good visual basis for comparing plots from different individuals or groups, even if a fairly large number of variables (say, up to about 25) are being displayed. Like a clock face, variables are evenly spaced around the centre of the plot in clockwise order starting at the 12 o’clock position. Visual interpretation of a radar plot primarily relies on shape comparisons, i.e. the rise and fall of peaks and valleys along the spokes around the plot. Valleys near the centre display low scores on specific variables, peaks near the outside of the plot display high scores on specific variables. [Note that, technically, radar plots employ polar coordinates.] SYSTAT can draw graphs using polar coordinates but not as easily as Excel can, from the user’s perspective. Radar plots work best if all the variables represented are measured on the same scale (e.g. a 1 to 7 Likert-type scale or 0% to 100% scale). Individuals who are missing any scores on the variables being plotted are typically omitted.

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig11_HTML.jpg

Radar plot comparing attitude ratings for inspectors 66 and 104

The radar plot in Fig. 5.11 , produced using Excel, compares two specific inspectors, 66 and 104, on the nine attitude rating scales. Inspector 66 gave the highest rating (= 7) on the cultqual variable and inspector 104 gave the lowest rating (= 1). The plot shows that inspector 104 tended to provide very low ratings on all nine attitude variables, whereas inspector 66 tended to give very high ratings on all variables except acctrain and trainapp , where the scores were similar to those for inspector 104. Thus, in general, inspector 66 tended to show much more positive attitudes toward their workplace compared to inspector 104.

While Fig. 5.11 was generated to compare the scores for two individuals in the QCI database, it would be just as easy to produce a radar plot that compared the five types of companies in terms of their average ratings on the nine variables, as shown in Fig. 5.12 .

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig12_HTML.jpg

Radar plot comparing average attitude ratings for five types of company

Here we can form the visual impression that the five types of companies differ most in their average ratings of mgmtcomm and least in the average ratings of polsatis . Overall, the average ratings from inspectors from PC manufacturers (black diamonds with solid lines) seem to be generally the most positive as their scores lie on or near the outer ring of scores and those from Automobile manufacturers tend to be least positive on many variables (except the training-related variables).

Extrapolating from Fig. 5.12 , you may rightly conclude that including too many groups and/or too many variables in a radar plot comparison can lead to so much clutter that any visual comparison would be severely degraded. You may have to experiment with using colour-coded lines to represent different groups versus line and marker shape variations (as used in Fig. 5.12 ), because choice of coding method for groups can influence the interpretability of a radar plot.

A multiplot is simply a hybrid style of graph that can display group comparisons across a number of variables. There are a wide variety of possible multiplots one could potentially design (SYSTAT offers great capabilities with respect to multiplots). Figure 5.13 shows a multiplot comprising a side-by-side series of profile-based line graphs – one graph for each type of company in the QCI database.

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig13_HTML.jpg

Multiplot comparing profiles of average attitude ratings for five company types

The multiplot in Fig. 5.13 , produced using SYSTAT, graphs the profile of average attitude ratings for all inspectors within a specific type of company. This multiplot shows the same story as the radar plot in Fig. 5.12 , but in a different graphical format. It is still fairly clear that the average ratings from inspectors from PC manufacturers tend to be higher than for the other types of companies and the profile for inspectors from automobile manufacturers tends to be lower than for the other types of companies.

The profile for inspectors from large electrical appliance manufacturers is the flattest, meaning that their average attitude ratings were less variable than for other types of companies. Comparing the ease with which you can glean the visual impressions from Figs. 5.12 and 5.13 may lead you to prefer one style of graph over another. If you have such preferences, chances are others will also, which may mean you need to carefully consider your options when deciding how best to display data for effect.

Frequently, choice of graph is less a matter of which style is right or wrong, but more a matter of which style will suit specific purposes or convey a specific story, i.e. the choice is often strategic.

Parallel Coordinate Displays

A parallel coordinate display is useful for displaying individual scores on a range of variables, all measured using the same scale. Furthermore, such graphs can be combined side-by-side to facilitate very broad visual comparisons among groups, while retaining individual profile variability in scores. Each line in a parallel coordinate display represents one individual, e.g. an inspector.

The interpretation of a parallel coordinate display, such as the two shown in Fig. 5.14 , depends on visual impressions of the peaks and valleys (highs and lows) in the profiles as well as on the density of similar profile lines. The graph is called ‘parallel coordinate’ simply because it assumes that all variables are measured on the same scale and that scores for each variable can therefore be located along vertical axes that are parallel to each other (imagine vertical lines on Fig. 5.14 running from bottom to top for each variable on the X-axis). The main drawback of this method of data display is that only those individuals in the sample who provided legitimate scores on all of the variables being plotted (i.e. who have no missing scores) can be displayed.

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig14_HTML.jpg

Parallel coordinate displays comparing profiles of average attitude ratings for five company types

The parallel coordinate display in Fig. 5.14 , produced using SYSTAT, graphs the profile of average attitude ratings for all inspectors within two specific types of company: the left graph for inspectors from PC manufacturers and the right graph for automobile manufacturers.

There are fewer lines in each display than the number of inspectors from each type of company simply because several inspectors from each type of company were missing a rating on at least one of the nine attitude variables. The graphs show great variability in scores amongst inspectors within a company type, but there are some overall patterns evident.

For example, inspectors from automobile companies clearly and fairly uniformly rated mgmtcomm toward the low end of the scale, whereas the reverse was generally true for that variable for inspectors from PC manufacturers. Conversely, inspectors from automobile companies tend to rate acctrain and trainapp more toward the middle to high end of the scale, whereas the reverse is generally true for those variables for inspectors from PC manufacturers.

Perhaps the most creative types of multivariate displays are the so-called icon plots . SYSTAT and STATGRAPHICS offer an impressive array of different types of icon plots, including, amongst others, Chernoff’s faces, profile plots, histogram plots, star glyphs and sunray plots (Jacoby 1998 provides a detailed discussion of icon plots).

Icon plots generally use a specific visual construction to represent variables scores obtained by each individual within a sample or group. All icon plots are thus methods for displaying the response patterns for individual members of a sample, as long as those individuals are not missing any scores on the variables to be displayed (note that this is the same limitation as for radar plots and parallel coordinate displays). To illustrate icon plots, without generating too many icons to focus on, Figs. 5.15 , 5.16 , 5.17 and 5.18 present four different icon plots for QCI inspectors classified, using a new variable called BEST_WORST , as either the worst performers (= 1 where their accuracy scores were less than 70%) or the best performers (= 2 where their accuracy scores were 90% or greater).

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig15_HTML.jpg

Chernoff’s faces icon plot comparing individual attitude ratings for best and worst performing inspectors

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig16_HTML.jpg

Profile plot comparing individual attitude ratings for best and worst performing inspectors

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig17_HTML.jpg

Histogram plot comparing individual attitude ratings for best and worst performing inspectors

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig18_HTML.jpg

Sunray plot comparing individual attitude ratings for best and worst performing inspectors

The Chernoff’s faces plot gets its name from the visual icon used to represent variable scores – a cartoon-type face. This icon tries to capitalise on our natural human ability to recognise and differentiate faces. Each feature of the face is controlled by the scores on a single variable. In SYSTAT, up to 20 facial features are controllable; the first five being curvature of mouth, angle of brow, width of nose, length of nose and length of mouth (SYSTAT Software Inc., 2009 , p. 259). The theory behind Chernoff’s faces is that similar patterns of variable scores will produce similar looking faces, thereby making similarities and differences between individuals more apparent.

The profile plot and histogram plot are actually two variants of the same type of icon plot. A profile plot represents individuals’ scores for a set of variables using simplified line graphs, one per individual. The profile is scaled so that the vertical height of the peaks and valleys correspond to actual values for variables where the variables anchor the X-axis in a fashion similar to the parallel coordinate display. So, as you examine a profile from left to right across the X-axis of each graph, you are looking across the set of variables. A histogram plot represents the same information in the same way as for the profile plot but using histogram bars instead.

Figure 5.15 , produced using SYSTAT, shows a Chernoff’s faces plot for the best and worst performing inspectors using their ratings of job satisfaction, working conditions and the nine general attitude statements.

Each face is labelled with the inspector number it represents. The gaps indicate where an inspector had missing data on at least one of the variables, meaning a face could not be generated for them. The worst performers are drawn using red lines; the best using blue lines. The first variable is jobsat and this variable controls mouth curvature; the second variable is workcond and this controls angle of brow, and so on. It seems clear that there are differences in the faces between the best and worst performers with, for example, best performers tending to be more satisfied (smiling) and with higher ratings for working conditions (brow angle).

Beyond a broad visual impression, there is little in terms of precise inferences you can draw from a Chernoff’s faces plot. It really provides a visual sketch, nothing more. The fact that there is no obvious link between facial features, variables and score levels means that the Chernoff’s faces icon plot is difficult to interpret at the level of individual variables – a holistic impression of similarity and difference is what this type of plot facilitates.

Figure 5.16 produced using SYSTAT, shows a profile plot for the best and worst performing inspectors using their ratings of job satisfaction, working conditions and the nine attitude variables.

Like the Chernoff’s faces plot (Fig. 5.15 ), as you read across the rows of the plot from left to right, each plot corresponds respectively to a inspector in the sample who was either in the worst performer (red) or best performer (blue) category. The first attitude variable is jobsat and anchors the left end of each line graph; the last variable is polsatis and anchors the right end of the line graph. The remaining variables are represented in order from left to right across the X-axis of each graph. Figure 5.16 shows that these inspectors are rather different in their attitude profiles, with best performers tending to show taller profiles on the first two variables, for example.

Figure 5.17 produced using SYSTAT, shows a histogram plot for the best and worst performing inspectors based on their ratings of job satisfaction, working conditions and the nine attitude variables. This plot tells the same story as the profile plot, only using histogram bars. Some people would prefer the histogram icon plot to the profile plot because each histogram bar corresponds to one variable, making the visual linking of a specific bar to a specific variable much easier than visually linking a specific position along the profile line to a specific variable.

The sunray plot is actually a simplified adaptation of the radar plot (called a “star glyph”) used to represent scores on a set of variables for each individual within a sample or group. Remember that a radar plot basically arranges the variables around a central point like a clock face; the first variable is represented at the 12 o’clock position and the remaining variables follow around the plot in a clockwise direction.

Unlike a radar plot, while the spokes (the actual ‘star’ of the glyph’s name) of the plot are visible, no interpretive scale is evident. A variable’s score is visually represented by its distance from the central point. Thus, the star glyphs in a sunray plot are designed, like Chernoff’s faces, to provide a general visual impression, based on icon shape. A wide diameter well-rounded plot indicates an individual with high scores on all variables and a small diameter well-rounded plot vice-versa. Jagged plots represent individuals with highly variable scores across the variables. ‘Stars’ of similar size, shape and orientation represent similar individuals.

Figure 5.18 , produced using STATGRAPHICS, shows a sunray plot for the best and worst performing inspectors. An interpretation glyph is also shown in the lower right corner of Fig. 5.18 , where variables are aligned with the spokes of a star (e.g. jobsat is at the 12 o’clock position). This sunray plot could lead you to form the visual impression that the worst performing inspectors (group 1) have rather less rounded rating profiles than do the best performing inspectors (group 2) and that the jobsat and workcond spokes are generally lower for the worst performing inspectors.

Comparatively speaking, the sunray plot makes identifying similar individuals a bit easier (perhaps even easier than Chernoff’s faces) and, when ordered as STATGRAPHICS showed in Fig. 5.18 , permits easier visual comparisons between groups of individuals, but at the expense of precise knowledge about variable scores. Remember, a holistic impression is the goal pursued using a sunray plot.

Multivariate graphical methods provide summary techniques for visually presenting certain characteristics of a complex array of data on variables. Such visual representations are generally better at helping us to form holistic impressions of multivariate data rather than any sort of tabular representation or numerical index. They also allow us to compress many numerical measures into a finite representation that is generally easy to understand. Multivariate graphical displays can add interest to an otherwise dry statistical reporting of numerical data. They are designed to appeal to our pattern recognition skills, focusing our attention on features of the data such as shape, level, variability and orientation. Some multivariate graphs (e.g. radar plots, sunray plots and multiplots) are useful not only for representing score patterns for individuals but also providing summaries of score patterns across groups of individuals.

Multivariate graphs tend to get very busy-looking and are hard to interpret if a great many variables or a large number of individuals need to be displayed (imagine any of the icon plots, for a sample of 200 questionnaire participants, displayed on a A4 page – each icon would be so small that its features could not be easily distinguished, thereby defeating the purpose of the display). In such cases, using numerical summary statistics (such as averages or correlations) in tabular form alone will provide a more economical and efficient summary. Also, some multivariate displays will work better for conveying certain types of information than others.

Information about variable relationships may be better displayed using a scatterplot matrix. Information about individual similarities and difference on a set of variables may be better conveyed using a histogram or sunray plot. Multiplots may be better suited to displaying information about group differences across a set of variables. Information about the overall similarity of individual entities in a sample might best be displayed using Chernoff’s faces.

Because people differ greatly in their visual capacities and preferences, certain types of multivariate displays will work for some people and not others. Sometimes, people will not see what you see in the plots. Some plots, such as Chernoff’s faces, may not strike a reader as a serious statistical procedure and this could adversely influence how convinced they will be by the story the plot conveys. None of the multivariate displays described here provide sufficiently precise information for solid inferences or interpretations; all are designed to simply facilitate the formation of holistic visual impressions. In fact, you may have noticed that some displays (scatterplot matrices and the icon plots, for example) provide no numerical scaling information that would help make precise interpretations. If precision in summary information is desired, the types of multivariate displays discussed here would not be the best strategic choices.

Virtually any research design which produces quantitative data/statistics for multiple variables provides opportunities for multivariate graphical data display which may help to clarify or illustrate important data characteristics or relationships. Thus, for survey research involving many identically-scaled attitudinal questions, a multivariate display may be just the device needed to communicate something about patterns in the data. Multivariate graphical displays are simply specialised communication tools designed to compress a lot of information into a meaningful and efficient format for interpretation—which tool to choose depends upon the message to be conveyed.

Generally speaking, visual representations of multivariate data could prove more useful in communicating to lay persons who are unfamiliar with statistics or who prefer visual as opposed to numerical information. However, these displays would probably require some interpretive discussion so that the reader clearly understands their intent.

ApplicationProcedures
SPSS and choose from the gallery; drag the chart type into the working area and customise the chart with desired variables, labels, etc. Only a few elements of each chart can be configured and altered.
NCSS Only a few elements of this plot are customisable in NCSS.
SYSTAT (and you can select what type of plot you want to appear in the diagonal boxes) or ( can be selected by choosing a variable. e.g. ) or or (for icon plots, you can choose from a range of icons including Chernoff’s faces, histogram, star, sun or profile amongst others). A large number of elements of each type of plot are easily customisable, although it may take some trial and error to get exactly the look you want.
STATGRAPHICS or or or Several elements of each type of plot are easily customisable, although it may take some trial and error to get exactly the look you want.
commander You can select what type of plot you want to appear in the diagonal boxes, and you can control some other features of the plot. Other multivariate data displays are available via various packages (e.g. the or package), but not through commander.

Procedure 5.4: Assessing Central Tendency

The three most commonly reported measures of central tendency are the mean, median and mode. Each measure reflects a specific way of defining central tendency in a distribution of scores on a variable and each has its own advantages and disadvantages.

The mean is the most widely used measure of central tendency (also called the arithmetic average). Very simply, a mean is the sum of all the scores for a specific variable in a sample divided by the number of scores used in obtaining the sum. The resulting number reflects the average score for the sample of individuals on which the scores were obtained. If one were asked to predict the score that any single individual in the sample would obtain, the best prediction, in the absence of any other relevant information, would be the sample mean. Many parametric statistical methods (such as Procedures 10.1007/978-981-15-2537-7_7#Sec22 , 10.1007/978-981-15-2537-7_7#Sec32 , 10.1007/978-981-15-2537-7_7#Sec42 and 10.1007/978-981-15-2537-7_7#Sec68) deal with sample means in one way or another. For any sample of data, there is one and only one possible value for the mean in a specific distribution. For most purposes, the mean is the preferred measure of central tendency because it utilises all the available information in a sample.

In the context of the QCI database, Maree could quite reasonably ask what inspectors scored on the average in terms of mental ability ( mentabil ), inspection accuracy ( accuracy ), inspection speed ( speed ), overall job satisfaction ( jobsat ), and perceived quality of their working conditions ( workcond ). Table 5.3 shows the mean scores for the sample of 112 quality control inspectors on each of these variables. The statistics shown in Table 5.3 were computed using the SPSS Frequencies ... procedure. Notice that the table indicates how many of the 112 inspectors had a valid score for each variable and how many were missing a score (e.g. 109 inspectors provided a valid rating for jobsat; 3 inspectors did not).

Measures of central tendency for specific QCI variables

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Tab3_HTML.jpg

Each mean needs to be interpreted in terms of the original units of measurement for each variable. Thus, the inspectors in the sample showed an average mental ability score of 109.84 (higher than the general population mean of 100 for the test), an average inspection accuracy of 82.14%, and an average speed for making quality control decisions of 4.48 s. Furthermore, in terms of their work context, inspectors reported an average overall job satisfaction of 4.96 (on the 7-point scale, or a level of satisfaction nearly one full scale point above the Neutral point of 4—indicating a generally positive but not strong level of job satisfaction, and an average perceived quality of work conditions of 4.21 (on the 7-point scale which is just about at the level of Stressful but Tolerable.

The mean is sensitive to the presence of extreme values, which can distort its value, giving a biased indication of central tendency. As we will see below, the median is an alternative statistic to use in such circumstances. However, it is also possible to compute what is called a trimmed mean where the mean is calculated after a certain percentage (say, 5% or 10%) of the lowest and highest scores in a distribution have been ignored (a process called ‘trimming’; see, for example, the discussion in Field 2018 , pp. 262–264). This yields a statistic less influenced by extreme scores. The drawbacks are that the decision as to what percentage to trim can be somewhat subjective and trimming necessarily sacrifices information (i.e. the extreme scores) in order to achieve a less biased measure. Some software packages, such as SPSS, SYSTAT or NCSS, can report a specific percentage trimmed mean, if that option is selected for descriptive statistics or exploratory data analysis (see Procedure 5.6 ) procedures. Comparing the original mean with a trimmed mean can provide an indication of the degree to which the original mean has been biased by extreme values.

Very simply, the median is the centre or middle score of a set of scores. By ‘centre’ or ‘middle’ is meant that 50% of the data values are smaller than or equal to the median and 50% of the data values are larger when the entire distribution of scores is rank ordered from the lowest to highest value. Thus, we can say that the median is that score in the sample which occurs at the 50th percentile. [Note that a ‘percentile’ is attached to a specific score that a specific percentage of the sample scored at or below. Thus, a score at the 25th percentile means that 25% of the sample achieved this score or a lower score.] Table 5.3 shows the 25th, 50th and 75th percentile scores for each variable – note how the 50th percentile score is exactly equal to the median in each case .

The median is reported somewhat less frequently than the mean but does have some advantages over the mean in certain circumstances. One such circumstance is when the sample of data has a few extreme values in one direction (either very large or very small relative to all other scores). In this case, the mean would be influenced (biased) to a much greater degree than would the median since all of the data are used to calculate the mean (including the extreme scores) whereas only the single centre score is needed for the median. For this reason, many nonparametric statistical procedures (such as Procedures 10.1007/978-981-15-2537-7_7#Sec27 , 10.1007/978-981-15-2537-7_7#Sec37 and 10.1007/978-981-15-2537-7_7#Sec63) focus on the median as the comparison statistic rather than on the mean.

A discrepancy between the values for the mean and median of a variable provides some insight to the degree to which the mean is being influenced by the presence of extreme data values. In a distribution where there are no extreme values on either side of the distribution (or where extreme values balance each other out on either side of the distribution, as happens in a normal distribution – see Fundamental Concept II ), the mean and the median will coincide at the same value and the mean will not be biased.

For highly skewed distributions, however, the value of the mean will be pulled toward the long tail of the distribution because that is where the extreme values lie. However, in such skewed distributions, the median will be insensitive (statisticians call this property ‘robustness’) to extreme values in the long tail. For this reason, the direction of the discrepancy between the mean and median can give a very rough indication of the direction of skew in a distribution (‘mean larger than median’ signals possible positive skewness; ‘mean smaller than median’ signals possible negative skewness). Like the mean, there is one and only one possible value for the median in a specific distribution.

In Fig. 5.19 , the left graph shows the distribution of speed scores and the right-hand graph shows the distribution of accuracy scores. The speed distribution clearly shows the mean being pulled toward the right tail of the distribution whereas the accuracy distribution shows the mean being just slightly pulled toward the left tail. The effect on the mean is stronger in the speed distribution indicating a greater biasing effect due to some very long inspection decision times.

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig19_HTML.jpg

Effects of skewness in a distribution on the values for the mean and median

If we refer to Table 5.3 , we can see that the median score for each of the five variables has also been computed. Like the mean, the median must be interpreted in the original units of measurement for the variable. We can see that for mentabil , accuracy , and workcond , the value of the median is very close to the value of the mean, suggesting that these distributions are not strongly influenced by extreme data values in either the high or low direction. However, note that the median speed was 3.89 s compared to the mean of 4.48 s, suggesting that the distribution of speed scores is positively skewed (the mean is larger than the median—refer to Fig. 5.19 ). Conversely, the median jobsat score was 5.00 whereas the mean score was 4.96 suggesting very little substantive skewness in the distribution (mean and median are nearly equal).

The mode is the simplest measure of central tendency. It is defined as the most frequently occurring score in a distribution. Put another way, it is the score that more individuals in the sample obtain than any other score. An interesting problem associated with the mode is that there may be more than one in a specific distribution. In the case where multiple modes exist, the issue becomes which value do you report? The answer is that you must report all of them. In a ‘normal’ bell-shaped distribution, there is only one mode and it is indeed at the centre of the distribution, coinciding with both the mean and the median.

Table 5.3 also shows the mode for each of the five variables. For example, more inspectors achieved a mentabil score of 111 more often than any other score and inspectors reported a jobsat rating of 6 more often than any other rating. SPSS only ever reports one mode even if several are present, so one must be careful and look at a histogram plot for each variable to make a final determination of the mode(s) for that variable.

All three measures of central tendency yield information about what is going on in the centre of a distribution of scores. The mean and median provide a single number which can summarise the central tendency in the entire distribution. The mode can yield one or multiple indices. With many measurements on individuals in a sample, it is advantageous to have single number indices which can describe the distributions in summary fashion. In a normal or near-normal distribution of sample data, the mean, the median, and the mode will all generally coincide at the one point. In this instance, all three statistics will provide approximately the same indication of central tendency. Note however that it is seldom the case that all three statistics would yield exactly the same number for any particular distribution. The mean is the most useful statistic, unless the data distribution is skewed by extreme scores, in which case the median should be reported.

While measures of central tendency are useful descriptors of distributions, summarising data using a single numerical index necessarily reduces the amount of information available about the sample. Not only do we need to know what is going on in the centre of a distribution, we also need to know what is going on around the centre of the distribution. For this reason, most social and behavioural researchers report not only measures of central tendency, but also measures of variability (see Procedure 5.5 ). The mode is the least informative of the three statistics because of its potential for producing multiple values.

Measures of central tendency are useful in almost any type of experimental design, survey or interview study, and in any observational studies where quantitative data are available and must be summarised. The decision as to whether the mean or median should be reported depends upon the nature of the data which should ideally be ascertained by visual inspection of the data distribution. Some researchers opt to report both measures routinely. Computation of means is a prelude to many parametric statistical methods (see, for example, Procedure 10.1007/978-981-15-2537-7_7#Sec22 , 10.1007/978-981-15-2537-7_7#Sec32 , 10.1007/978-981-15-2537-7_7#Sec42 , 10.1007/978-981-15-2537-7_7#Sec52 , 10.1007/978-981-15-2537-7_7#Sec68 , 10.1007/978-981-15-2537-7_7#Sec76 and 10.1007/978-981-15-2537-7_7#Sec105); comparison of medians is associated with many nonparametric statistical methods (see, for example, Procedure 10.1007/978-981-15-2537-7_7#Sec27 , 10.1007/978-981-15-2537-7_7#Sec37 , 10.1007/978-981-15-2537-7_7#Sec63 and 10.1007/978-981-15-2537-7_7#Sec81).

ApplicationProcedures
SPSS then press the ‘ ’ button and choose mean, median and mode. To see trimmed means, you must use the Exploratory Data Analysis procedure; see .
NCSS then select the reports and plots that you want to see; make sure you indicate that you want to see the ‘Means Section’ of the Report. If you want to see trimmed means, tick the ‘Trimmed Section’ of the Report.
SYSTAT … then select the mean, median and mode (as well as any other statistics you might wish to see). If you want to see trimmed means, tick the ‘Trimmed mean’ section of the dialog box and set the percentage to trim in the box labelled ‘Two-sided’.
STATGRAPHICS or then choose the variable(s) you want to describe and select Summary Statistics (you don’t get any options for statistics to report – measures of central tendency and variability are automatically produced). STATGRAPHICS will not report modes and you will need to use and request ‘Percentiles’ in order to see the 50%ile score which will be the median; however, it won’t be labelled as the median.
Commander then select the central tendency statistics you want to see. Commander will not produce modes and to see the median, make sure that the ‘Quantiles’ box is ticked – the .5 quantile score (= 50%ile) score is the median; however, it won’t be labelled as the median.

Procedure 5.5: Assessing Variability

There are a variety of measures of variability to choose from including the range, interquartile range, variance and standard deviation. Each measure reflects a specific way of defining variability in a distribution of scores on a variable and each has its own advantages and disadvantages. Most measures of variability are associated with a specific measure of central tendency so that researchers are now commonly expected to report both a measure of central tendency and its associated measure of variability whenever they display numerical descriptive statistics on continuous or ranked-ordered variables.

This is the simplest measure of variability for a sample of data scores. The range is merely the largest score in the sample minus the smallest score in the sample. The range is the one measure of variability not explicitly associated with any measure of central tendency. It gives a very rough indication as to the extent of spread in the scores. However, since the range uses only two of the total available scores in the sample, the rest of the scores are ignored, which means that a lot of potentially useful information is being sacrificed. There are also problems if either the highest or lowest (or both) scores are atypical or too extreme in their value (as in highly skewed distributions). When this happens, the range gives a very inflated picture of the typical variability in the scores. Thus, the range tends not be a frequently reported measure of variability.

Table 5.4 shows a set of descriptive statistics, produced by the SPSS Frequencies procedure, for the mentabil, accuracy, speed, jobsat and workcond measures in the QCI database. In the table, you will find three rows labelled ‘Range’, ‘Minimum’ and ‘Maximum’.

Measures of central tendency and variability for specific QCI variables

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Tab4_HTML.jpg

Using the data from these three rows, we can draw the following descriptive picture. Mentabil scores spanned a range of 50 (from a minimum score of 85 to a maximum score of 135). Speed scores had a range of 16.05 s (from 1.05 s – the fastest quality decision to 17.10 – the slowest quality decision). Accuracy scores had a range of 43 (from 57% – the least accurate inspector to 100% – the most accurate inspector). Both work context measures ( jobsat and workcond ) exhibited a range of 6 – the largest possible range given the 1 to 7 scale of measurement for these two variables.

Interquartile Range

The Interquartile Range ( IQR ) is a measure of variability that is specifically designed to be used in conjunction with the median. The IQR also takes care of the extreme data problem which typically plagues the range measure. The IQR is defined as the range that is covered by the middle 50% of scores in a distribution once the scores have been ranked in order from lowest value to highest value. It is found by locating the value in the distribution at or below which 25% of the sample scored and subtracting this number from the value in the distribution at or below which 75% of the sample scored. The IQR can also be thought of as the range one would compute after the bottom 25% of scores and the top 25% of scores in the distribution have been ‘chopped off’ (or ‘trimmed’ as statisticians call it).

The IQR gives a much more stable picture of the variability of scores and, like the median, is relatively insensitive to the biasing effects of extreme data values. Some behavioural researchers prefer to divide the IQR in half which gives a measure called the Semi-Interquartile Range ( S-IQR ) . The S-IQR can be interpreted as the distance one must travel away from the median, in either direction, to reach the value which separates the top (or bottom) 25% of scores in the distribution from the remaining 75%.

The IQR or S-IQR is typically not produced by descriptive statistics procedures by default in many computer software packages; however, it can usually be requested as an optional statistic to report or it can easily be computed by hand using percentile scores. Both the median and the IQR figure prominently in Exploratory Data Analysis, particularly in the production of boxplots (see Procedure 5.6 ).

Figure 5.20 illustrates the conceptual nature of the IQR and S-IQR compared to that of the range. Assume that 100% of data values are covered by the distribution curve in the figure. It is clear that these three measures would provide very different values for a measure of variability. Your choice would depend on your purpose. If you simply want to signal the overall span of scores between the minimum and maximum, the range is the measure of choice. But if you want to signal the variability around the median, the IQR or S-IQR would be the measure of choice.

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig20_HTML.jpg

How the range, IQR and S-IQR measures of variability conceptually differ

Note: Some behavioural researchers refer to the IQR as the hinge-spread (or H-spread ) because of its use in the production of boxplots:

  • the 25th percentile data value is referred to as the ‘lower hinge’;
  • the 75th percentile data value is referred to as the ‘upper hinge’; and
  • their difference gives the H-spread.

Midspread is another term you may see used as a synonym for interquartile range.

Referring back to Table 5.4 , we can find statistics reported for the median and for the ‘quartiles’ (25th, 50th and 75th percentile scores) for each of the five variables of interest. The ‘quartile’ values are useful for finding the IQR or S-IQR because SPSS does not report these measures directly. The median clearly equals the 50th percentile data value in the table.

If we focus, for example, on the speed variable, we could find its IQR by subtracting the 25th percentile score of 2.19 s from the 75th percentile score of 5.71 s to give a value for the IQR of 3.52 s (the S-IQR would simply be 3.52 divided by 2 or 1.76 s). Thus, we could report that the median decision speed for inspectors was 3.89 s and that the middle 50% of inspectors showed scores spanning a range of 3.52 s. Alternatively, we could report that the median decision speed for inspectors was 3.89 s and that the middle 50% of inspectors showed scores which ranged 1.76 s either side of the median value.

Note: We could compare the ‘Minimum’ or ‘Maximum’ scores to the 25th percentile score and 75th percentile score respectively to get a feeling for whether the minimum or maximum might be considered extreme or uncharacteristic data values.

The variance uses information from every individual in the sample to assess the variability of scores relative to the sample mean. Variance assesses the average squared deviation of each score from the mean of the sample. Deviation refers to the difference between an observed score value and the mean of the sample—they are squared simply because adding them up in their naturally occurring unsquared form (where some differences are positive and others are negative) always gives a total of zero, which is useless for an index purporting to measure something.

If many scores are quite different from the mean, we would expect the variance to be large. If all the scores lie fairly close to the sample mean, we would expect a small variance. If all scores exactly equal the mean (i.e. all the scores in the sample have the same value), then we would expect the variance to be zero.

Figure 5.21 illustrates some possibilities regarding variance of a distribution of scores having a mean of 100. The very tall curve illustrates a distribution with small variance. The distribution of medium height illustrates a distribution with medium variance and the flattest distribution ia a distribution with large variance.

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig21_HTML.jpg

The concept of variance

If we had a distribution with no variance, the curve would simply be a vertical line at a score of 100 (meaning that all scores were equal to the mean). You can see that as variance increases, the tails of the distribution extend further outward and the concentration of scores around the mean decreases. You may have noticed that variance and range (as well as the IQR) will be related, since the range focuses on the difference between the ends of the two tails in the distribution and larger variances extend the tails. So, a larger variance will generally be associated with a larger range and IQR compared to a smaller variance.

It is generally difficult to descriptively interpret the variance measure in a meaningful fashion since it involves squared deviations around the sample mean. [Note: If you look back at Table 5.4 , you will see the variance listed for each of the variables (e.g. the variance of accuracy scores is 84.118), but the numbers themselves make little sense and do not relate to the original measurement scale for the variables (which, for the accuracy variable, went from 0% to 100% accuracy).] Instead, we use the variance as a steppingstone for obtaining a measure of variability that we can clearly interpret, namely the standard deviation . However, you should know that variance is an important concept in its own right simply because it provides the statistical foundation for many of the correlational procedures and statistical inference procedures described in Chaps. 10.1007/978-981-15-2537-7_6 , 10.1007/978-981-15-2537-7_7 and 10.1007/978-981-15-2537-7_8.

When considering either correlations or tests of statistical hypotheses, we frequently speak of one variable explaining or sharing variance with another (see Procedure 10.1007/978-981-15-2537-7_6#Sec27 and 10.1007/978-981-15-2537-7_7#Sec47 ). In doing so, we are invoking the concept of variance as set out here—what we are saying is that variability in the behaviour of scores on one particular variable may be associated with or predictive of variability in scores on another variable of interest (e.g. it could explain why those scores have a non-zero variance).

Standard Deviation

The standard deviation (often abbreviated as SD, sd or Std. Dev.) is the most commonly reported measure of variability because it has a meaningful interpretation and is used in conjunction with reports of sample means. Variance and standard deviation are closely related measures in that the standard deviation is found by taking the square root of the variance. The standard deviation, very simply, is a summary number that reflects the ‘average distance of each score from the mean of the sample’. In many parametric statistical methods, both the sample mean and sample standard deviation are employed in some form. Thus, the standard deviation is a very important measure, not only for data description, but also for hypothesis testing and the establishment of relationships as well.

Referring again back to Table 5.4 , we’ll focus on the results for the speed variable for discussion purposes. Table 5.4 shows that the mean inspection speed for the QCI sample was 4.48 s. We can also see that the standard deviation (in the row labelled ‘Std Deviation’) for speed was 2.89 s.

This standard deviation has a straightforward interpretation: we would say that ‘on the average, an inspector’s quality inspection decision speed differed from the mean of the sample by about 2.89 s in either direction’. In a normal distribution of scores (see Fundamental Concept II ), we would expect to see about 68% of all inspectors having decision speeds between 1.59 s (the mean minus one amount of the standard deviation) and 7.37 s (the mean plus one amount of the standard deviation).

We noted earlier that the range of the speed scores was 16.05 s. However, the fact that the maximum speed score was 17.1 s compared to the 75th percentile score of just 5.71 s seems to suggest that this maximum speed might be rather atypically large compared to the bulk of speed scores. This means that the range is likely to be giving us a false impression of the overall variability of the inspectors’ decision speeds.

Furthermore, given that the mean speed score was higher than the median speed score, suggesting that speed scores were positively skewed (this was confirmed by the histogram for speed shown in Fig. 5.19 in Procedure 5.4 ), we might consider emphasising the median and its associated IQR or S-IQR rather than the mean and standard deviation. Of course, similar diagnostic and interpretive work could be done for each of the other four variables in Table 5.4 .

Measures of variability (particularly the standard deviation) provide a summary measure that gives an indication of how variable (spread out) a particular sample of scores is. When used in conjunction with a relevant measure of central tendency (particularly the mean), a reasonable yet economical description of a set of data emerges. When there are extreme data values or severe skewness is present in the data, the IQR (or S-IQR) becomes the preferred measure of variability to be reported in conjunction with the sample median (or 50th percentile value). These latter measures are much more resistant (‘robust’) to influence by data anomalies than are the mean and standard deviation.

As mentioned above, the range is a very cursory index of variability, thus, it is not as useful as variance or standard deviation. Variance has little meaningful interpretation as a descriptive index; hence, standard deviation is most often reported. However, the standard deviation (or IQR) has little meaning if the sample mean (or median) is not reported along with it.

Knowing that the standard deviation for accuracy is 9.17 tells you little unless you know the mean accuracy (82.14) that it is the standard deviation from.

Like the sample mean, the standard deviation can be strongly biased by the presence of extreme data values or severe skewness in a distribution in which case the median and IQR (or S-IQR) become the preferred measures. The biasing effect will be most noticeable in samples which are small in size (say, less than 30 individuals) and far less noticeable in large samples (say, in excess of 200 or 300 individuals). [Note that, in a manner similar to a trimmed mean, it is possible to compute a trimmed standard deviation to reduce the biasing effect of extreme data values, see Field 2018 , p. 263.]

It is important to realise that the resistance of the median and IQR (or S-IQR) to extreme values is only gained by deliberately sacrificing a good deal of the information available in the sample (nothing is obtained without a cost in statistics). What is sacrificed is information from all other members of the sample other than those members who scored at the median and 25th and 75th percentile points on a variable of interest; information from all members of the sample would automatically be incorporated in mean and standard deviation for that variable.

Any investigation where you might report on or read about measures of central tendency on certain variables should also report measures of variability. This is particularly true for data from experiments, quasi-experiments, observational studies and questionnaires. It is important to consider measures of central tendency and measures of variability to be inextricably linked—one should never report one without the other if an adequate descriptive summary of a variable is to be communicated.

Other descriptive measures, such as those for skewness and kurtosis 1 may also be of interest if a more complete description of any variable is desired. Most good statistical packages can be instructed to report these additional descriptive measures as well.

Of all the statistics you are likely to encounter in the business, behavioural and social science research literature, means and standard deviations will dominate as measures for describing data. Additionally, these statistics will usually be reported when any parametric tests of statistical hypotheses are presented as the mean and standard deviation provide an appropriate basis for summarising and evaluating group differences.

ApplicationProcedures
SPSS then press the ‘ ’ button and choose Std. Deviation, Variance, Range, Minimum and/or Maximum as appropriate. SPSS does not produce or have an option to produce either the IQR or S-IQR, however, if your request ‘Quantiles’ you will see the 25th and 75th %ile scores, which can then be used to quickly compute either variability measure. Remember to select appropriate central tendency measures as well.
NCSS then select the reports and plots that you want to see; make sure you indicate that you want to see the Variance Section of the Report. Remember to select appropriate central tendency measures as well (by opting to see the Means Section of the Report).
SYSTAT … then select SD, Variance, Range, Interquartile range, Minimum and/or Maximum as appropriate. Remember to select appropriate central tendency measures as well.
STATGRAPHICS or then choose the variable(s) you want to describe and select Summary Statistics (you don’t get any options for statistics to report – measures of central tendency and variability are automatically produced). STATGRAPHICS does not produce either the IQR or S-IQR, however, if you use Percentiles’ can be requested in order to see the 25th and 75th %ile scores, which can then be used to quickly compute either variability measure.
Commander then select either the Standard Deviation or Interquartile Range as appropriate. Commander will not produce the range statistic or report minimum or maximum scores. Remember to select appropriate central tendency measures as well.

Fundamental Concept I: Basic Concepts in Probability

The concept of simple probability.

In Procedures 5.1 and 5.2 , you encountered the idea of the frequency of occurrence of specific events such as particular scores within a sample distribution. Furthermore, it is a simple operation to convert the frequency of occurrence of a specific event into a number representing the relative frequency of that event. The relative frequency of an observed event is merely the number of times the event is observed divided by the total number of times one makes an observation. The resulting number ranges between 0 and 1 but we typically re-express this number as a percentage by multiplying it by 100%.

In the QCI database, Maree Lakota observed data from 112 quality control inspectors of which 58 were male and 51 were female (gender indications were missing for three inspectors). The statistics 58 and 51 are thus the frequencies of occurrence for two specific types of research participant, a male inspector or a female inspector.

If she divided each frequency by the total number of observations (i.e. 112), whe would obtain .52 for males and .46 for females (leaving .02 of observations with unknown gender). These statistics are relative frequencies which indicate the proportion of times that Maree obtained data from a male or female inspector. Multiplying each relative frequency by 100% would yield 52% and 46% which she could interpret as indicating that 52% of her sample was male and 46% was female (leaving 2% of the sample with unknown gender).

It does not take much of a leap in logic to move from the concept of ‘relative frequency’ to the concept of ‘probability’. In our discussion above, we focused on relative frequency as indicating the proportion or percentage of times a specific category of participant was obtained in a sample. The emphasis here is on data from a sample.

Imagine now that Maree had infinite resources and research time and was able to obtain ever larger samples of quality control inspectors for her study. She could still compute the relative frequencies for obtaining data from males and females in her sample but as her sample size grew larger and larger, she would notice these relative frequencies converging toward some fixed values.

If, by some miracle, Maree could observe all of the quality control inspectors on the planet today, she would have measured the entire population and her computations of relative frequency for males and females would yield two precise numbers, each indicating the proportion of the population of inspectors that was male and the proportion that was female.

If Maree were then to list all of these inspectors and randomly choose one from the list, the chances that she would choose a male inspector would be equal to the proportion of the population of inspectors that was male and this logic extends to choosing a female inspector. The number used to quantify this notion of ‘chances’ is called a probability. Maree would therefore have established the probability of randomly observing a male or a female inspector in the population on any specific occasion.

Probability is expressed on a 0.0 (the observation or event will certainly not be seen) to 1.0 (the observation or event will certainly be seen) scale where values close to 0.0 indicate observations that are less certain to be seen and values close to 1.0 indicate observations that are more certain to be seen (a value of .5 indicates an even chance that an observation or event will or will not be seen – a state of maximum uncertainty). Statisticians often interpret a probability as the likelihood of observing an event or type of individual in the population.

In the QCI database, we noted that the relative frequency of observing males was .52 and for females was .46. If we take these relative frequencies as estimates of the proportions of each gender in the population of inspectors, then .52 and .46 represent the probability of observing a male or female inspector, respectively.

Statisticians would state this as “the probability of observing a male quality control inspector is .52” or in a more commonly used shorthand code, the likelihood of observing a male quality control inspector is p = .52 (p for probability). For some, probabilities make more sense if they are converted to percentages (by multiplying by 100%). Thus, p = .52 can also understood as a 52% chance of observing a male quality control inspector.

We have seen that relative frequency is a sample statistic that can be used to estimate the population probability. Our estimate will get more precise as we use larger and larger samples (technically, as the size of our samples more closely approximates the size of our population). In most behavioural research, we never have access to entire populations so we must always estimate our probabilities.

In some very special populations, having a known number of fixed possible outcomes, such as results of coin tosses or rolls of a die, we can analytically establish event probabilities without doing an infinite number of observations; all we must do is assume that we have a fair coin or die. Thus, with a fair coin, the probability of observing a H or a T on any single coin toss is ½ or .5 or 50%; the probability of observing a 6 on any single throw of a die is 1/6 or .16667 or 16.667%. With behavioural data, though, we can never measure all possible behavioural outcomes, which thereby forces researchers to depend on samples of observations in order to make estimates of population values.

The concept of probability is central to much of what is done in the statistical analysis of behavioural data. Whenever a behavioural scientist wishes to establish whether a particular relationship exists between variables or whether two groups, treated differently, actually show different behaviours, he/she is playing a probability game. Given a sample of observations, the behavioural scientist must decide whether what he/she has observed is providing sufficient information to conclude something about the population from which the sample was drawn.

This decision always has a non-zero probability of being in error simply because in samples that are much smaller than the population, there is always the chance or probability that we are observing something rare and atypical instead of something which is indicative of a consistent population trend. Thus, the concept of probability forms the cornerstone for statistical inference about which we will have more to say later (see Fundamental Concept 10.1007/978-981-15-2537-7_7#Sec6). Probability also plays an important role in helping us to understand theoretical statistical distributions (e.g. the normal distribution) and what they can tell us about our observations. We will explore this idea further in Fundamental Concept II .

The Concept of Conditional Probability

It is important to understand that the concept of probability as described above focuses upon the likelihood or chances of observing a specific event or type of observation for a specific variable relative to a population or sample of observations. However, many important behavioural research issues may focus on the question of the probability of observing a specific event given that the researcher has knowledge that some other event has occurred or been observed (this latter event is usually measured by a second variable). Here, the focus is on the potential relationship or link between two variables or two events.

With respect to the QCI database, Maree could ask the quite reasonable question “what is the probability (estimated in the QCI sample by a relative frequency) of observing an inspector being female given that she knows that an inspector works for a Large Business Computer manufacturer.

To address this question, all she needs to know is:

  • how many inspectors from Large Business Computer manufacturers are in the sample ( 22 ); and
  • how many of those inspectors were female ( 7 ) (inspectors who were missing a score for either company or gender have been ignored here).

If she divides 7 by 22, she would obtain the probability that an inspector is female given that they work for a Large Business Computer manufacturer – that is, p = .32 .

This type of question points to the important concept of conditional probability (‘conditional’ because we are asking “what is the probability of observing one event conditional upon our knowledge of some other event”).

Continuing with the previous example, Maree would say that the conditional probability of observing a female inspector working for a Large Business Computer manufacturer is .32 or, equivalently, a 32% chance. Compare this conditional probability of p  = .32 to the overall probability of observing a female inspector in the entire sample ( p  = .46 as shown above).

This means that there is evidence for a connection or relationship between gender and the type of company an inspector works for. That is, the chances are lower for observing a female inspector from a Large Business Computer manufacturer than they are for simply observing a female inspector at all.

Maree therefore has evidence suggesting that females may be relatively under-represented in Large Business Computer manufacturing companies compared to the overall population. Knowing something about the company an inspector works for therefore can help us make a better prediction about their likely gender.

Suppose, however, that Maree’s conditional probability had been exactly equal to p  = .46. This would mean that there was exactly the same chance of observing a female inspector working for a Large Business Computer manufacturer as there was of observing a female inspector in the general population. Here, knowing something about the company an inspector works doesn’t help Maree make any better prediction about their likely gender. This would mean that the two variables are statistically independent of each other.

A classic case of events that are statistically independent is two successive throws of a fair die: rolling a six on the first throw gives us no information for predicting how likely it will be that we would roll a six on the second throw. The conditional probability of observing a six on the second throw given that I have observed a six on the first throw is 0.16667 (= 1 divided by 6) which is the same as the simple probability of observing a six on any specific throw. This statistical independence also means that if we wanted to know what the probability of throwing two sixes on two successive throws of a fair die, we would just multiply the probabilities for each independent event (i.e., throw) together; that is, .16667 × .16667 = .02789 (this is known as the multiplication rule of probability, see, for example, Smithson 2000 , p. 114).

Finally, you should know that conditional probabilities are often asymmetric. This means that for many types of behavioural variables, reversing the conditional arrangement will change the story about the relationship. Bayesian statistics (see Fundamental Concept 10.1007/978-981-15-2537-7_7#Sec73) relies heavily upon this asymmetric relationship between conditional probabilities.

Maree has already learned that the conditional probability that an inspector is female given that they worked for a Large Business Computer manufacturer is p = .32. She could easily turn the conditional relationship around and ask what is the conditional probability that an inspector works for a Large Business Computer manufacturer given that the inspector is female?

From the QCI database, she can find that 51 inspectors in her total sample were female and of those 51, 7 worked for a Large Business Computer manufacturer. If she divided 7 by 51, she would get p = .14 (did you notice that all that changed was the number she divided by?). Thus, there is only a 14% chance of observing an inspector working for a Large Business Computer manufacturer given that the inspector is female – a rather different probability from p = .32, which tells a different story.

As you will see in Procedures 10.1007/978-981-15-2537-7_6#Sec14 and 10.1007/978-981-15-2537-7_7#Sec17, conditional relationships between categorical variables are precisely what crosstabulation contingency tables are designed to reveal.

Procedure 5.6: Exploratory Data Analysis

There are a variety of visual display methods for EDA, including stem & leaf displays, boxplots and violin plots. Each method reflects a specific way of displaying features of a distribution of scores or measurements and, of course, each has its own advantages and disadvantages. In addition, EDA displays are surprisingly flexible and can combine features in various ways to enhance the story conveyed by the plot.

Stem & Leaf Displays

The stem & leaf display is a simple data summary technique which not only rank orders the data points in a sample but presents them visually so that the shape of the data distribution is reflected. Stem & leaf displays are formed from data scores by splitting each score into two parts: the first part of each score serving as the ‘stem’, the second part as the ‘leaf’ (e.g. for 2-digit data values, the ‘stem’ is the number in the tens position; the ‘leaf’ is the number in the ones position). Each stem is then listed vertically, in ascending order, followed horizontally by all the leaves in ascending order associated with it. The resulting display thus shows all of the scores in the sample, but reorganised so that a rough idea of the shape of the distribution emerges. As well, extreme scores can be easily identified in a stem & leaf display.

Consider the accuracy and speed scores for the 112 quality control inspectors in the QCI sample. Figure 5.22 (produced by the R Commander Stem-and-leaf display … procedure) shows the stem & leaf displays for inspection accuracy (left display) and speed (right display) data.

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig22_HTML.jpg

Stem & leaf displays produced by R Commander

[The first six lines reflect information from R Commander about each display: lines 1 and 2 show the actual R command used to produce the plot (the variable name has been highlighted in bold); line 3 gives a warning indicating that inspectors with missing values (= NA in R ) on the variable have been omitted from the display; line 4 shows how the stems and leaves have been defined; line 5 indicates what a leaf unit represents in value; and line 6 indicates the total number (n) of inspectors included in the display).] In Fig. 5.22 , for the accuracy display on the left-hand side, the ‘stems’ have been split into ‘half-stems’—one (which is starred) associated with the ‘leaves’ 0 through 4 and the other associated with the ‘leaves’ 5 through 9—a strategy that gives the display better balance and visual appeal.

Notice how the left stem & leaf display conveys a fairly clear (yet sideways) picture of the shape of the distribution of accuracy scores. It has a rather symmetrical bell-shape to it with only a slight suggestion of negative skewness (toward the extreme score at the top). The right stem & leaf display clearly depicts the highly positively skewed nature of the distribution of speed scores. Importantly, we could reconstruct the entire sample of scores for each variable using its display, which means that unlike most other graphical procedures, we didn’t have to sacrifice any information to produce the visual summary.

Some programs, such as SYSTAT, embellish their stem & leaf displays by indicating in which stem or half-stem the ‘median’ (50th percentile), the ‘upper hinge score’ (75th percentile), and ‘lower hinge score’ (25th percentile) occur in the distribution (recall the discussion of interquartile range in Procedure 5.5 ). This is shown in Fig. 5.23 , produced by SYSTAT, where M and H indicate the stem locations for the median and hinge points, respectively. This stem & leaf display labels a single extreme accuracy score as an ‘outside value’ and clearly shows that this actual score was 57.

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig23_HTML.jpg

Stem & leaf display, produced by SYSTAT, of the accuracy QCI variable

Another important EDA technique is the boxplot or, as it is sometimes known, the box-and-whisker plot . This plot provides a symbolic representation that preserves less of the original nature of the data (compared to a stem & leaf display) but typically gives a better picture of the distributional characteristics. The basic boxplot, shown in Fig. 5.24 , utilises information about the median (50th percentile score) and the upper (75th percentile score) and lower (25th percentile score) hinge points in the construction of the ‘box’ portion of the graph (the ‘median’ defines the centre line in the box; the ‘upper’ and ‘lower hinge values’ define the end boundaries of the box—thus the box encompasses the middle 50% of data values).

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig24_HTML.jpg

Boxplots for the accuracy and speed QCI variables

Additionally, the boxplot utilises the IQR (recall Procedure 5.5 ) as a way of defining what are called ‘fences’ which are used to indicate score boundaries beyond which we would consider a score in a distribution to be an ‘outlier’ (or an extreme or unusual value). In SPSS, the inner fence is typically defined as 1.5 times the IQR in each direction and a ‘far’ outlier or extreme case is typically defined as 3 times the IQR in either direction (Field 2018 , p. 193). The ‘whiskers’ in a boxplot extend out to the data values which are closest to the upper and lower inner fences (in most cases, the vast majority of data values will be contained within the fences). Outliers beyond these ‘whiskers’ are then individually listed. ‘Near’ outliers are those lying just beyond the inner fences and ‘far’ outliers lie well beyond the inner fences.

Figure 5.24 shows two simple boxplots (produced using SPSS), one for the accuracy QCI variable and one for the speed QCI variable. The accuracy plot shows a median value of about 83, roughly 50% of the data fall between about 77 and 89 and there is one outlier, inspector 83, in the lower ‘tail’ of the distribution. The accuracy boxplot illustrates data that are relatively symmetrically distributed without substantial skewness. Such data will tend to have their median in the middle of the box, whiskers of roughly equal length extending out from the box and few or no outliers.

The speed plot shows a median value of about 4 s, roughly 50% of the data fall between 2 s and 6 s and there are four outliers, inspectors 7, 62, 65 and 75 (although inspectors 65 and 75 fall at the same place and are rather difficult to read), all falling in the slow speed ‘tail’ of the distribution. Inspectors 65, 75 and 7 are shown as ‘near’ outliers (open circles) whereas inspector 62 is shown as a ‘far’ outlier (asterisk). The speed boxplot illustrates data which are asymmetrically distributed because of skewness in one direction. Such data may have their median offset from the middle of the box and/or whiskers of unequal length extending out from the box and outliers in the direction of the longer whisker. In the speed boxplot, the data are clearly positively skewed (the longer whisker and extreme values are in the slow speed ‘tail’).

Boxplots are very versatile representations in that side-by-side displays for sub-groups of data within a sample can permit easy visual comparisons of groups with respect to central tendency and variability. Boxplots can also be modified to incorporate information about error bands associated with the median producing what is called a ‘notched boxplot’. This helps in the visual detection of meaningful subgroup differences, where boxplot ‘notches’ don’t overlap.

Figure 5.25 (produced using NCSS), compares the distributions of accuracy and speed scores for QCI inspectors from the five types of companies, plotted side-by-side.

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig25_HTML.jpg

Comparisons of the accuracy (regular boxplots) and speed (notched boxplots) QCI variables for different types of companies

Focus first on the left graph in Fig. 5.25 which plots the distribution of accuracy scores broken down by company using regular boxplots. This plot clearly shows the differing degree of skewness in each type of company (indicated by one or more outliers in one ‘tail’, whiskers which are not the same length and/or the median line being offset from the centre of a box), the differing variability of scores within each type of company (indicated by the overall length of each plot—box and whiskers), and the differing central tendency in each type of company (the median lines do not all fall at the same level of accuracy score). From the left graph in Fig. 5.25 , we could conclude that: inspection accuracy scores are most variable in PC and Large Electrical Appliance manufacturing companies and least variable in the Large Business Computer manufacturing companies; Large Business Computer and PC manufacturing companies have the highest median level of inspection accuracy; and inspection accuracy scores tend to be negatively skewed (many inspectors toward higher levels, relatively fewer who are poorer in inspection performance) in the Automotive manufacturing companies. One inspector, working for an Automotive manufacturing company, shows extremely poor inspection accuracy performance.

The right display compares types of companies in terms of their inspection speed scores, using’ notched’ boxplots. The notches define upper and lower error limits around each median. Aside from the very obvious positive skewness for speed scores (with a number of slow speed outliers) in every type of company (least so for Large Electrical Appliance manufacturing companies), the story conveyed by this comparison is that inspectors from Large Electrical Appliance and Automotive manufacturing companies have substantially faster median decision speeds compared to inspectors from Large Business Computer and PC manufacturing companies (i.e. their ‘notches’ do not overlap, in terms of speed scores, on the display).

Boxplots can also add interpretive value to other graphical display methods through the creation of hybrid displays. Such displays might combine a standard histogram with a boxplot along the X-axis to provide an enhanced picture of the data distribution as illustrated for the mentabil variable in Fig. 5.26 (produced using NCSS). This hybrid plot also employs a data ‘smoothing’ method called a density trace to outline an approximate overall shape for the data distribution. Any one graphical method would tell some of the story, but combined in the hybrid display, the story of a relatively symmetrical set of mentabil scores becomes quite visually compelling.

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig26_HTML.jpg

A hybrid histogram-density-boxplot of the mentabil QCI variable

Violin Plots

Violin plots are a more recent and interesting EDA innovation, implemented in the NCSS software package (Hintze 2012 ). The violin plot gets its name from the rough shape that the plots tend to take on. Violin plots are another type of hybrid plot, this time combining density traces (mirror-imaged right and left so that the plots have a sense of symmetry and visual balance) with boxplot-type information (median, IQR and upper and lower inner ‘fences’, but not outliers). The goal of the violin plot is to provide a quick visual impression of the shape, central tendency and variability of a distribution (the length of the violin conveys a sense of the overall variability whereas the width of the violin conveys a sense of the frequency of scores occurring in a specific region).

Figure 5.27 (produced using NCSS), compares the distributions of speed scores for QCI inspectors across the five types of companies, plotted side-by-side. The violin plot conveys a similar story to the boxplot comparison for speed in the right graph of Fig. 5.25 . However, notice that with the violin plot, unlike with a boxplot, you also get a sense of distributions that have ‘clumps’ of scores in specific areas. Some violin plots, like that for Automobile manufacturing companies in Fig. 5.27 , have a shape suggesting a multi-modal distribution (recall Procedure 5.4 and the discussion of the fact that a distribution may have multiple modes). The violin plot in Fig. 5.27 has also been produced to show where the median (solid line) and mean (dashed line) would fall within each violin. This facilitates two interpretations: (1) a relative comparison of central tendency across the five companies and (2) relative degree of skewness in the distribution for each company (indicated by the separation of the two lines within a violin; skewness is particularly bad for the Large Business Computer manufacturing companies).

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig27_HTML.jpg

Violin plot comparisons of the speed QCI variable for different types of companies

EDA methods (of which we have illustrated only a small subset; we have not reviewed dot density diagrams, for example) provide summary techniques for visually displaying certain characteristics of a set of data. The advantage of the EDA methods over more traditional graphing techniques such as those described in Procedure 5.2 is that as much of the original integrity of the data is maintained as possible while maximising the amount of summary information available about distributional characteristics.

Stem & leaf displays maintain the data in as close to their original form as possible whereas boxplots and violin plots provide more symbolic and flexible representations. EDA methods are best thought of as communication devices designed to facilitate quick visual impressions and they can add interest to any statistical story being conveyed about a sample of data. NCSS, SYSTAT, STATGRAPHICS and R Commander generally offer more options and flexibility in the generation of EDA displays than SPSS.

EDA methods tend to get cumbersome if a great many variables or groups need to be summarised. In such cases, using numerical summary statistics (such as means and standard deviations) will provide a more economical and efficient summary. Boxplots or violin plots are generally more space efficient summary techniques than stem & leaf displays.

Often, EDA techniques are used as data screening devices, which are typically not reported in actual write-ups of research (we will discuss data screening in more detail in Procedure 10.1007/978-981-15-2537-7_8#Sec11). This is a perfectly legitimate use for the methods although there is an argument for researchers to put these techniques to greater use in published literature.

Software packages may use different rules for constructing EDA plots which means that you might get rather different looking plots and different information from different programs (you saw some evidence of this in Figs. 5.22 and 5.23 ). It is important to understand what the programs are using as decision rules for locating fences and outliers so that you are clear on how best to interpret the resulting plot—such information is generally contained in the user’s guides or manuals for NCSS (Hintze 2012 ), SYSTAT (SYSTAT Inc. 2009a , b ), STATGRAPHICS (StatPoint Technologies Inc. 2010 ) and SPSS (Norušis 2012 ).

Virtually any research design which produces numerical measures (even to the extent of just counting the number of occurrences of several events) provides opportunities for employing EDA displays which may help to clarify data characteristics or relationships. One extremely important use of EDA methods is as data screening devices for detecting outliers and other data anomalies, such as non-normality and skewness, before proceeding to parametric statistical analyses. In some cases, EDA methods can help the researcher to decide whether parametric or nonparametric statistical tests would be best to apply to his or her data because critical data characteristics such as distributional shape and spread are directly reflected.

ApplicationProcedures
SPSS

produces stem-and-leaf displays and boxplots by default; variables may be explored on a whole-of-sample basis or broken down by the categories of a specific variable (called a ‘factor’ in the procedure). Cases can also be labelled with a variable (like in the QCI database), so that outlier points in the boxplot are identifiable.

can also be used to custom build different types of boxplots.

NCSS

produces a stem-and-leaf display by default.

can be used to produce box plots with different features (such as ‘notches’ and connecting lines).

can be configured to produce violin plots (by selecting the plot shape as ‘density with reflection’).

SYSTAT

can be used to produce stem-and-leaf displays for variables; however, you cannot really control any features of these displays.

can be used to produce boxplots of many types, with a number of features being controllable.

STATGRAPHICS

allows you to do a complete exploration of a single variable, including stem-and-leaf display (you need to select this option) and boxplot (produced by default). Some features of the boxplot can be controlled, but not features of the stem-and-leaf diagram.

and select either or which can produce not only descriptive statistics but also boxplots with some controllable features.

Commander or the dialog box for each procedure offers some features of the display or plot that can be controlled; whole-of-sample boxplots or boxplots by groups are possible.

Procedure 5.7: Standard ( z ) Scores

In certain practical situations in behavioural research, it may be desirable to know where a specific individual’s score lies relative to all other scores in a distribution. A convenient measure is to observe how many standard deviations (see Procedure 5.5 ) above or below the sample mean a specific score lies. This measure is called a standard score or z -score . Very simply, any raw score can be converted to a z -score by subtracting the sample mean from the raw score and dividing that result by the sample’s standard deviation. z -scores can be positive or negative and their sign simply indicates whether the score lies above (+) or below (−) the mean in value. A z -score has a very simple interpretation: it measures the number of standard deviations above or below the sample mean a specific raw score lies.

In the QCI database, we have a sample mean for speed scores of 4.48 s, a standard deviation for speed scores of 2.89 s (recall Table 5.4 in Procedure 5.5 ). If we are interested in the z -score for Inspector 65’s raw speed score of 11.94 s, we would obtain a z -score of +2.58 using the method described above (subtract 4.48 from 11.94 and divide the result by 2.89). The interpretation of this number is that a raw decision speed score of 11.94 s lies about 2.9 standard deviations above the mean decision speed for the sample.

z -scores have some interesting properties. First, if one converts (statisticians would say ‘transforms’) every available raw score in a sample to z -scores, the mean of these z -scores will always be zero and the standard deviation of these z -scores will always be 1.0. These two facts about z -scores (mean = 0; standard deviation = 1) will be true no matter what sample you are dealing with and no matter what the original units of measurement are (e.g. seconds, percentages, number of widgets assembled, amount of preference for a product, attitude rating, amount of money spent). This is because transforming raw scores to z -scores automatically changes the measurement units from whatever they originally were to a new system of measurements expressed in standard deviation units.

Suppose Maree was interested in the performance statistics for the top 25% most accurate quality control inspectors in the sample. Given a sample size of 112, this would mean finding the top 28 inspectors in terms of their accuracy scores. Since Maree is interested in performance statistics, speed scores would also be of interest. Table 5.5 (generated using the SPSS Descriptives … procedure, listed using the Case Summaries … procedure and formatted for presentation using Excel) shows accuracy and speed scores for the top 28 inspectors in descending order of accuracy scores. The z -score transformation for each of these scores is also shown (last two columns) as are the type of company, education level and gender for each inspector.

Listing of the 28 (top 25%) most accurate QCI inspectors’ accuracy and speed scores as well as standard ( z ) score transformations for each score

Case numberInspectorcompanyeduclevgenderaccuracyspeedZaccuracyZspeed
18PC ManufacturerHigh School OnlyMale1001.521.95−1.03
29PC ManufacturerHigh School OnlyFemale1003.321.95−0.40
314PC ManufacturerHigh School OnlyMale1003.831.95−0.23
417PC ManufacturerHigh School OnlyFemale997.071.840.90
5101PC ManufacturerHigh School Only983.111.73−0.47
619PC ManufacturerTertiary QualifiedFemale943.841.29−0.22
734Large Electrical Appliance ManufacturerTertiary QualifiedMale941.901.29−0.89
863Large Business Computer ManufacturerHigh School OnlyMale9411.941.292.58
967Large Business Computer ManufacturerHigh School OnlyMale942.341.29−0.74
1080Large Business Computer ManufacturerHigh School OnlyFemale944.681.290.07
115PC ManufacturerTertiary QualifiedMale934.181.18−0.10
1218PC ManufacturerTertiary QualifiedMale937.321.180.98
1346Small Electrical Appliance ManufacturerTertiary QualifiedFemale932.011.18−0.86
1464Large Business Computer ManufacturerHigh School OnlyFemale925.181.080.24
1577Large Business Computer ManufacturerTertiary QualifiedFemale926.111.080.56
1679Large Business Computer ManufacturerHigh School OnlyMale924.381.08−0.03
17106Large Electrical Appliance ManufacturerTertiary QualifiedMale921.701.08−0.96
1858Small Electrical Appliance ManufacturerHigh School OnlyMale914.120.97−0.12
1963Large Business Computer ManufacturerHigh School OnlyMale914.730.970.09
2072Large Business Computer ManufacturerTertiary QualifiedMale914.720.970.08
2120PC ManufacturerHigh School OnlyMale904.530.860.02
2269Large Business Computer ManufacturerHigh School OnlyMale904.940.860.16
2371Large Business Computer ManufacturerHigh School OnlyFemale9010.460.862.07
2485Automobile ManufacturerTertiary QualifiedFemale903.140.86−0.46
25111Large Business Computer ManufacturerHigh School OnlyMale904.110.86−0.13
266PC ManufacturerHigh School OnlyMale895.460.750.34
2761Large Business Computer ManufacturerTertiary QualifiedMale895.710.750.43
2875Large Business Computer ManufacturerHigh School OnlyMale8912.050.752.62

There are three inspectors (8, 9 and 14) who scored maximum accuracy of 100%. Such accuracy converts to a z -score of +1.95. Thus 100% accuracy is 1.95 standard deviations above the sample’s mean accuracy level. Interestingly, all three inspectors worked for PC manufacturers and all three had only high school-level education. The least accurate inspector in the top 25% had a z -score for accuracy that was .75 standard deviations above the sample mean.

Interestingly, the top three inspectors in terms of accuracy had decision speeds that fell below the sample’s mean speed; inspector 8 was the fastest inspector of the three with a speed just over 1 standard deviation ( z  = −1.03) below the sample mean. The slowest inspector in the top 25% was inspector 75 (case #28 in the list) with a speed z -score of +2.62; i.e., he was over two and a half standard deviations slower in making inspection decisions relative to the sample’s mean speed.

The fact that z -scores always have a common measurement scale having a mean of 0 and a standard deviation of 1.0 leads to an interesting application of standard scores. Suppose we focus on inspector number 65 (case #8 in the list) in Table 5.5 . It might be of interest to compare this inspector’s quality control performance in terms of both his decision accuracy and decision speed. Such a comparison is impossible using raw scores since the inspector’s accuracy score and speed scores are different measures which have differing means and standard deviations expressed in fundamentally different units of measurement (percentages and seconds). However, if we are willing to assume that the score distributions for both variables are approximately the same shape and that both accuracy and speed are measured with about the same level of reliability or consistency (see Procedure 10.1007/978-981-15-2537-7_8#Sec1), we can compare the inspector’s two scores by first converting them to z -scores within their own respective distributions as shown in Table 5.5 .

Inspector 65 looks rather anomalous in that he demonstrated a relatively high level of accuracy (raw score = 94%; z  = +1.29) but took a very long time to make those accurate decisions (raw score = 11.94 s; z  = +2.58). Contrast this with inspector 106 (case #17 in the list) who demonstrated a similar level of accuracy (raw score = 92%; z  = +1.08) but took a much shorter time to make those accurate decisions (raw score = 1.70 s; z  = −.96). In terms of evaluating performance, from a company perspective, we might conclude that inspector 106 is performing at an overall higher level than inspector 65 because he can achieve a very high level of accuracy but much more quickly; accurate and fast is more cost effective and efficient than accurate and slow.

Note: We should be cautious here since we know from our previous explorations of the speed variable in Procedure 5.6 , that accuracy scores look fairly symmetrical and speed scores are positively skewed, so assuming that the two variables have the same distribution shape, so that z -score comparisons are permitted, would be problematic.

You might have noticed that as you scanned down the two columns of z -scores in Table 5.5 , there was a suggestion of a pattern between the signs attached to the respective z -scores for each person. There seems to be a very slight preponderance of pairs of z -scores where the signs are reversed (12 out of 22 pairs). This observation provides some very preliminary evidence to suggest that there may be a relationship between inspection accuracy and decision speed, namely that a more accurate decision tends to be associated with a faster decision speed. Of course, this pattern would be better verified using the entire sample rather than the top 25% of inspectors. However, you may find it interesting to learn that it is precisely this sort of suggestive evidence (about agreement or disagreement between z -score signs for pairs of variable scores throughout a sample) that is captured and summarised by a single statistical indicator called a ‘correlation coefficient’ (see Fundamental Concept 10.1007/978-981-15-2537-7_6#Sec1 and Procedure 10.1007/978-981-15-2537-7_6#Sec4).

z -scores are not the only type of standard score that is commonly used. Three other types of standard scores are: stanines (standard nines), IQ scores and T-scores (not to be confused with the t -test described in Procedure 10.1007/978-981-15-2537-7_7#Sec22). These other types of scores have the advantage of producing only positive integer scores rather than positive and negative decimal scores. This makes interpretation somewhat easier for certain applications. However, you should know that almost all other types of standard scores come from a specific transformation of z -scores. This is because once you have converted raw scores into z -scores, they can then be quite readily transformed into any other system of measurement by simply multiplying a person’s z -score by the new desired standard deviation for the measure and adding to that product the new desired mean for the measure.

T-scores are simply z-scores transformed to have a mean of 50.0 and a standard deviation of 10.0; IQ scores are simply z-scores transformed to have a mean of 100 and a standard deviation of 15 (or 16 in some systems). For more information, see Fundamental Concept II .

Standard scores are useful for representing the position of each raw score within a sample distribution relative to the mean of that distribution. The unit of measurement becomes the number of standard deviations a specific score is away from the sample mean. As such, z -scores can permit cautious comparisons across samples or across different variables having vastly differing means and standard deviations within the constraints of the comparison samples having similarly shaped distributions and roughly equivalent levels of measurement reliability. z -scores also form the basis for establishing the degree of correlation between two variables. Transforming raw scores into z -scores does not change the shape of a distribution or rank ordering of individuals within that distribution. For this reason, a z -score is referred to as a linear transformation of a raw score. Interestingly, z -scores provide an important foundational element for more complex analytical procedures such as factor analysis ( Procedure 10.1007/978-981-15-2537-7_6#Sec36), cluster analysis ( Procedure 10.1007/978-981-15-2537-7_6#Sec41) and multiple regression analysis (see, for example, Procedure 10.1007/978-981-15-2537-7_6#Sec27 and 10.1007/978-981-15-2537-7_7#Sec86).

While standard scores are useful indices, they are subject to restrictions if used to compare scores across samples or across different variables. The samples must have similar distribution shapes for the comparisons to be meaningful and the measures must have similar levels of reliability in each sample. The groups used to generate the z -scores should also be similar in composition (with respect to age, gender distribution, and so on). Because z -scores are not an intuitively meaningful way of presenting scores to lay-persons, many other types of standard score schemes have been devised to improve interpretability. However, most of these schemes produce scores that run a greater risk of facilitating lay-person misinterpretations simply because their connection with z -scores is hidden or because the resulting numbers ‘look’ like a more familiar type of score which people do intuitively understand.

It is extremely rare for a T-score to exceed 100 or go below 0 because this would mean that the raw score was in excess of 5 standard deviations away from the sample mean. This unfortunately means that T-scores are often misinterpreted as percentages because they typically range between 0 and 100 and therefore ‘look’ like percentages. However, T-scores are definitely not percentages.

Finally, a common misunderstanding of z -scores is that transforming raw scores into z -scores makes them follow a normal distribution (see Fundamental Concept II ). This is not the case. The distribution of z -scores will have exactly the same shape as that for the raw scores; if the raw scores are positively skewed, then the corresponding z -scores will also be positively skewed.

z -scores are particularly useful in evaluative studies where relative performance indices are of interest. Whenever you compute a correlation coefficient ( Procedure 10.1007/978-981-15-2537-7_6#Sec4), you are implicitly transforming the two variables involved into z -scores (which equates the variables in terms of mean and standard deviation), so that only the patterning in the relationship between the variables is represented. z -scores are also useful as a preliminary step to more advanced parametric statistical methods when variables differing in scale, range and/or measurement units must be equated for means and standard deviations prior to analysis.

ApplicationProcedures
SPSS and tick the box labelled ‘Save standardized values as variables’. -scores are saved as new variables (labelled as Z followed by the original variable name as shown in Table ) which can then be listed or analysed further.
NCSS and select a new variable to hold the -scores, then select the ‘STANDARDIZE’ transformation from the list of available functions. -scores are saved as new variables which can then be listed or analysed further.
SYSTAT where -scores are saved as new variables which can then be listed or analysed further.
STATGRAPHICSOpen the window, and select an empty column in the database, then and choose the ‘STANDARDIZE’ transformation, choose the variable you want to transform and give the new variable a name.
Commander and select the variables you want to standardize; Commander automatically saves the transformed variable to the data base, appending Z. to the front of each variable’s name.

Fundamental Concept II: The Normal Distribution

Arguably the most fundamental distribution used in the statistical analysis of quantitative data in the behavioural and social sciences is the normal distribution (also known as the Gaussian or bell-shaped distribution ). Many behavioural phenomena, if measured on a large enough sample of people, tend to produce ‘normally distributed’ variable scores. This includes most measures of ability, performance and productivity, personality characteristics and attitudes. The normal distribution is important because it is the one form of distribution that you must assume describes the scores of a variable in the population when parametric tests of statistical inference are undertaken. The standard normal distribution is defined as having a population mean of 0.0 and a population standard deviation of 1.0. The normal distribution is also important as a means of interpreting various types of scoring systems.

Figure 5.28 displays the standard normal distribution (mean = 0; standard deviation = 1.0) and shows that there is a clear link between z -scores and the normal distribution. Statisticians have analytically calculated the probability (also expressed as percentages or percentiles) that observations will fall above or below any specific z -score in the theoretical standard normal distribution. Thus, a z -score of +1.0 in the standard normal distribution will have 84.13% (equals a probability of .8413) of observations in the population falling at or below one standard deviation above the mean and 15.87% falling above that point. A z -score of −2.0 will have 2.28% of observations falling at that point or below and 97.72% of observations falling above that point. It is clear then that, in a standard normal distribution, z -scores have a direct relationship with percentiles .

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig28_HTML.jpg

The normal (bell-shaped or Gaussian) distribution

Figure 5.28 also shows how T-scores relate to the standard normal distribution and to z -scores. The mean T-score falls at 50 and each increment or decrement of 10 T-score units means a movement of another standard deviation away from this mean of 50. Thus, a T-score of 80 corresponds to a z -score of +3.0—a score 3 standard deviations higher than the mean of 50.

Of special interest to behavioural researchers are the values for z -scores in a standard normal distribution that encompass 90% of observations ( z  = ±1.645—isolating 5% of the distribution in each tail), 95% of observations ( z  = ±1.96—isolating 2.5% of the distribution in each tail), and 99% of observations ( z  = ±2.58—isolating 0.5% of the distribution in each tail).

Depending upon the degree of certainty required by the researcher, these bands describe regions outside of which one might define an observation as being atypical or as perhaps not belonging to a distribution being centred at a mean of 0.0. Most often, what is taken as atypical or rare in the standard normal distribution is a score at least two standard deviations away from the mean, in either direction. Why choose two standard deviations? Since in the standard normal distribution, only about 5% of observations will fall outside a band defined by z -scores of ±1.96 (rounded to 2 for simplicity), this equates to data values that are 2 standard deviations away from their mean. This can give us a defensible way to identify outliers or extreme values in a distribution.

Thinking ahead to what you will encounter in Chap. 10.1007/978-981-15-2537-7_7, this ‘banding’ logic can be extended into the world of statistics (like means and percentages) as opposed to just the world of observations. You will frequently hear researchers speak of some statistic estimating a specific value (a parameter ) in a population, plus or minus some other value.

A survey organisation might report political polling results in terms of a percentage and an error band, e.g. 59% of Australians indicated that they would vote Labour at the next federal election, plus or minus 2%.

Most commonly, this error band (±2%) is defined by possible values for the population parameter that are about two standard deviations (or two standard errors—a concept discussed further in Fundamental Concept 10.1007/978-981-15-2537-7_7#Sec14) away from the reported or estimated statistical value. In effect, the researcher is saying that on 95% of the occasions he/she would theoretically conduct his/her study, the population value estimated by the statistic being reported would fall between the limits imposed by the endpoints of the error band (the official name for this error band is a confidence interval ; see Procedure 10.1007/978-981-15-2537-7_8#Sec18). The well-understood mathematical properties of the standard normal distribution are what make such precise statements about levels of error in statistical estimates possible.

Checking for Normality

It is important to understand that transforming the raw scores for a variable to z -scores (recall Procedure 5.7 ) does not produce z -scores which follow a normal distribution; rather they will have the same distributional shape as the original scores. However, if you are willing to assume that the normal distribution is the correct reference distribution in the population, then you are justified is interpreting z -scores in light of the known characteristics of the normal distribution.

In order to justify this assumption, not only to enhance the interpretability of z -scores but more generally to enhance the integrity of parametric statistical analyses, it is helpful to actually look at the sample frequency distributions for variables (using a histogram (illustrated in Procedure 5.2 ) or a boxplot (illustrated in Procedure 5.6 ), for example), since non-normality can often be visually detected. It is important to note that in the social and behavioural sciences as well as in economics and finance, certain variables tend to be non-normal by their very nature. This includes variables that measure time taken to complete a task, achieve a goal or make decisions and variables that measure, for example, income, occurrence of rare or extreme events or organisational size. Such variables tend to be positively skewed in the population, a pattern that can often be confirmed by graphing the distribution.

If you cannot justify an assumption of ‘normality’, you may be able to force the data to be normally distributed by using what is called a ‘normalising transformation’. Such transformations will usually involve a nonlinear mathematical conversion (such as computing the logarithm, square root or reciprocal) of the raw scores. Such transformations will force the data to take on a more normal appearance so that the assumption of ‘normality’ can be reasonably justified, but at the cost of creating a new variable whose units of measurement and interpretation are more complicated. [For some non-normal variables, such as the occurrence of rare, extreme or catastrophic events (e.g. a 100-year flood or forest fire, coronavirus pandemic, the Global Financial Crisis or other type of financial crisis, man-made or natural disaster), the distributions cannot be ‘normalised’. In such cases, the researcher needs to model the distribution as it stands. For such events, extreme value theory (e.g. see Diebold et al. 2000 ) has proven very useful in recent years. This theory uses a variation of the Pareto or Weibull distribution as a reference, rather than the normal distribution, when making predictions.]

Figure 5.29 displays before and after pictures of the effects of a logarithmic transformation on the positively skewed speed variable from the QCI database. Each graph, produced using NCSS, is of the hybrid histogram-density trace-boxplot type first illustrated in Procedure 5.6 . The left graph clearly shows the strong positive skew in the speed scores and the right graph shows the result of taking the log 10 of each raw score.

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig29_HTML.jpg

Combined histogram-density trace-boxplot graphs displaying the before and after effects of a ‘normalising’ log 10 transformation of the speed variable

Notice how the long tail toward slow speed scores is pulled in toward the mean and the very short tail toward fast speed scores is extended away from the mean. The result is a more ‘normal’ appearing distribution. The assumption would then be that we could assume normality of speed scores, but only in a log 10 format (i.e. it is the log of speed scores that we assume is normally distributed in the population). In general, taking the logarithm of raw scores provides a satisfactory remedy for positively skewed distributions (but not for negatively skewed ones). Furthermore, anything we do with the transformed speed scores now has to be interpreted in units of log 10 (seconds) which is a more complex interpretation to make.

Another visual method for detecting non-normality is to graph what is called a normal Q-Q plot (the Q-Q stands for Quantile-Quantile). This plots the percentiles for the observed data against the percentiles for the standard normal distribution (see Cleveland 1995 for more detailed discussion; also see Lane 2007 , http://onlinestatbook.com/2/advanced_graphs/ q-q_plots.html) . If the pattern for the observed data follows a normal distribution, then all the points on the graph will fall approximately along a diagonal line.

Figure 5.30 shows the normal Q-Q plots for the original speed variable and the transformed log-speed variable, produced using the SPSS Explore... procedure. The diagnostic diagonal line is shown on each graph. In the left-hand plot, for speed , the plot points clearly deviate from the diagonal in a way that signals positive skewness. The right-hand plot, for log_speed, shows the plot points generally falling along the diagonal line thereby conforming much more closely to what is expected in a normal distribution.

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Fig30_HTML.jpg

Normal Q-Q plots for the original speed variable and the new log_speed variable

In addition to visual ways of detecting non-normality, there are also numerical ways. As highlighted in Chap. 10.1007/978-981-15-2537-7_1, there are two additional characteristics of any distribution, namely skewness (asymmetric distribution tails) and kurtosis (peakedness of the distribution). Both have an associated statistic that provides a measure of that characteristic, similar to the mean and standard deviation statistics. In a normal distribution, the values for the skewness and kurtosis statistics are both zero (skewness = 0 means a symmetric distribution; kurtosis = 0 means a mesokurtic distribution). The further away each statistic is from zero, the more the distribution deviates from a normal shape. Both the skewness statistic and the kurtosis statistic have standard errors (see Fundamental Concept 10.1007/978-981-15-2537-7_7#Sec14) associated with them (which work very much like the standard deviation, only for a statistic rather than for observations); these can be routinely computed by almost any statistical package when you request a descriptive analysis. Without going into the logic right now (this will come in Fundamental Concept 10.1007/978-981-15-2537-7_7#Sec1), a rough rule of thumb you can use to check for normality using the skewness and kurtosis statistics is to do the following:

  • Prepare : Take the standard error for the statistic and multiply it by 2 (or 3 if you want to be more conservative).
  • Interval : Add the result from the Prepare step to the value of the statistic and subtract the result from the value of the statistic. You will end up with two numbers, one low - one high, that define the ends of an interval (what you have just created approximates what is called a ‘confidence interval’, see Procedure 10.1007/978-981-15-2537-7_8#Sec18).
  • Check : If zero falls inside of this interval (i.e. between the low and high endpoints from the Interval step), then there is likely to be no significant issue with that characteristic of the distribution. If zero falls outside of the interval (i.e. lower than the low value endpoint or higher than the high value endpoint), then you likely have an issue with non-normality with respect to that characteristic.

Visually, we saw in the left graph in Fig. 5.29 that the speed variable was highly positively skewed. What if Maree wanted to check some numbers to support this judgment? She could ask SPSS to produce the skewness and kurtosis statistics for both the original speed variable and the new log_speed variable using the Frequencies... or the Explore... procedure. Table 5.6 shows what SPSS would produce if the Frequencies ... procedure were used.

Skewness and kurtosis statistics and their standard errors for both the original speed variable and the new log_speed variable

An external file that holds a picture, illustration, etc.
Object name is 489638_3_En_5_Tab6_HTML.jpg

Using the 3-step check rule described above, Maree could roughly evaluate the normality of the two variables as follows:

  • skewness : [Prepare] 2 × .229 = .458 ➔ [Interval] 1.487 − .458 = 1.029 and 1.487 + .458 = 1.945 ➔ [Check] zero does not fall inside the interval bounded by 1.029 and 1.945, so there appears to be a significant problem with skewness. Since the value for the skewness statistic (1.487) is positive, this means the problem is positive skewness, confirming what the left graph in Fig. 5.29 showed.
  • kurtosis : [Prepare] 2 × .455 = .91 ➔ [Interval] 3.071 − .91 = 2.161 and 3.071 + .91 = 3.981 ➔ [Check] zero does not fall in interval bounded by 2.161 and 3.981, so there appears to be a significant problem with kurtosis. Since the value for the kurtosis statistic (1.487) is positive, this means the problem is leptokurtosis—the peakedness of the distribution is too tall relative to what is expected in a normal distribution.
  • skewness : [Prepare] 2 × .229 = .458 ➔ [Interval] −.050 − .458 = −.508 and −.050 + .458 = .408 ➔ [Check] zero falls within interval bounded by −.508 and .408, so there appears to be no problem with skewness. The log transform appears to have corrected the problem, confirming what the right graph in Fig. 5.29 showed.
  • kurtosis : [Prepare] 2 × .455 = .91 ➔ [Interval] −.672 – .91 = −1.582 and −.672 + .91 = .238 ➔ [Check] zero falls within interval bounded by −1.582 and .238, so there appears to be no problem with kurtosis. The log transform appears to have corrected this problem as well, rendering the distribution more approximately mesokurtic (i.e. normal) in shape.

There are also more formal tests of significance (see Fundamental Concept 10.1007/978-981-15-2537-7_7#Sec1) that one can use to numerically evaluate normality, such as the Kolmogorov-Smirnov test and the Shapiro-Wilk’s test . Each of these tests, for example, can be produced by SPSS on request, via the Explore... procedure.

1 For more information, see Chap. 10.1007/978-981-15-2537-7_1 – The language of statistics .

References for Procedure 5.1

  • Allen P, Bennett K, Heritage B. SPSS statistics: A practical guide. 4. South Melbourne, VIC: Cengage Learning Australia Pty; 2019. [ Google Scholar ]
  • George D, Mallery P. IBM SPSS statistics 25 step by step: A simple guide and reference. 15. New York: Routledge; 2019. [ Google Scholar ]

Useful Additional Readings for Procedure 5.1

  • Agresti A. Statistical methods for the social sciences. 5. Boston: Pearson; 2018. [ Google Scholar ]
  • Argyrous G. Statistics for research: With a guide to SPSS. 3. London: Sage; 2011. [ Google Scholar ]
  • De Vaus D. Analyzing social science data: 50 key problems in data analysis. London: Sage; 2002. [ Google Scholar ]
  • Glass GV, Hopkins KD. Statistical methods in education and psychology. 3. Upper Saddle River, NJ: Pearson; 1996. [ Google Scholar ]
  • Gravetter FJ, Wallnau LB. Statistics for the behavioural sciences. 10. Belmont, CA: Wadsworth Cengage; 2017. [ Google Scholar ]
  • Steinberg WJ. Statistics alive. 2. Los Angeles: Sage; 2011. [ Google Scholar ]

References for Procedure 5.2

  • Chang W. R graphics cookbook: Practical recipes for visualizing data. 2. Sebastopol, CA: O’Reilly Media; 2019. [ Google Scholar ]
  • Jacoby WG. Statistical graphics for univariate and bivariate data. Thousand Oaks, CA: Sage; 1997. [ Google Scholar ]
  • McCandless D. Knowledge is beautiful. London: William Collins; 2014. [ Google Scholar ]
  • Smithson MJ. Statistics with confidence. London: Sage; 2000. [ Google Scholar ]
  • Toseland M, Toseland S. Infographica: The world as you have never seen it before. London: Quercus Books; 2012. [ Google Scholar ]
  • Wilkinson L. Cognitive science and graphic design. In: SYSTAT Software Inc, editor. SYSTAT 13: Graphics. Chicago, IL: SYSTAT Software Inc; 2009. pp. 1–21. [ Google Scholar ]

Useful Additional Readings for Procedure 5.2

  • Field A. Discovering statistics using SPSS for windows. 5. Los Angeles: Sage; 2018. [ Google Scholar ]
  • George D, Mallery P. IBM SPSS statistics 25 step by step: A simple guide and reference. 15. Boston, MA: Pearson Education; 2019. [ Google Scholar ]
  • Hintze JL. NCSS 8 help system: Graphics. Kaysville, UT: Number Cruncher Statistical Systems; 2012. [ Google Scholar ]
  • StatPoint Technologies, Inc . STATGRAPHICS Centurion XVI user manual. Warrenton, VA: StatPoint Technologies Inc.; 2010. [ Google Scholar ]
  • SYSTAT Software Inc . SYSTAT 13: Graphics. Chicago, IL: SYSTAT Software Inc; 2009. [ Google Scholar ]

References for Procedure 5.3

  • Cleveland WR. Visualizing data. Summit, NJ: Hobart Press; 1995. [ Google Scholar ]
  • Jacoby WJ. Statistical graphics for visualizing multivariate data. Thousand Oaks, CA: Sage; 1998. [ Google Scholar ]

Useful Additional Readings for Procedure 5.3

  • Kirk A. Data visualisation: A handbook for data driven design. Los Angeles: Sage; 2016. [ Google Scholar ]
  • Knaflic CN. Storytelling with data: A data visualization guide for business professionals. Hoboken, NJ: Wiley; 2015. [ Google Scholar ]
  • Tufte E. The visual display of quantitative information. 2. Cheshire, CN: Graphics Press; 2001. [ Google Scholar ]

Reference for Procedure 5.4

Useful additional readings for procedure 5.4.

  • Rosenthal R, Rosnow RL. Essentials of behavioral research: Methods and data analysis. 2. New York: McGraw-Hill Inc; 1991. [ Google Scholar ]

References for Procedure 5.5

Useful additional readings for procedure 5.5.

  • Gravetter FJ, Wallnau LB. Statistics for the behavioural sciences. 9. Belmont, CA: Wadsworth Cengage; 2012. [ Google Scholar ]

References for Fundamental Concept I

Useful additional readings for fundamental concept i.

  • Howell DC. Statistical methods for psychology. 8. Belmont, CA: Cengage Wadsworth; 2013. [ Google Scholar ]

References for Procedure 5.6

  • Norušis MJ. IBM SPSS statistics 19 guide to data analysis. Upper Saddle River, NJ: Prentice Hall; 2012. [ Google Scholar ]
  • Field A. Discovering statistics using SPSS for Windows. 5. Los Angeles: Sage; 2018. [ Google Scholar ]
  • Hintze JL. NCSS 8 help system: Introduction. Kaysville, UT: Number Cruncher Statistical System; 2012. [ Google Scholar ]
  • SYSTAT Software Inc . SYSTAT 13: Statistics - I. Chicago, IL: SYSTAT Software Inc; 2009. [ Google Scholar ]

Useful Additional Readings for Procedure 5.6

  • Hartwig F, Dearing BE. Exploratory data analysis. Beverly Hills, CA: Sage; 1979. [ Google Scholar ]
  • Leinhardt G, Leinhardt L. Exploratory data analysis. In: Keeves JP, editor. Educational research, methodology, and measurement: An international handbook. 2. Oxford: Pergamon Press; 1997. pp. 519–528. [ Google Scholar ]
  • Rosenthal R, Rosnow RL. Essentials of behavioral research: Methods and data analysis. 2. New York: McGraw-Hill, Inc.; 1991. [ Google Scholar ]
  • Tukey JW. Exploratory data analysis. Reading, MA: Addison-Wesley Publishing; 1977. [ Google Scholar ]
  • Velleman PF, Hoaglin DC. ABC’s of EDA. Boston: Duxbury Press; 1981. [ Google Scholar ]

Useful Additional Readings for Procedure 5.7

References for fundemental concept ii.

  • Diebold FX, Schuermann T, Stroughair D. Pitfalls and opportunities in the use of extreme value theory in risk management. The Journal of Risk Finance. 2000; 1 (2):30–35. doi: 10.1108/eb043443. [ CrossRef ] [ Google Scholar ]
  • Lane D. Online statistics education: A multimedia course of study. Houston, TX: Rice University; 2007. [ Google Scholar ]

Useful Additional Readings for Fundemental Concept II

  • Keller DK. The tao of statistics: A path to understanding (with no math) Thousand Oaks, CA: Sage; 2006. [ Google Scholar ]

descriptive statistics section in research paper

Quant Analysis 101: Descriptive Statistics

Everything You Need To Get Started (With Examples)

By: Derek Jansen (MBA) | Reviewers: Kerryn Warren (PhD) | October 2023

If you’re new to quantitative data analysis , one of the first terms you’re likely to hear being thrown around is descriptive statistics. In this post, we’ll unpack the basics of descriptive statistics, using straightforward language and loads of examples . So grab a cup of coffee and let’s crunch some numbers!

Overview: Descriptive Statistics

What are descriptive statistics.

  • Descriptive vs inferential statistics
  • Why the descriptives matter
  • The “ Big 7 ” descriptive statistics
  • Key takeaways

At the simplest level, descriptive statistics summarise and describe relatively basic but essential features of a quantitative dataset – for example, a set of survey responses. They provide a snapshot of the characteristics of your dataset and allow you to better understand, roughly, how the data are “shaped” (more on this later). For example, a descriptive statistic could include the proportion of males and females within a sample or the percentages of different age groups within a population.

Another common descriptive statistic is the humble average (which in statistics-talk is called the mean ). For example, if you undertook a survey and asked people to rate their satisfaction with a particular product on a scale of 1 to 10, you could then calculate the average rating. This is a very basic statistic, but as you can see, it gives you some idea of how this data point is shaped .

Descriptive statistics summarise and describe relatively basic but essential features of a quantitative dataset, including its “shape”

What about inferential statistics?

Now, you may have also heard the term inferential statistics being thrown around, and you’re probably wondering how that’s different from descriptive statistics. Simply put, descriptive statistics describe and summarise the sample itself , while inferential statistics use the data from a sample to make inferences or predictions about a population .

Put another way, descriptive statistics help you understand your dataset , while inferential statistics help you make broader statements about the population , based on what you observe within the sample. If you’re keen to learn more, we cover inferential stats in another post , or you can check out the explainer video below.

Why do descriptive statistics matter?

While descriptive statistics are relatively simple from a mathematical perspective, they play a very important role in any research project . All too often, students skim over the descriptives and run ahead to the seemingly more exciting inferential statistics, but this can be a costly mistake.

The reason for this is that descriptive statistics help you, as the researcher, comprehend the key characteristics of your sample without getting lost in vast amounts of raw data. In doing so, they provide a foundation for your quantitative analysis . Additionally, they enable you to quickly identify potential issues within your dataset – for example, suspicious outliers, missing responses and so on. Just as importantly, descriptive statistics inform the decision-making process when it comes to choosing which inferential statistics you’ll run, as each inferential test has specific requirements regarding the shape of the data.

Long story short, it’s essential that you take the time to dig into your descriptive statistics before looking at more “advanced” inferentials. It’s also worth noting that, depending on your research aims and questions, descriptive stats may be all that you need in any case . So, don’t discount the descriptives! 

Free Webinar: Research Methodology 101

The “Big 7” descriptive statistics

With the what and why out of the way, let’s take a look at the most common descriptive statistics. Beyond the counts, proportions and percentages we mentioned earlier, we have what we call the “Big 7” descriptives. These can be divided into two categories – measures of central tendency and measures of dispersion.

Measures of central tendency

True to the name, measures of central tendency describe the centre or “middle section” of a dataset. In other words, they provide some indication of what a “typical” data point looks like within a given dataset. The three most common measures are:

The mean , which is the mathematical average of a set of numbers – in other words, the sum of all numbers divided by the count of all numbers. 
The median , which is the middlemost number in a set of numbers, when those numbers are ordered from lowest to highest.
The mode , which is the most frequently occurring number in a set of numbers (in any order). Naturally, a dataset can have one mode, no mode (no number occurs more than once) or multiple modes.

To make this a little more tangible, let’s look at a sample dataset, along with the corresponding mean, median and mode. This dataset reflects the service ratings (on a scale of 1 – 10) from 15 customers.

Example set of descriptive stats

As you can see, the mean of 5.8 is the average rating across all 15 customers. Meanwhile, 6 is the median . In other words, if you were to list all the responses in order from low to high, Customer 8 would be in the middle (with their service rating being 6). Lastly, the number 5 is the most frequent rating (appearing 3 times), making it the mode.

Together, these three descriptive statistics give us a quick overview of how these customers feel about the service levels at this business. In other words, most customers feel rather lukewarm and there’s certainly room for improvement. From a more statistical perspective, this also means that the data tend to cluster around the 5-6 mark , since the mean and the median are fairly close to each other.

To take this a step further, let’s look at the frequency distribution of the responses . In other words, let’s count how many times each rating was received, and then plot these counts onto a bar chart.

Example frequency distribution of descriptive stats

As you can see, the responses tend to cluster toward the centre of the chart , creating something of a bell-shaped curve. In statistical terms, this is called a normal distribution .

As you delve into quantitative data analysis, you’ll find that normal distributions are very common , but they’re certainly not the only type of distribution. In some cases, the data can lean toward the left or the right of the chart (i.e., toward the low end or high end). This lean is reflected by a measure called skewness , and it’s important to pay attention to this when you’re analysing your data, as this will have an impact on what types of inferential statistics you can use on your dataset.

Example of skewness

Measures of dispersion

While the measures of central tendency provide insight into how “centred” the dataset is, it’s also important to understand how dispersed that dataset is . In other words, to what extent the data cluster toward the centre – specifically, the mean. In some cases, the majority of the data points will sit very close to the centre, while in other cases, they’ll be scattered all over the place. Enter the measures of dispersion, of which there are three:

Range , which measures the difference between the largest and smallest number in the dataset. In other words, it indicates how spread out the dataset really is.

Variance , which measures how much each number in a dataset varies from the mean (average). More technically, it calculates the average of the squared differences between each number and the mean. A higher variance indicates that the data points are more spread out , while a lower variance suggests that the data points are closer to the mean.

Standard deviation , which is the square root of the variance . It serves the same purposes as the variance, but is a bit easier to interpret as it presents a figure that is in the same unit as the original data . You’ll typically present this statistic alongside the means when describing the data in your research.

Again, let’s look at our sample dataset to make this all a little more tangible.

descriptive statistics section in research paper

As you can see, the range of 8 reflects the difference between the highest rating (10) and the lowest rating (2). The standard deviation of 2.18 tells us that on average, results within the dataset are 2.18 away from the mean (of 5.8), reflecting a relatively dispersed set of data .

For the sake of comparison, let’s look at another much more tightly grouped (less dispersed) dataset.

Example of skewed data

As you can see, all the ratings lay between 5 and 8 in this dataset, resulting in a much smaller range, variance and standard deviation . You might also notice that the data are clustered toward the right side of the graph – in other words, the data are skewed. If we calculate the skewness for this dataset, we get a result of -0.12, confirming this right lean.

In summary, range, variance and standard deviation all provide an indication of how dispersed the data are . These measures are important because they help you interpret the measures of central tendency within context . In other words, if your measures of dispersion are all fairly high numbers, you need to interpret your measures of central tendency with some caution , as the results are not particularly centred. Conversely, if the data are all tightly grouped around the mean (i.e., low dispersion), the mean becomes a much more “meaningful” statistic).

Key Takeaways

We’ve covered quite a bit of ground in this post. Here are the key takeaways:

  • Descriptive statistics, although relatively simple, are a critically important part of any quantitative data analysis.
  • Measures of central tendency include the mean (average), median and mode.
  • Skewness indicates whether a dataset leans to one side or another
  • Measures of dispersion include the range, variance and standard deviation

If you’d like hands-on help with your descriptive statistics (or any other aspect of your research project), check out our private coaching service , where we hold your hand through each step of the research journey. 

Literature Review Course

Psst… there’s more!

This post is an extract from our bestselling short course, Methodology Bootcamp . If you want to work smart, you don't want to miss this .

ed

Good day. May I ask about where I would be able to find the statistics cheat sheet?

Khan

Right above you comment 🙂

Laarbik Patience

Good job. you saved me

Lou

Brilliant and well explained. So much information explained clearly!

Submit a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

  • Print Friendly

Have a thesis expert improve your writing

Check your thesis for plagiarism in 10 minutes, generate your apa citations for free.

  • Knowledge Base

Descriptive Statistics | Definitions, Types, Examples

Published on 4 November 2022 by Pritha Bhandari . Revised on 9 January 2023.

Descriptive statistics summarise and organise characteristics of a data set. A data set is a collection of responses or observations from a sample or entire population .

In quantitative research , after collecting data, the first step of statistical analysis is to describe characteristics of the responses, such as the average of one variable (e.g., age), or the relation between two variables (e.g., age and creativity).

The next step is inferential statistics , which help you decide whether your data confirms or refutes your hypothesis and whether it is generalisable to a larger population.

Table of contents

Types of descriptive statistics, frequency distribution, measures of central tendency, measures of variability, univariate descriptive statistics, bivariate descriptive statistics, frequently asked questions.

There are 3 main types of descriptive statistics:

  • The distribution concerns the frequency of each value.
  • The central tendency concerns the averages of the values.
  • The variability or dispersion concerns how spread out the values are.

Types of descriptive statistics

You can apply these to assess only one variable at a time, in univariate analysis, or to compare two or more, in bivariate and multivariate analysis.

  • Go to a library
  • Watch a movie at a theater
  • Visit a national park

A data set is made up of a distribution of values, or scores. In tables or graphs, you can summarise the frequency of every possible value of a variable in numbers or percentages.

  • Simple frequency distribution table
  • Grouped frequency distribution table
Gender Number
Male 182
Female 235
Other 27

From this table, you can see that more women than men or people with another gender identity took part in the study. In a grouped frequency distribution, you can group numerical response values and add up the number of responses for each group. You can also convert each of these numbers to percentages.

Library visits in the past year Percent
0–4 6%
5–8 20%
9–12 42%
13–16 24%
17+ 8%

Measures of central tendency estimate the center, or average, of a data set. The mean , median and mode are 3 ways of finding the average.

Here we will demonstrate how to calculate the mean, median, and mode using the first 6 responses of our survey.

The mean , or M , is the most commonly used method for finding the average.

To find the mean, simply add up all response values and divide the sum by the total number of responses. The total number of responses or observations is called N .

Mean number of library visits
Data set 15, 3, 12, 0, 24, 3
Sum of all values 15 + 3 + 12 + 0 + 24 + 3 = 57
Total number of responses = 6
Mean Divide the sum of values by to find : 57/6 =

The median is the value that’s exactly in the middle of a data set.

To find the median, order each response value from the smallest to the biggest. Then, the median is the number in the middle. If there are two numbers in the middle, find their mean.

Median number of library visits
Ordered data set 0, 3, 3, 12, 15, 24
Middle numbers 3, 12
Median Find the mean of the two middle numbers: (3 + 12)/2 =

The mode is the simply the most popular or most frequent response value. A data set can have no mode, one mode, or more than one mode.

To find the mode, order your data set from lowest to highest and find the response that occurs most frequently.

Mode number of library visits
Ordered data set 0, 3, 3, 12, 15, 24
Mode Find the most frequently occurring response:

Measures of variability give you a sense of how spread out the response values are. The range, standard deviation and variance each reflect different aspects of spread.

The range gives you an idea of how far apart the most extreme response scores are. To find the range , simply subtract the lowest value from the highest value.

Standard deviation

The standard deviation ( s ) is the average amount of variability in your dataset. It tells you, on average, how far each score lies from the mean. The larger the standard deviation, the more variable the data set is.

There are six steps for finding the standard deviation:

  • List each score and find their mean.
  • Subtract the mean from each score to get the deviation from the mean.
  • Square each of these deviations.
  • Add up all of the squared deviations.
  • Divide the sum of the squared deviations by N – 1.
  • Find the square root of the number you found.
Raw data Deviation from mean Squared deviation
15 15 – 9.5 = 5.5 30.25
3 3 – 9.5 = -6.5 42.25
12 12 – 9.5 = 2.5 6.25
0 0 – 9.5 = -9.5 90.25
24 24 – 9.5 = 14.5 210.25
3 3 – 9.5 = -6.5 42.25
= 9.5 Sum = 0 Sum of squares = 421.5

Step 5: 421.5/5 = 84.3

Step 6: √84.3 = 9.18

The variance is the average of squared deviations from the mean. Variance reflects the degree of spread in the data set. The more spread the data, the larger the variance is in relation to the mean.

To find the variance, simply square the standard deviation. The symbol for variance is s 2 .

Univariate descriptive statistics focus on only one variable at a time. It’s important to examine data from each variable separately using multiple measures of distribution, central tendency and spread. Programs like SPSS and Excel can be used to easily calculate these.

Visits to the library
6
Mean 9.5
Median 7.5
Mode 3
Standard deviation 9.18
Variance 84.3
Range 24

If you were to only consider the mean as a measure of central tendency, your impression of the ‘middle’ of the data set can be skewed by outliers, unlike the median or mode.

Likewise, while the range is sensitive to extreme values, you should also consider the standard deviation and variance to get easily comparable measures of spread.

If you’ve collected data on more than one variable, you can use bivariate or multivariate descriptive statistics to explore whether there are relationships between them.

In bivariate analysis, you simultaneously study the frequency and variability of two variables to see if they vary together. You can also compare the central tendency of the two variables before performing further statistical tests .

Multivariate analysis is the same as bivariate analysis but with more than two variables.

Contingency table

In a contingency table, each cell represents the intersection of two variables. Usually, an independent variable (e.g., gender) appears along the vertical axis and a dependent one appears along the horizontal axis (e.g., activities). You read ‘across’ the table to see how the independent and dependent variables relate to each other.

Number of visits to the library in the past year
Group 0–4 5–8 9–12 13–16 17+
Children 32 68 37 23 22
Adults 36 48 43 83 25

Interpreting a contingency table is easier when the raw data is converted to percentages. Percentages make each row comparable to the other by making it seem as if each group had only 100 observations or participants. When creating a percentage-based contingency table, you add the N for each independent variable on the end.

Visits to the library in the past year (Percentages)
Group 0–4 5–8 9–12 13–16 17+
Children 18% 37% 20% 13% 12% 182
Adults 15% 20% 18% 35% 11% 235

From this table, it is more clear that similar proportions of children and adults go to the library over 17 times a year. Additionally, children most commonly went to the library between 5 and 8 times, while for adults, this number was between 13 and 16.

Scatter plots

A scatter plot is a chart that shows you the relationship between two or three variables. It’s a visual representation of the strength of a relationship.

In a scatter plot, you plot one variable along the x-axis and another one along the y-axis. Each data point is represented by a point in the chart.

From your scatter plot, you see that as the number of movies seen at movie theaters increases, the number of visits to the library decreases. Based on your visual assessment of a possible linear relationship, you perform further tests of correlation and regression.

Descriptive statistics: Scatter plot

Descriptive statistics summarise the characteristics of a data set. Inferential statistics allow you to test a hypothesis or assess whether your data is generalisable to the broader population.

The 3 main types of descriptive statistics concern the frequency distribution, central tendency, and variability of a dataset.

  • Distribution refers to the frequencies of different responses.
  • Measures of central tendency give you the average for each response.
  • Measures of variability show you the spread or dispersion of your dataset.
  • Univariate statistics summarise only one variable  at a time.
  • Bivariate statistics compare two variables .
  • Multivariate statistics compare more than two variables .

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Bhandari, P. (2023, January 09). Descriptive Statistics | Definitions, Types, Examples. Scribbr. Retrieved 3 September 2024, from https://www.scribbr.co.uk/stats/descriptive-statistics-explained/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, data collection methods | step-by-step guide & examples, variability | calculating range, iqr, variance, standard deviation, normal distribution | examples, formulas, & uses.

Logo for BCcampus Open Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Chapter 12: Descriptive Statistics

Expressing Your Results

Learning Objectives

  • Write out simple descriptive statistics in American Psychological Association (APA) style.
  • Interpret and create simple APA-style graphs—including bar graphs, line graphs, and scatterplots.
  • Interpret and create simple APA-style tables—including tables of group or condition means and correlation matrixes.

Once you have conducted your descriptive statistical analyses, you will need to present them to others. In this section, we focus on presenting descriptive statistical results in writing, in graphs, and in tables—following American Psychological Association (APA) guidelines for written research reports. These principles can be adapted easily to other presentation formats such as posters and slide show presentations.

Presenting Descriptive Statistics in Writing

When you have a small number of results to report, it is often most efficient to write them out. There are a few important APA style guidelines here. First, statistical results are always presented in the form of numerals rather than words and are usually rounded to two decimal places (e.g., “2.00” rather than “two” or “2”). They can be presented either in the narrative description of the results or parenthetically—much like reference citations. Here are some examples:

The mean age of the participants was 22.43 years with a standard deviation of 2.34.

Among the low self-esteem participants, those in a negative mood expressed stronger intentions to have unprotected sex ( M  = 4.05,  SD  = 2.32) than those in a positive mood ( M  = 2.15,  SD  = 2.27).

The treatment group had a mean of 23.40 ( SD  = 9.33), while the control group had a mean of 20.87 ( SD  = 8.45).

The test-retest correlation was .96.

There was a moderate negative correlation between the alphabetical position of respondents’ last names and their response time ( r  = −.27).

Notice that when presented in the narrative, the terms  mean  and  standard deviation  are written out, but when presented parenthetically, the symbols  M and  SD  are used instead. Notice also that it is especially important to use parallel construction to express similar or comparable results in similar ways. The third example is  much  better than the following nonparallel alternative:

The treatment group had a mean of 23.40 ( SD  = 9.33), while 20.87 was the mean of the control group, which had a standard deviation of 8.45.

Presenting Descriptive Statistics in Graphs

When you have a large number of results to report, you can often do it more clearly and efficiently with a graph. When you prepare graphs for an APA-style research report, there are some general guidelines that you should keep in mind. First, the graph should always add important information rather than repeat information that already appears in the text or in a table. (If a graph presents information more clearly or efficiently, then you should keep the graph and eliminate the text or table.) Second, graphs should be as simple as possible. For example, the  Publication Manual  discourages the use of colour unless it is absolutely necessary (although colour can still be an effective element in posters, slide show presentations, or textbooks.) Third, graphs should be interpretable on their own. A reader should be able to understand the basic result based only on the graph and its caption and should not have to refer to the text for an explanation.

There are also several more technical guidelines for graphs that include the following:

  • The graph should be slightly wider than it is tall.
  • The independent variable should be plotted on the  x- axis and the dependent variable on the  y- axis.
  • Values should increase from left to right on the  x- axis and from bottom to top on the  y- axis.
  • Axis labels should be clear and concise and include the units of measurement if they do not appear in the caption.
  • Axis labels should be parallel to the axis.
  • Legends should appear within the boundaries of the graph.
  • Text should be in the same simple font throughout and differ by no more than four points.
  • Captions should briefly describe the figure, explain any abbreviations, and include the units of measurement if they do not appear in the axis labels.
  • Captions in an APA manuscript should be typed on a separate page that appears at the end of the manuscript. See  Chapter 11 for more information.

As we have seen throughout this book,  bar graphs  are generally used to present and compare the mean scores for two or more groups or conditions. The bar graph in Figure 12.11 is an APA-style version of Figure 12.4. Notice that it conforms to all the guidelines listed. A new element in Figure 12.11 is the smaller vertical bars that extend both upward and downward from the top of each main bar. These are error bars , and they represent the variability in each group or condition. Although they sometimes extend one standard deviation in each direction, they are more likely to extend one standard error in each direction (as in Figure 12.11). The  standard error  is the standard deviation of the group divided by the square root of the sample size of the group. The standard error is used because, in general, a difference between group means that is greater than two standard errors is statistically significant. Thus one can “see” whether a difference is statistically significant based on a bar graph with error bars.

Sample APA-style bar graph. Long description available.

Line Graphs

Line graphs  are used to present correlations between quantitative variables when the independent variable has, or is organized into, a relatively small number of distinct levels. Each point in a line graph represents the mean score on the dependent variable for participants at one level of the independent variable. Figure 12.12 is an APA-style version of the results of Carlson and Conard. Notice that it includes error bars representing the standard error and conforms to all the stated guidelines.

Sample APA-style line graph. Long description available.

In most cases, the information in a line graph could just as easily be presented in a bar graph. In Figure 12.12, for example, one could replace each point with a bar that reaches up to the same level and leave the error bars right where they are. This emphasizes the fundamental similarity of the two types of statistical relationship. Both are differences in the average score on one variable across levels of another. The convention followed by most researchers, however, is to use a bar graph when the variable plotted on the  x- axis is categorical and a line graph when it is quantitative.

Scatterplots

Scatterplots  are used to present relationships between quantitative variables when the variable on the  x- axis (typically the independent variable) has a large number of levels. Each point in a scatterplot represents an individual rather than the mean for a group of individuals, and there are no lines connecting the points. The graph in Figure 12.13 is an APA-style version of Figure 12.7, which illustrates a few additional points. First, when the variables on the x- axis and  y -axis are conceptually similar and measured on the same scale—as here, where they are measures of the same variable on two different occasions—this can be emphasized by making the axes the same length. Second, when two or more individuals fall at exactly the same point on the graph, one way this can be indicated is by offsetting the points slightly along the  x- axis. Other ways are by displaying the number of individuals in parentheses next to the point or by making the point larger or darker in proportion to the number of individuals. Finally, the straight line that best fits the points in the scatterplot, which is called the regression line, can also be included.

Sample APA-style scatterplot. Long description available.

Expressing Descriptive Statistics in Tables

Like graphs, tables can be used to present large amounts of information clearly and efficiently. The same general principles apply to tables as apply to graphs. They should add important information to the presentation of your results, be as simple as possible, and be interpretable on their own. Again, we focus here on tables for an APA-style manuscript.

The most common use of tables is to present several means and standard deviations—usually for complex research designs with multiple independent and dependent variables. Figure 12.14, for example, shows the results of a hypothetical study similar to the one by MacDonald and Martineau (2002) [1] discussed in  Chapter 5 . (The means in Figure 12.14 are the means reported by MacDonald and Martineau, but the standard errors are not). Recall that these researchers categorized participants as having low or high self-esteem, put them into a negative or positive mood, and measured their intentions to have unprotected sex. Although not mentioned in  Chapter 5 , they also measured participants’ attitudes toward unprotected sex. Notice that the table includes horizontal lines spanning the entire table at the top and bottom, and just beneath the column headings. Furthermore, every column has a heading—including the leftmost column—and there are additional headings that span two or more columns that help to organize the information and present it more efficiently. Finally, notice that APA-style tables are numbered consecutively starting at 1 (Table 1, Table 2, and so on) and given a brief but clear and descriptive title.

Sample APA-style table presenting means and standard deviations. Long description available.

Another common use of tables is to present correlations—usually measured by Pearson’s  r —among several variables. This kind of table is called a  correlation matrix . Figure 12.15 is a correlation matrix based on a study by David McCabe and colleagues (McCabe, Roediger, McDaniel, Balota, & Hambrick, 2010) [2] . They were interested in the relationships between working memory and several other variables. We can see from the table that the correlation between working memory and executive function, for example, was an extremely strong .96, that the correlation between working memory and vocabulary was a medium .27, and that all the measures except vocabulary tend to decline with age. Notice here that only half the table is filled in because the other half would have identical values. For example, the Pearson’s  r  value in the upper right corner (working memory and age) would be the same as the one in the lower left corner (age and working memory). The correlation of a variable with itself is always 1.00, so these values are replaced by dashes to make the table easier to read.

Sample APA-style table (correlation matrix). Long description available.

As with graphs, precise statistical results that appear in a table do not need to be repeated in the text. Instead, the writer can note major trends and alert the reader to details (e.g., specific correlations) that are of particular interest.

Key Takeaways

  • In an APA-style article, simple results are most efficiently presented in the text, while more complex results are most efficiently presented in graphs or tables.
  • APA style includes several rules for presenting numerical results in the text. These include using words only for numbers less than 10 that do not represent precise statistical results, and rounding results to two decimal places, using words (e.g., “mean”) in the text and symbols (e.g., “ M ”) in parentheses.
  • APA style includes several rules for presenting results in graphs and tables. Graphs and tables should add information rather than repeating information, be as simple as possible, and be interpretable on their own with a descriptive caption (for graphs) or a descriptive title (for tables).

Long Descriptions

“Convincing” long description: A four-panel comic strip. In the first panel, a man says to a woman, “I think we should give it another shot.” The woman says, “We should break up, and I can prove it.”

In the second panel, there is a line graph with a downward trend titled “Our Relationship.”

In the third panel, the man, bent over and looking at the graph in the woman’s hands, says, “Huh.”

In the fourth panel, the man says, “Maybe you’re right.” The woman says, “I knew data would convince you.” The man replies, “No, I just think I can do better than someone who doesn’t label her axes.” [Return to “Convincing”]

Figure 12.11 long description: A sample APA-style bar graph, with a horizontal axis labelled “Condition” and a vertical axis labelled “Clinician Rating of Severity.” The caption of the graph says, “Figure X. Mean clinician’s rating of phobia severity for participants receiving the education treatment and the exposure treatment. Error bars represent standard errors.” At the top of each data bar is an error bar, which look likes a capital I: a vertical line with short horizontal lines attached to its top and bottom. The bottom half of each error bar hangs over the top of the data bar, while each top half sticks out the top of the data bar. [Return to Figure 12.11]

Figure 12.12 long description: A sample APA-style line graph with a horizontal axis labelled “Last Name Quartile” and a vertical axis labelled “Response Times (z Scores).” The caption of the graph says, “Figure X. Mean response time by the alphabetical position of respondents’ names in the alphabet. Response times are expressed as z scores. Error bars represent standard errors.” Each data point has an error bar sticking out of its top and bottom. [Return to Figure 12.12]

Figure 12.13 long description: Sample APA-style scatterplot with a horizontal axis labelled “Time 1” and a vertical axis labelled “Time 2.” Each axis has values from 10 to 30. The caption of the scatterplot says, “Figure X. Relationship between scores on the Rosenberg self-esteem scale taken by 25 research methods students on two occasions one week apart. Pearson’s r = .96.” Most of the data points are clustered around the dashed regression line that extends from approximately (12, 11) to (29, 22). [Return to Figure 12.13]

Figure 12.14 long description: Sample APA-style table presenting means and standard deviations. The table is titled “Table X” and is captioned, “Means and Standard Deviations of Intentions to Have Unprotected Sex and Attitudes Toward Unprotected Sex as a Function of Both Mood and Self-Esteem.” The data is organized into negative and positive mood and details intentions and attitudes toward unprotected sex.

Negative mood:

  • High—Mean, 2.46
  • High—Standard Deviation, 1.97
  • Low—Mean, 4.05
  • Low—Standard Deviation, 2.32
  • High—Mean, 1.65
  • High—Standard Deviation, 2.23
  • Low—Mean, 1.95
  • Low—Standard Deviation, 2.01

Positive mood:

  • High—Mean, 2.45
  • High—Standard Deviation, 2.00
  • Low—Mean, 2.15
  • Low—Standard Deviation, 2.27
  • High—Mean, 1.82
  • High—Standard Deviation, 2.32
  • Low—Mean, 1.23
  • Low—Standard Deviation, 1.75

[Return to Figure 12.14]

Figure 12.15 long description: Sample APA-style correlation matrix, titled “Table X: Correlations Between Five Cognitive Variables and Age.” The five cognitive variables are:

  • Working memory
  • Executive function
  • Processing speed
  • Episodic memory

The data is as such:

Table X: Correlations Between Five Cognitive Variables and Age
Measure 1 2 3 4 5
.96
.78 .78
.27 .45 .08
.73 .75 .52 .38
−.59 −.56 −.82 .22 −.41

Media Attributions

  • Convincing by XKCD  CC BY-NC (Attribution NonCommercial)
  • MacDonald, T. K., & Martineau, A. M. (2002). Self-esteem, mood, and intentions to use condoms: When does low self-esteem lead to risky health behaviours? Journal of Experimental Social Psychology, 38 , 299–306. ↵
  • McCabe, D. P., Roediger, H. L., McDaniel, M. A., Balota, D. A., & Hambrick, D. Z. (2010). The relationship between working memory capacity and executive functioning. Neuropsychology, 24 (2), 222–243. doi:10.1037/a0017619 ↵
  • Buss, D. M., & Schmitt, D. P. (1993). Sexual strategies theory: A contextual evolutionary analysis of human mating. Psychological Review, 100 , 204–232. ↵

A figure in which the heights of the bars represent the group means.

Small bars at the top of each main bar in a bar graph that represent the variability in each group or condition.

The standard deviation of the group divided by the square root of the sample size of the group.

A graph used to present correlations between quantitative variables when the independent variable has, or is organized into, a relatively small number of distinct levels.

A graph which shows correlations between quantitative variables; each point represents one person’s score on both variables.

A table showing the correlation between every possible pair of variables in the study.

Research Methods in Psychology - 2nd Canadian Edition Copyright © 2015 by Paul C. Price, Rajiv Jhangiani, & I-Chant A. Chiang is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

descriptive statistics section in research paper

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • My Bibliography
  • Collections
  • Citation manager

Save citation to file

Email citation, add to collections.

  • Create a new collection
  • Add to an existing collection

Add to My Bibliography

Your saved search, create a file for external citation management software, your rss feed.

  • Search in PubMed
  • Search in NLM Catalog
  • Add to Search

Descriptive Statistics: Reporting the Answers to the 5 Basic Questions of Who, What, Why, When, Where, and a Sixth, So What?

Affiliation.

  • 1 From the Department of Surgery and Perioperative Care, Dell Medical School at the University of Texas at Austin, Austin, Texas.
  • PMID: 28891910
  • DOI: 10.1213/ANE.0000000000002471

Descriptive statistics are specific methods basically used to calculate, describe, and summarize collected research data in a logical, meaningful, and efficient way. Descriptive statistics are reported numerically in the manuscript text and/or in its tables, or graphically in its figures. This basic statistical tutorial discusses a series of fundamental concepts about descriptive statistics and their reporting. The mean, median, and mode are 3 measures of the center or central tendency of a set of data. In addition to a measure of its central tendency (mean, median, or mode), another important characteristic of a research data set is its variability or dispersion (ie, spread). In simplest terms, variability is how much the individual recorded scores or observed values differ from one another. The range, standard deviation, and interquartile range are 3 measures of variability or dispersion. The standard deviation is typically reported for a mean, and the interquartile range for a median. Testing for statistical significance, along with calculating the observed treatment effect (or the strength of the association between an exposure and an outcome), and generating a corresponding confidence interval are 3 tools commonly used by researchers (and their collaborating biostatistician or epidemiologist) to validly make inferences and more generalized conclusions from their collected data and descriptive statistics. A number of journals, including Anesthesia & Analgesia, strongly encourage or require the reporting of pertinent confidence intervals. A confidence interval can be calculated for virtually any variable or outcome measure in an experimental, quasi-experimental, or observational research study design. Generally speaking, in a clinical trial, the confidence interval is the range of values within which the true treatment effect in the population likely resides. In an observational study, the confidence interval is the range of values within which the true strength of the association between the exposure and the outcome (eg, the risk ratio or odds ratio) in the population likely resides. There are many possible ways to graphically display or illustrate different types of data. While there is often latitude as to the choice of format, ultimately, the simplest and most comprehensible format is preferred. Common examples include a histogram, bar chart, line chart or line graph, pie chart, scatterplot, and box-and-whisker plot. Valid and reliable descriptive statistics can answer basic yet important questions about a research data set, namely: "Who, What, Why, When, Where, How, How Much?"

PubMed Disclaimer

Similar articles

  • Fundamentals of Research Data and Variables: The Devil Is in the Details. Vetter TR. Vetter TR. Anesth Analg. 2017 Oct;125(4):1375-1380. doi: 10.1213/ANE.0000000000002370. Anesth Analg. 2017. PMID: 28787341 Review.
  • Repeated Measures Designs and Analysis of Longitudinal Data: If at First You Do Not Succeed-Try, Try Again. Schober P, Vetter TR. Schober P, et al. Anesth Analg. 2018 Aug;127(2):569-575. doi: 10.1213/ANE.0000000000003511. Anesth Analg. 2018. PMID: 29905618 Free PMC article.
  • Preparing for the first meeting with a statistician. De Muth JE. De Muth JE. Am J Health Syst Pharm. 2008 Dec 15;65(24):2358-66. doi: 10.2146/ajhp070007. Am J Health Syst Pharm. 2008. PMID: 19052282 Review.
  • Summarizing and presenting numerical data. Pupovac V, Petrovecki M. Pupovac V, et al. Biochem Med (Zagreb). 2011;21(2):106-10. doi: 10.11613/bm.2011.018. Biochem Med (Zagreb). 2011. PMID: 22135849
  • Introduction to biostatistics: Part 2, Descriptive statistics. Gaddis GM, Gaddis ML. Gaddis GM, et al. Ann Emerg Med. 1990 Mar;19(3):309-15. doi: 10.1016/s0196-0644(05)82052-9. Ann Emerg Med. 1990. PMID: 2310070
  • Applying a "medical deserts" lens to cancer care services in the North-West region of Romania from 2009 to 2022 - a mixed-methods analysis. Brînzac MG, Ungureanu MI, Baba CO. Brînzac MG, et al. Arch Public Health. 2024 Sep 5;82(1):149. doi: 10.1186/s13690-024-01353-x. Arch Public Health. 2024. PMID: 39232788 Free PMC article.
  • Canadian midwives' perspectives on the clinical impacts of point of care ultrasound in obstetrical care: A concurrent mixed-methods study. Johnston BK, Darling EK, Malott A, Thomas L, Murray-Davis B. Johnston BK, et al. Heliyon. 2024 Mar 5;10(6):e27512. doi: 10.1016/j.heliyon.2024.e27512. eCollection 2024 Mar 30. Heliyon. 2024. PMID: 38533003 Free PMC article.
  • Validation and psychometric testing of the Chinese version of the prenatal body image questionnaire. Wang Q, Lin J, Zheng Q, Kang L, Zhang X, Zhang K, Lin R, Lin R. Wang Q, et al. BMC Pregnancy Childbirth. 2024 Feb 1;24(1):102. doi: 10.1186/s12884-024-06281-w. BMC Pregnancy Childbirth. 2024. PMID: 38302902 Free PMC article.
  • Cracking the code: uncovering the factors that drive COVID-19 standard operating procedures compliance among school management in Malaysia. Ahmad NS, Karuppiah K, Praveena SM, Ali NF, Ramdas M, Mohammad Yusof NAD. Ahmad NS, et al. Sci Rep. 2024 Jan 4;14(1):556. doi: 10.1038/s41598-023-49968-4. Sci Rep. 2024. PMID: 38177620 Free PMC article.
  • Comparison of Nonneurological Structures at Risk During Anterior-to-Psoas Versus Transpsoas Surgical Approaches Using Abdominal CT Imaging From L1 to S1. Razzouk J, Ramos O, Harianja G, Carter M, Mehta S, Wycliffe N, Danisa O, Cheng W. Razzouk J, et al. Int J Spine Surg. 2023 Dec 26;17(6):809-815. doi: 10.14444/8542. Int J Spine Surg. 2023. PMID: 37748918 Free PMC article.
  • Search in MeSH

Related information

  • Cited in Books

LinkOut - more resources

Full text sources.

  • Ingenta plc
  • Ovid Technologies, Inc.
  • Wolters Kluwer

Other Literature Sources

  • scite Smart Citations

full text provider logo

  • Citation Manager

NCBI Literature Resources

MeSH PMC Bookshelf Disclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.

  • Search Menu
  • Sign in through your institution
  • Ageing - Other
  • Bladder and Bowel Health
  • Cardiovascular
  • Community Geriatrics
  • Dementia and Related Disorders
  • End of Life Care
  • Ethics and Law
  • Falls and Bone Health
  • Frailty in Urgent Care Settings
  • Gastroenterology and Clinical Nutrition
  • Movement Disorders
  • Perioperative Care of Older People Undergoing Surgery
  • Pharmacology and therapeutics
  • Respiratory
  • Sarcopenia and Frailty Research
  • Telemedicine
  • Advance articles
  • Editor's Choice
  • Supplements
  • Themed collections
  • The Dhole Eddlestone Memorial Prize
  • 50th Anniversary Collection
  • Author Guidelines
  • Submission Site
  • Open Access
  • Reasons to Publish
  • Advertising and Corporate Services
  • Journals Career Network
  • Advertising
  • Reprints and ePrints
  • Sponsored Supplements
  • Branded Books
  • About Age and Ageing
  • About the British Geriatrics Society
  • Editorial Board
  • Self-Archiving Policy
  • Journals on Oxford Academic
  • Books on Oxford Academic

Issue Cover

Article Contents

Introduction, describing the distribution of values, descriptive statistics in text, descriptive statistics in tables, describing loss of participants in a study, comparing baseline characteristics in rcts, conclusions, acknowledgements, conflicts of interest.

  • < Previous

Describing the participants in a study

ORCID logo

  • Article contents
  • Figures & tables
  • Supplementary Data

R. M. Pickering, Describing the participants in a study, Age and Ageing , Volume 46, Issue 4, July 2017, Pages 576–581, https://doi.org/10.1093/ageing/afx054

  • Permissions Icon Permissions

This paper reviews the use of descriptive statistics to describe the participants included in a study. It discusses the practicalities of incorporating statistics in papers for publication in Age and Aging , concisely and in ways that are easy for readers to understand and interpret.

Most papers reporting analysis of clinical data will at some point use statistics to describe the socio-demographic characteristics and medical history of the study participants. An important reason for doing this is to give the reader some idea of the extent to which study findings can be generalised to their own local situation. The production of descriptive statistics is a straightforward matter, most statistical packages producing all the statistics one could possibly desire, and a choice has to be made over which ones to present. These then have to be included in a paper in a manner that is easy for readers to assimilate. There may be constraints on the amount of space available, and it is in any case a good idea to make statistical display as concise as possible. This article reviews the statistics that might be used to describe a sample of older people, and gives tips on how best to do this in a paper for publication in Age and Aging . It builds on a previously published paper [ 1 ].

The values observed in a group of subjects, when measurements of a quantitative characteristic are made, are called the distribution of values. Graphical displays can be used to show the detail of the distribution in a variety of ways, but they take up a considerable amount of space. A precis of two key features of the distribution, its centre and its spread, is usually presented using descriptive statistics. The centre of a distribution can be described by its mean or median, and the spread by its standard deviation (SD), range, or inter-quartile range (IQR). Definitions and properties of these statistics are given in statistical textbooks [ 2 ].

Figure 1 a shows an idealised symmetric distribution for a quantitative variable. The mean might be used here to describe where the centre of the distribution lies and the SD to give an idea of how spread out values are around the centre. SDs are particularly appropriate where a symmetric distribution approximately follows the bell-shaped pattern shown in Figure 1 a which is called the normal distribution. For such a distribution the large majority, 95%, of values observed in a sample will fall between the values two SDs above and below the mean, called the normal range. Presentation of the mean and SD invites the reader to calculate the normal range and think of it as covering most of the distribution of values. Another reason for presenting the SD is that it is required in calculations of sample size for approximately normally distributed outcomes, and can be used by readers in planning future studies. A graphical display of approximately normally distributed real data (age at admission amongst 373 study participants) is shown in Figure 1 c: with relatively small sample size a smooth distribution such as that shown in Figure 1 a cannot be achieved. The mean (82.9) and SD (6.8) of the age distribution lead to the normal range 69.3–96.5 years, which can be seen in Figure 1 c to cover most of the ages in the sample: 14 subjects fall below 69.3 and 7 fall above 96.5, so that the range actually covers 352 (94.4%) of the 373 participants, close to the anticipated 95%. For familiar measurements, such as age, there is additional value in presenting the range, the minimum and maximum values attained. Knowing that the study included people aged between 65 and 101 years is immediately meaningful, whereas the value of the SD is more difficult to interpret.

Idealised and real data distributions. (a) Symmetrical distribution. (b) Skewed distribution. (c) Dotplot (each dot representing one value) of an approximate symmetrical distribution indicating the normal range: age in years at admission (n = 373). (d) Dotplot (each dot representing one value) of a skewed distribution with outliers emphasised and indicating mean and median: hours in A&E (n = 348).

Idealised and real data distributions. (a) Symmetrical distribution. (b) Skewed distribution. (c) Dotplot (each dot representing one value) of an approximate symmetrical distribution indicating the normal range: age in years at admission ( n = 373). (d) Dotplot (each dot representing one value) of a skewed distribution with outliers emphasised and indicating mean and median: hours in A&E ( n = 348).

When a distribution is skewed (Figure 1 b) just one or two extreme values, ‘outliers’, in one of the tails of the distribution (to the right in Figure 1 b) pull the mean away from the obvious central value. An alternative statistic describing central location is the median, defined as the point with 50% of the sample falling above it and 50% below. Figure 1 d shows the distribution of real data (hours in A&E amongst 348 study participants) following a skewed distribution. A few excessively long A&E stays pull the mean to the higher value of 4.9 h compared to the median of 4.4 h: the effect would be greater with a higher proportion of subjects having long stays. The median is often recommended as the preferred statistic to describe the centre of a skewed distribution, but the mean can be helpful. If the attribute being described takes only a limited number of values, the medians of two groups can take the same value in spite of substantial differences in the tails. In these circumstances, the mean can be sensitive to an overall shift in distribution while the median is not. When a comparison of cost based on length of stay is to be made, presenting means of the skewed distributions facilitates calculation of cost savings per subject by applying unit cost to the difference in means. Figure 1 b suggests that the value with highest frequency might be a useful descriptor of the centre of a distribution. In practice, this can prove awkward: depending on the precision of measurement there may be no value occurring more than once.

It is clear from Figure 1 b that no single number can adequately describe the spread of a skewed distribution because spread is greater in one direction than the other. The range (from 1.7 to 40.3 h in A&E in our skewed example) could be used. Another possibility is the IQR (from 3.5 to 5.4 h in A&E) covering the central 50% of the distribution. The SD may be presented even though a distribution is skewed, and could be useful to readers for approximate power calculations, but the normal range derived from the mean and SD will be misleading. With mean(SD) = 4.9(3.2), the lower limit of the normal range of hours in A&E is the impossible negative value of –1.5 h, while the upper limit of 11.3 h lies well below the extreme values exhibited in Figure 1 d.

Descriptive statistics may be presented in text, for example [ 3 ]:

Participants’ ages ranged from 50 to 87 years ( M  = 66.1, SD = 7.8) with 56% identified as female, 64% married or partnered, 23% reported being retired or not working, 55% had post-secondary and higher education, and <20% reported living alone. Over 60% of the participants identified as NZ European. The mean of net personal annual income was $34,615. The participants reported the diagnosis of an average of 2.63 (±2.07) chronic health conditions, with 50% reported having three or more chronic health conditions.

There are perhaps too many attributes (age, gender, marital status, employment status, educational level, living arrangements, nationality, personal income and number of chronic conditions) being described in the excerpt above: it would be easier to assimilate this information from a table.

Characteristics of subjects at admission and their operations before (1998/99) and after (2000/01) implementation of a care pathway [ 4 ]. Figures are number (% of non-missing values) unless otherwise stated

1998/99 (  = 395)2000/01 (  = 373)
Age on admission (years)
 Mean (SD)83 (7)83 (7)
 Minimum–maximum65–10165–101
Gender
 Male90 (23%)90 (24%)
 Female305 (77%)283 (76%)
Admission domicile
 Own home219 (55%)202 (54%)
 Sheltered accommodation47 (12%)58 (16%)
 Residential care90 (23%)83 (22%)
 Nursing home18 (5%)15 (4%)
 Other ward SUHT7 (2%)2 (1%)
 Other trust14 (4%)13 (4%)
Ambulation score
 Bed/chair bound8 (2%)5 (1%)
 Presence 1+12 (3%)7 (2%)
 1 person25 (6%)20 (5%)
 Unable 50 m145 (37%)138 (38%)
 Able 50 m200 (51%)197 (54%)
(  = 390)(  = 367)
Time in A&E (h)
 Mean (SD)4.9 (3.2)5.6 (2.4)
 Minimum–maximum1.7–40.30–21.4
(  = 348)(  = 328)
History of dementia79 (20%)85 (23%)
(  = 395)(  = 371)
Confused on admission124 (32%)125 (34%)
(  = 394)(  = 371)
Type of fracture
 Intra-capsular192 (54%)173 (52%)
 Extra-capsular165 (46%)161 (48%)
(  = 357)(  = 334)
Operation more than 48 h after ward admission183 (52%)205 (64%)
(  = 354)(  = 323)
Reason for delayed operation
 Medical61 (35%)74 (43%)
 Organisational66 (38%)72 (42%)
 Both45 (26%)27 (16%)
(  = 172)(  = 173)
Type of operation
 Thompson's hemiarthroplasty101 (27%)87 (24%)
 Austin-Moore hemiarthroplasty69 (19%)18 (5%)
 Dynamic screw162 (43%)165 (46%)
 Asnis screws38 (11%)38 (11%)
 Bipolar hemiarthroplasty3 (1%)48 (14%)
(  = 373)(  = 356)
Grade of surgeon
 Consultant46 (12%)110 (32%)
 SPR318 (86%)220 (63%)
 SHO6 (2%)18 (5%)
(  = 355)(  = 348)
Grade of anaesthetist
 Consultant1206 (34%)175 (55%)
 SPR99 (28%)52 (16%)
 SHO133 (38%)81 (29%)
(  = 352)(  = 318)
1998/99 (  = 395)2000/01 (  = 373)
Age on admission (years)
 Mean (SD)83 (7)83 (7)
 Minimum–maximum65–10165–101
Gender
 Male90 (23%)90 (24%)
 Female305 (77%)283 (76%)
Admission domicile
 Own home219 (55%)202 (54%)
 Sheltered accommodation47 (12%)58 (16%)
 Residential care90 (23%)83 (22%)
 Nursing home18 (5%)15 (4%)
 Other ward SUHT7 (2%)2 (1%)
 Other trust14 (4%)13 (4%)
Ambulation score
 Bed/chair bound8 (2%)5 (1%)
 Presence 1+12 (3%)7 (2%)
 1 person25 (6%)20 (5%)
 Unable 50 m145 (37%)138 (38%)
 Able 50 m200 (51%)197 (54%)
(  = 390)(  = 367)
Time in A&E (h)
 Mean (SD)4.9 (3.2)5.6 (2.4)
 Minimum–maximum1.7–40.30–21.4
(  = 348)(  = 328)
History of dementia79 (20%)85 (23%)
(  = 395)(  = 371)
Confused on admission124 (32%)125 (34%)
(  = 394)(  = 371)
Type of fracture
 Intra-capsular192 (54%)173 (52%)
 Extra-capsular165 (46%)161 (48%)
(  = 357)(  = 334)
Operation more than 48 h after ward admission183 (52%)205 (64%)
(  = 354)(  = 323)
Reason for delayed operation
 Medical61 (35%)74 (43%)
 Organisational66 (38%)72 (42%)
 Both45 (26%)27 (16%)
(  = 172)(  = 173)
Type of operation
 Thompson's hemiarthroplasty101 (27%)87 (24%)
 Austin-Moore hemiarthroplasty69 (19%)18 (5%)
 Dynamic screw162 (43%)165 (46%)
 Asnis screws38 (11%)38 (11%)
 Bipolar hemiarthroplasty3 (1%)48 (14%)
(  = 373)(  = 356)
Grade of surgeon
 Consultant46 (12%)110 (32%)
 SPR318 (86%)220 (63%)
 SHO6 (2%)18 (5%)
(  = 355)(  = 348)
Grade of anaesthetist
 Consultant1206 (34%)175 (55%)
 SPR99 (28%)52 (16%)
 SHO133 (38%)81 (29%)
(  = 352)(  = 318)

The distributions of the two quantitative variables in Table 1 are described by mean (SD) and range. The statistics being presented should be stated in the context of the table, here in the left hand column, and could differ across variables. If the same statistics are presented for all the variables in a table they can be indicated in the column headings or title. From the mean (SD) and range in each phase, we can see that the age distribution is reasonably symmetrical because the mean falls close to the centre of the range, and the mean ± 2 SD approach the limits of the range. The distribution of hours in A&E is skewed to the right but has been summarised with the same statistics. We can see that the distribution is skewed because the mean is much closer to the minimum than the maximum, and, if the normal range is calculated, the upper limit does not approach the high values in either phase. For these reasons, the normal range should not be interpreted as covering 95% of values. These conclusions from descriptive statistics alone can be verified in Figure 1 c and d.

A choice arises when describing the distribution of an ordinal variable indicating ordered response categories, such as ambulation score in Table 1 . If the variable takes many distinct values, it can be treated as a quantitative variable and described in terms of centre and spread: ordinal variables often extend from the minimum to maximum possible values and in this case stating the range is not helpful. The meaning of the extremes should be stated in the context of the table to aid interpretation of results. Ordinal variables taking only a few distinct values are better treated as categorical variables and number (%) presented for each category. With only five categories the latter approach was adopted for ambulation score. Display as a categorical variable can be facilitated by combining infrequently occurring adjacent values.

In the original study, 3,182 of 5,719 admissions were screened and 2,286 were eligible. Six hundred and ten patients were not available on the hospital units when the RA [Research Assistant] arrived to complete the CAM [Confusion Assessment Method]; 1,582 patients assented to complete the CAM and 94 patients did not assent; the CAM was not completed for 728 patients because an informant was not available to confirm an acute change and fluctuation in mental status prior to admission or enrolment. The CAM was completed for 854 patients; 375 had delirium; 278 were enroled. Of the 278 enroled patients, 172 were discharged before the follow-up assessment, 73 were still hospitalised, 8 withdrew from the study and 27 died. Of the 172 discharged patients, delirium recovery status was determined for 152, 16 withdrew from the study after discharge and 4 died.

The authors start with the 5,719 admissions and report the numbers lost at successive stages, to arrive at the analysis sample of 152. It may be easier to assimilate the detail of the process from tabular or graphical presentation. The CONSORT guidelines [ 6 ] concerning the reporting of Randomised Controlled Trials (RCTs) recommend that progress of participants through a trial be presented as a flow chart, and an example is shown in Figure 2 . These charts are unequivocally helpful and are now presented in studies other than RCTs.

Recruitment and attrition rates in an RCT of WiiActive exercises in community dwelling older adults [7].

Recruitment and attrition rates in an RCT of WiiActive exercises in community dwelling older adults [ 7 ].

In addition to loss of participants at each time point as shown in a flow chart, information on specific variables may be missing even though a participant was available at the study point in question. Taking Table 1 as an example, there were 395 and 373 admissions during the 1998/99 and 2000/01 phases, respectively, as stated in the column headings, but the number of participants providing information varies considerably across the characteristics in the table. The reader should be able to establish how many cases contribute to each result, and to this end wherever the number available is lower than the total for the phase, it is stated below the descriptive statistics. For example, ambulation score was only available for 390 of the 395 participants in the 1998/99 phase. The percentages presented for ambulation score were calculated amongst cases where information was available, and this was done for all percentages in the table as indicated in the title. Alternatively, missing values in a categorical variable may be treated as a category in their own right. Where there is a large amount of missing information, this may be the best way of handling the situation with percentages calculated from the total sample size as denominator. Stating the numbers available allows the reader to check this point. Only participants whose operation was delayed by more than 48 h, gave a ‘reason why operation was delayed’ in the table, and from the stated numbers the reader can see that a reason was not given for all delayed cases.

In reports of RCTs, a table describing baseline characteristics in each trial arm demonstrates whether or not randomisation was successful in producing similar groups, as well as addressing the generalisability issue. If there are differences at baseline, comparison of outcome may be confounded. Statistical tests of significance should not be used to decide whether any differences need to be taken into account [ 8 , 9 ]. If the allocation was properly randomised, we know that any differences at baseline must be due to chance. The question facing the researcher is whether or not the magnitude of a difference at baseline is sufficient to confound comparison of outcome, and this depends on the strength of the relationship between the potential confounder and the outcome, as well the baseline difference. A statistical test for baseline differences does not address this question; furthermore, there may be insufficient numbers available to detect quite large baseline differences. Statistics describing baseline characteristics are used to judge whether any differences are large enough to be important. If they are, additional analyses of outcome controlled for characteristics that differ at baseline may be performed. On the other hand, in non-randomised studies, groups are likely to differ, and statistical significance tests can be used to evaluate the evidence that the selection process of patients to each intervention results in different groups. In this situation a primary analysis controlled for many predictors of outcome would probably have been planned, and should be carried out irrespective of any differences, or lack of them, between study groups.

Describing the main features of the distribution of important characteristics of the participants included in a study is the first step in most papers reporting statistical analysis. It is important in establishing the generalisability of research findings, and in the context of comparative studies, flags the need for controlled analysis. Usually space constraints limit the presentation of many descriptive statistics, and in any case, too many statistics can confuse rather than enhance insight. The attrition of subjects during a study should also be described, so that study subjects can be related to the patient base from which they were drawn.

Descriptive statistics are used to describe the participants in a study so that readers can assess the generalisability of study findings to their own clinical practice.

They need to be appropriate to the variable or participant characteristic they aim to describe, and presented in a fashion that is easy for readers to understand.

When many patient characteristics are being described, the detail of the statistics used and number of participants contributing to analysis are best incorporated in tabular presentation.

The author would like to thank Dr Helen Roberts for kindly granting permission to use data from the care pathway study [ 4 ] to produce Figure 1 c and d.

None declared.

Pickering RM . Describing the subjects in a study . Palliat Med 2001 ; 15 : 69 – 75 .

Google Scholar

Altman DG . Practical Statistics for Medical research . London : Chapman & Hall , 1991 .

Google Preview

Yeung P , Breheny M . Using the capability approach to understand the determinants of subjective well-being among community-dwelling older people in New Zealand . Age Aging 2016 ; 45 : 292 – 8 .

Roberts HC , Pickering RM , Onslow E et al.  . The effectiveness of implementing a care pathway for femoral neck fracture in older people: a prospective controlled before and after study . Age Aging 2004 ; 33 : 178 – 84 .

Cole MG , McCusker JM , Bailey R et al.  . Partial and no recovery from delirium after hospital discharge predict increased adverse events . Age Aging 2017 ; 46 : 90 – 5 .

Schulz KF , Altman DG , Moher D , for the CONSORT Group . CONSORT 2010 statement: updated guidelines for reporting parallel-group randomised trials . BMJ 2010 ; 340 : 698 – 702 .

Kwok BC , Pua YH . Effects of WiiActive exercises on fear of falling and functional outcomes in community-dwelling older adults: a randomised control trial . Age Aging 2016 ; 45 : 621 – 28 .

Assman SF , Pocock SJ , Enos LE , Kasten LE . Subgroup analysis and other (mis)uses of baseline data in clinical trials . Lancet 2000 ; 355 : 1064 – 9 .

Altman DG . Comparability of randomized groups . Statistician 1985 ; 34 : 125 – 36 .

  • descriptive statistics
Month: Total Views:
May 2017 23
June 2017 62
July 2017 73
August 2017 53
September 2017 34
October 2017 89
November 2017 38
December 2017 59
January 2018 32
February 2018 12
March 2018 42
April 2018 45
May 2018 50
June 2018 40
July 2018 172
August 2018 255
September 2018 231
October 2018 289
November 2018 809
December 2018 1,101
January 2019 1,217
February 2019 1,418
March 2019 1,745
April 2019 1,633
May 2019 1,772
June 2019 1,136
July 2019 1,088
August 2019 1,091
September 2019 1,436
October 2019 1,933
November 2019 1,706
December 2019 1,447
January 2020 1,553
February 2020 2,191
March 2020 2,291
April 2020 3,369
May 2020 2,057
June 2020 2,624
July 2020 2,439
August 2020 2,584
September 2020 2,905
October 2020 3,179
November 2020 3,068
December 2020 2,768
January 2021 2,626
February 2021 2,429
March 2021 3,452
April 2021 3,830
May 2021 3,102
June 2021 2,528
July 2021 2,016
August 2021 1,848
September 2021 2,188
October 2021 2,649
November 2021 2,488
December 2021 2,142
January 2022 2,073
February 2022 2,164
March 2022 2,761
April 2022 3,154
May 2022 3,308
June 2022 2,185
July 2022 1,754
August 2022 2,090
September 2022 2,211
October 2022 2,497
November 2022 2,790
December 2022 2,471
January 2023 2,270
February 2023 2,359
March 2023 2,714
April 2023 3,028
May 2023 3,292
June 2023 2,366
July 2023 1,774
August 2023 1,588
September 2023 1,330
October 2023 1,571
November 2023 1,456
December 2023 1,293
January 2024 1,699
February 2024 1,815
March 2024 4,180
April 2024 2,115
May 2024 1,819
June 2024 1,047
July 2024 1,142
August 2024 1,063

Email alerts

Citing articles via.

  • Recommend to your Library

Affiliations

  • Online ISSN 1468-2834
  • Copyright © 2024 British Geriatrics Society
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

  • Privacy Policy

Research Method

Home » Research Results Section – Writing Guide and Examples

Research Results Section – Writing Guide and Examples

Table of Contents

Research Results

Research Results

Research results refer to the findings and conclusions derived from a systematic investigation or study conducted to answer a specific question or hypothesis. These results are typically presented in a written report or paper and can include various forms of data such as numerical data, qualitative data, statistics, charts, graphs, and visual aids.

Results Section in Research

The results section of the research paper presents the findings of the study. It is the part of the paper where the researcher reports the data collected during the study and analyzes it to draw conclusions.

In the results section, the researcher should describe the data that was collected, the statistical analysis performed, and the findings of the study. It is important to be objective and not interpret the data in this section. Instead, the researcher should report the data as accurately and objectively as possible.

Structure of Research Results Section

The structure of the research results section can vary depending on the type of research conducted, but in general, it should contain the following components:

  • Introduction: The introduction should provide an overview of the study, its aims, and its research questions. It should also briefly explain the methodology used to conduct the study.
  • Data presentation : This section presents the data collected during the study. It may include tables, graphs, or other visual aids to help readers better understand the data. The data presented should be organized in a logical and coherent way, with headings and subheadings used to help guide the reader.
  • Data analysis: In this section, the data presented in the previous section are analyzed and interpreted. The statistical tests used to analyze the data should be clearly explained, and the results of the tests should be presented in a way that is easy to understand.
  • Discussion of results : This section should provide an interpretation of the results of the study, including a discussion of any unexpected findings. The discussion should also address the study’s research questions and explain how the results contribute to the field of study.
  • Limitations: This section should acknowledge any limitations of the study, such as sample size, data collection methods, or other factors that may have influenced the results.
  • Conclusions: The conclusions should summarize the main findings of the study and provide a final interpretation of the results. The conclusions should also address the study’s research questions and explain how the results contribute to the field of study.
  • Recommendations : This section may provide recommendations for future research based on the study’s findings. It may also suggest practical applications for the study’s results in real-world settings.

Outline of Research Results Section

The following is an outline of the key components typically included in the Results section:

I. Introduction

  • A brief overview of the research objectives and hypotheses
  • A statement of the research question

II. Descriptive statistics

  • Summary statistics (e.g., mean, standard deviation) for each variable analyzed
  • Frequencies and percentages for categorical variables

III. Inferential statistics

  • Results of statistical analyses, including tests of hypotheses
  • Tables or figures to display statistical results

IV. Effect sizes and confidence intervals

  • Effect sizes (e.g., Cohen’s d, odds ratio) to quantify the strength of the relationship between variables
  • Confidence intervals to estimate the range of plausible values for the effect size

V. Subgroup analyses

  • Results of analyses that examined differences between subgroups (e.g., by gender, age, treatment group)

VI. Limitations and assumptions

  • Discussion of any limitations of the study and potential sources of bias
  • Assumptions made in the statistical analyses

VII. Conclusions

  • A summary of the key findings and their implications
  • A statement of whether the hypotheses were supported or not
  • Suggestions for future research

Example of Research Results Section

An Example of a Research Results Section could be:

  • This study sought to examine the relationship between sleep quality and academic performance in college students.
  • Hypothesis : College students who report better sleep quality will have higher GPAs than those who report poor sleep quality.
  • Methodology : Participants completed a survey about their sleep habits and academic performance.

II. Participants

  • Participants were college students (N=200) from a mid-sized public university in the United States.
  • The sample was evenly split by gender (50% female, 50% male) and predominantly white (85%).
  • Participants were recruited through flyers and online advertisements.

III. Results

  • Participants who reported better sleep quality had significantly higher GPAs (M=3.5, SD=0.5) than those who reported poor sleep quality (M=2.9, SD=0.6).
  • See Table 1 for a summary of the results.
  • Participants who reported consistent sleep schedules had higher GPAs than those with irregular sleep schedules.

IV. Discussion

  • The results support the hypothesis that better sleep quality is associated with higher academic performance in college students.
  • These findings have implications for college students, as prioritizing sleep could lead to better academic outcomes.
  • Limitations of the study include self-reported data and the lack of control for other variables that could impact academic performance.

V. Conclusion

  • College students who prioritize sleep may see a positive impact on their academic performance.
  • These findings highlight the importance of sleep in academic success.
  • Future research could explore interventions to improve sleep quality in college students.

Example of Research Results in Research Paper :

Our study aimed to compare the performance of three different machine learning algorithms (Random Forest, Support Vector Machine, and Neural Network) in predicting customer churn in a telecommunications company. We collected a dataset of 10,000 customer records, with 20 predictor variables and a binary churn outcome variable.

Our analysis revealed that all three algorithms performed well in predicting customer churn, with an overall accuracy of 85%. However, the Random Forest algorithm showed the highest accuracy (88%), followed by the Support Vector Machine (86%) and the Neural Network (84%).

Furthermore, we found that the most important predictor variables for customer churn were monthly charges, contract type, and tenure. Random Forest identified monthly charges as the most important variable, while Support Vector Machine and Neural Network identified contract type as the most important.

Overall, our results suggest that machine learning algorithms can be effective in predicting customer churn in a telecommunications company, and that Random Forest is the most accurate algorithm for this task.

Example 3 :

Title : The Impact of Social Media on Body Image and Self-Esteem

Abstract : This study aimed to investigate the relationship between social media use, body image, and self-esteem among young adults. A total of 200 participants were recruited from a university and completed self-report measures of social media use, body image satisfaction, and self-esteem.

Results: The results showed that social media use was significantly associated with body image dissatisfaction and lower self-esteem. Specifically, participants who reported spending more time on social media platforms had lower levels of body image satisfaction and self-esteem compared to those who reported less social media use. Moreover, the study found that comparing oneself to others on social media was a significant predictor of body image dissatisfaction and lower self-esteem.

Conclusion : These results suggest that social media use can have negative effects on body image satisfaction and self-esteem among young adults. It is important for individuals to be mindful of their social media use and to recognize the potential negative impact it can have on their mental health. Furthermore, interventions aimed at promoting positive body image and self-esteem should take into account the role of social media in shaping these attitudes and behaviors.

Importance of Research Results

Research results are important for several reasons, including:

  • Advancing knowledge: Research results can contribute to the advancement of knowledge in a particular field, whether it be in science, technology, medicine, social sciences, or humanities.
  • Developing theories: Research results can help to develop or modify existing theories and create new ones.
  • Improving practices: Research results can inform and improve practices in various fields, such as education, healthcare, business, and public policy.
  • Identifying problems and solutions: Research results can identify problems and provide solutions to complex issues in society, including issues related to health, environment, social justice, and economics.
  • Validating claims : Research results can validate or refute claims made by individuals or groups in society, such as politicians, corporations, or activists.
  • Providing evidence: Research results can provide evidence to support decision-making, policy-making, and resource allocation in various fields.

How to Write Results in A Research Paper

Here are some general guidelines on how to write results in a research paper:

  • Organize the results section: Start by organizing the results section in a logical and coherent manner. Divide the section into subsections if necessary, based on the research questions or hypotheses.
  • Present the findings: Present the findings in a clear and concise manner. Use tables, graphs, and figures to illustrate the data and make the presentation more engaging.
  • Describe the data: Describe the data in detail, including the sample size, response rate, and any missing data. Provide relevant descriptive statistics such as means, standard deviations, and ranges.
  • Interpret the findings: Interpret the findings in light of the research questions or hypotheses. Discuss the implications of the findings and the extent to which they support or contradict existing theories or previous research.
  • Discuss the limitations : Discuss the limitations of the study, including any potential sources of bias or confounding factors that may have affected the results.
  • Compare the results : Compare the results with those of previous studies or theoretical predictions. Discuss any similarities, differences, or inconsistencies.
  • Avoid redundancy: Avoid repeating information that has already been presented in the introduction or methods sections. Instead, focus on presenting new and relevant information.
  • Be objective: Be objective in presenting the results, avoiding any personal biases or interpretations.

When to Write Research Results

Here are situations When to Write Research Results”

  • After conducting research on the chosen topic and obtaining relevant data, organize the findings in a structured format that accurately represents the information gathered.
  • Once the data has been analyzed and interpreted, and conclusions have been drawn, begin the writing process.
  • Before starting to write, ensure that the research results adhere to the guidelines and requirements of the intended audience, such as a scientific journal or academic conference.
  • Begin by writing an abstract that briefly summarizes the research question, methodology, findings, and conclusions.
  • Follow the abstract with an introduction that provides context for the research, explains its significance, and outlines the research question and objectives.
  • The next section should be a literature review that provides an overview of existing research on the topic and highlights the gaps in knowledge that the current research seeks to address.
  • The methodology section should provide a detailed explanation of the research design, including the sample size, data collection methods, and analytical techniques used.
  • Present the research results in a clear and concise manner, using graphs, tables, and figures to illustrate the findings.
  • Discuss the implications of the research results, including how they contribute to the existing body of knowledge on the topic and what further research is needed.
  • Conclude the paper by summarizing the main findings, reiterating the significance of the research, and offering suggestions for future research.

Purpose of Research Results

The purposes of Research Results are as follows:

  • Informing policy and practice: Research results can provide evidence-based information to inform policy decisions, such as in the fields of healthcare, education, and environmental regulation. They can also inform best practices in fields such as business, engineering, and social work.
  • Addressing societal problems : Research results can be used to help address societal problems, such as reducing poverty, improving public health, and promoting social justice.
  • Generating economic benefits : Research results can lead to the development of new products, services, and technologies that can create economic value and improve quality of life.
  • Supporting academic and professional development : Research results can be used to support academic and professional development by providing opportunities for students, researchers, and practitioners to learn about new findings and methodologies in their field.
  • Enhancing public understanding: Research results can help to educate the public about important issues and promote scientific literacy, leading to more informed decision-making and better public policy.
  • Evaluating interventions: Research results can be used to evaluate the effectiveness of interventions, such as treatments, educational programs, and social policies. This can help to identify areas where improvements are needed and guide future interventions.
  • Contributing to scientific progress: Research results can contribute to the advancement of science by providing new insights and discoveries that can lead to new theories, methods, and techniques.
  • Informing decision-making : Research results can provide decision-makers with the information they need to make informed decisions. This can include decision-making at the individual, organizational, or governmental levels.
  • Fostering collaboration : Research results can facilitate collaboration between researchers and practitioners, leading to new partnerships, interdisciplinary approaches, and innovative solutions to complex problems.

Advantages of Research Results

Some Advantages of Research Results are as follows:

  • Improved decision-making: Research results can help inform decision-making in various fields, including medicine, business, and government. For example, research on the effectiveness of different treatments for a particular disease can help doctors make informed decisions about the best course of treatment for their patients.
  • Innovation : Research results can lead to the development of new technologies, products, and services. For example, research on renewable energy sources can lead to the development of new and more efficient ways to harness renewable energy.
  • Economic benefits: Research results can stimulate economic growth by providing new opportunities for businesses and entrepreneurs. For example, research on new materials or manufacturing techniques can lead to the development of new products and processes that can create new jobs and boost economic activity.
  • Improved quality of life: Research results can contribute to improving the quality of life for individuals and society as a whole. For example, research on the causes of a particular disease can lead to the development of new treatments and cures, improving the health and well-being of millions of people.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Dissertation Methodology

Dissertation Methodology – Structure, Example...

Research Paper Introduction

Research Paper Introduction – Writing Guide and...

Research Methods

Research Methods – Types, Examples and Guide

How to Publish a Research Paper

How to Publish a Research Paper – Step by Step...

Thesis Format

Thesis Format – Templates and Samples

Survey Instruments

Survey Instruments – List and Their Uses

descriptive statistics section in research paper

Transcription Service for Your Academic Paper

Start Transcription now

Editing & Proofreading for Your Research Paper

Get it proofread now

Online Printing & Binding with Free Express Delivery

Configure binding now

  • Academic essay overview
  • The writing process
  • Structuring academic essays
  • Types of academic essays
  • Academic writing overview
  • Sentence structure
  • Academic writing process
  • Improving your academic writing
  • Titles and headings
  • APA style overview
  • APA citation & referencing
  • APA structure & sections
  • Citation & referencing
  • Structure and sections
  • APA examples overview
  • Commonly used citations
  • Other examples
  • British English vs. American English
  • Chicago style overview
  • Chicago citation & referencing
  • Chicago structure & sections
  • Chicago style examples
  • Citing sources overview
  • Citation format
  • Citation examples
  • College essay overview
  • Application
  • How to write a college essay
  • Types of college essays
  • Commonly confused words
  • Definitions
  • Dissertation overview
  • Dissertation structure & sections
  • Dissertation writing process
  • Graduate school overview
  • Application & admission
  • Study abroad
  • Master degree
  • Harvard referencing overview
  • Language rules overview
  • Grammatical rules & structures
  • Parts of speech
  • Punctuation
  • Methodology overview
  • Analyzing data
  • Experiments
  • Observations
  • Inductive vs. Deductive
  • Qualitative vs. Quantitative
  • Types of validity
  • Types of reliability
  • Sampling methods
  • Theories & Concepts
  • Types of research studies
  • Types of variables
  • MLA style overview
  • MLA examples
  • MLA citation & referencing
  • MLA structure & sections
  • Plagiarism overview
  • Plagiarism checker
  • Types of plagiarism
  • Printing production overview
  • Research bias overview
  • Types of research bias
  • Example sections
  • Types of research papers
  • Research process overview
  • Problem statement
  • Research proposal
  • Research topic
  • Statistics overview
  • Levels of measurment
  • Frequency distribution
  • Measures of central tendency
  • Measures of variability
  • Hypothesis testing
  • Parameters & test statistics
  • Types of distributions
  • Correlation
  • Effect size
  • Hypothesis testing assumptions
  • Types of ANOVAs
  • Types of chi-square
  • Statistical data
  • Statistical models
  • Spelling mistakes
  • Tips overview
  • Academic writing tips
  • Dissertation tips
  • Sources tips
  • Working with sources overview
  • Evaluating sources
  • Finding sources
  • Including sources
  • Types of sources

descriptive statistics section in research paper

Your Step to Success

Transcription Service for Your Paper

Printing & Binding with 3D Live Preview

APA Results Section – Explanation & Examples

How do you like this article cancel reply.

Save my name, email, and website in this browser for the next time I comment.

APA-Results-Section-Definition

The APA results section summarizes data and includes reporting statistics in a quantitative research study. The APA results section is an essential part of your research paper and typically begins with a brief overview of the data followed by a systematic and detailed reporting of each hypothesis tested. The interpreted results will then be presented in the discussion sections. Ensure you adhere to APA style guidelines consistently throughout the paper.

Inhaltsverzeichnis

  • 1 APA Results Section – In a Nutshell
  • 2 Definition: APA results section
  • 3 What’s included in the APA results section?
  • 4 APA results section: Introducing the data
  • 5 APA results section: Summarizing the data
  • 6 APA results section: Reporting the results
  • 7 APA results section: Formatting numbers
  • 8 APA results section: Don’t include these

APA Results Section – In a Nutshell

  • The APA results section of empirical manuscripts reports the quantitative results of a study conducted on a data set.
  • The APA results section provides concrete evidence to disprove or confirm the hypothesis.

Definition: APA results section

The American Psychological Association recommends the APA style guide for presenting results in a manuscript. A research manuscript’s APA results section describes the researcher’s findings following a thorough data analysis and interpretation of the results. It uses obtained data to test or refute the theory of a research study.

Printing Your Thesis With BachelorPrint

  • High-quality bindings with customizable embossing
  • 3D live preview to check your work before ordering
  • Free express delivery

Configure your binding now!

to the print shop

What’s included in the APA results section?

The APA results section includes preliminary details on the data, participants, statistics , and the results of the explanatory analysis , as discussed below.

  • Participants – The number of participants is reported at every study stage
  • Missing data – Identifying the amount of data excluded from the final analysis.
  • Adverse effects – Report any unforeseen events for clinical studies
  • Descriptive statistics – Summarize the secondary and primary outcomes of a study
  • Inferential statistics – Helps researchers draw conclusions and make predictions from the data.
  • Confidence interval and effect size  – Confidence intervals are a range of possible values for the data set mean.
  • Results of explanatory analysis – An exploratory research investigates data to test a hypothesis, check assumptions, and find anomalies.

APA results section: Introducing the data

Before you discuss your research findings, start by clearly describing the participants at each study stage. If any data was excluded from the eventual analysis, indicate that too.

Participants

Recruitment, participant flow, and attrition should be reported. Attrition bias affects external and internal validity and produces erroneous results.

A flow chart is often the best way to report the number of participants per group per stage and their reasons for attrition. Below is an example of how to report participant flow.

  • 25% of the 400 participants who signed up and completed the first survey were eliminated for not fitting the research criteria.
  • 15% didn’t use fiber optics internet exclusively.
  • 10% did not have internet access.
  • 300 participants progressed to the final survey round for a gift bag.
  • 52 people didn’t complete the survey.

This resulted in 248 research participants.

Missing data & adverse effects

In any study, missing data must be reported. Unexpected events, poor storage, and equipment failures can cause missing data. In any instance, clearly explain why you couldn’t use the data.

Data outliers can be excluded from the final study, but you must explain why. Include how you handled missing data. Standard procedures include mean-value imputation, interpolation, extrapolation, and substitution.

  • Results of 33 participants were excluded from the study as they did not meet the research criteria.
  • The data for another 4 participants were lost due to human error.

APA results section: Summarizing the data

It is important to note that you should provide a summary of your study’s results. However, you can create a supplemental archive for other researchers to access raw data. 2

Descriptive Statistics

Descriptive statistics are concise coefficients that summarize a specific data collection , such as a population sample or APA results section. APA results section can include descriptive statistics such as:

  • Central tendency measures describe a data set by identifying the center of the data set. ( mode , median, mean )
  • Measures of variability describe the score dispersion within a data set. ( standard deviation , range, variance , and interquartile range )
  • Sample sizes
  • Variables of interest, which are measured, changing quantities in experimental studies. Be sure to explain how you operationalized any variable of interest you use.
  • 20 athletes in five trials were given 400 mg of a performance-enhancing substance to measure their speed (m/s ) and reaction time(s).
  • After averaging each athlete’s speed and response time, the group’s averages were calculated.

The group that used the performance-enhancing drug had a higher speed (m/s) than the group that did not use the drug ( M = 4, SD=1.25 )

APA results section: Reporting the results

APA journal standards require all the appropriate hypothesis tests, confidence intervals, and effect size estimates to be reported in the APA results section.

Inferential statistics

Inferential statistics help researchers draw conclusions and make predictions based on the data.

When you are reporting the inferential statistics in the APA results section, use the following:

  • Degrees of freedom
  • Test statistic (includes the z-score, t-value, and f-ratio )
  • Error term (if needed, though it is not included in correlations and non-parametric tests.)
  • The exact p-value (unless . 001)

In keeping with the hypotheses, athletes who take performance-enhancing drugs have increased reaction times, and speeds, t (20) = 1s , p .001

Confidence intervals & effect sizes

A confidence interval can be described as a range of possible values for the mean derived from the sample data. It helps show the variability that is around point estimates. You should include confidence intervals any time you report estimates for population parameters.

Night guards consume an average of 600 mg of caffeine weekly, 93% CI [90, 200}

Effect size measures an experiment’s magnitude. It explains the research’s significance. Since effect size is an estimate, confidence intervals should be included.

Moderate amounts of performance-enhancing drugs increase speed significantly, Cohen’s d =1.4, 93% CI [0.92, 1.57]

Subgroup & exploratory analyses

Exploratory analysis tests a hypothesis, checks assumptions , and finds patterns and anomalies in data . If you find notable results, report them as exploratory, not confirming, to avoid overstating their value.

APA results section: Formatting numbers

Use figures, text, and tables to show numbers in APA results sections properly.

✓ For three or fewer numbers, use a sentence, a table for 4 and 20 numbers, and a figure for more than 20 .

✓ Number and title the APA tables and figures , as well as relevant notes. If you have already presented the data in a table, do not repeat it in a figure and vice versa.

✓ Statistics in your APA results section must be abbreviated, capitalized, and italicized.

✓ Use APA norms for reporting statistics and writing numbers.

✓ Look up these guidelines if you are unsure how to present certain symbols.

  • ✓ Post a picture on Instagram
  • ✓ Get the most likes on your picture
  • ✓ Receive up to $300 cash back

APA results section: Don’t include these

Besides knowing what to include in an APA results section, it is just as important to know what not to have. Below is an outline of what you should exclude from an APA results section.

The APA results section should have results that are presented concisely.

Include it in the discussion section and only objectively report findings in the APA results section.
Assume the readers have professional knowledge of statistical inferences.
Only include data relevant to the research question in the APA results section.

What should be included in the APA results section?

The APA results section should include details on the participants, descriptive statistics and inferential statistics , missing data , and the results of any exploratory analysis.

What tense should I use to write my results?

Write the APA results section in the past tense.

When should I include tables and figures?

Include tables and figures if you will discuss them in the body text of the APA results section.

Bachelor Print is the most amazing company ever to print or bind academic work...

We use cookies on our website. Some of them are essential, while others help us to improve this website and your experience.

  • External Media

Individual Privacy Preferences

Cookie Details Privacy Policy Imprint

Here you will find an overview of all cookies used. You can give your consent to whole categories or display further information and select certain cookies.

Accept all Save

Essential cookies enable basic functions and are necessary for the proper function of the website.

Show Cookie Information Hide Cookie Information

Name
Anbieter Eigentümer dieser Website,
Zweck Speichert die Einstellungen der Besucher, die in der Cookie Box von Borlabs Cookie ausgewählt wurden.
Cookie Name borlabs-cookie
Cookie Laufzeit 1 Jahr
Name
Anbieter Bachelorprint
Zweck Erkennt das Herkunftsland und leitet zur entsprechenden Sprachversion um.
Datenschutzerklärung
Host(s) ip-api.com
Cookie Name georedirect
Cookie Laufzeit 1 Jahr
Name
Anbieter Playcanvas
Zweck Display our 3D product animations
Datenschutzerklärung
Host(s) playcanv.as, playcanvas.as, playcanvas.com
Cookie Laufzeit 1 Jahr

Statistics cookies collect information anonymously. This information helps us to understand how our visitors use our website.

Akzeptieren
Name
Anbieter Google Ireland Limited, Gordon House, Barrow Street, Dublin 4, Ireland
Zweck Cookie von Google zur Steuerung der erweiterten Script- und Ereignisbehandlung.
Datenschutzerklärung
Cookie Name _ga,_gat,_gid
Cookie Laufzeit 2 Jahre

Content from video platforms and social media platforms is blocked by default. If External Media cookies are accepted, access to those contents no longer requires manual consent.

Akzeptieren
Name
Anbieter Meta Platforms Ireland Limited, 4 Grand Canal Square, Dublin 2, Ireland
Zweck Wird verwendet, um Facebook-Inhalte zu entsperren.
Datenschutzerklärung
Host(s) .facebook.com
Akzeptieren
Name
Anbieter Google Ireland Limited, Gordon House, Barrow Street, Dublin 4, Ireland
Zweck Wird zum Entsperren von Google Maps-Inhalten verwendet.
Datenschutzerklärung
Host(s) .google.com
Cookie Name NID
Cookie Laufzeit 6 Monate
Akzeptieren
Name
Anbieter Meta Platforms Ireland Limited, 4 Grand Canal Square, Dublin 2, Ireland
Zweck Wird verwendet, um Instagram-Inhalte zu entsperren.
Datenschutzerklärung
Host(s) .instagram.com
Cookie Name pigeon_state
Cookie Laufzeit Sitzung
Akzeptieren
Name
Anbieter Openstreetmap Foundation, St John’s Innovation Centre, Cowley Road, Cambridge CB4 0WS, United Kingdom
Zweck Wird verwendet, um OpenStreetMap-Inhalte zu entsperren.
Datenschutzerklärung
Host(s) .openstreetmap.org
Cookie Name _osm_location, _osm_session, _osm_totp_token, _osm_welcome, _pk_id., _pk_ref., _pk_ses., qos_token
Cookie Laufzeit 1-10 Jahre
Akzeptieren
Name
Anbieter Twitter International Company, One Cumberland Place, Fenian Street, Dublin 2, D02 AX07, Ireland
Zweck Wird verwendet, um Twitter-Inhalte zu entsperren.
Datenschutzerklärung
Host(s) .twimg.com, .twitter.com
Cookie Name __widgetsettings, local_storage_support_test
Cookie Laufzeit Unbegrenzt
Akzeptieren
Name
Anbieter Vimeo Inc., 555 West 18th Street, New York, New York 10011, USA
Zweck Wird verwendet, um Vimeo-Inhalte zu entsperren.
Datenschutzerklärung
Host(s) player.vimeo.com
Cookie Name vuid
Cookie Laufzeit 2 Jahre
Akzeptieren
Name
Anbieter Google Ireland Limited, Gordon House, Barrow Street, Dublin 4, Ireland
Zweck Wird verwendet, um YouTube-Inhalte zu entsperren.
Datenschutzerklärung
Host(s) google.com
Cookie Name NID
Cookie Laufzeit 6 Monate

Privacy Policy Imprint

  • Mathematics
  • Descriptive Statistics

Data Analysis of Students Marks with Descriptive Statistics

  • 2(5):2321-8169

Bhawana Mathur at Sri Balaji College of Engineering and Technology

  • Sri Balaji College of Engineering and Technology

Manju Kaushik at JECRC University

  • JECRC University

Discover the world's research

  • 25+ million members
  • 160+ million publication pages
  • 2.3+ billion citations
  • Sausanuz Zakiyyah Hamibawani
  • Henny Indreswari

Khairiyah Mohd-Yusof

  • Maizam Alias
  • Akbariah Ary Mohd Mahdzir

Thembelani Sithebe

  • Rendani Maladzhi

Fulufhelo Nemavhola

  • Hongjuan Liu
  • NOBERT SSALI
  • Ashley Ge Zhang

Yan Chen

  • JANEROSE WANJIRU NJERI

Jane Queen Omwenga

  • Rashid Maqbool
  • Moneesh Bhuvaneswaran
  • Yahya Rashid

Saleha Ashfaq

  • R. A. Johnson
  • D. W. Wichern
  • TECHNOMETRICS
  • Charles Dunn
  • Richard A. Johnson
  • Dean W. Wichern

George Argyrous

  • James B. Davis

Joseph Mckean

  • Mark A Neill

Karen G Wotton

  • Thomas P. Hettmansperger
  • Intro Stats
  • Pearson Addison Wesley
  • Recruit researchers
  • Join for free
  • Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up

A geographical analysis of social enterprises: the case of Ireland

Social Enterprise Journal

ISSN : 1750-8614

Article publication date: 29 April 2024

Issue publication date: 4 July 2024

This study aims to conduct a geographical analysis of the distribution and type of activities developed by social enterprises in rural and urban areas of Ireland.

Design/methodology/approach

The study analyses data of more than 4,000 social enterprises against a six-tier rural/urban typology, using descriptive statistics and non-parametric tests to test six hypotheses.

The study shows a geographical rural–urban pattern in the distribution of social enterprises in Ireland, with a positive association between the remoteness of an area and the ratio of social enterprises, and a lack of capital-city effect related to the density of social enterprises. The analysis also shows a statistically significant geographical rural–urban pattern for the types of activities developed by social enterprises. The authors observe a positive association between the remoteness of the areas and the presence of social enterprises operating in the community and local development sector whereas the association is not significant for social enterprises developing welfare services.

Research limitations/implications

The paper shows the potential of using recently developed rural–urban typologies and tools such as geographical information systems for conducting geographical research on social enterprises. The findings also have implications for informing spatially sensitive policymaking on social enterprises.

Originality/value

The merging of a large national data set of social enterprises with geographical tools and data at subregional level contributes to the methodological advancement of the field of social enterprises, providing tools and frameworks for a nuanced and spatially sensitive analysis of these organisations.

  • Rural social enterprises
  • Urban social enterprises
  • Quantitative research
  • Social economy organisations

Olmedo, L. , O. Shaughnessy, M. and Holloway, P. (2024), "A geographical analysis of social enterprises: the case of Ireland", Social Enterprise Journal , Vol. 20 No. 4, pp. 499-521. https://doi.org/10.1108/SEJ-09-2023-0105

Emerald Publishing Limited

Copyright © 2024, Lucas Olmedo, Mary O. Shaughnessy and Paul Holloway.

Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial & non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence may be seen at http://creativecommons.org/licences/by/4.0/legalcode

1. Introduction

Social and solidarity economy organisations, and especially social enterprises, have recently been brought to the fore by international institutions including the European Commission, the organisation for economic co-operation and development (OECD) and the United Nations ( European Commission, 2021 ; OECD, 2022 ; United Nations, 2023 ). These institutions acknowledge the contribution and potential of social enterprises to address complex challenges such as climate change, ageing population and lack of access to employment for vulnerable groups; namely, due to the combination of social and/or environmental aims with an economic activity and democratic decision-making which characterise social enterprises ( Galera and Borzaga, 2009 ; Defourny and Nyssens, 2017 ).

Social enterprises in Ireland have been traditionally considered relevant actors providing goods and services to disadvantaged communities and enabling work integration of vulnerable groups ( O’Hara and O’Shaughnessy, 2021 ). In 2019, the Irish Government launched the first National Social Enterprise Policy for Ireland, representing a milestone for the recognition and institutionalisation of social enterprises in the country ( Olmedo et al. , 2021 ). This policy establishes an official definition of social enterprises as follows:

An enterprise whose objective is to achieve a social, societal or environmental impact, rather than maximising profit for its owners or shareholders. It pursues its objectives by trading on an ongoing basis through the provision of goods and/or services, and by reinvesting surpluses into achieving social objectives. It is governed in a fully accountable and transparent manner and is independent of the public sector. If dissolved, it should transfer its assets to another organisation with a similar mission ( Government of Ireland, 2019 , p. 8).

This policy recognises, in line with previous research reports on Irish social enterprises ( Hynes, 2016 ; European Commission, 2020 ), the contribution of Irish social enterprises to deliver a wide range of goods and services, as well as supporting the attainment of government policy goals in areas such as labour market activation but also in health care, climate action, social cohesion and rural development.

Despite common features shared across social enterprises, previous research has highlighted differences between social enterprises operating in rural and urban areas in terms of their community focus, leadership style and funding sources ( Smith and McColl, 2016 ; Barraket et al. , 2019 ). The geographical context where social enterprises operate has been acknowledged as a relevant factor for explaining the work of these organisations ( Steiner and Teasdale, 2019 ; Olmedo et al. , 2023 ) and their contribution to urban and rural development ( Angelidou and Mora, 2019 ; Olmedo and O’Shaughnessy, 2022 ). Geographically sensitive research on social enterprises has been developed mainly at the local level ( Mazzei, 2017 ; Jammulamadaka and Chakraborty, 2018 ; Pinch and Sunley, 2016 ), with some research also conducted at the regional level ( Buckingham et al. , 2011 ; Woo and Jung, 2023 ); however, less is known about the differences in the distribution and the type of activities that social enterprises develop in different rural–urban areas of a country. Therefore, the aim of this paper is to explore the distribution and type of activities developed by social enterprises in different rural and urban areas in Ireland.

To achieve this, we use a six-tier rural–urban typology developed by the Irish Central Statistics Office ( CSO, 2019 ) combined with data on 4,335 social enterprises collected in Ireland. Using geographical information systems (GIS) we georeferenced social enterprises and tested six hypotheses. The spatially sensitive and quantitative empirical data analysis provided by this study adds knowledge to previous calls for geographical research on social enterprises ( Munoz, 2010 ) and provides relevant evidence for the development of spatially sensitive policies for social enterprises ( Mazzei and Roy, 2017 ).

The rest of the paper is structured as follows, Section 2 presents a literature review on previous geographical research on social enterprises. Section 3 outlines the research framework and the hypotheses of this study. Section 4 explains the methodology used in the research. Section 5 presents the findings of this study, with a subsection presenting descriptive statistics and another presenting the analysis of the hypotheses’ tested. Section 6 discusses the findings and Section 7 outlines the conclusions and limitations of this study, ending with some proposals for further research.

2. Literature review – geographical research on social enterprises

The publication in 2010 of a seminal call “Towards a geographical research agenda for social enterprise” ( Munoz, 2010 ) meant a significant milestone for the development of a geographically sensitive perspective towards the study of social enterprises. Within this body of research [ 1 ], some authors have adopted a micro-geographical perspective to study social enterprises as spaces of well-being ( Munoz et al. , 2015 ). For example, Farmer et al. (2020) used GIS to link specific sites within a social enterprise to the well-being experienced by the employees of three Australian Work Integration Social Enterprises. Their findings show how the social enterprises studied acted as “socially-supportive workplaces which focus on deploying, developing and supporting talents and not simply allocating people to one job in one location for all time” ( Farmer et al. , 2020 , p. 9).

Another stream of studies has focused on the local geography of social enterprises ( Jammulamadaka and Chakraborty, 2018 ). Some of these studies have a specific urban focus. For example, Pinch and Sunley (2016) investigated whether social enterprises in four major UK cities benefited from urban agglomeration effects, concluding that agglomeration enables greater demand for social enterprises goods and services and better access to institutional support, funding, knowledge and networks. Similarly, Mazzei (2017) stressed the influence of “place” on the incentives and opportunities for two social enterprises operating within English cities.

Previous research has also taken a geographical perspective to study social enterprises in rural areas. Drawing from social network theory, Richter (2019) showed how social enterprises operating in rural Austria and Poland act as embedded intermediaries between their localities and supra-regional networks. In studies conducted in rural Scotland, Steiner and Steinerowska-Streb (2012) and Steiner and Teasdale (2019) stated that rural areas are a fertile ground for social enterprises due to characteristics associated to rurality, such as reduced market competitors and high levels of social capital. Moreover, these studies further explain how rural social enterprises use advantages of the rural context, such as the skills and knowledge of retired people who moved to rural localities, to develop social entrepreneurial activities. In a study conducted in rural Scotland exploring social enterprises in addressing social isolation and loneliness, Kelly et al. (2019) concluded that despite these organisations offer more flexible solutions than statutory services, relying on social enterprises as solutions to these challenges is not realistic. This was posited to features associated with the rural context of the study, such as remoteness, small labour markets and depopulation.

This echoes research on social enterprises in rural Ireland conducted by O’Shaughnessy and O’Hara (2016) , who stated that geographic isolation and limited job creation associated to the rural context challenges the development of social enterprises. More recently, Olmedo et al. (2023) showed how social enterprises in three Irish rural localities, through a process of “placial substantive hybridity”, harness and (re)valorise untapped local resources and complement these with extra-local resources to foster social innovation and contribute to an integrated development of their localities.

Geographical research has also been conducted comparing social enterprises operating in rural and urban localities. Smith and McColl (2016) explored the influence of the context in four social enterprises based in Scottish urban and rural communities. The authors found that rural social enterprises show a great linkage between the geographical characteristics of where they are based, their community identity and ownership and type of business developed. Contrarily in the urban social enterprises they studied, it was a social need rather than a geographical aspect which drove the organisations’ aim. In a study conducted in Australia, Barraket et al. (2019) compared 11 locally oriented urban and rural social enterprises resourcefulness strategies. The authors showed the great relevance of community networks within rural based social enterprises to access financial and physical assets; however, those social enterprises based in urban areas were more inclined to leverage public funding related to welfare objectives and resources from corporates.

Despite the plethora of research investigating social enterprises at the urban and rural levels, few studies have researched social enterprises through a regional perspective. In this regard, Buckingham et al. (2011) attempted to unmask the “enigmatic regional geography of social enterprises in the UK” using statistical data from different surveys related to social enterprises conducted between 2005 and 2009. The authors concluded that interregional variations (north–south and east–west) were relatively small and without statistical significance; except for high levels of social enterprise activity in London due to its dynamic and innovative business environment and the effect that headquarters location of national social enterprises (mainly in London) might have in the data. More recently, Woo and Jung (2023) have explored the regional determinants of the emergence of social enterprises in South Korea. Combining longitudinal data sets (2012–2019) from the Korea Social Enterprise Promotion Agency and Korea Statistics and using an entrepreneurial ecosystems perspective, the authors concluded that the emergence of social enterprises is especially significant in regions experiencing government or market failure and in regions with greater incidences of start-ups, human capital and financial resources.

At the national (country) and international level, research on social enterprises has been mainly conducted from an institutional perspective, influenced by the seminal work of Kerlin (2013) and the international comparative social enterprise models project ( Defourny and Nyssens, 2017 ; Defourny et al. , 2020 ), with scarce studies adopting a geographical perspective. A notable exception can be found in a study conducted by Douglas et al. (2018) exploring social enterprises in Fiji, in which the geography of the country, a small remote island in the Pacific Ocean, is considered (together with its history, social, economic, political and cultural institutions) a determinant factor shaping social enterprises in the country.

In summary (see Table 1 ), the review of the literature shows how geographical research on social enterprises has been conducted at various levels, from micro-organisational to national level; however, to-date this research has predominantly focused on the influence of local geographical elements in shaping the work of social enterprises. Within the local level, urban and rural localities have been subject to research and some differences have been identified in the ways rural–urban social enterprises operate. Regarding the methodologies used by studies, most geographical research on social enterprises have used qualitative methods, with some exceptions in studies that take a regional perspective. In these instances, studies have predominantly used existing survey data and registers of social enterprises ( Woo and Jung, 2023 ). In terms of theoretical perspectives, some studies are based on economic geography theories such as agglomeration and cluster theory (e.g. Pinch and Sunley, 2016 ; Jammulamadaka and Chakraborty, 2018 ) and concepts such as “place” borrowed from human geography (e.g. Mazzei, 2017 ; Olmedo et al. , 2023 ). However, generally the studies reviewed rather use theories from disciplines such as sociology, e.g. social network theory, and business/entrepreneurship, e.g. entrepreneurial ecosystems, complementing these with spatially sensitive elements such as the use of methodological tools such as GIS in their analysis ( Farmer et al. , 2020 ), the multi-scalar analysis of networks ( Richter, 2019 ) or a spatial rural–urban comparison of the cases studied ( Barraket et al. , 2019 ).

Despite the significant progress of geographical research on social enterprises in recent years, studies have focused on how geographical elements of the context influence the features and work of social enterprises, rather than exploring the basic and critical question (for research and policy) of how social enterprises are geographically distributed, and why. According to Buckingham et al. (2011 , p. 90), it “seems likely that the most significant geographical differences in the distribution of social enterprises are to be found at the sub-regional level […] and there is clearly a need for further, more fine-grained investigation”, see also Steiner et al. (2019) . This study aims to fill this gap for the case of Ireland by exploring the distribution and type of activities developed by social enterprises in rural and urban areas. To do so, we draw from a combination of increasingly complex thinking about rural–urban spatial heterogeneity, the advancement of methodological tools for rural–urban spatial classification at sub-regional level and from statistical information gathered on Irish social enterprises.

3. Research framework and hypothesis

3.1 territorial, rural–urban and classifications.

This paper is based on a geographical perspective towards the study of social enterprises in Ireland, and more specifically on the analysis of social enterprises in rural and urban areas. The definition of what constitutes a rural and urban area has been subject to extensive debate (see, for example, Mantino et al. , 2023 ; Eurostat, 2021 ). Within Europe there is no definitive agreement between Member States of what is considered as a rural/urban area; for example, in Ireland, rural areas are defined in terms of settlements with a population of less than 1,500 persons ( CSO, 2019 ), whereas in Spain rural areas are considered as those municipalities with less than 5,000 inhabitants but also those with less than 30,000 inhabitants and a density lower than 100 inhabitants/km 2 ( Government of Spain, 2007 ). These definitions classify rural–urban areas mainly in terms of population densities.

population density;

the percentage of the population of a region living in rural communities; and

the presence of large urban centres in such regions.

According to these criteria, NUTS 3 [ 2 ] regions are classified into Predominantly Rural; Intermediate and Predominantly Urban ( OECD, 2006 ) [ 3 ]. This methodology has been revised by Eurostat (2021) incorporating finer-grain data at Local Administrative Units Level 2 (LAU2) and grid cells of 1 km 2 to categorised territories into cities, towns, semi-dense areas and rural areas. Eurostat (2021) has also included a further subclassification based on population density and size. Towns and semi-dense areas were sub-divided into dense towns, semi-dense towns and suburban or peri-urban areas. Rural areas were also sub-divided into villages, dispersed rural areas and mostly inhabited rural areas. This finer analysis allows for a more precise analysis of the rural–urban continuum overcoming an abrupt differentiation between urban and rural areas but approaching it rather as a continuum that acknowledges the heterogeneity of rural and urban areas.

Besides the classification of rural–urban areas based on population density and size, classifications based on the functions and relations between areas have also been developed ( Mantino et al. , 2023 ). These classifications tend to incorporate indicators related to economic factors, for example, the economic growth/decline, the degree of productive activities (agriculture, forestry, manufacturing and construction) and consumption activities (tourism, recreation, housing and services) ( Copus et al. , 2011 ). Environmental indicators, for example, related to ecosystems functions (climate regulation, water supply and regulation, soil retention and formation, biodiversity) are also incorporated to classify rural–urban areas based on their (multi)functionality ( Mantino et al. , 2023 ). A key aspect of the relationship between rural–urban areas includes the mobility of workers and the access to services. In this regard indicators of proximity related, for example, to the time needed to access to services and infrastructures have also been considered in the classification of rural–urban areas ( Eurostat, 2021 ).

These functional classifications are usually interlinked with the abovementioned rural–urban classifications based on population density creating increasingly nuanced typologies through the multiple criteria that reflects the complexity of relationships between urban and rural areas ( Perpiñá Castillo et al. , 2022 ). In this line, the Central Statistics Office of Ireland (CSO) developed in 2019 a six-tier rural–urban typology ( CSO, 2019 ). This typology was developed using the place of work as a measure of distance to services and amenities, combined with population density from Census 2016. The typology is applied to small area population (SAP), and includes the following six categories: cities, satellite urban towns, independent urban towns, rural areas with high urban influence, rural areas with moderate rural influence and highly rural/remote rural areas (see Table 2 ).

3.2 Hypothesis development

Our study uses the typology developed by the CSO to conduct a geographical analysis of social enterprises in Ireland. Based on this framework, and some of the characteristics of social enterprises presented in the literature review of this paper, six hypotheses have been developed.

Previous studies have suggested that social enterprises are influenced by their geographical context with differences in the spread of social enterprises in rural and urban areas ( Buckingham et al. , 2011 ; CEIS and Social Value Lab, 2023 ). Some studies stress that rural areas represent a fertile ground for social enterprises ( Steiner and Steinerowska-Streb, 2012 ) and that social enterprises tend to emerge and develop in regions experiencing government or market failure ( Woo and Jung, 2023 ). However, the studies of Buckingham et al. (2011) and Pinch and Sunley (2016) suggest a capital-city effect attraction for social enterprises due to its dynamic and innovative business environment, the presence of headquarters location of national social enterprises, greater demand for social enterprises goods and services and better access to institutional support, funding, knowledge and networks, therefore, more supportive social entrepreneurial ecosystems (see also Diaz Gonzalez and Dentchev, 2020 ).

States that the presence of social enterprises is significantly associated with the type of rural–urban areas.

States that the presence of social enterprises is positively associated with areas with lower population density and greater distance to services and amenities (remoteness).

States that the presence of social enterprises within the capital city (Dublin) is significantly higher compared to the national average and to other rural and urban areas of Ireland.

Previous research has also pointed towards the influence of the geographical context in the activities developed by social enterprises ( Mazzei, 2017 ; Smith and McColl, 2016 ). Looking at rural–urban differences and the sector of activities of social enterprises, research has highlighted the key role of social enterprises in community and local development in (remote) rural areas ( van Twuijver et al. , 2020 ; Olmedo et al. , 2023 ) and in providing services related to welfare objectives in urban centres ( Barraket et al. , 2019 ).

States that there is a significant relationship between the sectors of activities in which social enterprises operate and the type of rural–urban areas in which they are based.

States that there is a positive association between areas with lower population density and greater distance to services and amenities (remoteness) and the presence of social enterprises in the sector of community and local development.

States that there is a negative association between areas with lower population density and greater distance to services and amenities (remoteness) and the presence of social enterprises operating in sectors related to welfare objectives.

4. Methodology

Nationwide data on Irish social enterprises were obtained from a social enterprise baseline data collection exercise conducted in 2022. This baseline data collection exercise followed a bottom-up methodology in which a population of social enterprises for Ireland was built from social enterprises lists provided by 36 intermediary organisations and public institutions delivering social enterprise programmes [ 4 ]. The population of social enterprises included 4,335 organisations, geographical-location information was gathered for 4,234 social enterprises and data about their sector of activity was gathered for 4,329 organisations.

Location information of social enterprises was georeferenced using organisation’s Eircodes (postal code/zip code equivalent for Ireland), thus allowing for a precise geolocation. The Eircode was either provided by the social enterprises or when not available the address of the organisation was introduced on the website “Eircode finder” to obtain the Eircode. Geographical coordinates for each Eircode were obtained using ArcGIS Online. Once the geographical coordinates were obtained each social enterprise was mapped using the software QGIS [ 5 ].

Data related to the CSO rural–urban typology containing information about the type of area (six categories) and population [ 6 ] was obtained from the Ordnance Survey Ireland – Open Data Portal [ 7 ]. The rural–urban typology developed by the CSO (2019) used in this study was applied to small area levels. Small areas are the lowest level of geography for the compilation of statistics by the CSO in line with data protection guidelines and typically contain between 80 and 120 dwellings ( CSO, 2019 ). A shapefile with small areas ungeneralised – National Statistical Boundaries was used, this contains a subdivision of the territory of the Republic of Ireland into 18,641 small areas. Information of small areas was vectorised and mapped using QGIS. Information about the six rural–urban categories was joined to each small area within QGIS and a choropleth map was created to differentiate between the types of rural–urban areas. Colours from light green (rural areas with high urban influence) to dark green (highly rural/remote areas) were used for rural areas, whereas dark red was used for cities, light red for satellite urban towns and pink for independent urban towns (see Figure 1 ).

The statistical analysis of this study includes three variables: type of rural–urban area, presence of social enterprises and sector of activities of social enterprises. As the aforementioned six-tier typology combines population density with distance to services and amenities, the categories have been ordered according to their level of remoteness, creating a dummy ordinal variable in which cities are converted into 1 (less remote) and highly rural/remote areas into 6 (most remote). The presence of social enterprises was calculated by the ratio of social enterprises divided by 10,000 inhabitants, following international guidelines from previous social enterprises census/baseline studies (see, for example, CEIS and Social Value Lab, 2023 ). The activities of social enterprises were codified following sectoral categories from the Scottish social enterprise census. This decision was made given the similarities between the countries (Scotland and Ireland) and the long experience of Scotland in constructing this census.

Statistical analysis for this study was conducted using the software R, version 4.2.2, within RStudio. We conducted a descriptive analysis of the variables before undertaking bivariate analysis of the variables to test our hypotheses. Due to the (partially categorical) nature of our data, we used non-parametric statistical tests such as Kruskal–Wallis H test, including post hoc Dunn’s test, chi-square test and Jonckheere–Terpstra test to investigate our hypotheses. The specific tests used for testing each hypothesis are explained in the following section.

5.1 Descriptive statistics

Social enterprises are distributed across rural and urban areas of Ireland (see Figure 2 ). In terms of total number, social enterprises are often concentrated in counties with the most populated Irish cities, such as Dublin (17.9% of total social enterprises) and Cork (10.5%) (see Figure 3 ). However, when considering the ratio of social enterprises by population (social enterprises/10,000 inhabitants), higher ratios of social enterprises are found, namely, in the north and northwest of the country (see Figure 4 ) and in counties with a high density of rural areas, such as Leitrim (26.2 social enterprises per 10,000 inhabitants), Donegal (18.5), Monaghan (17.3) and Mayo (16.5).

The descriptive analysis of social enterprises in relation to the rural–urban typology (see Table 3 ), shows that rural areas present a higher ratio of social enterprises (10.8 social enterprises per 10,000 inhabitants) than urban areas (8.0). However, the ratios show important differences when analysing the rural and urban subcategories, with highly rural/remote areas having a ratio of 21.0 social enterprises per 10,000 inhabitants against the 5.9 social enterprises per 10,000 inhabitants of rural areas with high urban influence. Within urban areas, independent urban towns have a higher concentration of social enterprises (12.9), than cities (6.7) and satellite urban towns (4.9).

The descriptive statistical analysis of the sector of activities of Irish social enterprises also shows some differences between rural–urban areas (see Table 4 ). For example, over 20% of social enterprises within each type of rural areas focus on community infrastructure and local development, whereas only 7.9% of social enterprises in cities operate within this sector. On the other hand, approximately 20% of social enterprises in cities and satellite urban towns develop activities related to health, youth services and social care, whereas in rural areas less than 10% of social enterprises operate within this sector. Social enterprises in sectors such as training and work integration, and information and support services are more prominent in cities, approximately 10% of city-based social enterprises operate in these sectors, whereas these sectors represent less than 5% of the total social enterprises based in Irish rural areas.

5.2 Hypothesis testing

Based on previous literature we developed six hypotheses to be tested related to the distribution and sectors of activities of social enterprises in rural and urban areas in Ireland (see Appendix for the results of the statistical test conducted).

H1 stated that the presence of social enterprises (measured by the ratio of social enterprises per 10,000 inhabitants) is significantly associated with the type of rural–urban areas (operationalised following the six-tier typology developed by the Irish CSO). To analyse this hypothesis a Kruskal–Wallis H test, a non-parametric version of ANOVA suitable for assessing the differences among three or more groups of a categorical/ordinal variable (rural–urban typology) related to a non-normally distributed continuous variable (social enterprise ratio), was conducted ( Vargha and Delaney, 1998 ). The results from this test show a statistically significant relationship between the variables ( p < 0.01), supporting H1 . As the rural–urban areas typology is formed by six categories, a post hoc Dunn test (adjusted with Bonferroni) ( Dinno, 2022 ) was conducted to compare the relationship between each of the pair categories. The results from this test show a significant relationship between all categories except for “cities and satellite urban towns” and “cities and rural areas with high urban”.

H2 refers to the positive association between the presence (ratio) of social enterprises and areas with lower population density and greater distance to services and amenities (remoteness). The six rural–urban categories have been ordered into a dummy variable from 1 to 6 according to their degree of “remoteness”. To test the (positive) directional association between the ratio of social enterprises and the rural–urban areas according to their degree of “remoteness” a Jonckheere–Terpstra test, a non-parametric test similar to Kruskal–Wallis H test, but preferred when the groups are assumed to be arranged in order (ascendent or descendent), was conducted ( Ali et al. , 2015 ). The results show a significant positive association ( p < 0.01) between the remoteness of the rural–urban areas studied and the presence (ratio) of social enterprises, supporting H2 .

H3 refers to the significantly higher presence (ratio) of social enterprises within the capital city (Dublin) compared to the national average and to other rural–urban areas of Ireland. To test this hypothesis, first, we calculated the ratio of social enterprises for the specific SAPs belonging to the category “cities” within County Dublin which accounts for 6.2 social enterprises per 10,000 inhabitants. Although social enterprises based in the city of Dublin represent 16.4% of total Irish social enterprises, the ratio of social enterprises in the city of Dublin (6.2) is below the national average (9.0) and lower than in other urban areas, including other Irish cities of more than 50,000 inhabitants (8.3) and independent urban towns (12.9). The ratio of social enterprises in Dublin city is also lower than in rural areas with moderate urban influence (9.9) and highly rural/remote areas (21.0).

Alternatively, the ratio of social enterprises in Dublin city is higher than in satellite urban towns (4.9) and rural areas with high rural influence (5.9). To analyse the statistical significance between the ratios of Dublin city and the categories with lower ratios we used Welch’s two-sample t -test, suitable for comparing means of groups with unequal variances ( Lu and Yuan, 2010 ). The results show no statistically significant difference between these means ( p > 0.05), thus H3 was not supported.

H4 refers to the significant relationship between the sectors of activities in which social enterprises operate and the type of rural–urban areas in which they are based. Due to the categorical nature of both variables, a Pearson chi-square test (test of independence) was conducted ( Franke et al. , 2012 ). The results show a statistical significance relationship between the variables ( p < 0.01), supporting H4 .

H5 refers to a positive association between areas with lower population density and greater distance to services and amenities (remoteness) and the presence of social enterprises in the community and local development sector and; H6 refers to a negative association between areas with lower population density and greater distance to services and amenities (remoteness) and the presence of social enterprises operating in sectors associated with welfare objectives such as “childcare” and “health, youth services and social care”. We followed the procedure explained in H2 of using a dummy variable to order the rural–urban categories according to their remoteness. Social enterprises within the category “community infrastructure and local development” were used to test H5 . Data of social enterprises from two categories, i.e. “childcare”, and “health, youth services and social care”, were used to test H6 .

To test the directional association between the ratio of social enterprises in community and local development ( H5 ) and in welfare services ( H6 ) with the rural–urban areas according to their degree of “remoteness” a Jonckheere–Terpstra test ( Ali et al. , 2015 ) was conducted. The results show a statistically significant relationship ( p < 0.05) for the variables of H5 , supporting this hypothesis. However, results for H6 were not statistically significant ( p > 0.05), thus this hypothesis was not supported.

In summary, our statistical analysis shows support for four of our six hypotheses (see Table 5 ). The hypothesis supported by our statistical analysis show a geographical rural–urban pattern in the distribution of social enterprises in Ireland ( H1 ) with a positive statistically significant association between the remoteness of the area and the ratio of social enterprises ( H2 ). However, our analysis suggests that there is not a capital effect that attracts a higher ratio of social enterprises to Dublin city ( H3 ). The statistical analysis also shows a geographical rural–urban pattern between the types of activities developed by social enterprises and the type of areas where they are based ( H4 ), with a positive association between the degree of remoteness of the area where social enterprises are based and the ratio of social enterprises in the community and local development sector ( H5 ). However, our analysis does not support a negative association between the degree of remoteness of the areas and the ratio of social enterprises in activities related to welfare services such as childcare and health, youth services and social care ( H6 ).

6. Discussion

The aim of this paper is to explore the distribution and type of activities developed by social enterprises in different rural and urban areas in Ireland. The results from our analysis show distinctive rural–urban patterns in the distribution of these organisations. Our research advances previous regional analysis of social enterprises ( Buckingham et al. , 2011 ) through the provision of fine-grained statistical data at subregional level and with a focus on heterogeneous rural and urban areas instead of following regional/county administrative divisions. The use of the six-tier rural–urban typology and the geo-localisation of social enterprises provides detailed evidence which can be used as a base by regional development actors and public authorities to develop targeted measures for social enterprises in geographically diverse areas ( Mazzei and Roy, 2017 ; Steiner and Teasdale, 2019 ).

Our results show the positive association between the presence of social enterprises and the degree of remoteness (low density of population and low access to services and amenities). These results align with previous studies that suggested rural areas and regions characterised by state and market failure as fertile grounds for social enterprises. ( Steiner and Steinerowska-Streb, 2012 ; Woo and Jung, 2023 ). Our study does not support the hypothesis that the capital city, in this case Dublin, with its greater entrepreneurial and innovation ecosystem acts as a significant area of social enterprises development – at least relative to its population. This result contradicts the analysis of Buckingham et al. (2011) which stressed the greater presence of social enterprises in London compared to other UK regions due to its capital effect.

Our results show the relevance of social enterprises in “lagged behind areas” and their aim to respond to unsatisfied needs, especially of marginalised people and territories ( Olmedo et al. , 2023 ). The great presence of social enterprises in these remote territories has meant the development of a wide range of services and community infrastructure which otherwise would have not been provided to the local population ( Aiken et al. , 2016 ; van Twuijver et al. , 2020 ). However, the presence of social enterprises cannot be automatically related to a greater capacity of these areas to overcome their challenges. Previous studies on rural social enterprises have shown their great potential to contribute to a socially inclusive and territorial integrated development when cooperating with other development actors including for-profit businesses and public institutions; however, these previous studies also show the incapacity of rural social enterprises to change, by themselves, structural-exogenous forces affecting marginalised territories ( Bock, 2016 ; Olmedo and O’Shaughnessy, 2022 ).

Our analysis of social enterprises by sectors of activities in different geographical areas does not show a relationship between social enterprises operating in urban areas and their greater focus on welfare objectives, contrary to the findings of Barraket et al. (2019) . It is important to note than in Ireland (community) childcares represent an important number of social enterprises (over 25%) and these are spread across the whole territory without a clear distinctive geographical pattern. Descriptive statistics by sectors of activity show that social enterprises focusing on activities related to health, youth services and social care represent over 10% in urban areas and only approximately 5% in rural areas which would be more in line with the results of Barraket et al. (2019) in Australia and Smith and McColl (2016) in Scotland when comparing urban and rural social enterprises.

Our results also show a significant focus of social enterprises on remote and rural areas in community and local development activities. This aligns with previous research on rural social enterprises that stress the relevance of community social entrepreneurship in rural territories ( Peredo and Chrisman, 2006 ) and the important role of rural (community-based) social enterprises in local development ( O’Shaughnessy et al. , 2011 ; Steiner and Teasdale, 2019 ; van Twuijver et al. , 2020 ). The significant developmental role of social enterprises in rural areas aligns with a key feature of rural social enterprises, which is their tendency to merge social, economic and/or environmental aims, contributing to an integrated territorial development ( Olmedo et al. , 2023 ). However, this significant focus of social enterprises in rural areas on community and local development activities often implies the development of basic infrastructure and services that are usually provided by public administrations in urban areas ( Bock, 2016 ). Thus social enterprises can, in this instance, be interpreted as a substitute arising from the absence and/or retrenchment of the state and public services ( Roy and Grant, 2019 ); this, in turn, can create an overburden to the citizens of these areas and increase the disparities between those better equipped and vulnerable social groups and territories ( Bock, 2016 ).

7. Conclusions, limitations and further research

This paper explored the distribution and sectors of activity of social enterprises in Ireland against a six-tier rural–urban typology that combines population density and access to services and amenities, adding a timely contribution to the body of geographical research on social enterprises. We suggest that the combination of national data of social enterprises with geographical tools and data at subregional level contributes to the methodological advancement of the field of social enterprises, through the provision of tools and frameworks for a nuanced and spatially sensitive analysis of these organisations. Moreover, this study contributes to testing, through a quantitative analysis, hypotheses developed from the findings of previous geographical research on social enterprises.

Our findings show geographical patterns in the distributions of social enterprises, such as their greater presence in highly rural/remote areas and the lack of a capital city effect in terms of density of social enterprises. Our analysis also shows a geographical rural–urban differentiation in terms of sectors of activity, with social enterprises in the community and local development sector being especially relevant in rural areas. Against this evidence, we conclude that social enterprise policies should incorporate territorially sensitive and place-based measures that account for the diversity of rural and urban areas. To this end, the alignment of social enterprises and rural development policies is a key aspect for harnessing the potential of these organisations in rural areas. However, we also conclude that there is great scope for the development of social enterprises in specific sectors in rural and remote areas, such as the creative industry, sustainable agri-food and the circular economy. The development of social enterprises within these sectors is linked to fostering a more socially and territorially inclusive society, but also to wider aspects related to the twin (digital and green) transitions.

This study is not absent of limitations. Social enterprises are context-specific, and the rural–urban typology used in this study was created by the Irish CSO with specific criteria. This makes international comparability difficult and any generalization of the results from this study to other contexts/countries should be taken with caution. Interestingly the Scottish Social Enterprise Census (latest version is of 2021) also follows a six-tier rural–urban typology, showing an important presence of social enterprises in remote rural areas; however, the use of different indicators for developing the Scottish rural–urban typology does not allow for a rigorous comparison with the data shown in this study. Recently developed methodologies such as the Global Human Settlement Layer by the Joint Research Centre of the European Commission ( Dijkstra et al. , 2021 ), which harmonise indicators for urban and rural areas to support consistent international comparisons across countries represent an interesting avenue for further research that compares geographical patterns of social enterprises in different countries. In this regard, the increasing amount of geolocation information and geographically sensitive data collection on social enterprises, and more generally on social economy organisations, can also represent an important advancement for future research.

A final suggestion for further research relates to the combination of geographical and institutional frameworks for the (quantitative) study of spatial patterns in social enterprises that can inform place-based social enterprise policies. This study can be further developed by isolating specific clusters of social enterprises at regional level and exploring their impact on the development of their areas and the critical factors supporting and/or hindering this impact.

Map rural–urban typology for the Republic of Ireland

Map of social enterprises by rural–urban typology

Map total number of social enterprises by county

Map ratio of social enterprises by county

Summary of literature on geographical research on social enterprises

Geographical analytical level Relevant findings Examples of articles
Micro Social enterprises and spaces of well-being (2015); (2020)
Urban Agglomeration in cities enables greater demand and better access to institutional support, funding, knowledge and networks for social enterprises
Characteristics of place influence in incentives and opportunities for social enterprises
;
Rural Social enterprises as embedded intermediaries between their localities and supra-regional networks
Social enterprises harness and (re)valorise untapped local resources and complement these with extra-local resources for integrated development of localities
Rural areas are a fertile ground for social enterprises due to some characteristics associated to rurality
; ; ; (2023)
Urban–rural Rural social enterprises more attached to geographical needs and community networks; urban social enterprises more focus on social needs and welfare objectives ; (2019)
Regional Low interregional variations (UK) in distribution of social enterprises, except for capital
Emergence of social enterprises related to regions experiencing government or market failure
(2011);
National Geographical location of Fiji influence in shaping social enterprises (2018)

Authors’ own creation

Type Definition
Cities Towns/settlements with populations greater than 50,000
Satellite urban towns Towns/settlements with populations between 1,500 and 49,999, where 20% or more of the usually resident used population’s workplace address is in “Cities”
Independent urban towns Towns/settlements with populations between 1,500 and 49,999, where less than 20% of the usually resident employed population’s workplace address is in “Cities”
Rural areas with high urban influence Rural areas (themselves defined as having an area type with a population less than 1,500 persons, as per census 2016) are allocated to one of three sub-categories, based on their dependence on urban areas
Again, employment location is the defining variable. The allocation is based on a weighted percentage of resident used adults of a rural small area who work in the three standard categories of urban area (for simplicity the methodology uses main, secondary and minor urban area). The percentages working in each urban area were weighted through the use of multipliers. The multipliers allowed for the increasing urbanisation for different sized urban areas. For example, the percentage of rural people working in a main urban area had double the impact of the same percentage working in a minor urban area. The weighting acknowledges the impact that a large urban centre has on its surrounding area
The adopted weights for:
Main urban areas is 2
Satellite urban communities is 1.5
Independent urban communities is 1
The weighted percentages is divided into tertials to assign one of the three rural breakdowns
Rural areas with moderate urban influence
Highly rural/remote areas

Area/Typology Social enterprises Population Ratio
(SE/10,000 inhabitants)
% %
Highly rural/remote areas 865 20.4 412,457 8.8 21.0 10.8 (total rural)
Rural areas with moderate urban influence 580 13.7 587,041 12.5 9.9
Rural areas with high urban influence 447 10.6 754,794 16.1 5.9
Independent urban towns 991 23.4 770,329 16.4 12.9 8.0 (total urban)
Satellite urban towns 293 6.9 597,355 12.8 4.9
Cities 1,058 25.0 1,567,945 33.4 6.7
Total 4,234 100 4,689,921 100 9.0

Authors’ own creation

Type of area Childcare (%) Community infrastructure and local development (%) Health, youth services and social care (%) Heritage, festivals, arts and creative industry (%) Sport and leisure (%) Training and work integration (%) Information, support and financial services (%) Housing (%) Food, agriculture, catering (%) Environment, circular economy and renewable energy (%) Retailing (%) Transport (%) Manufacturing (%) Other (%)
Highly rural/remote areas 28.7 23.8 8.6 15.7 5.3 3.5 3.6 2.9 3.1 2.3 1.4 0.2 0.2 0.6
Rural areas with moderate urban influence 32.2 21.9 9.0 10.5 9.5 3.6 2.4 2.2 3.1 2.4 1.2 0.9 0.2 0.9
Rural areas with high urban influence 23.7 22.4 9.4 10.1 13.4 4.7 5.6 3.4 3.6 2.2 0.4 0.0 0.2 1.1
Independent urban towns 23.8 14.6 14.5 12.7 8.9 5.8 7.1 5.2 2.2 1.3 1.8 0.7 0.4 1.1
Satellite urban towns 23.5 16.4 20.1 8.5 7.5 6.1 4.1 4.4 1.4 2.4 2.7 1.7 0.3 0.7
Cities 28.9 7.9 18.9 5.6 4.3 9.5 9.8 7.0 2.2 4.0 0.6 0.2 0.4 0.7
All Ireland 26.7 16.4 13.7 10.7 7.6 6.1 5.8 4.5 2.7 2.6 1.2 0.5 0.3 0.9

Authors’ own creation

Hypothesis Decision
: the presence of social enterprises is significantly associated with the type of rural–urban areas Supported
: the presence of social enterprises is positively associated to areas with lower population density and greater distance to services and amenities (remoteness). Supported
: the presence of social enterprises within the capital city (Dublin) is significantly higher compared to the national average and to other rural and urban areas of Ireland Not supported
: there is a significant relationship between the sectors of activities in which social enterprises operate and the type of rural–urban areas in which they are based Supported
: there is a positive association between areas with lower population density and greater distance to services and amenities (remoteness) and the presence of social enterprises in community and local development Supported
: there is a negative association between areas with lower population density and greater distance to services and amenities (remoteness) and the presence of social enterprises operating in sectors related to welfare objectives such as childcare, health and social care Not supported

Authors’ own creation

. Kruskal–Wallis H test

df -valueDecision
SEs ratio – rural/urban area 309.17 5 2.2e ** Supported
Note: 0.01

Authors’ own creation

. Kruskal–Wallis post hoc Dunn test (pairwise group comparison)

Comparison (pairwise) Z P. unadj P. adj (Bonferroni)
Highly rural/remote areas – Rural areas with moderate urban influence 6.432 1.26E-10 1.89E-09**
Highly rural/remote areas – Rural areas with high urban influence 9.694 3.21E-22 4.81E-21**
Highly rural/remote areas – Independent urban towns 3.866 0.000111 0.0017**
Highly rural/remote areas – Satellite urban towns 12.304 8.65E-35 1.308E-33**
Highly rural/remote areas – Cities −14.341 1.21E-46 1.81E-45**
Rural areas with moderate urban influence – Rural areas with high urban influence −3.256 0.001129 0.0169*
Rural areas with moderate urban influence – Independent urban towns 3.007 0.002637 0.0396*
Rural areas with moderate urban influence – Satellite urban towns 6.11 9.98E-10 1.50E-08**
Rural areas with moderate urban influence – Cities −6.657 2.79E-11 4.19E-10**
Rural areas with high urban influence – Independent urban towns 6.491 8.55E-11 1.28E-09**
Rural areas with high urban influence – Satellite urban towns 2.979 0.002889 0.0433*
Rural areas with high urban influence – Cities −2.772 0.005563 0.0834
Independent urban towns – Satellite urban towns 9.378 6.74E-21 1.01E-19**
Cities – Independent urban towns −11.05 2.19E-28 3.28E-27**
Cities – Satellite urban towns 0.879 0.379665 1
Notes: < 0.05; ** < 0.01

Authors’ own creation

. Jockeenhera–Terpstra test

Alternative hypothesis JT -valueDecision
Positive association area remoteness and ratio social enterprises Increasing 73161607 0.001** Supported
Note: < 0.01

Authors’ own creation

. Welch two sample -test

t-test
(Welch Two Sample t-test)
Pairs (categories) compared df ci (95%)Decision
Dublin City – satellite urban towns 1.6129 5,163.3 0.1068 (−0.22, 2.24) Not supported
Dublin City – rural areas with high urban influence 1.1337 6,491.1 0.2569 (−0.46, 1.75) Not supported

Authors’ own creation

. Chi-square test (test of independence)

df -valueDecision
Association between sector of activity SEs and rural–urban typology 445.99 70 2.2e ** Supported
Note: < 0.01

Authors’ own creation

and . Jockheenhere–Terpstra test

and Alternative hypothesis JT -valueDecision
: Positive association rural–urban remoteness and ratio social enterprises in community local development Increasing 13 0.02778* Supported
Negative association rural–urban remoteness and ratio social enterprises in welfare services Decreasing 3 0.06806 Not supported

* p < 0.05

Source: Authors’ own creation

The main source for selecting the papers for the literature review was a search on Scopus (conducted in early 2023), with the search string: TITLE-ABSTRACT-KEYWORDS (“geography” OR “rural” OR “urban” OR “regional”) AND “social enterprises”. From this search only papers where geography was considered an explanatory factor/dimension in the analysis of the features and/or work of social enterprises were selected. The article Douglas et al. (2018) was added by the authors.

Nomenclature of territorial units for statistics (see Eurostat, https://ec.europa.eu/eurostat/web/nuts/background )

The classification of regions into one of the three categories is based on the following criteria:

Population density. A community is defined as rural if its population density is below 150 inhabitants per km 2 (500 inhabitants for Japan to account for the fact that its national population density exceeds 300 inhabitants per km 2 ).

Regions by % population in rural communities. A region is classified as predominantly rural if more than 50% of its population lives in rural communities, predominantly urban if less than 15% of the population lives in rural communities, and intermediate if the share of the population living in rural communities is between 15% and 50%.

Urban centres. A region that would be classified as rural on the basis of the general rule is classified as intermediate if it has an urban centre of more than 200,000 inhabitants (500,000 for Japan) representing no less than 25% of the regional population. A region that would be classified as intermediate on the basis of the general rule is classified as predominantly urban if it has an urban centre of more than 500,000 inhabitants (1,000,000 for Japan) representing no less than 25% of the regional population.

More information about this methodology is available at: “Social Enterprises in Ireland – a Baseline data collection exercise” www.gov.ie/ga/foilsiuchan/b30e5-social-enterprises-in-ireland-a-baseline-data-collection-exercise/#:∼:text=In%202022%2C%20the%20Department%20of%20Rural%20and%20Community,sector%2C%20an%20online%20survey%20was%20developed%20and%20published

QGIS (Quantum Geographical Information System) is a free and open-source software for spatial analysis. See https://qgis.org/en/site/

Now Tailte Éireann, see https://data-osi.opendata.arcgis.com/

The more recent data for population at small area level at the time of this study was from Census 2016.

Aiken , M. , Taylor , M. and Moran , R. ( 2016 ), “ Always look a gift horse in the mouth: community organisations controlling assets ”, VOLUNTAS: International Journal of Voluntary and Nonprofit Organizations , Vol. 27 No. 4 , pp. 1669 - 1693 .

Ali , A. , Rasheed , A. , Siddiqui , A. , Naseer , M. , Wasim , S. and Akhtar , W. ( 2015 ), “ Non-parametric test for ordered medians: the Jonckheere Terpstra test ”, International Journal of Statistics in Medical Research , Vol. 4 No. 2 , pp. 203 - 207 , doi: 10.6000/1929-6029.2015.04.02.8 .

Angelidou , M. and Mora , L. ( 2019 ), “ Developing synergies between social entrepreneurship and urban planning ”, disP - the Planning Review , Vol. 55 No. 4 , pp. 28 - 45 , doi: 10.1080/02513625.2019.1708068 .

Barraket , J. , Eversole , R. , Luke , B. and Barth , S. ( 2019 ), “ Resourcefulness of locally-oriented social enterprises: implications for rural community development ”, Journal of Rural Studies , Vol. 70 , pp. 188 - 197 , doi: 10.1016/j.jrurstud.2017.12.031 .

Bock , B.B. ( 2016 ), “ Rural marginalisation and the role of social innovation; a turn towards nexogenous development and rural reconnection ”, Sociologia Ruralis , Vol. 56 No. 4 , pp. 552 - 573 , doi: 10.1111/soru.12119 .

Buckingham , H. , Pinch , S. and Sunley , P. ( 2011 ), “ The enigmatic regional geography of social enterprise in the UK: a conceptual framework and synthesis of the evidence ”, Area , Vol. 44 No. 1 , pp. 83 - 91 , doi: 10.1111/j.1475-4762.2011.01043.x .

CEIS and Social Value Lab ( 2023 ), “ Social enterprise in Scotland. Census 2021 ”, Scottish Government , available at: https://socialenterprisecensus.org.uk/wp-content/themes/census19/pdf/2021-report.pdf (accessed 29 August 2023 ).

Copus , A. , Courtney , P. , Dax , T. , Meredith , D. , Noguera , J. , Talbot , H. and Shucksmith , M. ( 2011 ), “ EDORA: European development opportunities for rural areas ”, Final Report , Luxembourg , ESPON .

CSO ( 2019 ), “ Urban and rural life in Ireland ”, CSO , available at: www.cso.ie/en/releasesandpublications/ep/p-urli/urbanandrurallifeinireland2019/introduction/ (accessed 22 April 2024 ).

Defourny , J. and Nyssens , M. ( 2017 ), “ Fundamentals for an international typology of social enterprise models ”, VOLUNTAS: International Journal of Voluntary and Nonprofit Organizations , Vol. 28 No. 6 , pp. 2469 - 2497 , doi: 10.1007/s11266-017-9884-7 .

Defourny , J. , Nyssens , M. and Brolis , O. ( 2020 ), “ Testing social enterprise models across the world: evidence from the ‘international comparative social enterprise models (ICSEM) project’ ”, Nonprofit and Voluntary Sector Quarterly , Vol. 50 No. 2 , p. 89976402095947 , doi: 10.1177/0899764020959470 .

Diaz Gonzalez , A. and Dentchev , N. ( 2020 ), “ Ecosystems in support of social entrepreneurs: a literature review ”, Social Enterprise Journal , Vol. 17 No. 3 , pp. 329 - 360 , doi: 10.1108/SEJ-08-2020-0064 .

Dijkstra , L. , Florczyk , A.J. , Freire , S. , Kemper , T. , Melchiorri , M. , Pesaresi , M. and Schiavina , M. ( 2021 ), “ Applying the degree of urbanisation to the globe: a new harmonised definition reveals a different picture of global urbanisation ”, Journal of Urban Economics , Vol. 125 , p. 103312 , doi: 10.1016/j.jue.2020.103312 .

Dinno , A. ( 2022 ), “ Dunn’s test of multiple comparisons using rank sums ”, available at: https://cran.r-project.org/web/packages/dunn.test/dunn.test.pdf (accessed 30 August 2023 ).

Douglas , H. , Eti-Tofinga , B. and Singh , G. ( 2018 ), “ Contextualising social enterprise in Fiji ”, Social Enterprise Journal , Vol. 14 No. 2 , pp. 208 - 224 , doi: 10.1108/SEJ-05-2017-0032 .

European Commission ( 2020 ), “ Social enterprises and their ecosystems in Europe ”, Country Report Ireland . Luxembourg , Publications Office of the European Union .

European Commission ( 2021 ), “ Building an economy that works for people: an action plan for the social economy ”, Luxembourg , Publications Office of the European Union .

Eurostat ( 2021 ), “ Applying the degree of urbanisation—a new international manual for defining cities, towns and rural areas—2021 edition ”, available at: https://ec.europa.eu/eurostat/web/products-catalogues/-/ks-04-20-676 (accessed 29 August 2023 ).

Farmer , J. , Kamstra , P. , Brennan-Horley , C. , De Cotta , T. , Roy , M. , Barraket , J. , Munoz , S.-A. and Kilpatrick , S. ( 2020 ), “ Using micro-geography to understand the realisation of wellbeing: a qualitative GIS study of three social enterprises ”, Health and Place , Vol. 62 , p. 102293 , doi: 10.1016/j.healthplace.2020.102293 .

Franke , T.M. , Ho , T. and Christie , C.A. ( 2012 ), “ The chi-square test: often used and more often misinterpreted ”, American Journal of Evaluation , Vol. 33 No. 3 , pp. 448 - 458 , doi: 10.1177/1098214011426594 .

Galera , G. and Borzaga , C. ( 2009 ), “ Social enterprise: an international overview of its conceptual evolution and legal implementation ”, Social Enterprise Journal , Vol. 5 No. 3 , pp. 210 - 228 , doi: 10.1108/17508610911004313 .

Government of Ireland ( 2019 ), National Social Enterprise Policy for Ireland 2019-2022 , Government of Ireland , Dublin .

Government of Spain ( 2007 ), “ Ley 45/2007 de 13 diciembre, Para el desarrollo sostenible del medio rural ”, Boletín Oficial Del Estado, 14 de Diciembre de 2007, (299) , pp. 51339 - 51349 .

Hynes , B. ( 2016 ), Creating an Enabling, Supportive Environment for the Social Enterprise Sector in Ireland , The Irish Local Development Network , Ireland .

Jammulamadaka , N. and Chakraborty , K. ( 2018 ), “ Local geographies of developing country social enterprises ”, Social Enterprise Journal , Vol. 14 No. 3 , pp. 367 - 386 , doi: 10.1108/SEJ-11-2016-0051 .

Kelly , D. , Steiner , A. , Mazzei , M. and Baker , R. ( 2019 ), “ Filling a void? The role of social enterprise in addressing social isolation and loneliness in rural communities ”, Journal of Rural Studies , Vol. 70 , pp. 225 - 236 , doi: 10.1016/j.jrurstud.2019.01.024 .

Kerlin , J.A. ( 2013 ), “ Defining social enterprise across different contexts: a conceptual framework based on institutional factors ”, Nonprofit and Voluntary Sector Quarterly , Vol. 42 No. 1 , pp. 84 - 108 , doi: 10.1177/0899764011433040 .

Lu , Z. and Yuan , K.-H. ( 2010 ), “ Welch’s t test ”, Salkind , N.J. (Ed.), Encyclopedia of Research Design , SageEditors , Thousand Oaks, CA , pp. 1620 - 1623 , doi: 10.13140/RG.2.1.3057.9607 .

Mantino , F. , Forcina , B. and Morse , A. ( 2023 ), “ Exploring the rural-urban continuum ”, Methodological framework to define Functional Rural Areas and rural transitions. RUSTIK. D1.1 ., available at: https://rustik-he.eu/wp-content/uploads/2023/04/RUSTIK_D-1-1_Methodological_Framework_31.03.23.pdf (accessed 25 August 2023 ).

Mazzei , M. ( 2017 ), “ Understanding difference: the importance of ‘place’ in the shaping of local social economies ”, VOLUNTAS: International Journal of Voluntary and Nonprofit Organizations , Vol. 28 No. 6 , pp. 2763 - 2784 , doi: 10.1007/s11266-016-9803-3 .

Mazzei , M. and Roy , M.J. ( 2017 ), “ From policy to practice: exploring practitioners’ perspectives on social enterprise policy claims ”, VOLUNTAS: International Journal of Voluntary and Nonprofit Organizations , Vol. 28 No. 6 , pp. 2449 - 2468 , doi: 10.1007/s11266-017-9856-y .

Munoz , S.-A. ( 2010 ), “ Towards a geographical research agenda for social enterprise ”, Area , Vol. 42 No. 3 , pp. 302 - 312 , doi: 10.1111/j.1475-4762.2009.00926.x .

Munoz , S.-A. , Farmer , J. , Winterton , R. and Barraket , J. ( 2015 ), “ The social enterprise as a space of well-being: an exploratory case study ”, Social Enterprise Journal , Vol. 11 No. 3 , pp. 281 - 302 , doi: 10.1108/SEJ-11-2014-0041 .

O’Hara , P. and O’Shaughnessy , M. ( 2021 ), “ ‘Social enterprise in Ireland. State support key to, the predominance of work integration social enterprise (WISE) ”, in Defourny , J. and Nyssens , M. (Eds), Social Enterprise in Western Europe. Theory, Models and Practice , Routledge , London/New York, NY , pp. 112 - 130 .

O’Shaughnessy , M. and O’Hara , P. ( 2016 ), “ Towards an explanation of Irish rural-based social enterprises ”, International Review of Sociology , Vol. 26 No. 2 , pp. 223 - 233 , doi: 10.1080/03906701.2016.1181389 .

O’Shaughnessy , M. , Casey , E. and Enright , P. ( 2011 ), “ Rural transport in peripheral rural areas: the role of social enterprises in meeting the needs of rural citizens ”, Social Enterprise Journal , Vol. 7 No. 2 , pp. 183 - 190 , doi: 10.1108/17508611111156637 .

OECD ( 2006 ), “ The new rural paradigm. Policies and governance ”, OECD Publishing , Paris .

OECD ( 2022 ), “ Recommendation of the council on the social and solidarity economy and social innovation ”, OECD/LEGAL/0472 .

Olmedo , L. and O’Shaughnessy , M. ( 2022 ), “ Community-based social enterprises as actors for Neo-Endogenous rural development: a multi-stakeholder approach ”, Rural Sociology , Vol. 87 No. 4 , pp. 1191 - 1218 , doi: 10.1111/ruso.12462 .

Olmedo , L. , van Twuijver , M. and O’Shaughnessy , M. ( 2023 ), “ Rurality as context for innovative responses to social challenges – the role of rural social enterprises ”, Journal of Rural Studies , Vol. 99 , pp. 272 - 283 , doi: 10.1016/j.jrurstud.2021.04.020 .

Olmedo , L. , van Twuijver , M. , O’Shaughnessy , M. and Sloane , A. ( 2021 ), “ Irish rural social enterprises and the national policy framework ”, Administration , Vol. 69 No. 4 , pp. 9 - 37 .

Peredo , A.M. and Chrisman , J.J. ( 2006 ), “ Toward a theory of community-based enterprise ”, Academy of Management Review , Vol. 31 No. 2 , pp. 309 - 328 .

Perpiñá Castillo , C. , Heerden , S. , Barranco , R. , Jacobs-Crisioni , C. , Kompil , M. , Kučas , A. , Aurambout , J.-P. , Silva , F. and Lavalle , C. ( 2022 ), “ Urban‐rural continuum: an overview of their interactions and territorial disparities ”, Regional Science Policy and Practice , Vol. 15 No. 4 , doi: 10.1111/rsp3.12592 .

Pinch , S. and Sunley , P. ( 2016 ), “ Do urban social enterprises benefit from agglomeration? Evidence from four UK cities ”, Regional Studies , Vol. 50 No. 8 , pp. 1290 - 1301 , doi: 10.1080/00343404.2015.1034667 .

Richter , R. ( 2019 ), “ Rural social enterprises as embedded intermediaries: the innovative power of connecting rural communities with supra-regional networks ”, Journal of Rural Studies , Vol. 70 , pp. 179 - 187 , doi: 10.1016/j.jrurstud.2017.12.005 .

Roy , M. and Grant , S. ( 2019 ), “ The contemporary relevance of Karl Polanyi to critical social enterprise scholarship ”, Journal of Social Entrepreneurship , Vol. 11 No. 2 , doi: 10.1080/19420676.2019.1621363 .

Smith , A.M. and McColl , J. ( 2016 ), “ Contextual influences on social enterprise management in rural and urban communities ”, Local Economy: The Journal of the Local Economy Policy Unit , Vol. 31 No. 5 , pp. 572 - 588 , doi: 10.1177/0269094216655519 .

Steiner , A. and Steinerowska-Streb , I. ( 2012 ), “ Can social enterprise contribute to creating sustainable rural communities? Using the lens of structuration theory to analyse the emergence of rural social enterprise ”, Local Economy: The Journal of the Local Economy Policy Unit , Vol. 27 No. 2 , pp. 167 - 182 , doi: 10.1177/0269094211429650 .

Steiner , A. and Teasdale , S. ( 2019 ), “ Unlocking the potential of rural social enterprise ”, Journal of Rural Studies , Vol. 70 , pp. 144 - 154 , doi: 10.1016/j.jrurstud.2017.12.021 .

Steiner , A. , Farmer , J. and Bosworth , G. ( 2019 ), “ Rural social enterprise–evidence to date, and a research agenda ”, Journal of Rural Studies , Vol. 70 , pp. 139 - 143 , doi: 10.1016/j.jrurstud.2019.08.008 .

United Nations ( 2023 ), “ Promoting the social and solidarity economy for sustainable development ”, United Nations, Inter-Agency Task on Social and Solidarity Economy Force , available at: https://unsse.org/wp-content/uploads/2023/05/A_RES_77_281-EN.pdf (accessed 28 August 2023 ).

Van Twuijver , M.W. , Olmedo , L. , O’Shaughnessy , M. and Hennessy , T. ( 2020 ), “ Rural social enterprises in Europe: a systematic literature review ”, Local Economy: The Journal of the Local Economy Policy Unit , Vol. 35 No. 2 , pp. 121 - 142 , doi: 10.1177/0269094220907024 .

Vargha , A. and Delaney , H.D. ( 1998 ), “ The Kruskal-Wallis test and stochastic homogeneity ”, Journal of Educational and Behavioral Statistics , Vol. 23 No. 2 , pp. 170 - 192 , doi: 10.2307/1165320 .

Woo , C. and Jung , H. ( 2023 ), “ Exploring the regional determinants of the emergence of social enterprises in South Korea: an entrepreneurial ecosystem perspective ”, Nonprofit and Voluntary Sector Quarterly , Vol. 52 No. 3 , pp. 723 - 744 , doi: 10.1177/08997640221110211 .

Acknowledgements

This study have been funded by the Department of Rural and Community Development, Government of Ireland – NUI Post-Doctoral Fellowship in Rural Development 2022. The authors would like to thank you the funders for their support and three anonymous reviewers and the editors of the journal for their feedback.

Corresponding author

Related articles, all feedback is valuable.

Please share your general feedback

Report an issue or find answers to frequently asked questions

Contact Customer Support

Log in using your username and password

You are here

Download PDF

Objectives This cohort study reported descriptive statistics in athletes engaged in Summer and Winter Olympic sports who sustained a sport-related concussion (SRC) and assessed the impact of access to multidisciplinary care and injury modifiers on recovery.

Methods 133 athletes formed two subgroups treated in a Canadian sport institute medical clinic: earlier (≤7 days) and late (≥8 days) access. Descriptive sample characteristics were reported and unrestricted return to sport (RTS) was evaluated based on access groups as well as injury modifiers. Correlations were assessed between time to RTS, history of concussions, the number of specialist consults and initial symptoms.

Results 160 SRC (median age 19.1 years; female=86 (54%); male=74 (46%)) were observed with a median (IQR) RTS duration of 34.0 (21.0–63.0) days. Median days to care access was different in the early (1; n SRC =77) and late (20; n SRC =83) groups, resulting in median (IQR) RTS duration of 26.0 (17.0–38.5) and 45.0 (27.5–84.5) days, respectively (p<0.001). Initial symptoms displayed a meaningful correlation with prognosis in this study (p<0.05), and female athletes (52 days (95% CI 42 to 101)) had longer recovery trajectories than male athletes (39 days (95% CI 31 to 65)) in the late access group (p<0.05).

Conclusions Olympic athletes in this cohort experienced an RTS time frame of about a month, partly due to limited access to multidisciplinary care and resources. Earlier access to care shortened the RTS delay. Greater initial symptoms and female sex in the late access group were meaningful modifiers of a longer RTS.

Data availability statement

Data are available on reasonable request. Due to the confidential nature of the dataset, it will be shared through a controlled access repository and made available on specific and reasonable requests.

https://doi.org/10.1136/bjsports-2024-108211

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

WHAT IS ALREADY KNOWN ON THIS TOPIC

Most data regarding the impact of sport-related concussion (SRC) guidelines on return to sport (RTS) are derived from collegiate or recreational athletes. In these groups, time to RTS has steadily increased in the literature since 2005, coinciding with the evolution of RTS guidelines. However, current evidence suggests that earlier access to care may accelerate recovery and RTS time frames.

WHAT THIS STUDY ADDS

This study reports epidemiological data on the occurrence of SRC in athletes from several Summer and Winter Olympic sports with either early or late access to multidisciplinary care. We found the median time to RTS for Olympic athletes with an SRC was 34.0 days which is longer than that reported in other athletic groups such as professional or collegiate athletes. Time to RTS was reduced by prompt access to multidisciplinary care following SRC, and sex-influenced recovery in the late access group with female athletes having a longer RTS timeline. Greater initial symptoms, but not prior concussion history, were also associated with a longer time to RTS.

HOW THIS STUDY MIGHT AFFECT RESEARCH, PRACTICE OR POLICY

Considerable differences exist in access to care for athletes engaged in Olympic sports, which impact their recovery. In this cohort, several concussions occurred during international competitions where athletes are confronted with poor access to organised healthcare. Pathways for prompt access to multidisciplinary care should be considered by healthcare authorities, especially for athletes who travel internationally and may not have the guidance or financial resources to access recommended care.

Introduction

After two decades of consensus statements, sport-related concussion (SRC) remains a high focus of research, with incidence ranging from 0.1 to 21.5 SRC per 1000 athlete exposures, varying according to age, sex, sport and level of competition. 1 2 Evidence-based guidelines have been proposed by experts to improve its identification and management, such as those from the Concussion in Sport Group. 3 Notably, they recommend specific strategies to improve SRC detection and monitoring such as immediate removal, 4 prompt access to healthcare providers, 5 evidence-based interventions 6 and multidisciplinary team approaches. 7 It is believed that these guidelines contribute to improving the early identification and management of athletes with an SRC, thereby potentially mitigating its long-term consequences.

Nevertheless, evidence regarding the impact of SRC guidelines implementation remains remarkably limited, especially within high-performance sport domains. In fact, most reported SRC data focus on adolescent student-athletes, collegiate and sometimes professional athletes in the USA but often neglect Olympians. 1 2 8–11 Athletes engaged in Olympic sports, often referred to as elite amateurs, are typically classified among the highest performers in elite sport, alongside professional athletes. 12 13 They train year-round and uniquely compete regularly on the international stage in sports that often lack professional leagues and rely on highly variable resources and facilities, mostly dependent on winning medals. 14 Unlike professional athletes, Olympians do not have access to large financial rewards. Although some Olympians work or study in addition to their intensive sports practice, they can devote more time to full-time sports practice compared with collegiate athletes. Competition calendars in Olympians differ from collegiate athletes, with periodic international competitions (eg, World Cups, World Championships) throughout the whole year rather than regular domestic competitions within a shorter season (eg, semester). Olympians outclass most collegiate athletes, and only the best collegiate athletes will have the chance to become Olympians and/or professionals. 12 13 15 In Canada, a primary reason for limited SRC data in Olympic sports is that the Canadian Olympic and Paralympic Sports Institute (COPSI) network only adopted official guidelines in 2018 to standardise care for athletes’ SRC nationwide. 16 17 The second reason could be the absence of a centralised medical structure and surveillance systems, identified as key factors contributing to the under-reporting and underdiagnosis of athletes with an SRC. 18

Among the available evidence on the evolution of SRC management, a 2023 systematic review and meta-analysis in athletic populations including children, adolescents and adults indicated that a full return to sport (RTS) could take up to a month but is estimated to require 19.8 days on average (15.4 days in adults), as opposed to the initial expectation of approximately 10.0 days based on studies published prior to 2005. 19 In comparison, studies focusing strictly on American collegiate athletes report median times to RTS of 16 days. 9 20 21 Notably, a recent study of military cadets reported an even longer return to duty times of 29.4 days on average, attributed to poorer access to care and fewer incentives to return to play compared with elite sports. 22 In addition, several modifiers have also been identified as influencing the time to RTS, such as the history of concussions, type of sport, sex, past medical problems (eg, preinjury modifiers), as well as the initial number of symptoms and their severity (eg, postinjury modifiers). 20 22 The evidence regarding the potential influence of sex on the time to RTS has yielded mixed findings in this area. 23–25 In fact, females are typically under-represented in SRC research, highlighting the need for additional studies that incorporate more balanced sample representation across sexes and control for known sources of bias. 26 Interestingly, a recent Concussion Assessment, Research and Education Consortium study, which included a high representation of concussed female athletes (615 out of 1071 patients), revealed no meaningful differences in RTS between females and males (13.5 and 11.8 days, respectively). 27 Importantly, findings in the sporting population suggested that earlier initiation of clinical care is linked to shorter recovery after concussion. 5 28 However, these factors affecting the time to RTS require a more thorough investigation, especially among athletes engaged in Olympic sports who may or may not have equal access to prompt, high-quality care.

Therefore, the primary objective of this study was to provide descriptive statistics among athletes with SRC engaged in both Summer and Winter Olympic sport programmes over a quadrennial, and to assess the influence of recommended guidelines of the COPSI network and the fifth International Consensus Conference on Concussion in Sport on the duration of RTS performance. 16 17 Building on available evidence, the international schedule constraints, variability in resources 14 and high-performance expectation among this elite population, 22 prolonged durations for RTS, compared with what is typically reported (eg, 16.0 or 15.4 days), were hypothesised in Olympians. 3 19 The secondary objective was to more specifically evaluate the impact of access to multidisciplinary care and injury modifiers on the time to RTS. Based on current evidence, 5 7 29 30 the hypothesis was formulated that athletes with earlier multidisciplinary access would experience a faster RTS. Regarding injury modifiers, it was expected that female and male athletes would show similar time to RTS despite presenting sex-specific characteristics of SRC. 31 The history of concussions, the severity of initial symptoms and the number of specialist consults were expected to be positively correlated to the time to RTS. 20 32

Participants

A total of 133 athletes (F=72; M=61; mean age±SD: 20.7±4.9 years old) who received medical care at the Institut national du sport du Québec, a COPSI training centre set up with a medical clinic, were included in this cohort study with retrospective analysis. They participated in 23 different Summer and Winter Olympic sports which were classified into six categories: team (soccer, water polo), middle distance/power (rowing, swimming), speed/strength (alpine skiing, para alpine skiing, short and long track speed skating), precision/skill-dependent (artistic swimming, diving, equestrian, figure skating, gymnastics, skateboard, synchronised skating, trampoline) and combat/weight-making (boxing, fencing, judo, para judo, karate, para taekwondo, wrestling) sports. 13 This sample consists of two distinct groups: (1) early access group in which athletes had access to a medical integrated support team of multidisciplinary experts within 7 days following their SRC and (2) late access group composed of athletes who had access to a medical integrated support team of multidisciplinary experts eight or more days following their SRC. 5 30 Inclusion criteria for the study were participation in a national or international-level sports programme 13 and having sustained at least one SRC diagnosed by an authorised healthcare practitioner (eg, physician and/or physiotherapist).

Clinical context

The institute clinic provides multidisciplinary services for care of patients with SRC including a broad range of recommended tests for concussion monitoring ( table 1 ). The typical pathway for the athletes consisted of an initial visit to either a sports medicine physician or their team sports therapist. A clinical diagnosis of SRC was then confirmed by a sports medicine physician, and referral for the required multidisciplinary assessments ensued based on the patient’s signs and symptoms. Rehabilitation progression was based on the evaluation of exercise tolerance, 33 priority to return to cognitive tasks and additional targeted support based on clinical findings of a cervical, visual or vestibular nature. 17 The expert team worked in an integrated manner with the athlete and their coaching staff for the rehabilitation phase, including regular round tables and ongoing communication. 34 For some athletes, access to recommended care was fee based, without a priori agreements with a third party payer (eg, National Sports Federation).

Main evaluations performed to guide the return to sport following sport-related concussion

Data collection

Data were collected at the medical clinic using a standardised injury surveillance form based on International Olympic Committee guidelines. 35 All injury characteristics were extracted from the central injury database between 1 July 2018 and 31 July 2022. This period corresponds to a Winter Olympic sports quadrennial but also covers 3 years for Summer Olympic sports due to the postponing of the Tokyo 2020 Olympic Games. Therefore, the observation period includes a typical volume of competitions across sports and minimises differences in exposure based on major sports competition schedules. The information extracted from the database included: participant ID, sex, date of birth, sport, date of injury, type of injury, date of their visit at the clinic, clearance date of unrestricted RTS (eg, defined as step 6 of the RTS strategy with a return to normal gameplay including competitions), the number and type of specialist consults, mechanism of injury (eg, fall, hit), environment where the injury took place (eg, training, competition), history of concussions, history of modifiers (eg, previous head injury, migraines, learning disability, attention deficit disorder or attention deficit/hyperactivity disorder, depression, anxiety, psychotic disorder), as well as the number of symptoms and the total severity score from the first Sport Concussion Assessment Tool 5 (SCAT5) assessment following SRC. 17

Following a Shapiro-Wilk test, medians, IQR and non-parametric tests were used for the analyses because of the absence of normal distributions for all the variables in the dataset (all p<0.001). The skewness was introduced by the presence of individuals that required lengthy recovery periods. One participant was removed from the analysis because their time to consult with the multidisciplinary team was extremely delayed (>1 year).

Descriptive statistics were used to describe the participant’s demographics, SRC characteristics and risk factors in the total sample. Estimated incidences of SRC were also reported for seven resident sports at the institute for which it was possible to quantify a detailed estimate of training volume based on the annual number of training and competition hours as well as the number of athletes in each sport.

To assess if access to multidisciplinary care modified the time to RTS, we compared time to RTS between early and late access groups using a method based on median differences described elsewhere. 36 Wilcoxon rank sum tests were also performed to make between-group comparisons on single variables of age, time to first consult, the number of specialists consulted and medical visits. Fisher’s exact tests were used to compare count data between groups on variables of sex, history of concussion, time since the previous concussion, presence of injury modifiers, environment and mechanism of injury. Bonferroni corrections were applied for multiple comparisons in case of meaningful differences.

To assess if injury modifiers modified time to RTS in the total sample, we compared time to RTS between sexes, history of concussions, time since previous concussion or other injury modifiers using a method based on median differences described elsewhere. 36 Kaplan-Meier curves were drawn to illustrate time to RTS differences between sexes (origin and start time: date of injury; end time: clearance date of unrestricted RTS). Trajectories were then assessed for statistical differences using Cox proportional hazards model. Wilcoxon rank sum tests were employed for comparing the total number of symptoms and severity scores on the SCAT5. The association of multilevel variables on return to play duration was evaluated in the total sample with Kruskal-Wallis rank tests for environment, mechanism of injury, history of concussions and time since previous concussion. For all subsequent analyses of correlations between SCAT5 results and secondary variables, only data obtained from SCAT5 assessments within the acute phase of injury (≤72 hours) were considered (n=65 SRC episodes in the early access group). 37 Spearman rank correlations were estimated between RTS duration, history of concussions, number of specialist consults and total number of SCAT5 symptoms or total symptom severity. All statistical tests were performed using RStudio (R V.4.1.0, The R Foundation for Statistical Computing). The significance level was set to p<0.05.

Equity, diversity and inclusion statement

The study population is representative of the Canadian athletic population in terms of age, gender, demographics and includes a balanced representation of female and male athletes. The study team consists of investigators from different disciplines and countries, but with a predominantly white composition and under-representation of other ethnic groups. Our study population encompasses data from the Institut national du sport du Québec, covering individuals of all genders, ethnicities and geographical regions across Canada.

Patient and public involvement

The patients or the public were not involved in the design, conduct, reporting or dissemination plans of our research.

Sample characteristics

During the 4-year period covered by this retrospective chart review, a total of 160 SRC episodes were recorded in 132 athletes with a median (IQR) age of 19.1 (17.8–22.2) years old ( table 2 ). 13 female and 10 male athletes had multiple SRC episodes during this time. The sample had a relatively balanced number of females (53.8%) and males (46.2%) with SRC included. 60% of the sample reported a history of concussion, with 35.0% reporting having experienced more than two episodes. However, most of these concussions had occurred more than 1 year before the SRC for which they were being treated. Within this sample, 33.1% of participants reported a history of injury modifiers. Importantly, the median (IQR) time to first clinic consult was 10.0 (1.0–20.0) days and the median (IQR) time to RTS was 34.0 (21.0–63.0) days in this sample ( table 3 ). The majority of SRCs occurred during training (56.3%) rather than competition (33.1%) and were mainly due to a fall (63.7%) or a hit (31.3%). The median (IQR) number of follow-up consultations and specialists consulted after the SRC were, respectively, 9 (5.0–14.3) and 3 (2.0–4.0).

Participants demographics

Sport-related concussion characteristics

Among seven sports of the total sample (n=89 SRC), the estimated incidence of athletes with SRC was highest in short-track speed skating (0.47/1000 hours; 95% CI 0.3 to 0.6), and lower in boxing, trampoline, water polo, judo, artistic swimming, and diving (0.24 (95% CI 0.0 to 0.5), 0.16 (95% CI 0.0 to 0.5), 0.13 (95% CI 0.1 to 0.2), 0.11 (95% CI 0.1 to 0.2), 0.09 (95% CI 0.0 to 0.2) and 0.06 (95% CI 0.0 to 0.1)/1000, respectively ( online supplemental material ). Furthermore, most athletes sustained an SRC in training (66.5%; 95% CI 41.0 to 92.0) rather than competition (26.0%; 95% CI 0.0 to 55.0) except for judo athletes (20.0% (95% CI 4.1 to 62.0) and 80.0% (95% CI 38.0 to 96.0), respectively). Falls were the most common injury mechanism in speed skating, trampoline and judo while hits were the most common injury mechanism in boxing, water polo, artistic swimming and diving.

Supplemental material

Access to care.

The median difference in time to RTS was 19 days (95% CI 9.3 to 28.7; p<0.001) between the early (26 (IQR 17.0–38.5) days) and late (45 (IQR 27.5–84.5) days) access groups ( table 3 ; figure 1 ). Importantly, the distribution of SRC environments was different between both groups (p=0.008). The post hoc analysis demonstrated a meaningful difference in the distribution of SRC in training and competition environments between groups (p=0.029) but not for the other comparisons. There was a meaningful difference between the groups in time to first consult (p<0.001; 95% CI −23.0 to −15.0), but no meaningful differences between groups in median age (p=0.176; 95% CI −0.3 to 1.6), sex distribution (p=0.341; 95% CI 0.7 to 2.8), concussion history (p=0.210), time since last concussion (p=0.866), mechanisms of SRC (p=0.412), the presence of modifiers (p=0.313; 95% CI 0.3 to 1.4) and the number of consulted specialists (p=0.368; 95% CI −5.4 to 1.0) or medical visits (p=0.162; 95% CI −1.0 to 3.0).

Time to return to sport following sport-related concussion as a function of group’s access to care and sex. Outliers: below=Q1−1.5×IQR; above=Q3+1.5×IQR.

The median difference in time to RTS was 6.5 days (95% CI −19.3 to 5.3; p=0.263; figure 1 ) between female (37.5 (IQR 22.0–65.3) days) and male (31.0 (IQR 20.0–48.0) days) athletes. Survival analyses highlighted an increased hazard of longer recovery trajectory in female compared with male athletes (HR 1.4; 95% CI 1.4 to 0.7; p=0.052; figure 2A ), which was mainly driven by the late (HR 1.8; 95% CI 1.8 to 0.6; p=0.019; figure 2C ) rather than the early (HR 1.1; 95% CI 1.1 to 0.9; p=0.700; figure 2B ) access group. Interestingly, a greater number of female athletes (n=15) required longer than 100 days for RTS as opposed to the male athletes (n=6). There were no meaningful differences between sexes for the total number of symptoms recorded on the SCAT5 (p=0.539; 95% CI −1.0 to 2.0) nor the total symptoms total severity score (p=0.989; 95% CI −5.0 to 5.0).

Time analysis of sex differences in the time to return to sport following sport-related concussion in the (A) total sample, as well as (B) early, and (C) late groups using survival curves with 95% confidence bands and tables of time-specific number of patients at risk (censoring proportion: 0%).

History of modifiers

SRC modifiers are presented in table 2 , and their influence on RTP is shown in table 4 . The median difference in time to RTS was 1.5 days (95% CI −10.6 to 13.6; p=0.807) between athletes with none and one episode of previous concussion, was 3.5 days (95% CI −13.9 to 19.9; p=0.728) between athletes with none and two or more episodes of previous concussion, and was 2 days (95% CI −12.4 to 15.4; p=0.832) between athletes with one and two or more episodes of previous concussion. The history of concussions (none, one, two or more) had no meaningful impact on the time to RTS (p=0.471). The median difference in time to RTS was 4.5 days (95% CI −21.0 to 30.0; p=0.729) between athletes with none and one episode of concussion in the previous year, was 2 days (95% CI −10.0 to 14.0; p=0.744) between athletes with none and one episode of concussion more than 1 year ago, and was 2.5 days (95% CI −27.7 to 22.7; p=0.846) between athletes with an episode of concussion in the previous year and more than 1 year ago. Time since the most recent concussion did not change the time to RTS (p=0.740). The longest time to RTS was observed in the late access group in which athletes had a concussion in the previous year, with a very large spread of durations (65.0 (IQR 33.0–116.5) days). The median difference in time to RTS was 3 days (95% CI −13.1 to 7.1; p=0.561) between athletes with and without other injury modifiers. The history of other injury modifiers had no meaningful influence on the time to RTS (95% CI −6.0 to 11.0; p=0.579).

Preinjury modifiers of time to return to sport following SRC

SCAT5 symptoms and severity scores

Positive associations were observed between the time to RTS and the number of initial symptoms (r=0.3; p=0.010; 95% CI 0.1 to 0.5) or initial severity score (r=0.3; p=0.008; 95% CI 0.1 to 0.5) from the SCAT5. The associations were not meaningful between the number of specialist consultations and the initial number of symptoms (r=−0.1; p=0.633; 95% CI −0.3 to 0.2) or initial severity score (r=−0.1; p=0.432; 95% CI −0.3 to 0.2). Anecdotally, most reported symptoms following SRC were ‘headache’ (86.2%) and ‘pressure in the head’ (80.0%), followed by ‘fatigue’ (72.3%), ‘neck pain’ (70.8%) and ‘not feeling right’ (67.7%; online supplemental material ).

This study is the first to report descriptive data on athletes with SRC collected across several sports during an Olympic quadrennial, including athletes who received the most recent evidence-based care at the time of data collection. Primarily, results indicate that the time to RTS in athletes engaged in Summer and Winter Olympic sports may require a median (IQR) of 34.0 (21.0–63.0) days. Importantly, findings demonstrated that athletes with earlier (≤7 days) access to multidisciplinary concussion care showed faster RTS compared with those with late access. Time to RTS exhibited large variability where sex had a meaningful influence on the recovery pathway in the late access group. Initial symptoms, but not history of concussion, were correlated with prognosis in this sample. The main reported symptoms were consistent with previous studies. 38 39

Time to RTS in Olympic sports

This study provides descriptive data on the impact of SRC monitoring programmes on recovery in elite athletes engaged in Olympic sports. As hypothesised, the median time to RTS found in this study (eg, 34.0 days) was about three times longer than those found in reports from before 2005, and 2 weeks longer than the typical median values (eg, 19.8 days) recently reported in athletic levels including youth (high heterogeneity, I 2 =99.3%). 19 These durations were also twice as long as the median unrestricted time to RTS observed among American collegiate athletes, which averages around 16 days. 9 20 21 However, they were more closely aligned with findings from collegiate athletes with slow recovery (eg, 34.7 days) and evidence from military cadets with poor access where return to duty duration was 29.4 days. 8 22 Several reasons could explain such extended time to RTS, but the most likely seems to be related to the diversity in access among these sports to multidisciplinary services (eg, 10.0 median days (1–20)), well beyond the delays experienced by collegiate athletes, for example (eg, 0.0 median days (0–2)). 40 In the total sample, the delays to first consult with the multidisciplinary clinic were notably mediated by the group with late access, whose athletes had more SRC during international competition. One of the issues for athletes engaged in Olympic sports is that they travel abroad year-round for competitions, in contrast with collegiate athletes who compete domestically. These circumstances likely make access to quality care very variable and make the follow-up of care less centralised. Also, access to resources among these sports is highly variable (eg, medal-dependant), 14 and at the discretion of the sport’s leadership (eg, sport federation), who may decide to prioritise more or fewer resources to concussion management considering the relatively low incidence of this injury. Another explanation for the longer recovery times in these athletes could be the lack of financial incentives to return to play faster, which are less prevalent among Olympic sports compared with professionals. However, the stakes of performance and return to play are still very high among these athletes.

Additionally, it is plausible that studies vary their outcome with shifting operational definitions such as resolution of symptoms, return to activities, graduated return to play or unrestricted RTS. 19 40 It is understood that resolution of symptoms may occur much earlier than return to preinjury performance levels. Finally, an aspect that has been little studied to date is the influence of the sport’s demands on the RTS. For example, acrobatic sports requiring precision/technical skills such as figure skating, trampoline and diving, which involve high visuospatial and vestibular demands, 41 might require more time to recover or elicit symptoms for longer times. Anecdotally, athletes who experienced a long time to RTS (>100 days) were mostly from precision/skill-dependent sports in this sample. The sports demand should be further considered as an injury modifier. More epidemiological reports that consider the latest guidelines are therefore necessary to gain a better understanding of the true time to RTS and impact following SRC in Olympians.

Supporting early multidisciplinary access to care

In this study, athletes who obtained early access to multidisciplinary care after SRC recovered faster than those with late access to multidisciplinary care. This result aligns with findings showing that delayed access to a healthcare practitioner delays recovery, 19 including previous evidence in a sample of patients from a sports medicine clinic (ages 12–22), indicating that the group with a delayed first clinical visit (eg, 8–20 days) was associated with a 5.8 times increased likelihood of a recovery longer than 30 days. 5 Prompt multidisciplinary approach for patients with SRC is suggested to yield greater effectiveness over usual care, 3 6 17 which is currently evaluated under randomised controlled trial. 42 Notably, early physical exercise and prescribed exercise (eg, 48 hours postinjury) are effective in improving recovery compared with strict rest or stretching. 43 44 In fact, preclinical and clinical studies have shown that exercise has the potential to improve neurotransmission, neuroplasticity and cerebral blood flow which supports that the physically trained brain enhanced recovery. 45 46 Prompt access to specialised healthcare professionals can be challenging in some contexts (eg, during international travel), and the cost of accessing medical care privately may prove further prohibitive. This barrier to recovery should be a priority for stakeholders in Olympic sports and given more consideration by health authorities.

Estimated incidences and implications

The estimated incidences of SRC were in the lower range compared with what is reported in other elite sport populations. 1 2 However, the burden of injury remained high for these sports, and the financial resources as well as expertise required to facilitate athletes’ rehabilitation was considerable (median number of consultations: 9.0). Notably, the current standard of public healthcare in Canada does not subsidise the level of support recommended following SRC as first-line care, and the financial subsidisation of this recommended care within each federation is highly dependent on the available funding, varying significantly between sports. 14 Therefore, the ongoing efforts to improve education, prevention and early recognition, modification of rules to make the environments safer and multidisciplinary care access for athletes remain crucial. 7

Strength and limitations

This unique study provides multisport characteristics following the evolution of concussion guidelines in Summer and Winter Olympic sports in North America. Notably, it features a balance between the number of female and male athletes, allowing the analysis of sex differences. 23 26 In a previous review of 171 studies informing consensus statements, samples were mostly composed of more than 80% of male participants, and more than 40% of these studies did not include female participants at all. 26 This study also included multiple non-traditional sports typically not encompassed in SRC research, feature previously identified as a key requirement of future epidemiological research. 47

However, it must be acknowledged that potential confounding factors could influence the results. For example, the number of SRC detected during the study period does not account for potentially unreported concussions. Nevertheless, this figure should be minimal because these athletes are supervised both in training and in competition by medical staff. Next, the sport types were heterogeneous, with inconsistent risk for head impacts or inconsistent sport demand which might have an influence on recovery. Furthermore, the number of participants or sex in each sport was not evenly distributed, with short-track speed skaters representing a large portion of the overall sample (32.5%), for example. Additionally, the number of participants with specific modifiers was too small in the current sample to conclude whether the presence of precise characteristics (eg, history of concussion) impacted the time to RTS. Also, the group with late access was more likely to consist of athletes who sought specialised care for persistent symptoms. These complex cases are often expected to require additional time to recover. 48 Furthermore, athletes in the late group may have sought support outside of the institute medical clinic, without a coordinated multidisciplinary approach. Therefore, the estimation of clinical consultations was tentative for this group and may represent a potential confounding factor in this study.

This is the first study to provide evidence of the prevalence of athletes with SRC and modifiers of recovery in both female and male elite-level athletes across a variety of Summer and Winter Olympic sports. There was a high variability in access to care in this group, and the median (IQR) time to RTS following SRC was 34.0 (21.0–63.0) days. Athletes with earlier access to multidisciplinary care took nearly half the time to RTS compared with those with late access. Sex had a meaningful influence on the recovery pathway in the late access group. Initial symptom number and severity score but not history of concussion were meaningful modifiers of recovery. Injury surveillance programmes targeting national sport organisations should be prioritised to help evaluate the efficacy of recommended injury monitoring programmes and to help athletes engaged in Olympic sports who travel a lot internationally have better access to care. 35 49

Ethics statements

Patient consent for publication.

Not applicable.

Ethics approval

This study involves human participants and was approved by the ethics board of Université de Montréal (certificate #2023-4052). Participants gave informed consent to participate in the study before taking part.

Acknowledgments

The authors would like to thank the members of the concussion interdisciplinary clinic of the Institut national du sport du Québec for collecting the data and for their unconditional support to the athletes.

Supplementary materials

Supplementary data.

This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

X @ThomasRomeas

Correction notice This article has been corrected since it published Online First. The ORCID details have been added for Dr Croteau.

Contributors TR, FC and SL were involved in planning, conducting and reporting the work. François Bieuzen and Magdalena Wojtowicz critically reviewed the manuscript. TR is guarantor.

Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

Competing interests None declared.

Patient and public involvement Patients and/or the public were not involved in the design, or conduct, or reporting, or dissemination plans of this research.

Provenance and peer review Not commissioned; externally peer reviewed.

Supplemental material This content has been supplied by the author(s). It has not been vetted by BMJ Publishing Group Limited (BMJ) and may not have been peer-reviewed. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by BMJ. BMJ disclaims all liability and responsibility arising from any reliance placed on the content. Where the content includes any translated material, BMJ does not warrant the accuracy and reliability of the translations (including but not limited to local regulations, clinical guidelines, terminology, drug names and drug dosages), and is not responsible for any error and/or omissions arising from translation and adaptation or otherwise.

Read the full text or download the PDF:

Democracy and Foreign Direct Investment in BRICS-TM Countries for Sustainable Development

Cite this article

You have full access to this open access article

descriptive statistics section in research paper

5 Altmetric

The study aims to examine the long-term cointegration between the democracy index and foreign direct investment (FDI). The sample group chosen for this investigation comprises BRICS-TM (Brazil, Russia, India, China, South Africa, Turkey [Türkiye], and Mexico) countries due to their increasing strategic importance and potential growth in the global economy. Data from 1994 to 2018 were analyzed, with panel data analysis techniques employed to accommodate potential structural breaks. The level of democracy serves as the independent variable in the model, while FDI is the dependent variable. Inflation and income per capita are considered control variables due to their impact on FDI. The analysis revealed a long-term relationship with structural breaks among the model’s variables. Democratic progress and FDI demonstrate a correlated, balanced relationship over time in these countries. Therefore, governments and policymakers in emerging economies aiming to attract FDI should account for structural breaks and the correlation between democracy and FDI. Furthermore, the Kónya causality tests revealed a causality from democracy to FDI at a 1% significance level in Mexico, 5% in China, and 10% in Russia. From FDI to democracy (DEMOC), there is causality at a 5% significance level in Mexico and a 10% significance level in Russia. Thus, the findings suggest that supporting democratic development with macroeconomic indicators in BRICS-TM countries will positively impact foreign direct capital inflows.

Graphical Abstract

descriptive statistics section in research paper

Avoid common mistakes on your manuscript.

Introduction

Economies and governments require capital infusion to augment their production and employment levels. Underdeveloped and developing nations, despite having an abundance of land and labor, grapple with capital deficiencies. Consequently, these countries often seek foreign direct investment (FDI) to address this capital shortfall. Even emerging market economies are not immune to this phenomenon, with challenges intensifying globally post-COVID-19 pandemic. Khan et al. ( 2023 ) highlighted the pivotal role of institutional quality and good governance in attracting FDI. The need for FDI has grown exponentially in an increasingly globalized world characterized by interdependence among states. Democracy and the democratic status of states emerge as critical indicators of institutional quality. Kilci and Yilanci ( 2022 ) posit that the prolonged pandemic triggered the third most significant recession since the Great Depression of 1929 and the Global Financial Crisis of 2008–2009. Consequently, the demand for FDI has surged, positioning foreign investment as the foremost resource for fostering sustainable economic development. In light of the provided frame, this study addresses the following research questions:

What factors attract foreign direct investment to a country?

Which factors positively impact FDI?

Reviewing the existing literature reveals that scholars from diverse disciplines address similar questions using political variables like political stability and democracy levels or economic variables such as economic stability and natural resources . However, the impact of democracy on FDI is often overlooked . For example, studies by Baghestani et al. ( 2019 ) and Gür ( 2020 ) investigated variables like oil prices, exchange rates, exports, imports, and the global innovation index but seldom considered democracy’s role in attracting FDI . Similarly, studies examining the relationship between democracy and FDI, like those by Yusuf et al. ( 2020 ) and Ahmed et al. ( 2021 ), generally excluded data from BRICS-TM countries.

Li and Resnick ( 2003 ) assert that the two paramount features of modern international political economy are the proliferation of democracy and increased economic globalization . It has become apparent that FDI inflow is a manifestation of high-level globalization and the diffusion of democracy. According to the United Nations Conferences on Trade and Development (UNCTAD), 2002 data between 1990 and 2000, three-quarters of the total international foreign direct capital was directed toward democratic and developed countries (Busse, 2003 ).

The conceptualization of democracy, within both theoretical and historical frameworks, has been marked by inherent challenges (Suny, 2017 ). Aliefendioğlu ( 2005 ) defines democracy as the amalgamation of the ancient Greek terms “Demos” and “Kratos,” centered on the principle of self-governance by the people. In essence, democracy encompasses the utilization of popular sovereignty by and for the citizenry (Keser et al., 2023 ). Haydaroğlu and Gülşah ( 2016 ) contend that the contemporary manifestation of democracies is rooted in representative democracy, wherein individuals exercise their sovereignty by selecting representatives to act on their behalf. The spread of liberal or representative democracy is believed to be a driving force behind this shift in economic structures. The relational intersection between FDI flow and democratic mechanisms needs to be investigated. At this point, Voicu and Peral ( 2014 ) argue that economic development and modernization operate as background factors that affect the development of support for democracy. Therefore, an opinion emerges that there is an inevitable intersection between FDI flow and democratic mechanism.

Despite the sustained attention from academia and the public, the detailed understanding of democracy’s effect on FDI remains limited (Li & Resnick, 2003 ). There is a noticeable gap in the literature concerning studies investigating the impact of democracy on FDI, specifically in BRICS-TM countries , which are emerging markets that attract significant FDI. Moreover, the absence of structural break panel cointegration tests in previous analyses accentuates these gaps, forming the primary motivation for this research . The study aims to fill these voids by empirically examining the relationship between democracy and FDI using data from the emerging markets of BRICS-TM countries. These countries require substantial foreign capital and are crucial for the stable development of the global economy since they are expected to become pivotal centers in the multipolar world system. The study differs from other publications, employing unique methods, such as structural break panel cointegration tests, to address these objectives.

Reducing costs, increasing employment-oriented production, and enhancing export capacity are paramount in global competition. If a country cannot achieve these advancements with its existing potential and dynamics, attracting foreign capital becomes imperative, necessitating the creation of multiple attraction points to entice foreign direct investments. Consequently, attracting foreign capital is significant in today’s globalized world. This study provides insights into this pressing issue in the contemporary global competitive landscape by analyzing the long-term relationship between democracy and foreign direct investment. Considering their prominence in the world economy due to recent economic growth and competitive structures, the selection of BRICS-TM countries as a sample group underscores the study’s importance. The study acknowledges the strategic importance and increasing power of BRICS-TM countries, especially China and India, which have consistently attracted significant foreign capital in recent years. Using panel data analysis techniques that incorporate structural breaks addresses a crucial gap in the literature, offering a more accurate analysis of the democracy-foreign direct investment relationship in the BRICS-TM sample group. However, data constraints related to model variables alongside the limitations of evaluating results within the framework of the chosen sample group are acknowledged later in the “ Discussion ” section.

Lastly, there appears to be a gap in the existing literature concerning studies that investigate the impact of democracy on FDI flow in BRICS-TM countries . The countries that attract more FDI than others raise the question of whether their democracy level empirically influences the amount of FDI. Moreover, upon examining the limited studies exploring the relationship between democracy and FDI, it is evident that none applied the structural break panel cointegration test in their analyses. These gaps collectively serve as the primary motivation for this research. Thus, the study aims to address these gaps in the existing literature and scrutinizes whether there is cointegration between the level of democracy and FDI in a country by utilizing sample group data from emerging markets of BRICS-TM countries. This selection is significant as these countries are among emerging economies with considerable developmental potential. In essence, this study aims to empirically unveil the relationship between democracy and FDI , a crucial requirement for developing economies striving to attract more foreign capital for sustainable development . Additionally, this study employs distinctive methods, such as the structural break panel cointegration test, to investigate the subject, further elaborated in the “ Research Method and Econometric Analysis ” section.

In global competition, the imperative to reduce costs, increase employment-oriented production, and enhance export capacity is paramount. Given a country’s potential and dynamics, if these enhancements prove elusive, the necessity arises to attract foreign capital and establish various attraction points to incentivize foreign direct investments. Therefore, attracting foreign direct investment (FDI) to a country holds tremendous significance in today’s globalized world. Before investing, foreign capital rigorously assesses the potential profit opportunities and scrutinizes various socio-economic indicators, especially democracy. For these reasons, by analyzing the long-term relationship between democracy and foreign direct investment in the BRICS-TM sample, this study incorporates analyses and inferences regarding this crucial challenge in today’s globally competitive environment.

Furthermore, it is anticipated that the strategic importance and influence of BRICS-TM countries will continue to escalate in the upcoming years. Notably, countries in the sample group, particularly China and India, have consistently attracted substantial foreign capital, and their economies exhibit ongoing growth. As evident from the graphical analysis in the study, China stands out as the world leader in attracting foreign direct investment. Considering the economic size of Russia and Brazil, the geo-strategic location of Türkiye, and the natural resource wealth of China, India, and Mexico, it is apparent that these countries are central attractions for foreign direct capital. Events with significant consequences on the global stage, such as economic crises, wars, earthquakes, and elections, can induce substantial fluctuations and structural breaks in national economies. Hence, using panel data analysis techniques that allow for structural breaks in the study fills a critical gap in the literature. This approach provides a more accurate analysis of the democracy-foreign direct investment relationship in the BRICS-TM sample group. The primary limitation in the study’s analysis is the constraint arising from the variables included in the model. Additionally, selecting the BRICS-TM sample group as the focus on developing countries can be considered another limitation, restricting the evaluation of results within this specific sample framework. The study anticipates that the policy recommendations derived from the analysis findings will guide policymakers, market players, and new researchers.

The article is organized into the following sections: (1) “ Introduction ” section: This section initially furnishes broad information concerning the subject matter, elucidating the lacunae in the existing literature and delineating the limitations of the study. (2) “ Theoretical Frame and Literature Review ” section: Subsequently, the second section delves into the examination of the theoretical framework, scrutinizing the prevailing status of the literature. (3) “ Research Method and Econometric Analysis ” section: The third segment comprehensively addresses the research methodology employed and expounds upon the econometric analysis conducted. (4) “ Results ” section: The ensuing fourth chapter presents the study’s findings and results. (5) “ Discussion ” section: These results and findings are then systematically expounded upon in the fifth chapter within the context of the current literature. (6) “ Conclusion ” section: Culminating the study is a concluding section encapsulating the critical insights derived, followed by policy recommendations.

Theoretical Frame and Literature Review

As previously indicated, scarce studies have delved into the correlation between democracy and foreign direct investment (FDI). A comprehensive examination of the existing literature reveals a notable dearth of research focused on BRICS-TM countries, with most of them overlooking “democracy” as a variable and/or the connection between “democracy and FDI.” Conversely, researchers investigating FDI predominantly explore its associations with other variables, such as “exports and imports.”

The Status of the Literature on BRICS-TM Countries and Democracy and Foreign Direct Investment

The following two tables summarize the status of the current literature on the issue and its findings. In Table  1 , the literature on BRICS and/or BRIC + S + T + M countries, as well as its variables, methods, and findings, is given. Then, in Table  2 , the studies researching the relationship between democracy and FDI, their methodology, sample groups, and findings are summarized.

As can be seen in Table  1 , BRICS-TM countries were very rarely studied, and almost all of these studies neglected “democracy” as a variable and/or the relation between “democracy and FDI.” Alternatively, the studies that did examine FDI researched its relation with other variables such as export and import. Unique methods, such as structural break panel cointegration tests, were applied to investigate the issue, and this method comprises the novel part of the study. The details can be seen under the “ Research Method and Econometric Analysis ” section.

In summary, the literature review provided in Table  1 covers the relationship between democracy, foreign direct investment (FDI), and various other economic variables, focusing on BRICS-TM countries. Below is an analysis of the essential findings and gaps identified in the literature:

By applying AI (ChatGPT) to the information provided in Table  1 (studies on BRIC + S + T + M countries), key findings are double-checked and summarized below:

Limited focus on BRICS-TM countries: The literature review notes a scarcity of studies on BRICS-TM countries, with a lack of attention to the “democracy” variable in the context of FDI.

Variable relationships explored: Various studies investigate the relationships between different economic variables and FDI, such as oil prices, exchange rates, gross domestic product (GDP), international tourism, economic output, carbon emissions, exports, imports, and innovation.

Diverse methodologies: Researchers employ diverse methodologies, including directional analysis, panel ARDL cointegration, survey research, and panel cointegration, to analyze the relationships among variables.

Within this frame, a summary of the studies investigating the relationship between democracy and FDI or using similar variables is given in Table  2 .

As presented in Table  2 , none of the above studies analyzed the relationship among democracy, FDI, inflation , and GDP variables for BRICS-TM countries. In addition, none of the studies applied a structural break panel cointegration test in their analysis. All these gaps motivate the authors of this study to conduct such research.

Additionally, applying AI (ChatGPT) to the information provided in Table  2 , key findings from Table  2 are double-checked and summarized below (studies on the relationship between democracy and economics):

Limited studies on democracy and FDI in BRICS-TM: The literature highlights a gap in research, as none of the studies in Table  2 specifically analyze the relationship between democracy, FDI, inflation, and GDP variables in BRICS-TM countries.

Contradictory findings on democracy and economic growth: The studies in Table  2 present contradictory findings on the impact of democracy on economic growth. Some find a positive and significant effect, while others do not establish a significant relationship.

Methodological variety: Various methods, such as dynamic fixed effects, panel data regression analysis, panel cointegration, and causality analysis, are employed to explore the relationships between democracy, FDI, and economic growth.

Upon inspection of the limited studies, contradictory results emerge, even when employing data from diverse sample groups. An illustrative example is found in the work of Busse ( 2003 ), whose research can be summarized as follows:

Results from regression analysis between FDI and democracy reveal that analogous to studies by Rodrik ( 1996 ) and Harms and Ursprung ( 2002 ), multinational corporations (MNCs) exhibit a preference for countries where political rights and freedoms are legally and practically safeguarded.

Countries that enhance their democratic rights and freedoms tend to attract more FDI per capita than predicted (Busse, 2003 ).

Li and Resnick ( 2003 ) posited that investors typically favor regimes with advanced democracy and robust legal systems over states where their properties are at risk in dictatorial regimes. From this standpoint, one can infer that a significantly high level of democracy correlates with a markedly high level of FDI. In other words, property rights violations are diminished in developing countries with robust democracies, leading to increased FDI levels (Li & Resnick, 2003 ).

However, Haggard ( 1990 ) presents a contrary perspective, arguing that authoritarian regimes may appeal more to investors seeking to safeguard their economic assets and properties. An amalgamation of opposing views arises: investors from countries with underdeveloped democracies prefer collaboration with authoritarian regimes, whereas investors from developed nations lean toward familiar democratic regimes.

Despite the contradictory and complex findings from the limited number of studies on the potential relationship between democracy and FDI, it is contended that two influential factors contribute to investment flow toward countries with legally guaranteed and well-developed democratic rights. Firstly , as proposed by Spar ( 1999 ), a transition occurs from critical sectors like agriculture and raw materials to production and tertiary sectors in the flow and stock structure of FDI in developing countries. Secondly , there is a transformation in the interest and motivation of multinational enterprises toward developing countries based on sectoral development (Busse, 2003 ). This underscores the impact of democratic organizations established to secure democratic rights on FDI. In instances where poor democratic governance renders a country less appealing to foreign investors, the country faces a dilemma: choosing between the limited options of “loss of foreign capital” or “democratization” (Li & Resnick, 2003 ). Spar ( 1999 ) emphasizes that as the reliance on governments and their policies decreases, the need for a more democratic environment, a reliable and stable legal system, and appropriate market conditions becomes increasingly crucial for the overall well-being of the country’s economy.

Upon scrutinizing the most recent studies on the subject, a trend of contradictory findings becomes apparent. For instance, Yusuf et al. ( 2020 ) found that the democracy coefficient, as a variable signifying its impact on economic growth, lacks significance for West African countries in the short and long run. In contrast, Putra and Putri ( 2021 ) asserted that “democracy has a positive and significant effect on economic growth in 7 Asia Pacific countries.” Similar to Yusuf et al., in a panel data analysis encompassing the period from 1970 to 2014 and involving 115 developing countries, Lacroix et al. ( 2021 ) concluded that “democratic transitions do not affect foreign direct investment (FDI) inflows.”

A comprehensive review of existing empirical studies reveals a notable scarcity in the number of inquiries into the relationship between democracy and foreign direct investment (FDI) (Li & Resnick, 2003 ). Moreover, the available studies yield contradictory results on this matter. Addressing this issue, it is noteworthy that Oneal ( 1994 ) conducted one of the initial qualitative examinations on the impact of regime characteristics on FDI. Despite not identifying a statistically valid relationship between regime type and FDI flow, Oneal’s research is an early exploration of this intricate relationship.

Explorations into the connection between investor behavior and political regime characteristics, particularly in determining whether democratic or authoritarian features foster more foreign direct investment (FDI), have yielded divergent outcomes. Derbali et al. ( 2015 ) found a statistically significant relationship between FDI and democratic transformation. Through an econometric analysis encompassing a sample of 173 countries, with 44 undergoing democratic transformation between 1980 and 2010, the authors observed a substantial increase in FDI flow associated with democratic transitions.

Castro ( 2014 ) conducted a test examining the relationship between foreign direct investment (FDI) flow (the ratio of FDI flow to GDP) and indicators of “democracy” and “dictatorship” using a dynamic panel data model. Despite the analysis results failing to furnish evidence supporting a direct connection between FDI and democracy, the author emphasizes that this outcome does not negate the impact of political institutions on the flow of FDI. According to Mathur and Singh ( 2013 ), their study stands out as the inaugural examination focusing on the “importance given to economic freedom rather than political freedom” in the decision-making process of foreign investors. The authors concluded that contrary to conventional expectations, even democratic countries may attract less foreign direct investment (FDI) if they do not ensure guaranteed economic freedom. Malikane and Chitambara ( 2017 ) conducted a study exploring the relationship between democracy and foreign direct investment (FDI), employing data from eight South African countries from 1980–2014. The research findings indicate a direct and positive impact of FDI on economic growth due to the robust democratic institutions emerging as crucial catalysts in the respective sample countries.

Consequently, Malikane and Chitambara’s ( 2017 :92) study suggests that the influence of FDI on economic growth is contingent upon the level of democracy in the host country. Upon scrutinizing the studies above, a pattern of conflicting findings emerges concerning the relationship between the level of democracy and the influx of foreign direct investment (FDI) to a country . Studies commonly emphasize that the impact of democracy on FDI depends upon each country’s developmental stage. The prevalence of confusion, varying findings, and conflicting results underscores the significance of empirical analyses on this matter. A comprehensive examination of the overview identified gaps, and the need for new research is detailed under the subsequent subheading.

Overview of the Literature, Identified Gaps, and Requirements for New Research

After a detailed overview of the existing literature, the main features and gaps can be identified as follows:

Limited studies on democracy and FDI: The literature notes a scarcity of studies examining the relationship between democracy and FDI, and existing studies present conflicting results.

Context-dependent impact of democracy: Contradictory findings suggest that democracy’s impact on FDI may vary depending on a country’s development level.

Gap in BRICS-TM studies: The identified gap in the literature is the lack of research specifically addressing the relationship between democracy and FDI in BRICS-TM countries. The need for a structural break panel cointegration test is also emphasized.

Influence of political institutions: Some studies argue that solid democratic institutions positively influence FDI, while others suggest that economic freedom, rather than political freedom, may be more crucial for attracting FDI.

Requirements for new research: To fill the gap in the literature, new research should be conducted specifically targeting BRICS-TM countries.

Thus, when c onsidering the contradictory findings, future studies should explore the contextual factors influencing the relationship between democracy and FDI in different country settings. Conducting longitudinal analyses could provide insights into the dynamic relationship between democracy and FDI over time. Comparative studies between countries with different levels of democratic development can help in understanding the nuanced impact of democracy on FDI. Last but not least, given the emphasis on structural break panel cointegration tests, future research could incorporate these analytical tools for a more comprehensive understanding of the relationships under consideration.

Last but not least, Olorogun ( 2023 ) conducted research using data from sub-Saharan countries from 1978 to 2019 and found a “long-run covariance between sustainable economic development and foreign direct investment (FDI)” and a “significant level of causality between economic growth and financial development in the private sector, FDI, and export.” So, if a significant relationship can be found between democracy and foreign direct investment, the results may also provide a useful assessment for sustainable development.

In summary, while the literature review reveals valuable insights into the complex relationship between democracy, FDI, and economic variables, there is a clear need for more targeted research in the context of BRICS-TM countries by further exploration of the contextual factors influencing these relationships.

Research Method and Econometric Analysis

This section of the study delves into the analysis methods and interpretations of the relationship between democracy and foreign direct investment (FDI). The presentation encompasses the dataset and model specifications concerning the variables under scrutiny. Specifically, analyses were conducted utilizing econometric analysis programs, namely, EViews 12 , Gauss 23 , and StataMP 64 . The study culminated with interpreting findings and formulating policy recommendations based on the results obtained.

Data Set and Model

The study scrutinized the hypothesis to address the initial research inquiry, asserting a correlation between democracy and foreign direct investment (FDI). The research targeted BRICS-TM countries (Brazil, Russia, India, China, South Africa, Türkiye, Mexico) recognized for their increasing prominence in the global economy and anticipated growth in strategic significance. These seven emerging markets were chosen due to their demonstrated potential to attract FDI. The research covered annual data spanning 1994–2018 by employing panel data analysis techniques capable of accommodating structural breaks. Both democracy and foreign direct investments are susceptible to the influence of local and global dynamics, which can induce significant disruptions in the variables.

Consequently, the study utilized tests allowing for structural breaks to enhance the robustness of the analyses. The investigation aimed to uncover the long-term relationship between foreign direct investment and democracy , a critical indicator of economic development for emerging markets in recent years. The model developed for examining the relationship between democracy and foreign direct investment within the specified sample and data range is represented by Eq.  1 :

In the model, cross-section data is represented by i  = 1, 2, 3,…. N , while the time dimension is represented by t  = 1, 2, 3,….. T , and the error term is by ɛ.

The study’s model setup and variables were adapted from Yusuf et al. ( 2020 ), Putra and Putri ( 2021 ), and Lacroix et al. ( 2021 ) in the literature. Figure  1 shows the research design.

figure 1

Research design

Table 3 shows the variables and data sources used in the model.

The study designated foreign direct investment (FDI), denoted as LNFDI, as the dependent variable. The independent variable was conceptualized as the democracy variable (DEMOC). To account for potential influencing factors, inflation (INF) and per capita income (PGDP) variables, known to impact FDI, were introduced into the model as control variables to draw upon insights from the existing literature. In the context of panel data analyses, selecting control variables involves consulting the literature to identify factors with substantial influence on the dependent variable. When examining factors impacting foreign direct investment (FDI), a frequently encountered category comprises various macroeconomic variables, among which inflation and per capita income are recurrently employed. Given the study’s sample composition—comprising the BRICS-TM countries—these two variables were incorporated into the model as control variables. This decision was motivated by their recurrent utilization in the literature and their direct relevance to foreign direct investments and production costs. Furthermore, the inclusion of these variables addressed a shared data constraint.

During the data collection phase, the study utilized indices reflecting “political rights” and “civil liberties,” which were acknowledged indicators of “democracy” in the literature. These indices, sourced from the Freedom House Database ( 2020 ), were incorporated into the analysis by calculating their means, which were then used as values for the democracy variable. This approach aligns with the practices of several researchers in the existing literature, such as Kebede and Takyi ( 2017 ), Doucoligaos and Ulubasoglu ( 2008 ), and Tavares and Wacziarg ( 2001 ), who have employed this index. The index operates on a scale from 1 to 7, where 1 represents the highest state of democracy and 7 corresponds to the lowest state. To facilitate analyses, calculations, and interpretation, the index values were scaled to ensure a range between 0 and 100.

Freedom House assesses the degree of democratic governance in 29 countries from Central Europe to Central Asia through its annual “Nations in Transit” report. The democracy score encompasses distinct ratings on various facets, including national and local governance, electoral processes, independent media, civil society, judicial framework and independence, and corruption. Most researchers (Dolunay et al., 2017 ; Martin et al., 2016 ; Osiewicz & Skrzypek, 2020 ; Steiner, 2016 ) frequently utilize the data provided by Freedom House in their studies. In addition to the independent variable of democracy (DEMOC), the model integrates control variables influencing FDI. Capitation (LNPGDP) and inflation (INF) variables were incorporated within this framework. A review of the existing literature reveals that factors affecting FDI, including inflation and per capita income, have been employed in models by researchers (Botric & Skuflic, 2005 ; Chakrabarti, 2001 ; Jadhav, 2012 ; Ranjan & Agraval, 2011 ; Vijayakumar et al., 2010 ).

In the literature, various variables such as “trade openness, level of human capital, unemployment rates, government supports, tax costs,” which are believed to influence foreign capital, are employed as control variables in models. On the other hand, in some research, the impact of institutional quality, such as democracy and governance, on environmental quality is studied. Within this frame, Shahbaz et al. ( 2023 ) found that “institutional quality variables impacted environmental quality differently. In this sense, it is detrimental for policymakers to consider concerted measures to decrease institutional vulnerabilities and reduce the level of the informal economy.” However, in this study, inflation and per capita income variables were chosen due to their prominence as the most frequently used variables in the literature (detailed in the “ Theoretical Frame and Literature Review ” section) and their comprehensive impact on foreign direct capital in terms of macroeconomics.

Furthermore, a shared data problem is evident in all variables from 1994 to 2018 for the BRICS-TM country sample group, particularly in variables other than the control variables in the model. Nevertheless, these issues have yet to be encountered as inflation and per capita income variables are comprehensive and fall within general macroeconomic data. Additionally, including many control variables in the model might obscure the significance of the effect on the dependent variable in hypothesis tests examining the relationship between democracy and foreign direct investment. Consequently, real GDP data, rather than nominal, were utilized in the analysis, and the logarithm of the data was represented as LNGDP.

As explored earlier, foreign investors prioritize economic freedom over political freedom when making investment decisions (Mathur & Singh, 2013 ). In this context, the assurance of economic liberty and the legal protection of property rights may be linked to the level of democracy, particularly in developed countries. This condition explains why the relevant variables should be incorporated into the model and tested. The logarithm of FDI (LNFDI) and per capita income (LNPGDP) variables were employed in the analyses. The rationale behind the logarithmic transformation lies in its capacity to facilitate the interpretation of analysis results and standardize variables on a specific scale. Additionally, taking logarithms of series does not result in information loss in data; it also aids in mitigating autocorrelation issues and allows the series to exhibit a normal distribution.

Econometric Method

The primary motivation behind the conducted study is to investigate the impact of the variable “democracy” on foreign direct investments through newly developed panel data analysis tests that allow for structural breaks, which are not commonly used in political science. In this regard, the study aims to be one of the pioneering works testing the relationship between variables related to political science and economics with an interdisciplinary perspective through innovative empirical studies. The methodological framework of this study, which analyzes the relationship between democracy and FDI through annual data from the 1994–2018 periods using panel data analysis and causality test, is outlined below:

Graphical representation of variables and analysis of descriptive statistics,

CD lm1 (Breusch & Pagan, 1980 ), CD lm1 , and LM adj tests (Pesaran et al., 2008 ) were used in the analysis to find the presence of cross-section dependence of variables.

Panel LM test (Im, Lee, & Tieslau, 2010 ) determined whether variables in the model have a unit root.

Delta test (Pesaran & Yamagata, 2008 ) was used to determine the homogeneity or heterogeneity of variables.

Cointegration test with multiple structural breaks (Westerlund & Edgerton, 2008 ) was conducted to determine the presence of cointegration between variables.

Kónya’s causality test (Kónya, 2006 ) was conducted to investigate the existence of causal relationships between variables.

In terms of methodology, the study aims to address a significant gap in the literature on democracy. Given the chosen sample group and the specified period, it becomes evident that structural changes must be considered in the analysis because the variables of democracy and foreign direct investment are particularly susceptible to global developments, leading to substantial shifts in the markets. A literature review indicates a preference for general country-based time series analyses over new-generation tests, with classical panel data analyses commonly employed for the selected country group. In summary, an examination of the literature reveals that studies on this issue predominantly rely on first- and second-generation linear panel data analysis techniques. Therefore, incorporating unit root and cointegration tests is crucial in significantly contributing to the literature, particularly by acknowledging and addressing structural breaks in the study. Additionally, it aligns with the theoretical framework that variables such as democracy and foreign direct capital investments, susceptible to the influence of global developments, are prone to structural changes. Consequently, employing panel data analysis techniques with structural breaks gains significance and enhances the motivation and scientific robustness of the study, mainly when a substantial data range is available.

The study focuses on the BRICS-TM countries: Brazil, Russia, India, China, South Africa, Türkiye Footnote 1 (Turkey), and Mexico . These nations have gained prominence in the global economy, and their strategic significance is anticipated to grow. The selection of this sample group is based on their demonstrated high performance and potential to attract substantial foreign direct investment globally. The study’s unique contribution lies in its examination of the impact of the democracy variable on foreign direct investments within this specific country group, employing innovative techniques not commonly found in the existing literature. Furthermore, the potential increase in foreign direct investment within these countries is expected to influence national and per capita incomes positively. The continuous enhancement of economic well-being and the rising accumulation of foreign direct investments could position these countries as new focal points of attraction in the medium and long term, fortifying their appealing characteristics.

Descriptive Statistics and Graphical Analysis of Variables

Graphical analyses provide valuable insights into the changes and fluctuations of variables over the years in econometric studies. The visual representation and interpretations of the study variables are presented in Fig.  2 .

figure 2

Graphical representation of variables

The graphical analysis reveals the trend and volatility of FDI over the study period (1994–2018). Peaks and troughs may indicate significant events or economic shifts influencing FDI.

Democracy index: The graphical representation illustrates the changes in the democracy index across the selected countries. Distinct patterns or shifts may be observed, indicating periods of democratic development or regression.

Inflation (INF): The inflation variable is depicted graphically, highlighting its trajectory over the analyzed years. Fluctuations in inflation rates may correlate with economic events impacting FDI.

Per capita income (PGDP): The per capita income variable is visually presented, demonstrating its variations and trends. Per capita income changes can influence countries’ attractiveness for foreign investments.

These graphical analyses serve as a foundation for understanding the dynamics of the variables under investigation and provide a visual context for further econometric interpretations.

So Fig.  2 provides a comprehensive overview of the variables examined in the study. The following key observations can be made:

Foreign direct investment (FDI): China stands out as the leader in attracting the highest FDI among the BRICS-TM countries. South Africa exhibits the lowest FDI levels in the sample group.

Democracy index: China also holds the highest score in the democracy index, indicating its position as the most democratic among the selected countries. South Africa, on the other hand, has the lowest democracy index score.

Per capita income (PGDP): Russia demonstrates the highest per capita income among the countries, suggesting a relatively higher economic well-being. India, conversely, has the lowest per capita income in the sample group.

Inflation (INF): Russia and Türkiye experience the highest inflation rates, while other countries exhibit fluctuating patterns at lower and similar levels.

Table 4 provides a detailed overview of the descriptive statistics for the variables under consideration. The following key statistics offer insights into the central tendencies and variations within the sample group.

The analysis of the basic descriptive statistics in Table  4 yields several noteworthy findings:

Kurtosis values: The INF variable stands out with a kurtosis value exceeding 3, indicating a sharp peak and heavy tails in its distribution. All other variables exhibit kurtosis values below 3, suggesting relatively normal distributions without excessively heavy tails.

Skewness values: LNFDI and LNPGDP variables display negative skewness values, suggesting a longer left tail in their distributions. DEMOC and INF variables exhibit positive skewness values, indicating longer right tails in their distributions.

Jarque–Bera test: The Jarque–Bera test results indicate that the variables are statistically significant and deviate from a normal distribution. This departure from normality suggests that certain factors or events influence the distributions of the variables.

These findings provide insights into the shapes and characteristics of the variable distributions. As indicated by skewness and kurtosis values, the deviations from normality suggest that the variables may be subject to specific influences or events, contributing to their non-normal distributions. Researchers should consider these distributional characteristics when interpreting the results and drawing conclusions from the dataset.

Cross-section Dependence Test

The escalating interdependence among countries in global economies has rendered them susceptible to the impact of positive or negative developments in one nation affecting others. This phenomenon directly results from the deepening global integration associated with globalization. Consequently, econometric studies must incorporate cross-section dependence tests to gauge the extent of interaction between nations. Such tests aim to quantify how a shock in one country reverberates across borders, influencing other countries of the global economic landscape.

Studies addressing cross-section dependency (Andrews, 2005 ; Pesaran, 2006 ; Phillips & Sul, 2003 ) emphasize that failing to account for cross-section analysis may lead to biased and inconsistent results. Thus, all analyses should consider cross-sectional dependence in relevant studies (Breusch & Pagan, 1980 ; Pesaran, 2004 ).

The tests used to determine cross-section dependence were as follows:

When the time dimension is greater than the cross-section dimension ( T  >  N ), analyses were conducted using Breusch and Pagan’s ( 1980 ) CD lm1 test.

In cases when the time dimension is equal to the cross-section dimension ( T  =  N ), the CD lm2 test (Pesaran, 2004 ) was used to conduct analyses.

In cases when the time dimension was smaller than the cross-section dimension ( T  <  N ), analyses were conducted by CD lm test (Pesaran, 2004 ).

In cases when the time dimension is both smaller and greater than the cross-section dimension, analyses were conducted (LM adj ) test (Pesaran et al., 2008 ).

This study’s analysis focuses on the relationship between democracy and FDI across BRICS-TM countries, involving seven countries. With annual data spanning 1994–2018, the cross-section dimension is denoted by N  = 7 and the time dimension by T  = 25. Given that T  >  N , the study utilized the CD lm1 test (Breusch & Pagan, 1980 ) and CD lm1 and LM adj tests (Pesaran et al., 2008 ).

Given that T  >  N for the countries and time dimension, the decision-making is informed by the results of the CD lm1 and LM adj tests. Notably, LM adj test results were prioritized, considering the potential bias in cross-section dependency tests associated with the CD lm1 test. The findings of the cross-section dependence tests are presented in Table  5 .

Upon reviewing Table  5 , it is evident that the probability values for all variables are less than 0.01. Consequently, based on the LM adj test results, the null hypothesis stating “there is no dependence between sections” is rejected, while the alternative hypothesis suggesting “cross-section dependence between sections” is accepted.

The outcomes of the tests align with the characteristics of the contemporary global landscape, where any impactful event or development in one of the BRICS-TM countries has reverberations across others. Whether positive or negative, changes in one BRICS-TM nation can influence others, particularly in areas related to foreign direct investment (FDI) and democracy. As a result, policymakers in these countries should craft their future strategies with a keen awareness of this interconnectedness and the potential spillover effects on FDI and democracy. Indeed, the obtained result is consistent with theoretical expectations. The observed interdependence and influential power of the BRICS-TM country group align with the current dynamics of the globalized world. Their growing significance in the world economy and their strategic importance reinforces the decision that developments within these countries have substantial implications beyond their borders. This outcome urges the need for a nuanced approach to respond to the interconnected nature of these nations in the contemporary global landscape.

Panel Unit Root Test

In the initial phase of the econometric analysis, the stationarity of the variables in the models was determined through unit root analyses to address the spurious regression problem. Accurate results cannot be obtained when a unit root is present in a series of variables (Granger & Newbold, 1974 ). In panel data analysis, the primary consideration in stationarity tests is whether the countries are independent of each other or not. Unit root tests in panel data analysis comprise first- and second-generation tests, each with distinct characteristics. The first generation of unit root tests is further divided based on the homogeneity and heterogeneity assumptions of the countries. Some authors conducted tests under the homogeneity assumption (Breitung, 2005 ; Hadri, 2000 ; Levin et al., 2002 ), while some others pursued their analysis under the heterogeneity assumption (Choi, 2001 ; Im et al., 2003 ; Maddala & Wu, 1999 ).

Additionally, second-generation tests incorporate cross-section dependency into their analyses, whereas first-generation tests do not account for it. Given the dynamics of the global world, the use of second-generation tests in the literature is deemed more beneficial, as it is more realistic to assume that other countries will be affected by a shock experienced by one of the countries in the panel. Panel unit root tests have gained broader acceptance in time series analysis due to their ability to provide more meaningful results than standard stationarity tests. In recent years, there has been a preference for tests that allow for structural breaks, especially in series sensitive to economic variations such as foreign trade, exchange rates, and foreign capital. Hence, this study utilized panel unit root tests that consider structural breaks to assess the stationarity of variables susceptible to cyclical fluctuations, including democracy, inflation, per capita income, and FDI. Conducting stationarity tests without accounting for structural breaks can yield misleading results, making panel LM unit root tests with structural breaks the method of choice for this study.

The panel LM test (Im, Lee, & Tieslau, 2010 ) examines series in models with a level and trend, considering single and two breaks. In this study, analyses with a single break were preferred due to the shortness of the specified time interval and the events expected to cause breaks in the given period. The LM test statistics were employed to assess the hypothesis of “there is a unit root” (ϕ i  = 0). Compared to others, a distinctive feature of this test is its allowance for different breaking times for different countries. Moreover, it permits a structural break under both zero and alternative hypotheses, providing an additional advantage. The asymptotic distribution of the test follows the standard normal distribution, and it remains unaffected by the presence of a structural break. Table 6 presents the stationarity analysis results of the series for seven countries based on the model allowing breaks in level.

The analysis of Table  6  yields the following observations:

In unit root models allowing for a constant break, it is evident that all variables in the panel become stationary when their differences are calculated. In other words, since the series are stationary for the entire panel at the I(1) level, the necessary conditions for cointegration tests are met. The cointegration test indicates that global and local developments in countries cause structural breaks when considering these break dates.

On a country basis, the following conclusions can be drawn from Table  6 :

For the series whose differences are calculated, the FDI variable is stationary at the level value in Russia and India, while the same variable is stationary in India and Türkiye.

The per capita income variable is stationary at a level value only in Türkiye. However, the same variable is stationary in Brazil, India, and Türkiye for the series whose differences are computed.

The inflation variable is stationary at the level value in South Africa and Mexico. However, the same variable is stationary for the series whose differences are computed in Brazil, Russia, and China.

The democracy variable is stationary at the level value in Brazil, South Africa, and Türkiye. However, the variable is stationary in Brazil, Türkiye, and Mexico for the series whose differences are computed.

Table 7 shows the stationarity analysis results of seven countries based on the model that allows breaks in level and trend.

The results in Table  7 can be analyzed based on the following points:

General panel evaluation: Foreign direct investment (FDI) and per capita income variables are stationary at the level values when the panel is considered whole. Taking the difference of these variables increases the degree of stationarity. Inflation and democracy variables, among the other variables in the model, are stationary in the series when the difference is taken. However, they exhibit unit root characteristics at the level values. Overall, all series are stationary at the I(1) level with structural breaks for the entire panel. This suggests that the necessary conditions for the cointegration test are met. The dates of structural breaks indicate that social, political, and economic developments may have caused these breaks in the BRICS-TM countries included in the sample . These findings imply that significant events and changes in the socio-political and economic landscape of the BRICS-TM countries likely influence the structural breaks in the series.

Results from Table  7 can be interpreted on a country-specific basis as follows:

Brazil: FDI and per capita income are stationary at the level value. Inflation is stationary at the level, while democracy is stationary at the difference.

Russia: FDI and per capita income are stationary at the level value. Inflation is stationary at the level, while democracy is stationary at the difference.

India: FDI is stationary at the level value. Per capita income is stationary at the level, while inflation and democracy are stationary at the difference.

China: FDI is stationary at the difference. Per capita income is stationary at the level, while inflation and democracy are stationary at the difference.

South Africa: FDI is stationary at the level value. Per capita income is stationary at the level, while inflation and democracy are stationary at the difference.

Türkiye: FDI is stationary at the level value, per capita income is stationary at the level, and inflation and democracy are stationary at the difference.

Mexico: FDI is stationary at the difference. Per capita income is stationary at the level, while inflation and democracy are stationary at the difference.

These country-specific findings indicate variations in the stationarity characteristics of the variables, highlighting the importance of considering individual country dynamics in the analysis. The results of the panel unit root tests, both with and without structural breaks, provide insights into the stationarity of the variables. The interpretation suggests that a shock to one of the countries included in the model can lead to permanent effects that do not dissipate immediately. As confirmed by the tests, the non-stationarity of the series establishes the necessary condition for cointegration tests.

Moreover, when the same tests are conducted by taking the first-order differences of all series to achieve stationarity, it is observed that the variables become stationary at the I(1) level. This indicates that the variables are integrated in the first order, aligning with theoretical expectations. The I(1) characteristic implies that the variables exhibit a tendency to return to equilibrium after a shock, supporting the notion of long-run relationships among the variables.

Homogeneity Test of Cointegration Coefficients

The homogeneity of coefficients plays a crucial role in determining the relationship between variables in panel data studies. It helps organize subsequent tests used in the analysis. The homogeneity test examines whether the change in one country is affected at the same level by other countries. Coefficients are expected to be homogeneous in models for countries with similar economic structures, while they may be heterogeneous for countries with different economic structures. Pesaran and Yamagata ( 2008 ) developed the delta test based on Swamy ( 1970 ) to determine whether the slope parameters of cross-sections are homogeneous. The null hypothesis for this test is “slope coefficients are homogeneous.” Homogeneity, in the context of panel data analysis, implies that the coefficients of the slopes are the same for all units or entities within the panel. On the other hand, heterogeneity indicates that, at least in one of the entities, the slope coefficients differ from those in the rest of the panel. Testing for homogeneity helps assess whether the relationship between variables is consistent across all units or if there are significant variations.

As seen in Table  8 , the delta homogeneity test was performed to determine whether the slope coefficients of the model differ between units.

The delta test results indicate that the slope coefficients vary between units in the long term, given that the probability values for both test statistics are smaller than 0.05, as presented in Table  8 . This result suggests that the variables exhibit heterogeneity, implying that the relationships between variables are inconsistent across all units over the long term. The obtained result aligns with expectations and is consistent with the theory, indicating that the countries within the BRICS-TM sample exhibit different structures, and the coefficients are heterogeneous. This result suggests that the relationship between variables varies across these countries, emphasizing the sample group’s diverse economic characteristics and behaviors.

Panel Cointegration Test with Structural Break

Different methods are employed to determine the existence of long-term cointegration among the model’s variables. One set of methods is first-generation tests, which do not require cross-section dependence. The second set includes second-generation tests that consider cross-section dependence but do not incorporate structural breaks (Koç & Sarica, 2016 ). To obtain realistic and unbiased results, it is crucial to conduct tests that take structural breaks into account in cointegration analyses. Therefore, the panel cointegration test-PCWE (Westerlund & Edgerton, 2008 ) was employed, given that the series is stationary at the I(1) level.

PCWE was developed based on unit root tests that utilize Lagrange multiplier (LM) statistics, obtained from multiple repetitions (bootstrap). The merits of this test can be succinctly summarized as follows (Koç & Sarica, 2016 ; Göçer, 2013 ):

It takes into account cross-section dependency and structural breaks.

It accommodates heteroscedasticity and autocorrelation.

It identifies breaks at different dates for each country in terms of both constants and slopes.

Potential inherent problems in the model can be addressed with fully adjusted least squares estimators.

This test is effective in yielding reliable results even with small sample sizes.

This study opted for PCWE tests, given their robust characteristics. Additionally, considering the limited number of countries in the sample and the anticipation of few structural breaks in the specified period, the PCWE test was the preferred choice. As depicted in Table  9 , the determination of statistically significant cointegration between variables is made based on the significance levels of the probability values.

As indicated in Table  9 , cointegration is observed at a 5% significance level in the regime change model and a 1% significance level in the model without a break. The presence of cointegration suggests a long-term relationship between the variables of democracy and FDI in BRICS-TM. In simpler terms, democratic developments and FDI are correlated over the long run, indicating a balanced relationship between them. Future researchers may explore the direction of these variables across different samples. This study specifically tested the existence of a long-term relationship between FDI and democracy, and the inclusion of structural breaks was found to be significant. Governments and decision-makers, particularly in developing countries like BRICS-TM, should consider the relationship between democracy and FDI by taking structural breaks into account to attract foreign investment effectively. Therefore, it is emphasized that “any development related to democracy has the potential to influence FDI, and considering this factor is beneficial in the formulation and implementation of socio-economic policies.” No cointegration is observed in the “change at level” model. Indeed, the obtained results align with the study’s hypothesis. Considering the periods of structural breaks in the countries within the sample, it becomes evident that a long-term relationship exists between the variables incorporated into the model. This issue underscores the importance of considering not only the overall relationship between democracy and FDI but also the specific historical contexts and transitions in individual countries that might contribute to this relationship.

Regarding structural breaks in countries in the sample within the scope of cointegration in the regime change model, local and global developments, in general, cause breaks. The reasons for structural break dates in the sample countries are given in Table  10 .

The following items can be aligned with the breaking dates provided in Table  10 :

A recovery in macroeconomics and positive expectations toward agreements with the IMF became prominent after Russia’s transition economies in 1996.

2000 in Brazil is known as the period when the rapid growth trend started after passing the targeted inflation after the 1999 Russian Crisis.

Membership of China in the International Trade Union was evaluated as an essential development in the global economy in 2001.

Experiencing the biggest crisis in history in Türkiye in 2002 and starting a dominant single-party regime were remarkable developments.

The 2005 Election results in Mexico and the hurricane disasters, including an 8.7-magnitude earthquake, created significant socio-economic problems that year.

The ANC party’s coming to power alone in South Africa in 2009 was commented on as a consistent process for the national and regional economy; this situation also removed a series of uncertainties.

The devaluation experienced in India in 2016 has created a significant break.

Of course, the impact of such structural breaks should be considered. Toguç et al. ( 2023 ) argued that “differentiating these short-term and long-term effects has implications for risk management and policymaking.” Since structural break increases risks and uncertainty, foreign capital prefers to invest in other destinations.

Kónya’s Causality Test

This test (Kónya, 2006 ) investigates the existence of causality between variables using the seemingly unrelated regression (SUR) estimator (Zellner, 1962 ). One advantage of this test is that the causality test can be applied separately to the countries that make up the heterogeneous panel. Another important advantage is that it is unnecessary to apply unit root and cointegration tests, as country-specific critical values are produced. According to the test results, if the Wald statistics calculated for each country are greater than the critical values at the chosen significance level, the null hypothesis of “no causality between the variables” is rejected. In other words, a Wald statistic greater than the critical value indicates that there is causality between the variables.

The Kónya causality test results provided in Table  11 revealed a causality from democracy (DEMOC) to FDI at a 1% significance level in Mexico, 5% in China, and 10% in Russia. In addition, from FDI to democracy (DEMOC), there is causality at a 5% significance level in Mexico and a 10% significance level in Russia.

According to the results in Table  12 for the causality between foreign direct investment (FDI) and PGDP, the Kónya causality tests revealed a one-way causality from PGDP to FDI at a 10% significance level in Mexico.

According to the results provided in Table  13 for the causality between foreign direct investment (FDI) and inflation (INF), the results of the Kónya causality tests revealed a one-way causality from inflation to FDI at a 10% significance level in Türkiye and, conversely, a one-way causality from FDI to inflation at a 10% significance level in South Africa.

The study investigated the nexus between democracy and foreign direct investment (FDI) using annual data from a sample of seven countries within emerging markets from 1994–2019. According to cross-section dependence test results, all variables’ probability values were less than 0.01, indicating significant cross-section dependence. The rejection of the null hypothesis, stating “there is no dependence between sections” in favor of the alternative hypothesis suggesting “there is cross-section dependence between sections,” aligns with the contemporary global landscape. In today’s interconnected world, any impactful event or development in one of the BRICS-TM countries has reverberations across others, particularly in areas related to FDI and democracy. These findings underscore the imperative for governments and policymakers in these countries to craft future strategies with a keen awareness of this interconnectedness and the potential spillover effects on FDI and democracy.

Furthermore, the outcomes of the panel unit root test indicate that all variables in the panel become stationary at the I(1) level when their differences are calculated, meeting the necessary conditions for cointegration tests. This result suggests that global and local developments in countries cause structural breaks when considering these break dates. Variations in stationarity characteristics of variables were observed on a country basis, highlighting the importance of considering individual country dynamics in the analysis.

The delta homogeneity test results suggest that the variables exhibit heterogeneity, implying that the relationships between variables are inconsistent across all units over the long term. This aligns with expectations and emphasizes the diverse economic characteristics and behaviors within the sample group of BRICS-TM countries.

The Westerlund-Edgerton cointegration test results reveal significant cointegration between variables, observed at a 1% significance level in the model without a break and a 5% level in the regime change model. This result signifies a sustained relationship between FDI and democracy in BRICS-TM countries over the long term. Future researchers may explore the direction of these variables across different samples, while governments and decision-makers should consider this relationship, particularly in developing countries, to attract foreign investment effectively.

Kónya’s causality test results also provided significant causality between some of the variables in some countries within the sample group. Firstly, there is a causality from democracy (DEMOC) to FDI in Mexico (1% significance level), in China (5% significance level), and in Russia (10% significance level). Secondly, there is also a significant causality from FDI to democracy (DEMOC) in Mexico (5% significance level) and in Russia (10% significance level). Thirdly, a one-way causality could only be found from PGDP to FDI in Mexico (10% significance level). Fourthly, there is also a one-way causality from inflation to FDI in Türkiye (10% significance level) and a one-way causality from FDI to inflation in South Africa (10% significance level). Thus, Kónya’s causality test results supported the hypothesis of the research with significant results.

In conclusion, the empirical findings establish a statistically significant and robust relationship between the level of democracy and the flow of FDI in BRICS-TM countries. These findings underscore the intertwined nature of political and economic dynamics within these nations and highlight the importance of considering both aspects in policy formulation and decision-making processes.

The relationship between the democracy level and foreign direct investment (FDI) of BRICS-TM countries is an area that requires further exploration. Subsequently, comparing the findings of this study with those of previous research reveals its significance. While earlier studies predominantly concentrated on the preferences of host countries in attracting foreign investment, few delved into the factors influencing foreign investors’ choices. A notable exception is by Li and Resnick ( 2003 ), who highlighted the pivotal question of “Why do companies invest in foreign countries?” and proposed a theory positing that “democratic institutions impact FDI flow in both positive and negative ways” (Li & Resnick, 2003 :176). Their conclusions from data analysis of 53 developing countries spanning 1982–1995 align with the current study’s outcomes. Specifically, they found that (1) advancements in democracy lead to heightened property rights protection, fostering increased FDI inflows, and, (2) conversely, democratic improvements in underdeveloped nations result in diminished FDI flows. These findings correspond with our study, given that the sampled countries are a mix of developing and developed nations, mirroring the first scenario described by Li and Resnick.

Derbali et al. ( 2015 ) concluded in a similar vein in their study, examining a massive dataset spanning from 1980 to 2010 with 173 countries, 44 of which underwent democratic transformation. Their observation that “variables related to human development and individual freedom initiate the democratic transformation process, contrary to the social heterogeneity variable” aligns with the results of the present study when interpreted in reverse. This scenario prompts a chicken-and-egg question: Does the level of democracy positively influence the flow of FDI, or does FDI flow positively impact the level of democracy? The authors tackled this issue in the second stage of their analysis and determined that democratic transformation leads to a substantial increase in FDI inflows. Our findings corroborate this perspective with evidence from a different sample group of countries.

Malikane and Chitambara ( 2017 ) concluded in their study analyzing the relationship between FDI, democracy, and economic growth in eight South African countries from 1980 to 2014 that the FDI variable exhibits a direct and positive impact on economic development, explicitly implicating that strong democratic institutions serve as notable drivers of economic growth. Their findings suggest that the effect of FDI on economic growth is contingent on the level of democracy in the host country. In another study on developing countries, Khan et al. ( 2023 ) found that specific determinants of good governance, such as control of corruption, political stability, and voice and accountability, significantly attract FDI inflows. However, other determinants, including government effectiveness, regulatory quality, political system, and institutional quality, significantly reduce FDI inflows. On the contrary, they found that in Asian countries, all institutional quality indicators except control of corruption have a significant and positive effect on FDI inflows (Khan et al., 2023 ). The significant relationships identified between these phenomena across various indicators for developing and Asian countries align with the findings of our study.

Developed and developing nations actively engage in concerted efforts to attract foreign capital investments in the contemporary global economic landscape. Foreign direct investments (FDIs) stand out as a pivotal form of investment that significantly influences a country’s growth and development trajectory. The inflow of direct foreign capital brings multifaceted contributions to a nation’s economy, encompassing vital aspects such as capital infusion, technological advancement, elevated management standards, expanded foreign trade opportunities, employment generation, sectoral discipline, access to skilled labor, and risk mitigation.

In addition to all these, foreign direct investment (FDI) holds significant importance not only in the general context of sustainability but also specifically in sustainable development. To better understand this close relationship between sustainable development and FDI, first briefly examine the concept of sustainability. Simply put, sustainability entails maintaining a favorable condition through methods that cause no harm yet are supportable, legally and scientifically verifiable, defendable, and implementable (Ratiu, 2013 ). From a developmental perspective, it signifies maintaining continuity without losing control. According to Menger ( 2010 ), sustainability can be defined as the ability to grow and survive independently. The author emphasizes that the concept of sustainability is closely related to “creativity” and “cultural vitality,” as well as being an “internally growing” and “self-sustaining” trend with innovative effects that also attract different social strata.

Within the context of all these existing barriers and dilemmas, managing the process of reducing the negative aspects while increasing and offering the positives to people must be handled with care. This intricate process, termed sustainable development, is like the search for the cosmos in chaos as it aims to balance the economic, environmental, and social dimensions of both local urban areas and regional and national areas, and even the global sphere, especially with climate change becoming one of the main negative impacts on the environmental dimension. Gazibey et al. ( 2014 ) also noted that, while some problem areas, such as “poverty reduction” are mainly related to the economic and somewhat to the social dimensions of sustainability, other issues like “climate change” and “reduction of carbon footprint” are more related to the environmental dimension. An in-depth examination reveals that many problems, which may initially seem related to a single dimension, are intertwined with multiple dimensions. Thus, while attracting foreign direct investment to a country may seem primarily related to the economic dimension at first glance, it is closely linked to environmental and social dimensions.

In its most straightforward approach, meeting and satisfying the basic needs of individuals will subsequently prioritize higher-level needs. This, in turn, will support sustainable development in all three dimensions. Thus, while foreign capital invested in a country may initially support economic sustainability, its contribution to the socio-economic levels of individuals will lay the groundwork primarily for social and educational improvement in the medium and long term, secondarily for environmental enhancement to result in a more livable environment. For example, Xu et al. ( 2024 ) argued that “China is currently exploring a sustainable development mode of collaborative governance.” In a good level of governance, all social partners expected to be affected by the possible policies are included in the decision-making process. This process is related to and supports the participation dimension of democracy. So, as the pieces of a chain, a good level of democracy supports the level of governance, and governance supports the accumulation of FDI and economic performance. Consequently, these favorable conditions might pave the way for sustainable development. Another study (Olorogun, 2023 ) found a long-run relationship between financial development in the private sector and economic growth in sub-Saharan Africa, with the data spanning from 1978 to 2019. According to the results of the author’s research, there is a long-run covariance between sustainable economic development and foreign direct investment (FDI) and a significant level of causality between economic growth and financial development in the private sector, FDI, and export.

Indeed, sustainability resembles a ball resting on a three-legged stool: Any absence or imbalance in one of this tripod’s economic, social, or environmental legs will cause the ball to fall. In other words, sustainable development requires addressing all three dimensions in a balanced manner.

This idea brings us to the focus of this research: The level of democracy and the FDI variable and the relationship between these variables essentially concerns all three dimensions. In countries with a higher level of democracy, the possibility of developing policies that consider citizens’ demands and preferences is higher than in countries with lower levels of democracy. Conversely, in countries with lower levels of democracy , the likelihood of prioritizing the preferences and gains of specific individuals or groups over issues such as sustainability, environmental protection, and social welfare is higher. Consequently, this situation will negatively affect both the potential level of FDI attracted to the less developed country and, ultimately, the sustainable development momentum.

To sum up, numerous factors play a crucial role in shaping decisions related to foreign direct investments. Particularly in underdeveloped and developing countries, where domestic capital accumulation might be insufficient, the preference for attracting direct foreign capital investments emerges as a strategic choice over external borrowing. This strategic approach is driven by fostering economic development and sustainable growth while leveraging the benefits associated with foreign capital inflows.

The empirical evidence on the relationship between democracy and the level of foreign direct investment (FDI) often presents conflicting results, influenced by variations in study periods and sample compositions. Notably, these disparities can be traced back to the differing development levels of countries under scrutiny.

Reviewing previous studies reveals a recurring pattern wherein developed countries exhibit a positive and significant correlation between democracy and FDI. Conversely, in underdeveloped or developing nations, a negative relationship tends to prevail between these two variables. This disparity hinges on the distinct behavior of capital owners seeking to invest in already developed countries, where business transactions are grounded in established legal frameworks, property rights, and the rule of law. In contrast, underdeveloped and developing countries often witness capital owners engaging in potentially illicit and unethical business dealings with high risks and potential returns.

These arrangements are frequently based on different interests and assurances with individuals and groups in positions of power. In essence, the ease of resource acquisition, processing, and exportation in underdeveloped countries becomes contingent upon the presence of authoritarian regimes. Such relationships of interest with authoritarian regimes provide investment security for global investors. However, these regimes—keen on preserving these relationships—are disinclined to have their dealings exposed, which in turn leads to increased pressure on their citizens. The resulting mutualistic relationship transforms into a lucrative exploitation process.

When the outcomes of the panel data analysis incorporating structural breaks were examined, it was found that all variables demonstrated significance at the 1% level. The cross-sectional dependency analysis results indicated a significant cross-sectional relationship between the variables. In the panel unit root test, it was observed that the variables in the model exhibited unit roots at the level, but their differences rendered all variables stationary. The delta homogeneity test findings suggested that the variables lacked homogeneity. Furthermore, the results of the panel cointegration test with structural breaks affirmed a long-term relationship, with significance levels of 1% in the model without breaks and 5% in the regime change model. Lastly, the reached bidirectional and one-directional causality between FDI and democracy and other economic variables like inflation and PGDP in the sample group countries require policymakers to focus on each variable carefully especially on the level of democracy if they aim to reach a high level of FDI.

In conclusion, the findings of this study suggest the presence of a long-term relationship between democracy and FDI also supported by causality in some countries within the sample, as revealed through the analysis of data from BRICS-TM countries within emerging markets spanning the period 1994–2018. The significance of this relationship is particularly evident when considering the impact of structural breaks. It is emphasized that governments and policymakers in emerging markets (including those in BRICS-TM), which aim to bolster their economy’s resilience against various shocks, should not only consider structural breaks but also recognize the intricate connection between democracy and FDI. The study underscores that developments in democracy have the potential to influence FDI, emphasizing the importance of factoring this relationship into the formulation and execution of socio-economic policies. Lastly, using panel tests with a structural break, a method uncommonly employed in the empirical analysis of the democracy variable, may contribute as an additional dimension to the existing literature in this field.

In analyzing the relationship between democracy and foreign direct investment, the findings suggest a long-term relationship in all models except for the level change model. These results highlight the significance of democratic developments in the BRICS-TM countries influencing the inflow of foreign direct capital. Therefore, policymakers in emerging markets, particularly within BRICS-TM countries, are encouraged to prioritize democracy and foster democratic developments to attract foreign direct investments. Additionally, given the impact of global and local developments leading to structural breaks, it becomes crucial for these policymakers to closely monitor and interpret international and global events that may affect the resilience of their national economies, both negatively and positively. By doing so, emerging markets can enhance their resilience against various shocks, enabling policymakers to adeptly prepare their economies, private sectors, and stock markets for potential global risks.

Opting for direct foreign capital investments over external debt or short-term investments is a more rational approach for developing countries to accumulate capital for their overall development. As many countries seek to address the scarcity of capital, the understanding of the contributions of foreign capital to development improves, while global competition intensifies to attract foreign capital. Therefore, policymakers should focus on enhancing macroeconomic indicators such as inflation and national income and fostering democratic development, a fundamental trust factor for foreign capital. Demographic and institutional factors also affect the global or social fiscal pressure (Nuță & Nuță, 2020 ). Thus, as an institutional factor, positive developments at the level of democracy are fundamental in attracting foreign capital.

It is crucial for developing countries to prioritize and keep pace with indicators that foreign capital considers significant. Global companies prioritize countries they can trust, where investments can swiftly yield profits due to potential risks. The foundation of democracy in developing nations starts in the family and education realms. Proper education on the importance and necessity of democracy in the curriculum contributes to long-term awareness of democracy. Developing effective education policies within families can address intra-family democracy, fostering a culture of democracy throughout the country.

The reasons listed up to this point reiterate that attracting foreign direct investments to a country is of utmost critical importance for supporting sustainable development in all aspects of the nation. As discussed in the discussion section, while sustainability may appear to be solely related to the economic dimension at first glance, an increase in foreign direct investment toward a country has the potential to indirectly and positively impact the social and environmental dimensions of sustainability as well. When considering that the level of democracy also has a similar effect on the level of FDI, it should be expected that the level of democracy in a country is strongly correlated with the issue of sustainable development.

In conclusion, new researchers interested in this subject are recommended to conduct analyses on different country groups. Updating established models and testing hypotheses using various socio-economic indicators and analysis methods can further contribute to the literature.

Data Availability

The data set is uploaded to the system as a supplementary file and also uploaded to Figshare with the https://doi.org/10.6084/m9.figshare.21701966 .

Turkey’s name changed to Türkiye: According to the United Nations (UN)-Türkiye, the country’s name has been officially changed to Türkiye at the UN upon a letter received on June 1 from the Turkish Foreign Ministry (UN-Türkiye. (2022)). Turkey’s name changed to Türkiye, URL: https://turkiye.un.org/en/184798-turkeys-name-changed-turkiye , Accessed on: 02.07.2022.

Abbreviations

Brazil, Russia, India, China, South Africa, Türkiye, Mexico

The Democracy Index variable

Ecological footprint

Gross domestic product

Logarithm of foreign direct investment

Logarithm of per capita income

Multinational corporations

Per capita income

Political institutions

Regression coefficient value

World Development Indicators

Ahmed, Z., Ahmad, Z., Rjoub, H., Kalugina, O. A., & Hussain, N. (2021). Economic growth, renewable energy consumption, and ecological footprint: Exploring the role of environmental regulations and democracy in sustainable development. Sustainable Development, 30 (4), 595–605. https://doi.org/10.1002/sd.2251

Article   Google Scholar  

Aliefendioğlu, Y. (2005). Temsili demokrasinin ‘seçim’ ayağı (The election leg of the representative democracy. TBB Dergisi (TBB Journal), 60 (2005), 71–96.

Google Scholar  

Andrews, D. W. K. (2005). Cross-section regression with common shocks. Econometrica, 73 (5), 1551–1585. https://doi.org/10.1111/j.1468-0262.2005.00629.x

Baghestani, H., Chazi, A., & Khallaf, A. (2019). A directional analysis of oil prices and real exchange rates in BRIC countries. Research in International Business and Finance, 50 (C), 450–456. https://doi.org/10.1016/j.ribaf.2019.06.013

Banday, U. J., & Ismail, S. (2017). Does tourism development lead to a positive or negative impact on economic growth and environment in BRICS countries? A panel data analysis. Economics Bulletin, 37 (1), 553–567.

Botric, V., & Skuflic, L. (2005). Main determinants of foreign direct investment in the Southeast European countries. Transition Studies Review, 13 (2), 359–377. https://doi.org/10.1007/s11300-006-0110-3

Breitung, J. (2005). A parametric approach to the estimation of cointegrating vectors in panel data. Econometric Reviews, 24 (2), 151–173. https://doi.org/10.1081/ETC-200067895

Breusch, T. S., & Pagan, A. R. (1980). The Lagrange multiplier test and its applications to model specification in econometrics. Review of Economic Studies, 47 (1), 239–253. https://doi.org/10.2307/2297111

Busse, M. (2003). Democracy and FDI. HWWA Discussion Paper 220 . Hamburg Institute of International Economics (HWWA).

Castro, D. (2014). Foreign direct investment and democracy. The Honors Program Senior Capstone Project , Dissertation in Bryant University, May. pp.1–26. Retrieved June 10 2023, from https://digitalcommons.bryant.edu/honors_economics/18/ . Accessed 05.03.2024.

Chakrabarti, A. (2001). The determinants of foreign direct investment: Sensitivity analyses of cross-country regressions. Kyklos, 54 , 89–114. https://doi.org/10.1111/1467-6435.00142

Choi, I. (2001). Unit roots test for panel data. Journal of International Money and Finance , 20(2), 249–272. Retrieved June 10 2023, from https://doi.org/10.1016/S0261-5606(00)00048-6

Derbali, A., Trabelsi, L., & Zitouna, M. H. (2015). Democratic transition and FDI: Transition process matters. Munich Personal RePEc Archive MPRA Paper No. 77518 , posted 16 Mar 2017, 11 August, 1–38. Retrieved June 10 2023, from https://mpra.ub.uni-muenchen.de/id/eprint/77518 . Accessed 18 Jan 2024.

Dolunay, A., Kasap, F., & Keçeci, G. (2017). Freedom of mass communication in the digital age in the case of the Internet: Freedom house and the USA Example. Sustainability, 9 (10), 1739, 1–21. https://doi.org/10.3390/su9101739

Doucoligaos, H., & Ulubasoglu, M. A. (2008). Democracy and economic growth: A meta-analysis. American Journal of Political Science, 52 (1), 61–83. https://doi.org/10.1111/j.1540-5907.2007.00299.x

Erdoğan, S., Yıldırım, D. Ç., & Gedikli, A. (2019). Investigation of causality analysis between economic growth and CO2 emissions: The case of BRICS – T countries. International Journal of Energy Economics and Policy, 9 (6), 430–438. Retrieved June 10 2023, from https://doi.org/10.32479/ijeep.8546

Fernandes, G. W., de Oliveira Roque, F. O., Fernandes, S., de Viveiros Grelle, C. E., Ochoa-Quintero, J. M., Toma, T. S. P., Vilela, E. F., & Fearnside, P. M. (2023). Brazil’s democracy and sustainable agendas: A nexus in urgent need of strengthening. Perspectives in Ecology and Conservation, 21 (3), 197–199. https://doi.org/10.1016/j.pecon.2023.06.001

Freedom House. (2020). Democracy scores , Retrieved April 25 2022, from https://freedomhouse.org/report/freedom-world . Accessed 12.10. 2023.

Gazibey, Y., Keser, A., & Gökmen, Y. (2014). Türkiye’de illerin sürdürülebilirlik boyutlari açisindan değerlendirilmesi (The evaluation of the cities in Türkiye according to the dimensions of sustainability). Ankara University SBF Journal, 69 (3), 511–541.

Göçer, İ. (2013). Ar-Ge Harcamalarının Yüksek Teknolojili Ürün İhracatı, Dış Ticaret Dengesi ve Ekonomik Büyüme Üzerindeki Etkileri (Effects of RandD expenditures on high technology exports, balance of foreign trade and economic growth). Maliye Dergisi, 165 , 215–240. Retrieved June 10 2023, from https://www.researchgate.net/publication/296621402_Ar-Ge_Harcamalarinin_Yuksek_Teknolojili_Urun_Ihracati_Dis_Ticaret_Dengesi_ve_Ekonomik_Buyume_Uzerindeki_Etkileri

Granger, C. W. J., & Newbold, P. (1974). Spurious regressions in econometrics. Journal of Econometrics, 2 (2), 111–120. https://doi.org/10.1016/0304-4076(74)90034-7

Gür, B. (2020). The effect of foreign trade on innovation: The case of BRICS-T countries. Journal of Social, Humanities and Administrative Sciences, 6 (27), 819–830. Retrieved June 10 2023, from https://www.researchgate.net/publication/339847375_The_Effect_of_Foreign_Trade_on_Innovation_The_Case_of_BRICS-T_Countries . Accessed 25 Mar 2024.

Hadri, K. (2000). Testing for stationarity in heterogeneous panels. Econometrics Journal, 3 (2), 148–161. Retrieved June 10 2023, from https://doi.org/10.1111/1368-423X.00043

Haggard, S. (1990). Pathways from the periphery: The politics of growth in the newly industrializing countries . Cornell University Press.

Harms, P., & Ursprung, H. (2002). Do civil and political repression boost FDI? Economic Inquiry, 40 (4), 651–663. https://doi.org/10.1093/ei/40.4.651

Haydaroğlu, C., & Gülşah, Ç. (2016). Türkiye’de seçim sistemlerinin demokrasi ve ekonomi ilişkisi çerçevesinde incelenmesi. Uluslararası Politik Araştırmalar Dergisi, 2 (1), 51–63. https://doi.org/10.25272/j.2149-8539.2016.2.1.05

Im, K.S., Lee, J., & Tieslau, M. (2010). Panel LM unit root tests with trend shifts (March 1, 2010). FDIC Center for Financial Research Working Paper, No. 2010–1 . https://doi.org/10.2139/ssrn.1619918

Im, K. S., Pesaran, M. H., & Shin, Y. (2003). Testing for unit roots in heterogeneous panels. Journal of Econometrics, 115 (1), 53–74. https://doi.org/10.1016/S0304-4076(03)00092-7

Jadhav, P. (2012). Determinants of foreign direct investment in BRICS economies: Analysis of economic institutional and political factor. Social and Behavioral Sciences, 37 , 5–14. https://doi.org/10.1016/j.sbspro.2012.03.270

Kebede, J. G., & Takyi, P. O. (2017). Causality between institutional quality and economic growth: Evidence from sub-Saharan Africa. European Journal of Economic and Financial Research, 2 (1), 114–131. https://doi.org/10.5281/zenodo.438146

Keser, A., Kılıç, B., & Özbek, C.A. (2023). How the Demos [public] regulate the Kratos [administration] through repeated elections: Lessons learned from the elections in Türkiye for the government and opposition. İnsan Ve Toplum , 13(4), 66–93. https://doi.org/10.12658/M0704

Khan, H., Dong, Y., Bibi, R., & Khan, I. (2023). Institutional quality and foreign direct investment: Global evidence. Journal of the Knowledge Economy . https://doi.org/10.1007/s13132-023-01508-1

Kilci, E. N., & Yilanci, V. (2022). Impact of monetary aggregates on consumer behavior: A study on the policy response of the federal reserve against COVID-19. Asian Journal of Applied Economics, 29 (1), 100–122. Retrieved June 10 2023, from https://so01.tci-thaijo.org/index.php/AEJ/article/view/248476 . Accessed 21 Apr 2024.

Koç, A., & Sarica, D. (2016). Analysis on the relationship between the share of labour income and the level of union organization in selected OECD countries in the neoliberal era. Journal of Current Researches on Business and Economics, 6 (2), 29–56. Retrieved June 10 2023, from https://www.jocrebe.com/imagesbuyuk/0d2436-2-say%C4%B1%20tam%20dosyas%C4%B1.pdf . Accessed 14 Apr 2024.

Kónya, L. (2006). Exports and growth: Granger causality analysis on OECD countries with a panel approach. Economic Modelling, 23 (6), 978–992. https://doi.org/10.1016/j.econmod.2006.04.008

Lacroix, J., Meon, P. G., & Sekkat, K. (2021). Democratic transitions can attract foreign direct investment: Effect, trajectories, and the role of political risk. Journal of Comparative Economics, 49 (2), 340–357. https://doi.org/10.1016/j.jce.2020.09.003

Levin, A., Lin, C. F., & Chu, C. J. (2002). Unit root tests in panel data: Asymptotic and finite sample properties. Journal of Econometrics, 108 , 1–24. https://doi.org/10.1016/S0304-4076(01)00098-7

Li, Q., & Resnick, A. (2003). Reversal of fortunes: Democratic institutions and FDI inflows to developing countries. International Organization, 57 (1), 175–211. https://doi.org/10.1017/S0020818303571077

Maddala, G. S., & Wu, S. (1999). A comparative study of unit root tests with panel data and a new simple test. Oxford Bulletin of Economics and Statistics, 61 , 631–652. https://doi.org/10.1111/1468-0084.0610s1631

Magazzino, C. (2023). Ecological footprint, electricity consumption, and economic growth in China: Geopolitical risk and natural resources governance. Empirical Economics . https://doi.org/10.1007/s00181-023-02460-4

Magazzino, C., & Mele, M. (2022). Can a change in FDI accelerate GDP growth? Time-series and ANNs evidence on Malta. The Journal of Economic Asymmetries, 25 , e00243. https://doi.org/10.1016/j.jeca.2022.e00243

Magazzino, C., & Mele, M. (2022). A new machine learning algorithm to explore the CO2 emissions-energy use-economic growth trilemma. Annals of Operations Research . https://doi.org/10.1007/s10479-022-04787-0

Malikane, C., & Chitambara, P. (2017). FDI, democracy, and economic growth in Southern Africa. African Development Review, 29 (1), 92–102. https://doi.org/10.1111/1467-8268.12242

Martin, J. D., Abbas, D., & Martins, R. J. (2016). The validity of global press ratings. Journalism Practice, 10 (1), 93–108. https://doi.org/10.1080/17512786.2015.1010851

Mathur, A., & Singh, K. (2013). Foreign direct investment, corruption and democracy. Applied Economics, 45 (8), 991–1002. https://doi.org/10.1080/00036846.2011.613786

Menger, P.-M. (2010), Cultural policies in Europe from a state to a city-centered perspective on cultural generativity. GRIPS Discussion Paper No. 10–28 , GRIPS Policy Research Center,.1–9, Tokyo, Japan. Retrieved May 10 2022, from chromeextension://efaidnbmnnnibpcajpcglclefindmkaj/https://www.grips.ac.jp/r-center/wpcontent/uploads/10-28.pdf

Muhammad, B., Khan, M. K., Khan, M. I., & Khan, S. (2022). Impact of foreign direct investment, natural resources, renewable energy consumption, and economic growth on environmental degradation: Evidence from BRICS, developing, developed and global countries. Environmental Science and Pollution Research, 28 , 21789–21798. https://doi.org/10.1007/s11356-021-16861-4

Nuță, A. C., & Nuță, F. M. (2020). Modelling the influences of economic, demographic, and institutional factors on fiscal pressure using OLS, PCSE, and FD-GMM approaches. Sustainability, 12 (4), 1681. https://doi.org/10.3390/su12041681

Ojekemi, O. S., Ağa, M., & Magazzino, C. (2023). Towards achieving sustainability in the BRICS economies: The role of renewable energy consumption and economic risk. Energies, 16 (14), 5287. https://doi.org/10.3390/en16145287

Olorogun, L. A. (2023). Modelling financial development in the private sector, FDI, and sustainable economic growth in sub-Saharan Africa: ARDL bound test-FMOLS, DOLS robust analysis. Journal of the Knowledge Economy . https://doi.org/10.1007/s13132-023-01224-w

Oneal, J. R. (1994). The affinity of foreign investors for authoritarian regimes. Political Research Quarterly, 47 (3), 565–588. https://doi.org/10.1177/106591299404700302

Osiewicz, P., & Skrzypek, M. (2020). Is Spain becoming a militant democracy? Empirical evidence from freedom house reports. Aportes-Revista de Historia Contemporanea, 35 (103), 7–33. Retrieved April 10 2022, from https://www.revistaaportes.com/index.php/aportes/article/view/526/296 . Accessed 28 Jan 2024.

Pesaran, M. H. (2004). General diagnostic tests for cross-section dependence in panels. IZA Discussion Paper No. 1240 . Bonn, Germany. Retrieved June 10 2023, from chromeextension://efaidnbmnnnibpcajpcglclefindmkaj/https://docs.iza.org/dp1240.pdf . Accessed 10 Jun 2024.

Pesaran, M. H. (2006). Estimation and inference in large heterogeneous panels with a multifactor error structure. Econometrica, 74 (4), 967–1012. https://doi.org/10.1111/j.1468-0262.2006.00692.x

Pesaran, M. H., Ullah, A., & Yamagata, T. (2008). A bias-adjusted LM test of error cross-section independence. The Econometrics Journal, 11 (1), 105–127. https://doi.org/10.1111/j.1368-423X.2007.00227.x

Pesaran, M. H., & Yamagata, T. (2008). Testing slope homogeneity in large panels. Journal of Econometrics, 142 (1), 50–93. https://doi.org/10.1016/j.jeconom.2007.05.010

Phillips, P. C. B., & Sul, D. (2003). Dynamic panel estimation and homogeneity testing under cross-section dependence. The Econometrics Journal, 6 (1), 217–259. https://doi.org/10.1111/1368-423X.00108

Putra, R. F., & Putri, D. Z. (2021). The effect of corruption, democracy and foreign debt on economic growth in Asian Pacific countries. Jambura Equilibrium Journal, 3 (2), 66–71. https://doi.org/10.37479/jej.v3i2.10272

Raghutla, C., & Chittedi, K. R. (2021). Financial development, energy consumption, technology, urbanization, economic output and carbon emissions nexus in BRICS countries: An empirical analysis. Management of Environmental Quality, 32 (2), 290–307. https://doi.org/10.1108/MEQ-02-2020-0035

Rahalkar, H., Sheppard, A., Lopez Morales, C. A., Lobo, L., & Salek, S. (2021). Challenges faced by the biopharmaceutical industry in the development and marketing authorization of biosimilar medicines in BRICS TM countries: An exploratory study. Pharmaceutical Medicine, 35 , 235–251. https://doi.org/10.1007/s40290-021-00395-8

Ranjan, V., & Agraval, G. (2011). FDI inflow determinants in BRIC countries: A panel data analysis. International Business Research, 4 (4), 255–263. https://doi.org/10.5539/ibr.v4n4p255

Ratiu, D. E. (2013). Creative cities and/or sustainable cities: Discourses and practices. City, Culture and Society, 4 , 125–135. https://doi.org/10.1016/j.ccs.2013.04.002

Rodrik, D. (1996). Labor standards in international trade: Do they matter and what do we do about them? In R. Lawrence, D. Rodrik, & J. Whalley (Eds.), Emerging Agenda for Global Trade: High States for Developing Countries (pp. 35–79). Johns Hopkins University Press.

Shahbaz, M., Nuta, A. C., Mishra, P., & Ayad, H. (2023). The impact of informality and institutional quality on environmental footprint: The case of emerging economies in a comparative approach. Journal of Environmental Management, 348 , 119325. https://doi.org/10.1016/j.jenvman.2023.119325

Spar, D. (1999). Foreign investment and human rights. Challenge, 42 (1), 55–80. https://doi.org/10.1080/05775132.1999.11472078

Steiner, N. D. (2016). Comparing freedom house democracy scores to alternative indices and testing for political bias: Are US allies rated as more democratic by freedom house? Journal of Comparative Policy Analysis: Research and Practice, 18 (4), 329–349. https://doi.org/10.1080/13876988.2013.877676

Suny, R. G. (2017). The crisis of bourgeois democracy: The fate of an experiment in the age of nationalism, populism, and neo-liberalism. New Perspectives on Turkey, 57 , 115–141. https://doi.org/10.1017/npt.2017.32

Swamy, P. (1970). Efficient inference in a random coefficient regression model. Econometrica, 38 (2), 311–323. https://doi.org/10.2307/1913012

Tavares, J., & Wacziarg, R. (2001). How democracy affects growth. European Economic Review, 45 (2001), 1341–1373. https://doi.org/10.1016/S0014-2921(00)00093-3

Toguç, N., Kuşkaya, S., Magazzino, C., & Bilgili, F. (2023). The impact of natural disaster shocks on business confidence level and Istanbul stock exchange: A wavelet coherence approach. Geological Journal, 58 (12), 4610–4624. https://doi.org/10.1002/gj.4868

Vijayakumar, N., Sridharan, P., & Rao, K. C. S. (2010). Determinants of FDI in BRICS countries: A panel analysis. International Journal of Business Science & Applied Management (IJBSAM), 5 (3), 1–13.

Voicu, M., & Peral, E. B. (2014). Support for democracy and early socialization in a non-democratic country: Does the regime matter? Democratization, 21 (3), 554–573. https://doi.org/10.1080/13510347.2012.751974

Westerlund, J., & Edgerton, D. L. (2008). A simple test for cointegration in dependent panels with structural breaks. Oxford Bulletin of Economics and Statistics, 70 , 665–704. https://doi.org/10.1111/j.1468-0084.2008.00513.x

World Bank. (2020). World Development Indicators , Retrieved April 25 2020, from https://databank.worldbank.org/indicator/NE.EXP.GNFS.ZS/1ff4a498/Popular-Indicators# . Accessed 11 Nov 2023.

Xu, J., Wang, J., Yang, X., Jin, Z., & Liu, Y. (2024). Digital economy and sustainable development: Insight from synergistic pollution control and carbon reduction. Journal of the Knowledge Economy . https://doi.org/10.1007/s13132-024-01950-9

Yang, M., Magazzino, C., Abraham, A. A., & Abdulloev, N. (2024). Determinants of load capacity factor in BRICS countries: A panel data analysis. Natural Resources Forum, 48 (2), 525–548. https://doi.org/10.1111/1477-8947.12331

Yusuf, H. A., Shittu, W. O., Akanbi, S. B., Umar, H. M. B., & Abdulrahman, I. A. (2020). The role of foreign direct investment, financial development, democracy, and political (in) stability on economic growth in West Africa. International Trade, Politics and Development, 4 (1), 27–46. https://doi.org/10.1108/ITPD-01-2020-0002

Zellner, A. (1962). An efficient method of estimating seemingly unrelated regressions and tests for aggregation bias. Journal of the American Statistical Association, 57 (298), 348–368. https://doi.org/10.1080/01621459.1962.10480664

Download references

Acknowledgements

We appreciate all the efforts and time spent by the editorial office members and anonymous reviewers for all their comments, which contribute to the quality of the article.

Open access funding provided by the Scientific and Technological Research Council of Türkiye (TÜBİTAK). No funds were received from any institution.

Author information

Authors and affiliations.

Department of Economics, Hasan Kalyoncu University, Şahinbey, Gaziantep, Turkey

Ibrahim Cutcu

Department of Political Science and International Relations, Hasan Kalyoncu University, Şahinbey, Gaziantep, Turkey

Ahmet Keser

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Ahmet Keser .

Ethics declarations

Ethics approval.

The research was conducted within all ethical standards.

Conflict of Interest

The authors declare no competing interests.

Permission to reproduce material from other sources

Not Applicable.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Practice Points/Highlights

1. From 1994 to 2018, there was significant cointegration between democracy and foreign direct investment (FDI) in BRICS-TM countries among the emerging markets.

2. Democratic developments and FDI move together in the long run and have a balanced relationship between them in Emerging Market Economies.

3. Policymakers in BRICS-TM countries need to develop democracy awareness and ensure democratic developments to attract foreign direct investment to secure a resilient economy in these emerging economies

4. Governments and decision-makers in emerging economies, such as BRICS-TM, who want to attract FDI need to consider the structural breaks and the relationship between democracy and FDI .

Supplementary Information

Supplementary material 1., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cutcu, I., Keser, A. Democracy and Foreign Direct Investment in BRICS-TM Countries for Sustainable Development. J Knowl Econ (2024). https://doi.org/10.1007/s13132-024-02205-3

Download citation

Received : 11 October 2023

Accepted : 14 June 2024

Published : 05 September 2024

DOI : https://doi.org/10.1007/s13132-024-02205-3

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

JEL Classification

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

The Beginner's Guide to Statistical Analysis | 5 Steps & Examples

Statistical analysis means investigating trends, patterns, and relationships using quantitative data . It is an important research tool used by scientists, governments, businesses, and other organizations.

To draw valid conclusions, statistical analysis requires careful planning from the very start of the research process . You need to specify your hypotheses and make decisions about your research design, sample size, and sampling procedure.

After collecting data from your sample, you can organize and summarize the data using descriptive statistics . Then, you can use inferential statistics to formally test hypotheses and make estimates about the population. Finally, you can interpret and generalize your findings.

This article is a practical introduction to statistical analysis for students and researchers. We’ll walk you through the steps using two research examples. The first investigates a potential cause-and-effect relationship, while the second investigates a potential correlation between variables.

Table of contents

Step 1: write your hypotheses and plan your research design, step 2: collect data from a sample, step 3: summarize your data with descriptive statistics, step 4: test hypotheses or make estimates with inferential statistics, step 5: interpret your results, other interesting articles.

To collect valid data for statistical analysis, you first need to specify your hypotheses and plan out your research design.

Writing statistical hypotheses

The goal of research is often to investigate a relationship between variables within a population . You start with a prediction, and use statistical analysis to test that prediction.

A statistical hypothesis is a formal way of writing a prediction about a population. Every research prediction is rephrased into null and alternative hypotheses that can be tested using sample data.

While the null hypothesis always predicts no effect or no relationship between variables, the alternative hypothesis states your research prediction of an effect or relationship.

Planning your research design

A research design is your overall strategy for data collection and analysis. It determines the statistical tests you can use to test your hypothesis later on.

First, decide whether your research will use a descriptive, correlational, or experimental design. Experiments directly influence variables, whereas descriptive and correlational studies only measure variables.

Your research design also concerns whether you’ll compare participants at the group level or individual level, or both.

First, you’ll take baseline test scores from participants. Then, your participants will undergo a 5-minute meditation exercise. Finally, you’ll record participants’ scores from a second math test.

In this experiment, the independent variable is the 5-minute meditation exercise, and the dependent variable is the math test score from before and after the intervention. Example: Correlational research design In a correlational study, you test whether there is a relationship between parental income and GPA in graduating college students. To collect your data, you will ask participants to fill in a survey and self-report their parents’ incomes and their own GPA.

Measuring variables

When planning a research design, you should operationalize your variables and decide exactly how you will measure them.

For statistical analysis, it’s important to consider the level of measurement of your variables, which tells you what kind of data they contain:

Many variables can be measured at different levels of precision. For example, age data can be quantitative (8 years old) or categorical (young). If a variable is coded numerically (e.g., level of agreement from 1–5), it doesn’t automatically mean that it’s quantitative instead of categorical.

Identifying the measurement level is important for choosing appropriate statistics and hypothesis tests. For example, you can calculate a mean score with quantitative data, but not with categorical data.

In a research study, along with measures of your variables of interest, you’ll often collect data on relevant participant characteristics.

Variable Type of data
Age Quantitative (ratio)
Gender Categorical (nominal)
Race or ethnicity Categorical (nominal)
Baseline test scores Quantitative (interval)
Final test scores Quantitative (interval)
Parental income Quantitative (ratio)
GPA Quantitative (interval)

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

Population vs sample

In most cases, it’s too difficult or expensive to collect data from every member of the population you’re interested in studying. Instead, you’ll collect data from a sample.

Statistical analysis allows you to apply your findings beyond your own sample as long as you use appropriate sampling procedures . You should aim for a sample that is representative of the population.

Sampling for statistical analysis

There are two main approaches to selecting a sample.

In theory, for highly generalizable findings, you should use a probability sampling method. Random selection reduces several types of research bias , like sampling bias , and ensures that data from your sample is actually typical of the population. Parametric tests can be used to make strong statistical inferences when data are collected using probability sampling.

But in practice, it’s rarely possible to gather the ideal sample. While non-probability samples are more likely to at risk for biases like self-selection bias , they are much easier to recruit and collect data from. Non-parametric tests are more appropriate for non-probability samples, but they result in weaker inferences about the population.

If you want to use parametric tests for non-probability samples, you have to make the case that:

Keep in mind that external validity means that you can only generalize your conclusions to others who share the characteristics of your sample. For instance, results from Western, Educated, Industrialized, Rich and Democratic samples (e.g., college students in the US) aren’t automatically applicable to all non-WEIRD populations.

If you apply parametric tests to data from non-probability samples, be sure to elaborate on the limitations of how far your results can be generalized in your discussion section .

Create an appropriate sampling procedure

Based on the resources available for your research, decide on how you’ll recruit participants.

Your participants are self-selected by their schools. Although you’re using a non-probability sample, you aim for a diverse and representative sample. Example: Sampling (correlational study) Your main population of interest is male college students in the US. Using social media advertising, you recruit senior-year male college students from a smaller subpopulation: seven universities in the Boston area.

Calculate sufficient sample size

Before recruiting participants, decide on your sample size either by looking at other studies in your field or using statistics. A sample that’s too small may be unrepresentative of the sample, while a sample that’s too large will be more costly than necessary.

There are many sample size calculators online. Different formulas are used depending on whether you have subgroups or how rigorous your study should be (e.g., in clinical research). As a rule of thumb, a minimum of 30 units or more per subgroup is necessary.

To use these calculators, you have to understand and input these key components:

Once you’ve collected all of your data, you can inspect them and calculate descriptive statistics that summarize them.

Inspect your data

There are various ways to inspect your data, including the following:

By visualizing your data in tables and graphs, you can assess whether your data follow a skewed or normal distribution and whether there are any outliers or missing data.

A normal distribution means that your data are symmetrically distributed around a center where most values lie, with the values tapering off at the tail ends.

Mean, median, mode, and standard deviation in a normal distribution

In contrast, a skewed distribution is asymmetric and has more values on one end than the other. The shape of the distribution is important to keep in mind because only some descriptive statistics should be used with skewed distributions.

Extreme outliers can also produce misleading statistics, so you may need a systematic approach to dealing with these values.

Calculate measures of central tendency

Measures of central tendency describe where most of the values in a data set lie. Three main measures of central tendency are often reported:

However, depending on the shape of the distribution and level of measurement, only one or two of these measures may be appropriate. For example, many demographic characteristics can only be described using the mode or proportions, while a variable like reaction time may not have a mode at all.

Calculate measures of variability

Measures of variability tell you how spread out the values in a data set are. Four main measures of variability are often reported:

Once again, the shape of the distribution and level of measurement should guide your choice of variability statistics. The interquartile range is the best measure for skewed distributions, while standard deviation and variance provide the best information for normal distributions.

Using your table, you should check whether the units of the descriptive statistics are comparable for pretest and posttest scores. For example, are the variance levels similar across the groups? Are there any extreme values? If there are, you may need to identify and remove extreme outliers in your data set or transform your data before performing a statistical test.

Pretest scores Posttest scores
Mean 68.44 75.25
Standard deviation 9.43 9.88
Variance 88.96 97.96
Range 36.25 45.12
30

From this table, we can see that the mean score increased after the meditation exercise, and the variances of the two scores are comparable. Next, we can perform a statistical test to find out if this improvement in test scores is statistically significant in the population. Example: Descriptive statistics (correlational study) After collecting data from 653 students, you tabulate descriptive statistics for annual parental income and GPA.

It’s important to check whether you have a broad range of data points. If you don’t, your data may be skewed towards some groups more than others (e.g., high academic achievers), and only limited inferences can be made about a relationship.

Parental income (USD) GPA
Mean 62,100 3.12
Standard deviation 15,000 0.45
Variance 225,000,000 0.16
Range 8,000–378,000 2.64–4.00
653

A number that describes a sample is called a statistic , while a number describing a population is called a parameter . Using inferential statistics , you can make conclusions about population parameters based on sample statistics.

Researchers often use two main methods (simultaneously) to make inferences in statistics.

You can make two types of estimates of population parameters from sample statistics:

If your aim is to infer and report population characteristics from sample data, it’s best to use both point and interval estimates in your paper.

You can consider a sample statistic a point estimate for the population parameter when you have a representative sample (e.g., in a wide public opinion poll, the proportion of a sample that supports the current government is taken as the population proportion of government supporters).

There’s always error involved in estimation, so you should also provide a confidence interval as an interval estimate to show the variability around a point estimate.

A confidence interval uses the standard error and the z score from the standard normal distribution to convey where you’d generally expect to find the population parameter most of the time.

Hypothesis testing

Using data from a sample, you can test hypotheses about relationships between variables in the population. Hypothesis testing starts with the assumption that the null hypothesis is true in the population, and you use statistical tests to assess whether the null hypothesis can be rejected or not.

Statistical tests determine where your sample data would lie on an expected distribution of sample data if the null hypothesis were true. These tests give two main outputs:

Statistical tests come in three main varieties:

Your choice of statistical test depends on your research questions, research design, sampling method, and data characteristics.

Parametric tests

Parametric tests make powerful inferences about the population based on sample data. But to use them, some assumptions must be met, and only some types of variables can be used. If your data violate these assumptions, you can perform appropriate data transformations or use alternative non-parametric tests instead.

A regression models the extent to which changes in a predictor variable results in changes in outcome variable(s).

Comparison tests usually compare the means of groups. These may be the means of different groups within a sample (e.g., a treatment and control group), the means of one sample group taken at different times (e.g., pretest and posttest scores), or a sample mean and a population mean.

The z and t tests have subtypes based on the number and types of samples and the hypotheses:

The only parametric correlation test is Pearson’s r . The correlation coefficient ( r ) tells you the strength of a linear relationship between two quantitative variables.

However, to test whether the correlation in the sample is strong enough to be important in the population, you also need to perform a significance test of the correlation coefficient, usually a t test, to obtain a p value. This test uses your sample size to calculate how much the correlation coefficient differs from zero in the population.

You use a dependent-samples, one-tailed t test to assess whether the meditation exercise significantly improved math test scores. The test gives you:

Although Pearson’s r is a test statistic, it doesn’t tell you anything about how significant the correlation is in the population. You also need to test whether this sample correlation coefficient is large enough to demonstrate a correlation in the population.

A t test can also determine how significantly a correlation coefficient differs from zero based on sample size. Since you expect a positive correlation between parental income and GPA, you use a one-sample, one-tailed t test. The t test gives you:

Prevent plagiarism. Run a free check.

The final step of statistical analysis is interpreting your results.

Statistical significance

In hypothesis testing, statistical significance is the main criterion for forming conclusions. You compare your p value to a set significance level (usually 0.05) to decide whether your results are statistically significant or non-significant.

Statistically significant results are considered unlikely to have arisen solely due to chance. There is only a very low chance of such a result occurring if the null hypothesis is true in the population.

This means that you believe the meditation intervention, rather than random factors, directly caused the increase in test scores. Example: Interpret your results (correlational study) You compare your p value of 0.001 to your significance threshold of 0.05. With a p value under this threshold, you can reject the null hypothesis. This indicates a statistically significant correlation between parental income and GPA in male college students.

Note that correlation doesn’t always mean causation, because there are often many underlying factors contributing to a complex variable like GPA. Even if one variable is related to another, this may be because of a third variable influencing both of them, or indirect links between the two variables.

Effect size

A statistically significant result doesn’t necessarily mean that there are important real life applications or clinical outcomes for a finding.

In contrast, the effect size indicates the practical significance of your results. It’s important to report effect sizes along with your inferential statistics for a complete picture of your results. You should also report interval estimates of effect sizes if you’re writing an APA style paper .

With a Cohen’s d of 0.72, there’s medium to high practical significance to your finding that the meditation exercise improved test scores. Example: Effect size (correlational study) To determine the effect size of the correlation coefficient, you compare your Pearson’s r value to Cohen’s effect size criteria.

Decision errors

Type I and Type II errors are mistakes made in research conclusions. A Type I error means rejecting the null hypothesis when it’s actually true, while a Type II error means failing to reject the null hypothesis when it’s false.

You can aim to minimize the risk of these errors by selecting an optimal significance level and ensuring high power . However, there’s a trade-off between the two errors, so a fine balance is necessary.

Frequentist versus Bayesian statistics

Traditionally, frequentist statistics emphasizes null hypothesis significance testing and always starts with the assumption of a true null hypothesis.

However, Bayesian statistics has grown in popularity as an alternative approach in the last few decades. In this approach, you use previous research to continually update your hypotheses based on your expectations and observations.

Bayes factor compares the relative strength of evidence for the null versus the alternative hypothesis rather than making a conclusion about rejecting the null hypothesis or not.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

Methodology

Research bias

Is this article helpful?

Other students also liked.

More interesting articles

What is your plagiarism score?

IMAGES

  1. 4 SAS/STAT Descriptive Statistics Procedure You Must Know

    descriptive statistics section in research paper

  2. How To Use Descriptive Analysis In Research

    descriptive statistics section in research paper

  3. Descriptive Statistics

    descriptive statistics section in research paper

  4. Descriptive Statistics

    descriptive statistics section in research paper

  5. Summary of Descriptive Statistics

    descriptive statistics section in research paper

  6. Descriptive Statistics

    descriptive statistics section in research paper

VIDEO

  1. Introduction to Descriptive Statistics

  2. Applied Descriptive Statistics

  3. 3. Descriptive Statistics

  4. DESCRIPTIVE STATISTICS AND DATA AGGREGATIONS

  5. Descriptive Statistics Analysis: የዲስክሪፕቲቭ (ገላጭ) ስታቲስቲክስ ትንተና

  6. Descriptive Statistics

COMMENTS

  1. Writing with Descriptive Statistics

    Usually there is no good way to write a statistic. It rarely sounds good, and often interrupts the structure or flow of your writing. Oftentimes the best way to write descriptive statistics is to be direct. If you are citing several statistics about the same topic, it may be best to include them all in the same paragraph or section.

  2. Reporting Research Results in APA Style

    Present descriptive statistics for each primary, secondary, and subgroup analysis. Don't provide formulas or citations for commonly used statistics (e.g., standard deviation) - but do provide them for new or rare equations. ... In the methods section of an APA research paper, you report in detail the participants, measures, and procedure of ...

  3. Descriptive Statistics

    Descriptive Statistics. The mean, the mode, the median, the range, and the standard deviation are all examples of descriptive statistics. Descriptive statistics are used because in most cases, it isn't possible to present all of your data in any form that your reader will be able to quickly interpret.

  4. Descriptive Statistics for Summarising Data

    Using the data from these three rows, we can draw the following descriptive picture. Mentabil scores spanned a range of 50 (from a minimum score of 85 to a maximum score of 135). Speed scores had a range of 16.05 s (from 1.05 s - the fastest quality decision to 17.10 - the slowest quality decision).

  5. PDF Reporting Results of Common Statistical Tests in APA Format

    In reporting the results of statistical tests, report the descriptive statistics, such as means and standard deviations, as well as the test statistic, degrees of freedom, obtained value of the test, and the probability of the result occurring by chance (p value). Test statistics and p values should be rounded to two decimal places.

  6. Descriptive Statistics

    There are 3 main types of descriptive statistics: The distribution concerns the frequency of each value. The central tendency concerns the averages of the values. The variability or dispersion concerns how spread out the values are. You can apply these to assess only one variable at a time, in univariate analysis, or to compare two or more, in ...

  7. Reporting Statistics in APA Style

    Report descriptive statistics to summarize your data. Quantitative data is often reported using means and standard deviations, while categorical data (e.g., demographic variables) is reported using proportions. ... In the methods section of an APA research paper, you report in detail the participants, measures, and procedure of your study. 266.

  8. What Is Descriptive Statistics: Full Explainer With Examples

    Descriptive statistics, although relatively simple, are a critically important part of any quantitative data analysis. Measures of central tendency include the mean (average), median and mode. Skewness indicates whether a dataset leans to one side or another. Measures of dispersion include the range, variance and standard deviation.

  9. Descriptive Statistics

    Descriptive Statistics | Definitions, Types, Examples. Published on 4 November 2022 by Pritha Bhandari.Revised on 9 January 2023. Descriptive statistics summarise and organise characteristics of a data set. A data set is a collection of responses or observations from a sample or entire population.. In quantitative research, after collecting data, the first step of statistical analysis is to ...

  10. Expressing Your Results

    In this section, we focus on presenting descriptive statistical results in writing, in graphs, and in tables—following American Psychological Association (APA) guidelines for written research reports. These principles can be adapted easily to other presentation formats such as posters and slide show presentations.

  11. (PDF) Introduction to Descriptive statistics

    Descriptive statistics are used to examine methods of collecting, tidying up, and presenting research data (Alabi & Bukola, 2023). In addition to descriptive, there is also an evaluative analysis ...

  12. Descriptive Statistics: Reporting the Answers to the 5 Basic Questions

    Descriptive statistics are specific methods basically used to calculate, describe, and summarize collected research data in a logical, meaningful, and efficient way. Descriptive statistics are reported numerically in the manuscript text and/or in its tables, or graphically in its figures. This basic …

  13. Describing the participants in a study

    This paper reviews the use of descriptive statistics to describe the participants included in a study. It discusses the practicalities of incorporating statistics in papers for publication in Age and Aging, concisely and in ways that are easy for readers to understand and interpret. older people, descriptive statistics, study participants ...

  14. Descriptive statistics: organizing, summarizing, describing, and

    In this paper, we present essential methods of Descriptive Statistics for biomedical science students and professionals. We explore data summary techniques such as the mean, median, and mode ...

  15. Research Results Section

    Research Results. Research results refer to the findings and conclusions derived from a systematic investigation or study conducted to answer a specific question or hypothesis. These results are typically presented in a written report or paper and can include various forms of data such as numerical data, qualitative data, statistics, charts, graphs, and visual aids.

  16. APA Results Section

    The APA results section summarizes data and includes reporting statistics in a quantitative research study. The APA results section is an essential part of your research paper and typically begins with a brief overview of the data followed by a systematic and detailed reporting of each hypothesis tested. The interpreted results will then be presented in the discussion sections.

  17. PDF Presenting Descriptive Statistics

    5.3 Writing about descriptive statistics The amount of a research report that is devoted to descriptive statistics varies depending on the research project and the type of publication. Some research projects present only descriptive statistics. This may be the case in mixed-methods studies or exploratory studies. However,

  18. How to Write a Results Section

    Checklist: Research results 0 / 7. I have completed my data collection and analyzed the results. I have included all results that are relevant to my research questions. I have concisely and objectively reported each result, including relevant descriptive statistics and inferential statistics. I have stated whether each hypothesis was supported ...

  19. Descriptive Statistics

    Abstract. Descriptive statistics provide an essential foundation for understanding and summarizing large datasets by offering valuable insights into the central tendencies, dispersion, and shape of the distribution. By leveraging measures such as mean, median, mode, range, variance, and standard deviation, researchers can succinctly present the ...

  20. Data Analysis of Students Marks with Descriptive Statistics

    Descriptive statistics is a powerful beast of burden: (1) It co llects and summarize s vast amounts of data and. information in a manageable and organized manner, (2) A fairly straightforward ...

  21. A geographical analysis of social enterprises: the case of Ireland

    Section 3 outlines the research framework and the hypotheses of this study. Section 4 explains the methodology used in the research. Section 5 presents the findings of this study, with a subsection presenting descriptive statistics and another presenting the analysis of the hypotheses' tested. Section 6 discusses the findings and Section 7 ...

  22. How to Write an APA Methods Section

    Research papers in the social and natural sciences often follow APA style. This article focuses on reporting quantitative research methods. In your APA methods section, you should report enough information to understand and replicate your study, including detailed information on the sample, measures, and procedures used.

  23. Where is the research on sport-related concussion in Olympic athletes

    Objectives This cohort study reported descriptive statistics in athletes engaged in Summer and Winter Olympic sports who sustained a sport-related concussion (SRC) and assessed the impact of access to multidisciplinary care and injury modifiers on recovery. Methods 133 athletes formed two subgroups treated in a Canadian sport institute medical clinic: earlier (≤7 days) and late (≥8 days ...

  24. Democracy and Foreign Direct Investment in BRICS-TM ...

    Graphical representation of variables and analysis of descriptive statistics, (2) CD lm1 (Breusch & Pagan, 1980), CD lm1, and LM adj tests (Pesaran et al., 2008) were used in the analysis to find the presence of cross-section dependence of variables. (3) Panel LM test (Im, Lee, & Tieslau, 2010) determined whether variables in the model have a ...

  25. Inferential Statistics

    Example: Inferential statistics. You randomly select a sample of 11th graders in your state and collect data on their SAT scores and other characteristics. You can use inferential statistics to make estimates and test hypotheses about the whole population of 11th graders in the state based on your sample data.

  26. The Beginner's Guide to Statistical Analysis

    Table of contents. Step 1: Write your hypotheses and plan your research design. Step 2: Collect data from a sample. Step 3: Summarize your data with descriptive statistics. Step 4: Test hypotheses or make estimates with inferential statistics.