• Privacy Policy

Research Method

Home » Correlation Analysis – Types, Methods and Examples

Correlation Analysis – Types, Methods and Examples

Table of Contents

Correlation Analysis

Correlation Analysis

Correlation analysis is a statistical method used to evaluate the strength and direction of the relationship between two or more variables . The correlation coefficient ranges from -1 to 1.

  • A correlation coefficient of 1 indicates a perfect positive correlation. This means that as one variable increases, the other variable also increases.
  • A correlation coefficient of -1 indicates a perfect negative correlation. This means that as one variable increases, the other variable decreases.
  • A correlation coefficient of 0 means that there’s no linear relationship between the two variables.

Correlation Analysis Methodology

Conducting a correlation analysis involves a series of steps, as described below:

  • Define the Problem : Identify the variables that you think might be related. The variables must be measurable on an interval or ratio scale. For example, if you’re interested in studying the relationship between the amount of time spent studying and exam scores, these would be your two variables.
  • Data Collection : Collect data on the variables of interest. The data could be collected through various means such as surveys , observations , or experiments. It’s crucial to ensure that the data collected is accurate and reliable.
  • Data Inspection : Check the data for any errors or anomalies such as outliers or missing values. Outliers can greatly affect the correlation coefficient, so it’s crucial to handle them appropriately.
  • Choose the Appropriate Correlation Method : Select the correlation method that’s most appropriate for your data. If your data meets the assumptions for Pearson’s correlation (interval or ratio level, linear relationship, variables are normally distributed), use that. If your data is ordinal or doesn’t meet the assumptions for Pearson’s correlation, consider using Spearman’s rank correlation or Kendall’s Tau.
  • Compute the Correlation Coefficient : Once you’ve selected the appropriate method, compute the correlation coefficient. This can be done using statistical software such as R, Python, or SPSS, or manually using the formulas.
  • Interpret the Results : Interpret the correlation coefficient you obtained. If the correlation is close to 1 or -1, the variables are strongly correlated. If the correlation is close to 0, the variables have little to no linear relationship. Also consider the sign of the correlation coefficient: a positive sign indicates a positive relationship (as one variable increases, so does the other), while a negative sign indicates a negative relationship (as one variable increases, the other decreases).
  • Check the Significance : It’s also important to test the statistical significance of the correlation. This typically involves performing a t-test. A small p-value (commonly less than 0.05) suggests that the observed correlation is statistically significant and not due to random chance.
  • Report the Results : The final step is to report your findings. This should include the correlation coefficient, the significance level, and a discussion of what these findings mean in the context of your research question.

Types of Correlation Analysis

Types of Correlation Analysis are as follows:

Pearson Correlation

This is the most common type of correlation analysis. Pearson correlation measures the linear relationship between two continuous variables. It assumes that the variables are normally distributed and have equal variances. The correlation coefficient (r) ranges from -1 to +1, with -1 indicating a perfect negative linear relationship, +1 indicating a perfect positive linear relationship, and 0 indicating no linear relationship.

Spearman Rank Correlation

Spearman’s rank correlation is a non-parametric measure that assesses how well the relationship between two variables can be described using a monotonic function. In other words, it evaluates the degree to which, as one variable increases, the other variable tends to increase, without requiring that increase to be consistent.

Kendall’s Tau

Kendall’s Tau is another non-parametric correlation measure used to detect the strength of dependence between two variables. Kendall’s Tau is often used for variables measured on an ordinal scale (i.e., where values can be ranked).

Point-Biserial Correlation

This is used when you have one dichotomous and one continuous variable, and you want to test for correlations. It’s a special case of the Pearson correlation.

Phi Coefficient

This is used when both variables are dichotomous or binary (having two categories). It’s a measure of association for two binary variables.

Canonical Correlation

This measures the correlation between two multi-dimensional variables. Each variable is a combination of data sets, and the method finds the linear combination that maximizes the correlation between them.

Partial and Semi-Partial (Part) Correlations

These are used when the researcher wants to understand the relationship between two variables while controlling for the effect of one or more additional variables.

Cross-Correlation

Used mostly in time series data to measure the similarity of two series as a function of the displacement of one relative to the other.

Autocorrelation

This is the correlation of a signal with a delayed copy of itself as a function of delay. This is often used in time series analysis to help understand the trend in the data over time.

Correlation Analysis Formulas

There are several formulas for correlation analysis, each corresponding to a different type of correlation. Here are some of the most commonly used ones:

Pearson’s Correlation Coefficient (r)

Pearson’s correlation coefficient measures the linear relationship between two variables. The formula is:

   r = Σ[(xi – Xmean)(yi – Ymean)] / sqrt[(Σ(xi – Xmean)²)(Σ(yi – Ymean)²)]

  • xi and yi are the values of X and Y variables.
  • Xmean and Ymean are the mean values of X and Y.
  • Σ denotes the sum of the values.

Spearman’s Rank Correlation Coefficient (rs)

Spearman’s correlation coefficient measures the monotonic relationship between two variables. The formula is:

   rs = 1 – (6Σd² / n(n² – 1))

  • d is the difference between the ranks of corresponding variables.
  • n is the number of observations.

Kendall’s Tau (τ)

Kendall’s Tau is a measure of rank correlation. The formula is:

   τ = (nc – nd) / 0.5n(n-1)

  • nc is the number of concordant pairs.
  • nd is the number of discordant pairs.

This correlation is a special case of Pearson’s correlation, and so, it uses the same formula as Pearson’s correlation.

Phi coefficient is a measure of association for two binary variables. It’s equivalent to Pearson’s correlation in this specific case.

Partial Correlation

The formula for partial correlation is more complex and depends on the Pearson’s correlation coefficients between the variables.

For partial correlation between X and Y given Z:

  rp(xy.z) = (rxy – rxz * ryz) / sqrt[(1 – rxz^2)(1 – ryz^2)]

  • rxy, rxz, ryz are the Pearson’s correlation coefficients.

Correlation Analysis Examples

Here are a few examples of how correlation analysis could be applied in different contexts:

  • Education : A researcher might want to determine if there’s a relationship between the amount of time students spend studying each week and their exam scores. The two variables would be “study time” and “exam scores”. If a positive correlation is found, it means that students who study more tend to score higher on exams.
  • Healthcare : A healthcare researcher might be interested in understanding the relationship between age and cholesterol levels. If a positive correlation is found, it could mean that as people age, their cholesterol levels tend to increase.
  • Economics : An economist may want to investigate if there’s a correlation between the unemployment rate and the rate of crime in a given city. If a positive correlation is found, it could suggest that as the unemployment rate increases, the crime rate also tends to increase.
  • Marketing : A marketing analyst might want to analyze the correlation between advertising expenditure and sales revenue. A positive correlation would suggest that higher advertising spending is associated with higher sales revenue.
  • Environmental Science : A scientist might be interested in whether there’s a relationship between the amount of CO2 emissions and average temperature increase. A positive correlation would indicate that higher CO2 emissions are associated with higher average temperatures.

Importance of Correlation Analysis

Correlation analysis plays a crucial role in many fields of study for several reasons:

  • Understanding Relationships : Correlation analysis provides a statistical measure of the relationship between two or more variables. It helps in understanding how one variable may change in relation to another.
  • Predicting Trends : When variables are correlated, changes in one can predict changes in another. This is particularly useful in fields like finance, weather forecasting, and technology, where forecasting trends is vital.
  • Data Reduction : If two variables are highly correlated, they are conveying similar information, and you may decide to use only one of them in your analysis, reducing the dimensionality of your data.
  • Testing Hypotheses : Correlation analysis can be used to test hypotheses about relationships between variables. For example, a researcher might want to test whether there’s a significant positive correlation between physical exercise and mental health.
  • Determining Factors : It can help identify factors that are associated with certain behaviors or outcomes. For example, public health researchers might analyze correlations to identify risk factors for diseases.
  • Model Building : Correlation is a fundamental concept in building multivariate statistical models, including regression models and structural equation models. These models often require an understanding of the inter-relationships (correlations) among multiple variables.
  • Validity and Reliability Analysis : In psychometrics, correlation analysis is used to assess the validity and reliability of measurement instruments such as tests or surveys.

Applications of Correlation Analysis

Correlation analysis is used in many fields to understand and quantify the relationship between variables. Here are some of its key applications:

  • Finance : In finance, correlation analysis is used to understand the relationship between different investment types or the risk and return of a portfolio. For example, if two stocks are positively correlated, they tend to move together; if they’re negatively correlated, they move in opposite directions.
  • Economics : Economists use correlation analysis to understand the relationship between various economic indicators, such as GDP and unemployment rate, inflation rate and interest rates, or income and consumption patterns.
  • Marketing : Correlation analysis can help marketers understand the relationship between advertising spend and sales, or the relationship between price changes and demand.
  • Psychology : In psychology, correlation analysis can be used to understand the relationship between different psychological variables, such as the correlation between stress levels and sleep quality, or between self-esteem and academic performance.
  • Medicine : In healthcare, correlation analysis can be used to understand the relationships between various health outcomes and potential predictors. For example, researchers might investigate the correlation between physical activity levels and heart disease, or between smoking and lung cancer.
  • Environmental Science : Correlation analysis can be used to investigate the relationships between different environmental factors, such as the correlation between CO2 levels and average global temperature, or between pesticide use and biodiversity.
  • Social Sciences : In fields like sociology and political science, correlation analysis can be used to investigate relationships between different social and political phenomena, such as the correlation between education levels and political participation, or between income inequality and social unrest.

Advantages and Disadvantages of Correlation Analysis

About the author.

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Cluster Analysis

Cluster Analysis – Types, Methods and Examples

Discriminant Analysis

Discriminant Analysis – Methods, Types and...

MANOVA

MANOVA (Multivariate Analysis of Variance) –...

Documentary Analysis

Documentary Analysis – Methods, Applications and...

ANOVA

ANOVA (Analysis of variance) – Formulas, Types...

Graphical Methods

Graphical Methods – Types, Examples and Guide

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology
  • Correlational Research | Guide, Design & Examples

Correlational Research | Guide, Design & Examples

Published on 5 May 2022 by Pritha Bhandari . Revised on 5 December 2022.

A correlational research design investigates relationships between variables without the researcher controlling or manipulating any of them.

A correlation reflects the strength and/or direction of the relationship between two (or more) variables. The direction of a correlation can be either positive or negative.

Table of contents

Correlational vs experimental research, when to use correlational research, how to collect correlational data, how to analyse correlational data, correlation and causation, frequently asked questions about correlational research.

Correlational and experimental research both use quantitative methods to investigate relationships between variables. But there are important differences in how data is collected and the types of conclusions you can draw.

Prevent plagiarism, run a free check.

Correlational research is ideal for gathering data quickly from natural settings. That helps you generalise your findings to real-life situations in an externally valid way.

There are a few situations where correlational research is an appropriate choice.

To investigate non-causal relationships

You want to find out if there is an association between two variables, but you don’t expect to find a causal relationship between them.

Correlational research can provide insights into complex real-world relationships, helping researchers develop theories and make predictions.

To explore causal relationships between variables

You think there is a causal relationship between two variables, but it is impractical, unethical, or too costly to conduct experimental research that manipulates one of the variables.

Correlational research can provide initial indications or additional support for theories about causal relationships.

To test new measurement tools

You have developed a new instrument for measuring your variable, and you need to test its reliability or validity .

Correlational research can be used to assess whether a tool consistently or accurately captures the concept it aims to measure.

There are many different methods you can use in correlational research. In the social and behavioural sciences, the most common data collection methods for this type of research include surveys, observations, and secondary data.

It’s important to carefully choose and plan your methods to ensure the reliability and validity of your results. You should carefully select a representative sample so that your data reflects the population you’re interested in without bias .

In survey research , you can use questionnaires to measure your variables of interest. You can conduct surveys online, by post, by phone, or in person.

Surveys are a quick, flexible way to collect standardised data from many participants, but it’s important to ensure that your questions are worded in an unbiased way and capture relevant insights.

Naturalistic observation

Naturalistic observation is a type of field research where you gather data about a behaviour or phenomenon in its natural environment.

This method often involves recording, counting, describing, and categorising actions and events. Naturalistic observation can include both qualitative and quantitative elements, but to assess correlation, you collect data that can be analysed quantitatively (e.g., frequencies, durations, scales, and amounts).

Naturalistic observation lets you easily generalise your results to real-world contexts, and you can study experiences that aren’t replicable in lab settings. But data analysis can be time-consuming and unpredictable, and researcher bias may skew the interpretations.

Secondary data

Instead of collecting original data, you can also use data that has already been collected for a different purpose, such as official records, polls, or previous studies.

Using secondary data is inexpensive and fast, because data collection is complete. However, the data may be unreliable, incomplete, or not entirely relevant, and you have no control over the reliability or validity of the data collection procedures.

After collecting data, you can statistically analyse the relationship between variables using correlation or regression analyses, or both. You can also visualise the relationships between variables with a scatterplot.

Different types of correlation coefficients and regression analyses are appropriate for your data based on their levels of measurement and distributions .

Correlation analysis

Using a correlation analysis, you can summarise the relationship between variables into a correlation coefficient : a single number that describes the strength and direction of the relationship between variables. With this number, you’ll quantify the degree of the relationship between variables.

The Pearson product-moment correlation coefficient, also known as Pearson’s r , is commonly used for assessing a linear relationship between two quantitative variables.

Correlation coefficients are usually found for two variables at a time, but you can use a multiple correlation coefficient for three or more variables.

Regression analysis

With a regression analysis , you can predict how much a change in one variable will be associated with a change in the other variable. The result is a regression equation that describes the line on a graph of your variables.

You can use this equation to predict the value of one variable based on the given value(s) of the other variable(s). It’s best to perform a regression analysis after testing for a correlation between your variables.

It’s important to remember that correlation does not imply causation . Just because you find a correlation between two things doesn’t mean you can conclude one of them causes the other, for a few reasons.

Directionality problem

If two variables are correlated, it could be because one of them is a cause and the other is an effect. But the correlational research design doesn’t allow you to infer which is which. To err on the side of caution, researchers don’t conclude causality from correlational studies.

Third variable problem

A confounding variable is a third variable that influences other variables to make them seem causally related even though they are not. Instead, there are separate causal links between the confounder and each variable.

In correlational research, there’s limited or no researcher control over extraneous variables . Even if you statistically control for some potential confounders, there may still be other hidden variables that disguise the relationship between your study variables.

Although a correlational study can’t demonstrate causation on its own, it can help you develop a causal hypothesis that’s tested in controlled experiments.

A correlation reflects the strength and/or direction of the association between two or more variables.

  • A positive correlation means that both variables change in the same direction.
  • A negative correlation means that the variables change in opposite directions.
  • A zero correlation means there’s no relationship between the variables.

A correlational research design investigates relationships between two variables (or more) without the researcher controlling or manipulating any of them. It’s a non-experimental type of quantitative research .

Controlled experiments establish causality, whereas correlational studies only show associations between variables.

  • In an experimental design , you manipulate an independent variable and measure its effect on a dependent variable. Other variables are controlled so they can’t impact the results.
  • In a correlational design , you measure variables without manipulating any of them. You can test whether your variables change together, but you can’t be sure that one variable caused a change in another.

In general, correlational research is high in external validity while experimental research is high in internal validity .

A correlation is usually tested for two variables at a time, but you can test correlations between three or more variables.

A correlation coefficient is a single number that describes the strength and direction of the relationship between your variables.

Different types of correlation coefficients might be appropriate for your data based on their levels of measurement and distributions . The Pearson product-moment correlation coefficient (Pearson’s r ) is commonly used to assess a linear relationship between two quantitative variables.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Bhandari, P. (2022, December 05). Correlational Research | Guide, Design & Examples. Scribbr. Retrieved 27 May 2024, from https://www.scribbr.co.uk/research-methods/correlational-research-design/

Is this article helpful?

Pritha Bhandari

Pritha Bhandari

Other students also liked, a quick guide to experimental design | 5 steps & examples, quasi-experimental design | definition, types & examples, qualitative vs quantitative research | examples & methods.

Logo for M Libraries Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

7.2 Correlational Research

Learning objectives.

  • Define correlational research and give several examples.
  • Explain why a researcher might choose to conduct correlational research rather than experimental research or another type of nonexperimental research.

What Is Correlational Research?

Correlational research is a type of nonexperimental research in which the researcher measures two variables and assesses the statistical relationship (i.e., the correlation) between them with little or no effort to control extraneous variables. There are essentially two reasons that researchers interested in statistical relationships between variables would choose to conduct a correlational study rather than an experiment. The first is that they do not believe that the statistical relationship is a causal one. For example, a researcher might evaluate the validity of a brief extraversion test by administering it to a large group of participants along with a longer extraversion test that has already been shown to be valid. This researcher might then check to see whether participants’ scores on the brief test are strongly correlated with their scores on the longer one. Neither test score is thought to cause the other, so there is no independent variable to manipulate. In fact, the terms independent variable and dependent variable do not apply to this kind of research.

The other reason that researchers would choose to use a correlational study rather than an experiment is that the statistical relationship of interest is thought to be causal, but the researcher cannot manipulate the independent variable because it is impossible, impractical, or unethical. For example, Allen Kanner and his colleagues thought that the number of “daily hassles” (e.g., rude salespeople, heavy traffic) that people experience affects the number of physical and psychological symptoms they have (Kanner, Coyne, Schaefer, & Lazarus, 1981). But because they could not manipulate the number of daily hassles their participants experienced, they had to settle for measuring the number of daily hassles—along with the number of symptoms—using self-report questionnaires. Although the strong positive relationship they found between these two variables is consistent with their idea that hassles cause symptoms, it is also consistent with the idea that symptoms cause hassles or that some third variable (e.g., neuroticism) causes both.

A common misconception among beginning researchers is that correlational research must involve two quantitative variables, such as scores on two extraversion tests or the number of hassles and number of symptoms people have experienced. However, the defining feature of correlational research is that the two variables are measured—neither one is manipulated—and this is true regardless of whether the variables are quantitative or categorical. Imagine, for example, that a researcher administers the Rosenberg Self-Esteem Scale to 50 American college students and 50 Japanese college students. Although this “feels” like a between-subjects experiment, it is a correlational study because the researcher did not manipulate the students’ nationalities. The same is true of the study by Cacioppo and Petty comparing college faculty and factory workers in terms of their need for cognition. It is a correlational study because the researchers did not manipulate the participants’ occupations.

Figure 7.2 “Results of a Hypothetical Study on Whether People Who Make Daily To-Do Lists Experience Less Stress Than People Who Do Not Make Such Lists” shows data from a hypothetical study on the relationship between whether people make a daily list of things to do (a “to-do list”) and stress. Notice that it is unclear whether this is an experiment or a correlational study because it is unclear whether the independent variable was manipulated. If the researcher randomly assigned some participants to make daily to-do lists and others not to, then it is an experiment. If the researcher simply asked participants whether they made daily to-do lists, then it is a correlational study. The distinction is important because if the study was an experiment, then it could be concluded that making the daily to-do lists reduced participants’ stress. But if it was a correlational study, it could only be concluded that these variables are statistically related. Perhaps being stressed has a negative effect on people’s ability to plan ahead (the directionality problem). Or perhaps people who are more conscientious are more likely to make to-do lists and less likely to be stressed (the third-variable problem). The crucial point is that what defines a study as experimental or correlational is not the variables being studied, nor whether the variables are quantitative or categorical, nor the type of graph or statistics used to analyze the data. It is how the study is conducted.

Figure 7.2 Results of a Hypothetical Study on Whether People Who Make Daily To-Do Lists Experience Less Stress Than People Who Do Not Make Such Lists

Results of a Hypothetical Study on Whether People Who Make Daily To-Do Lists Experience Less Stress Than People Who Do Not Make Such Lists

Data Collection in Correlational Research

Again, the defining feature of correlational research is that neither variable is manipulated. It does not matter how or where the variables are measured. A researcher could have participants come to a laboratory to complete a computerized backward digit span task and a computerized risky decision-making task and then assess the relationship between participants’ scores on the two tasks. Or a researcher could go to a shopping mall to ask people about their attitudes toward the environment and their shopping habits and then assess the relationship between these two variables. Both of these studies would be correlational because no independent variable is manipulated. However, because some approaches to data collection are strongly associated with correlational research, it makes sense to discuss them here. The two we will focus on are naturalistic observation and archival data. A third, survey research, is discussed in its own chapter.

Naturalistic Observation

Naturalistic observation is an approach to data collection that involves observing people’s behavior in the environment in which it typically occurs. Thus naturalistic observation is a type of field research (as opposed to a type of laboratory research). It could involve observing shoppers in a grocery store, children on a school playground, or psychiatric inpatients in their wards. Researchers engaged in naturalistic observation usually make their observations as unobtrusively as possible so that participants are often not aware that they are being studied. Ethically, this is considered to be acceptable if the participants remain anonymous and the behavior occurs in a public setting where people would not normally have an expectation of privacy. Grocery shoppers putting items into their shopping carts, for example, are engaged in public behavior that is easily observable by store employees and other shoppers. For this reason, most researchers would consider it ethically acceptable to observe them for a study. On the other hand, one of the arguments against the ethicality of the naturalistic observation of “bathroom behavior” discussed earlier in the book is that people have a reasonable expectation of privacy even in a public restroom and that this expectation was violated.

Researchers Robert Levine and Ara Norenzayan used naturalistic observation to study differences in the “pace of life” across countries (Levine & Norenzayan, 1999). One of their measures involved observing pedestrians in a large city to see how long it took them to walk 60 feet. They found that people in some countries walked reliably faster than people in other countries. For example, people in the United States and Japan covered 60 feet in about 12 seconds on average, while people in Brazil and Romania took close to 17 seconds.

Because naturalistic observation takes place in the complex and even chaotic “real world,” there are two closely related issues that researchers must deal with before collecting data. The first is sampling. When, where, and under what conditions will the observations be made, and who exactly will be observed? Levine and Norenzayan described their sampling process as follows:

Male and female walking speed over a distance of 60 feet was measured in at least two locations in main downtown areas in each city. Measurements were taken during main business hours on clear summer days. All locations were flat, unobstructed, had broad sidewalks, and were sufficiently uncrowded to allow pedestrians to move at potentially maximum speeds. To control for the effects of socializing, only pedestrians walking alone were used. Children, individuals with obvious physical handicaps, and window-shoppers were not timed. Thirty-five men and 35 women were timed in most cities. (p. 186)

Precise specification of the sampling process in this way makes data collection manageable for the observers, and it also provides some control over important extraneous variables. For example, by making their observations on clear summer days in all countries, Levine and Norenzayan controlled for effects of the weather on people’s walking speeds.

The second issue is measurement. What specific behaviors will be observed? In Levine and Norenzayan’s study, measurement was relatively straightforward. They simply measured out a 60-foot distance along a city sidewalk and then used a stopwatch to time participants as they walked over that distance. Often, however, the behaviors of interest are not so obvious or objective. For example, researchers Robert Kraut and Robert Johnston wanted to study bowlers’ reactions to their shots, both when they were facing the pins and then when they turned toward their companions (Kraut & Johnston, 1979). But what “reactions” should they observe? Based on previous research and their own pilot testing, Kraut and Johnston created a list of reactions that included “closed smile,” “open smile,” “laugh,” “neutral face,” “look down,” “look away,” and “face cover” (covering one’s face with one’s hands). The observers committed this list to memory and then practiced by coding the reactions of bowlers who had been videotaped. During the actual study, the observers spoke into an audio recorder, describing the reactions they observed. Among the most interesting results of this study was that bowlers rarely smiled while they still faced the pins. They were much more likely to smile after they turned toward their companions, suggesting that smiling is not purely an expression of happiness but also a form of social communication.

A woman bowling

Naturalistic observation has revealed that bowlers tend to smile when they turn away from the pins and toward their companions, suggesting that smiling is not purely an expression of happiness but also a form of social communication.

sieneke toering – bowling big lebowski style – CC BY-NC-ND 2.0.

When the observations require a judgment on the part of the observers—as in Kraut and Johnston’s study—this process is often described as coding . Coding generally requires clearly defining a set of target behaviors. The observers then categorize participants individually in terms of which behavior they have engaged in and the number of times they engaged in each behavior. The observers might even record the duration of each behavior. The target behaviors must be defined in such a way that different observers code them in the same way. This is the issue of interrater reliability. Researchers are expected to demonstrate the interrater reliability of their coding procedure by having multiple raters code the same behaviors independently and then showing that the different observers are in close agreement. Kraut and Johnston, for example, video recorded a subset of their participants’ reactions and had two observers independently code them. The two observers showed that they agreed on the reactions that were exhibited 97% of the time, indicating good interrater reliability.

Archival Data

Another approach to correlational research is the use of archival data , which are data that have already been collected for some other purpose. An example is a study by Brett Pelham and his colleagues on “implicit egotism”—the tendency for people to prefer people, places, and things that are similar to themselves (Pelham, Carvallo, & Jones, 2005). In one study, they examined Social Security records to show that women with the names Virginia, Georgia, Louise, and Florence were especially likely to have moved to the states of Virginia, Georgia, Louisiana, and Florida, respectively.

As with naturalistic observation, measurement can be more or less straightforward when working with archival data. For example, counting the number of people named Virginia who live in various states based on Social Security records is relatively straightforward. But consider a study by Christopher Peterson and his colleagues on the relationship between optimism and health using data that had been collected many years before for a study on adult development (Peterson, Seligman, & Vaillant, 1988). In the 1940s, healthy male college students had completed an open-ended questionnaire about difficult wartime experiences. In the late 1980s, Peterson and his colleagues reviewed the men’s questionnaire responses to obtain a measure of explanatory style—their habitual ways of explaining bad events that happen to them. More pessimistic people tend to blame themselves and expect long-term negative consequences that affect many aspects of their lives, while more optimistic people tend to blame outside forces and expect limited negative consequences. To obtain a measure of explanatory style for each participant, the researchers used a procedure in which all negative events mentioned in the questionnaire responses, and any causal explanations for them, were identified and written on index cards. These were given to a separate group of raters who rated each explanation in terms of three separate dimensions of optimism-pessimism. These ratings were then averaged to produce an explanatory style score for each participant. The researchers then assessed the statistical relationship between the men’s explanatory style as college students and archival measures of their health at approximately 60 years of age. The primary result was that the more optimistic the men were as college students, the healthier they were as older men. Pearson’s r was +.25.

This is an example of content analysis —a family of systematic approaches to measurement using complex archival data. Just as naturalistic observation requires specifying the behaviors of interest and then noting them as they occur, content analysis requires specifying keywords, phrases, or ideas and then finding all occurrences of them in the data. These occurrences can then be counted, timed (e.g., the amount of time devoted to entertainment topics on the nightly news show), or analyzed in a variety of other ways.

Key Takeaways

  • Correlational research involves measuring two variables and assessing the relationship between them, with no manipulation of an independent variable.
  • Correlational research is not defined by where or how the data are collected. However, some approaches to data collection are strongly associated with correlational research. These include naturalistic observation (in which researchers observe people’s behavior in the context in which it normally occurs) and the use of archival data that were already collected for some other purpose.

Discussion: For each of the following, decide whether it is most likely that the study described is experimental or correlational and explain why.

  • An educational researcher compares the academic performance of students from the “rich” side of town with that of students from the “poor” side of town.
  • A cognitive psychologist compares the ability of people to recall words that they were instructed to “read” with their ability to recall words that they were instructed to “imagine.”
  • A manager studies the correlation between new employees’ college grade point averages and their first-year performance reports.
  • An automotive engineer installs different stick shifts in a new car prototype, each time asking several people to rate how comfortable the stick shift feels.
  • A food scientist studies the relationship between the temperature inside people’s refrigerators and the amount of bacteria on their food.
  • A social psychologist tells some research participants that they need to hurry over to the next building to complete a study. She tells others that they can take their time. Then she observes whether they stop to help a research assistant who is pretending to be hurt.

Kanner, A. D., Coyne, J. C., Schaefer, C., & Lazarus, R. S. (1981). Comparison of two modes of stress measurement: Daily hassles and uplifts versus major life events. Journal of Behavioral Medicine, 4 , 1–39.

Kraut, R. E., & Johnston, R. E. (1979). Social and emotional messages of smiling: An ethological approach. Journal of Personality and Social Psychology, 37 , 1539–1553.

Levine, R. V., & Norenzayan, A. (1999). The pace of life in 31 countries. Journal of Cross-Cultural Psychology, 30 , 178–205.

Pelham, B. W., Carvallo, M., & Jones, J. T. (2005). Implicit egotism. Current Directions in Psychological Science, 14 , 106–110.

Peterson, C., Seligman, M. E. P., & Vaillant, G. E. (1988). Pessimistic explanatory style is a risk factor for physical illness: A thirty-five year longitudinal study. Journal of Personality and Social Psychology, 55 , 23–27.

Research Methods in Psychology Copyright © 2016 by University of Minnesota is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

  • Comprehensive Learning Paths
  • 150+ Hours of Videos
  • Complete Access to Jupyter notebooks, Datasets, References.

Rating

Correlation – Connecting the Dots, the Role of Correlation in Data Analysis

  • September 23, 2023

Correlation is a fundamental concept in statistics and data science. It quantifies the degree to which two variables are related. But what does this mean, and how can we use it to our advantage in real-world scenarios? Let’s dive deep into understanding correlation, how to measure it, and its practical implications.

data analysis in correlational research

In this Blog post we will learn:

  • What is Correlation?
  • Importance of Correlation in Data Science?
  • How to Measure Correlation? 3.1. Typs of Correlation 3.2. Pearson Correlation Coefficient 3.3. Formula: 3.4. Explanation: 3.5. Interpretation:
  • Calculate Correlation Using Python 4.1. Visualize Correlations 4.2. Test for Significance in Correlation 4.3. Handle Multiple Correlations 4.4. Visualizing the Correlation Matrix with a Heatmap 4.5. How to Account for Non-Linear Correlations?
  • Difference Between Correlation and Causation?

1. What is Correlation?

Correlation refers to a statistical measure that represents the strength and direction of a linear relationship between two variables. If you’ve ever wondered if one event or variable has a relationship with another, you’re thinking about correlation. For instance, does the number of hours you study correlate with your exam scores?

2. Importance of Correlation in Data Science?

Understanding correlations can help data scientists:

  • Discover relationships between variables.
  • Determine important variables for predictive modeling.
  • Uncover underlying patterns in data.
  • Make better business decisions by understanding key drivers.

3. How to Measure Correlation?

The most common measure of correlation is the Pearson correlation coefficient, often denoted as ‘r’. Its values range between -1 and 1. Here’s what these values indicate:

  • 1 or -1: Perfect correlation; 1 is positive, -1 is negative.
  • 0: No correlation.
  • Between 0 and ±1: Varying degrees of correlation, with strength increasing as it approaches ±1.

3.1. Typs of Correlation

Positive Correlation : – Value: $r$ is between 0 and +1. – Meaning: When one variable increases, the other also increases, and when one decreases, the other also decreases. – Graphically, a positive correlation will generally display a line of best fit that slopes upwards.

Negative Correlation : – Value: $r$ is between 0 and -1. – Meaning: When one variable increases, the other decreases, and vice versa. – Graphically, a negative correlation will typically show a line of best fit that slopes downwards.

No Correlation (Zero Correlation) : – Value: $r$ is approximately 0. – Meaning: Changes in one variable do not predict any particular change in the other variable. They move independently of each other. – Graphically, data with no correlation will appear scattered with no discernible pattern or trend.

data analysis in correlational research

3.2. Pearson Correlation Coefficient

The Pearson correlation coefficient, often denoted as $r$, quantifies the linear relationship between two variables. Let’s delve into its formula and understand its significance.

3.3. Formula:

Given two variables, $X$ and $Y$, with data points $x_1, x_2, …, x_n$ and $y_1, y_2, …, y_n$ respectively, the Pearson correlation coefficient, $r$, is formulated as:

$ r = \frac{\sum_{i=1}^{n} (x_i – \bar{x})(y_i – \bar{y})}{\sqrt{\sum_{i=1}^{n} (x_i – \bar{x})^2 \sum_{i=1}^{n} (y_i – \bar{y})^2}} $

Where: – $\bar{x}$ represents the mean of the $x$ values. – $\bar{y}$ represents the mean of the $y$ values.

3.4. Explanation:

  • The numerator, $\sum_{i=1}^{n} (x_i – \bar{x})(y_i – \bar{y})$, sums up the product of the deviations of each data point from their respective averages. This evaluates if the deviations of one variable coincide with the deviations of the other.
  • The denominator ensures normalization of the coefficient, ensuring that $r$ remains between -1 and 1. The terms $\sum_{i=1}^{n} (x_i – \bar{x})^2$ and $\sum_{i=1}^{n} (y_i – \bar{y})^2$ sum the squared deviations of each data point from their means for $X$ and $Y$ respectively.
  • $r = 1$: Indicates a perfect positive linear relationship between $X$ and $Y$.
  • $r = -1$: Signifies a perfect negative linear relationship between $X$ and $Y$.
  • $r = 0$: Suggests no evident linear trend between the variables.

3.5. Interpretation:

Envision plotting the data points of $X$ and $Y$ on a scatter plot. The Pearson correlation provides insight into how closely these points cluster around a straight line.

data analysis in correlational research

  • An $r$ value near 1 implies that as $X$ elevates, $Y$ also tends to rise, resulting in an upward trending line.
  • An $r$ value nearing -1 indicates that as $X$ escalates, $Y$ generally diminishes, yielding a downward trending line.
  • A value approaching 0 indicates no discernible linear trend between the variables.

However, a crucial note is that correlation doesn’t signify causation. A strong correlation doesn’t necessarily indicate that one variable caused the other.

4. Calculate Correlation Using Python

Let’s assume you’re a teacher who wants to understand if there’s a relationship between the hours a student studies and their exam scores.

Scenario: You have data on 5 students: hours studied and their corresponding exam scores.

This output suggests a strong positive correlation between study hours and exam scores.

4.1. Visualize Correlations

A scatter plot is a common way.

data analysis in correlational research

4.2. Test for Significance in Correlation

It helps to determine if the observed correlation is statistically significant. This means we’re reasonably sure the correlation is real and not due to chance.

If the p-value is below a threshold (commonly 0.05), the correlation is considered statistically significant.

4.3. Handle Multiple Correlations

In real-world datasets, you might want to check correlations between multiple variables. This can be done using a correlation matrix.

4.4. Visualizing the Correlation Matrix with a Heatmap

Visualizing multiple correlations using a heatmap is a common and insightful way to quickly grasp relationships between multiple variables in a dataset. We’ll use the Python libraries like pandas and seaborn , to display these correlations.

For larger datasets, visualizing this matrix as a heatmap can be insightful.

data analysis in correlational research

  • annot=True ensures that the correlation values appear on the heatmap.
  • cmap specifies the color palette. In this case, we’ve chosen ‘coolwarm’, but there are various palettes available in seaborn.
  • linewidths determines the width of the lines that will divide each cell.
  • vmin and vmax are used to anchor the colormap, ensuring that the center is set at a meaningful value.

4.5. How to Account for Non-Linear Correlations?

Pearson’s correlation coefficient captures linear relationships. But what if the relationship is curved or nonlinear? Enter Spearman’s rank correlation. It’s based on ranked values rather than raw data.

5. Difference Between Correlation and Causation?

It’s vital to note that correlation does not imply causation. Just because two variables are correlated doesn’t mean one caused the other. Using our example, while hours studied and exam scores are correlated, it doesn’t mean studying longer always causes better scores. Other factors might play a role.

6. Conclusion

Correlation is a powerful tool in data science, offering insights into relationships between variables. But it’s crucial to use it judiciously and remember that correlation doesn’t equate to causation. Python, with its rich library ecosystem, provides a many tools and methods to efficiently calculate, visualize, and interpret correlations.

The key is to understand the data, choose the appropriate correlation measure, and always be aware of the underlying assumptions.

More Articles

Hypothesis testing – a deep dive into hypothesis testing, the backbone of statistical inference, sampling and sampling distributions – a comprehensive guide on sampling and sampling distributions, law of large numbers – a deep dive into the world of statistics, central limit theorem – a deep dive into central limit theorem and its significance in statistics, skewness and kurtosis – peaks and tails, understanding data through skewness and kurtosis”, similar articles, complete introduction to linear regression in r, how to implement common statistical significance tests and find the p value, logistic regression – a complete tutorial with examples in r.

Subscribe to Machine Learning Plus for high value data science content

© Machinelearningplus. All rights reserved.

data analysis in correlational research

Machine Learning A-Z™: Hands-On Python & R In Data Science

Free sample videos:.

data analysis in correlational research

  • Search Menu

Sign in through your institution

  • Browse content in Arts and Humanities
  • Browse content in Archaeology
  • Anglo-Saxon and Medieval Archaeology
  • Archaeological Methodology and Techniques
  • Archaeology by Region
  • Archaeology of Religion
  • Archaeology of Trade and Exchange
  • Biblical Archaeology
  • Contemporary and Public Archaeology
  • Environmental Archaeology
  • Historical Archaeology
  • History and Theory of Archaeology
  • Industrial Archaeology
  • Landscape Archaeology
  • Mortuary Archaeology
  • Prehistoric Archaeology
  • Underwater Archaeology
  • Zooarchaeology
  • Browse content in Architecture
  • Architectural Structure and Design
  • History of Architecture
  • Residential and Domestic Buildings
  • Theory of Architecture
  • Browse content in Art
  • Art Subjects and Themes
  • History of Art
  • Industrial and Commercial Art
  • Theory of Art
  • Biographical Studies
  • Byzantine Studies
  • Browse content in Classical Studies
  • Classical History
  • Classical Philosophy
  • Classical Mythology
  • Classical Literature
  • Classical Reception
  • Classical Art and Architecture
  • Classical Oratory and Rhetoric
  • Greek and Roman Epigraphy
  • Greek and Roman Law
  • Greek and Roman Archaeology
  • Greek and Roman Papyrology
  • Late Antiquity
  • Religion in the Ancient World
  • Digital Humanities
  • Browse content in History
  • Colonialism and Imperialism
  • Diplomatic History
  • Environmental History
  • Genealogy, Heraldry, Names, and Honours
  • Genocide and Ethnic Cleansing
  • Historical Geography
  • History by Period
  • History of Agriculture
  • History of Education
  • History of Emotions
  • History of Gender and Sexuality
  • Industrial History
  • Intellectual History
  • International History
  • Labour History
  • Legal and Constitutional History
  • Local and Family History
  • Maritime History
  • Military History
  • National Liberation and Post-Colonialism
  • Oral History
  • Political History
  • Public History
  • Regional and National History
  • Revolutions and Rebellions
  • Slavery and Abolition of Slavery
  • Social and Cultural History
  • Theory, Methods, and Historiography
  • Urban History
  • World History
  • Browse content in Language Teaching and Learning
  • Language Learning (Specific Skills)
  • Language Teaching Theory and Methods
  • Browse content in Linguistics
  • Applied Linguistics
  • Cognitive Linguistics
  • Computational Linguistics
  • Forensic Linguistics
  • Grammar, Syntax and Morphology
  • Historical and Diachronic Linguistics
  • History of English
  • Language Acquisition
  • Language Variation
  • Language Families
  • Language Evolution
  • Language Reference
  • Lexicography
  • Linguistic Theories
  • Linguistic Typology
  • Linguistic Anthropology
  • Phonetics and Phonology
  • Psycholinguistics
  • Sociolinguistics
  • Translation and Interpretation
  • Writing Systems
  • Browse content in Literature
  • Bibliography
  • Children's Literature Studies
  • Literary Studies (Asian)
  • Literary Studies (European)
  • Literary Studies (Eco-criticism)
  • Literary Studies (Modernism)
  • Literary Studies (Romanticism)
  • Literary Studies (American)
  • Literary Studies - World
  • Literary Studies (1500 to 1800)
  • Literary Studies (19th Century)
  • Literary Studies (20th Century onwards)
  • Literary Studies (African American Literature)
  • Literary Studies (British and Irish)
  • Literary Studies (Early and Medieval)
  • Literary Studies (Fiction, Novelists, and Prose Writers)
  • Literary Studies (Gender Studies)
  • Literary Studies (Graphic Novels)
  • Literary Studies (History of the Book)
  • Literary Studies (Plays and Playwrights)
  • Literary Studies (Poetry and Poets)
  • Literary Studies (Postcolonial Literature)
  • Literary Studies (Queer Studies)
  • Literary Studies (Science Fiction)
  • Literary Studies (Travel Literature)
  • Literary Studies (War Literature)
  • Literary Studies (Women's Writing)
  • Literary Theory and Cultural Studies
  • Mythology and Folklore
  • Shakespeare Studies and Criticism
  • Browse content in Media Studies
  • Browse content in Music
  • Applied Music
  • Dance and Music
  • Ethics in Music
  • Ethnomusicology
  • Gender and Sexuality in Music
  • Medicine and Music
  • Music Cultures
  • Music and Religion
  • Music and Culture
  • Music and Media
  • Music Education and Pedagogy
  • Music Theory and Analysis
  • Musical Scores, Lyrics, and Libretti
  • Musical Structures, Styles, and Techniques
  • Musicology and Music History
  • Performance Practice and Studies
  • Race and Ethnicity in Music
  • Sound Studies
  • Browse content in Performing Arts
  • Browse content in Philosophy
  • Aesthetics and Philosophy of Art
  • Epistemology
  • Feminist Philosophy
  • History of Western Philosophy
  • Metaphysics
  • Moral Philosophy
  • Non-Western Philosophy
  • Philosophy of Science
  • Philosophy of Action
  • Philosophy of Law
  • Philosophy of Religion
  • Philosophy of Language
  • Philosophy of Mind
  • Philosophy of Perception
  • Philosophy of Mathematics and Logic
  • Practical Ethics
  • Social and Political Philosophy
  • Browse content in Religion
  • Biblical Studies
  • Christianity
  • East Asian Religions
  • History of Religion
  • Judaism and Jewish Studies
  • Qumran Studies
  • Religion and Education
  • Religion and Health
  • Religion and Politics
  • Religion and Science
  • Religion and Law
  • Religion and Art, Literature, and Music
  • Religious Studies
  • Browse content in Society and Culture
  • Cookery, Food, and Drink
  • Cultural Studies
  • Customs and Traditions
  • Ethical Issues and Debates
  • Hobbies, Games, Arts and Crafts
  • Natural world, Country Life, and Pets
  • Popular Beliefs and Controversial Knowledge
  • Sports and Outdoor Recreation
  • Technology and Society
  • Travel and Holiday
  • Visual Culture
  • Browse content in Law
  • Arbitration
  • Browse content in Company and Commercial Law
  • Commercial Law
  • Company Law
  • Browse content in Comparative Law
  • Systems of Law
  • Competition Law
  • Browse content in Constitutional and Administrative Law
  • Government Powers
  • Judicial Review
  • Local Government Law
  • Military and Defence Law
  • Parliamentary and Legislative Practice
  • Construction Law
  • Contract Law
  • Browse content in Criminal Law
  • Criminal Procedure
  • Criminal Evidence Law
  • Sentencing and Punishment
  • Employment and Labour Law
  • Environment and Energy Law
  • Browse content in Financial Law
  • Banking Law
  • Insolvency Law
  • History of Law
  • Human Rights and Immigration
  • Intellectual Property Law
  • Browse content in International Law
  • Private International Law and Conflict of Laws
  • Public International Law
  • IT and Communications Law
  • Jurisprudence and Philosophy of Law
  • Law and Politics
  • Law and Society
  • Browse content in Legal System and Practice
  • Courts and Procedure
  • Legal Skills and Practice
  • Primary Sources of Law
  • Regulation of Legal Profession
  • Medical and Healthcare Law
  • Browse content in Policing
  • Criminal Investigation and Detection
  • Police and Security Services
  • Police Procedure and Law
  • Police Regional Planning
  • Browse content in Property Law
  • Personal Property Law
  • Study and Revision
  • Terrorism and National Security Law
  • Browse content in Trusts Law
  • Wills and Probate or Succession
  • Browse content in Medicine and Health
  • Browse content in Allied Health Professions
  • Arts Therapies
  • Clinical Science
  • Dietetics and Nutrition
  • Occupational Therapy
  • Operating Department Practice
  • Physiotherapy
  • Radiography
  • Speech and Language Therapy
  • Browse content in Anaesthetics
  • General Anaesthesia
  • Neuroanaesthesia
  • Browse content in Clinical Medicine
  • Acute Medicine
  • Cardiovascular Medicine
  • Clinical Genetics
  • Clinical Pharmacology and Therapeutics
  • Dermatology
  • Endocrinology and Diabetes
  • Gastroenterology
  • Genito-urinary Medicine
  • Geriatric Medicine
  • Infectious Diseases
  • Medical Oncology
  • Medical Toxicology
  • Pain Medicine
  • Palliative Medicine
  • Rehabilitation Medicine
  • Respiratory Medicine and Pulmonology
  • Rheumatology
  • Sleep Medicine
  • Sports and Exercise Medicine
  • Clinical Neuroscience
  • Community Medical Services
  • Critical Care
  • Emergency Medicine
  • Forensic Medicine
  • Haematology
  • History of Medicine
  • Browse content in Medical Dentistry
  • Oral and Maxillofacial Surgery
  • Paediatric Dentistry
  • Restorative Dentistry and Orthodontics
  • Surgical Dentistry
  • Medical Ethics
  • Browse content in Medical Skills
  • Clinical Skills
  • Communication Skills
  • Nursing Skills
  • Surgical Skills
  • Medical Statistics and Methodology
  • Browse content in Neurology
  • Clinical Neurophysiology
  • Neuropathology
  • Nursing Studies
  • Browse content in Obstetrics and Gynaecology
  • Gynaecology
  • Occupational Medicine
  • Ophthalmology
  • Otolaryngology (ENT)
  • Browse content in Paediatrics
  • Neonatology
  • Browse content in Pathology
  • Chemical Pathology
  • Clinical Cytogenetics and Molecular Genetics
  • Histopathology
  • Medical Microbiology and Virology
  • Patient Education and Information
  • Browse content in Pharmacology
  • Psychopharmacology
  • Browse content in Popular Health
  • Caring for Others
  • Complementary and Alternative Medicine
  • Self-help and Personal Development
  • Browse content in Preclinical Medicine
  • Cell Biology
  • Molecular Biology and Genetics
  • Reproduction, Growth and Development
  • Primary Care
  • Professional Development in Medicine
  • Browse content in Psychiatry
  • Addiction Medicine
  • Child and Adolescent Psychiatry
  • Forensic Psychiatry
  • Learning Disabilities
  • Old Age Psychiatry
  • Psychotherapy
  • Browse content in Public Health and Epidemiology
  • Epidemiology
  • Public Health
  • Browse content in Radiology
  • Clinical Radiology
  • Interventional Radiology
  • Nuclear Medicine
  • Radiation Oncology
  • Reproductive Medicine
  • Browse content in Surgery
  • Cardiothoracic Surgery
  • Gastro-intestinal and Colorectal Surgery
  • General Surgery
  • Neurosurgery
  • Paediatric Surgery
  • Peri-operative Care
  • Plastic and Reconstructive Surgery
  • Surgical Oncology
  • Transplant Surgery
  • Trauma and Orthopaedic Surgery
  • Vascular Surgery
  • Browse content in Science and Mathematics
  • Browse content in Biological Sciences
  • Aquatic Biology
  • Biochemistry
  • Bioinformatics and Computational Biology
  • Developmental Biology
  • Ecology and Conservation
  • Evolutionary Biology
  • Genetics and Genomics
  • Microbiology
  • Molecular and Cell Biology
  • Natural History
  • Plant Sciences and Forestry
  • Research Methods in Life Sciences
  • Structural Biology
  • Systems Biology
  • Zoology and Animal Sciences
  • Browse content in Chemistry
  • Analytical Chemistry
  • Computational Chemistry
  • Crystallography
  • Environmental Chemistry
  • Industrial Chemistry
  • Inorganic Chemistry
  • Materials Chemistry
  • Medicinal Chemistry
  • Mineralogy and Gems
  • Organic Chemistry
  • Physical Chemistry
  • Polymer Chemistry
  • Study and Communication Skills in Chemistry
  • Theoretical Chemistry
  • Browse content in Computer Science
  • Artificial Intelligence
  • Computer Architecture and Logic Design
  • Game Studies
  • Human-Computer Interaction
  • Mathematical Theory of Computation
  • Programming Languages
  • Software Engineering
  • Systems Analysis and Design
  • Virtual Reality
  • Browse content in Computing
  • Business Applications
  • Computer Security
  • Computer Games
  • Computer Networking and Communications
  • Digital Lifestyle
  • Graphical and Digital Media Applications
  • Operating Systems
  • Browse content in Earth Sciences and Geography
  • Atmospheric Sciences
  • Environmental Geography
  • Geology and the Lithosphere
  • Maps and Map-making
  • Meteorology and Climatology
  • Oceanography and Hydrology
  • Palaeontology
  • Physical Geography and Topography
  • Regional Geography
  • Soil Science
  • Urban Geography
  • Browse content in Engineering and Technology
  • Agriculture and Farming
  • Biological Engineering
  • Civil Engineering, Surveying, and Building
  • Electronics and Communications Engineering
  • Energy Technology
  • Engineering (General)
  • Environmental Science, Engineering, and Technology
  • History of Engineering and Technology
  • Mechanical Engineering and Materials
  • Technology of Industrial Chemistry
  • Transport Technology and Trades
  • Browse content in Environmental Science
  • Applied Ecology (Environmental Science)
  • Conservation of the Environment (Environmental Science)
  • Environmental Sustainability
  • Environmentalist Thought and Ideology (Environmental Science)
  • Management of Land and Natural Resources (Environmental Science)
  • Natural Disasters (Environmental Science)
  • Nuclear Issues (Environmental Science)
  • Pollution and Threats to the Environment (Environmental Science)
  • Social Impact of Environmental Issues (Environmental Science)
  • History of Science and Technology
  • Browse content in Materials Science
  • Ceramics and Glasses
  • Composite Materials
  • Metals, Alloying, and Corrosion
  • Nanotechnology
  • Browse content in Mathematics
  • Applied Mathematics
  • Biomathematics and Statistics
  • History of Mathematics
  • Mathematical Education
  • Mathematical Finance
  • Mathematical Analysis
  • Numerical and Computational Mathematics
  • Probability and Statistics
  • Pure Mathematics
  • Browse content in Neuroscience
  • Cognition and Behavioural Neuroscience
  • Development of the Nervous System
  • Disorders of the Nervous System
  • History of Neuroscience
  • Invertebrate Neurobiology
  • Molecular and Cellular Systems
  • Neuroendocrinology and Autonomic Nervous System
  • Neuroscientific Techniques
  • Sensory and Motor Systems
  • Browse content in Physics
  • Astronomy and Astrophysics
  • Atomic, Molecular, and Optical Physics
  • Biological and Medical Physics
  • Classical Mechanics
  • Computational Physics
  • Condensed Matter Physics
  • Electromagnetism, Optics, and Acoustics
  • History of Physics
  • Mathematical and Statistical Physics
  • Measurement Science
  • Nuclear Physics
  • Particles and Fields
  • Plasma Physics
  • Quantum Physics
  • Relativity and Gravitation
  • Semiconductor and Mesoscopic Physics
  • Browse content in Psychology
  • Affective Sciences
  • Clinical Psychology
  • Cognitive Neuroscience
  • Cognitive Psychology
  • Criminal and Forensic Psychology
  • Developmental Psychology
  • Educational Psychology
  • Evolutionary Psychology
  • Health Psychology
  • History and Systems in Psychology
  • Music Psychology
  • Neuropsychology
  • Organizational Psychology
  • Psychological Assessment and Testing
  • Psychology of Human-Technology Interaction
  • Psychology Professional Development and Training
  • Research Methods in Psychology
  • Social Psychology
  • Browse content in Social Sciences
  • Browse content in Anthropology
  • Anthropology of Religion
  • Human Evolution
  • Medical Anthropology
  • Physical Anthropology
  • Regional Anthropology
  • Social and Cultural Anthropology
  • Theory and Practice of Anthropology
  • Browse content in Business and Management
  • Business Strategy
  • Business History
  • Business Ethics
  • Business and Government
  • Business and Technology
  • Business and the Environment
  • Comparative Management
  • Corporate Governance
  • Corporate Social Responsibility
  • Entrepreneurship
  • Health Management
  • Human Resource Management
  • Industrial and Employment Relations
  • Industry Studies
  • Information and Communication Technologies
  • International Business
  • Knowledge Management
  • Management and Management Techniques
  • Operations Management
  • Organizational Theory and Behaviour
  • Pensions and Pension Management
  • Public and Nonprofit Management
  • Strategic Management
  • Supply Chain Management
  • Browse content in Criminology and Criminal Justice
  • Criminal Justice
  • Criminology
  • Forms of Crime
  • International and Comparative Criminology
  • Youth Violence and Juvenile Justice
  • Development Studies
  • Browse content in Economics
  • Agricultural, Environmental, and Natural Resource Economics
  • Asian Economics
  • Behavioural Finance
  • Behavioural Economics and Neuroeconomics
  • Econometrics and Mathematical Economics
  • Economic Systems
  • Economic Methodology
  • Economic History
  • Economic Development and Growth
  • Financial Markets
  • Financial Institutions and Services
  • General Economics and Teaching
  • Health, Education, and Welfare
  • History of Economic Thought
  • International Economics
  • Labour and Demographic Economics
  • Law and Economics
  • Macroeconomics and Monetary Economics
  • Microeconomics
  • Public Economics
  • Urban, Rural, and Regional Economics
  • Welfare Economics
  • Browse content in Education
  • Adult Education and Continuous Learning
  • Care and Counselling of Students
  • Early Childhood and Elementary Education
  • Educational Equipment and Technology
  • Educational Strategies and Policy
  • Higher and Further Education
  • Organization and Management of Education
  • Philosophy and Theory of Education
  • Schools Studies
  • Secondary Education
  • Teaching of a Specific Subject
  • Teaching of Specific Groups and Special Educational Needs
  • Teaching Skills and Techniques
  • Browse content in Environment
  • Applied Ecology (Social Science)
  • Climate Change
  • Conservation of the Environment (Social Science)
  • Environmentalist Thought and Ideology (Social Science)
  • Natural Disasters (Environment)
  • Social Impact of Environmental Issues (Social Science)
  • Browse content in Human Geography
  • Cultural Geography
  • Economic Geography
  • Political Geography
  • Browse content in Interdisciplinary Studies
  • Communication Studies
  • Museums, Libraries, and Information Sciences
  • Browse content in Politics
  • African Politics
  • Asian Politics
  • Chinese Politics
  • Comparative Politics
  • Conflict Politics
  • Elections and Electoral Studies
  • Environmental Politics
  • European Union
  • Foreign Policy
  • Gender and Politics
  • Human Rights and Politics
  • Indian Politics
  • International Relations
  • International Organization (Politics)
  • International Political Economy
  • Irish Politics
  • Latin American Politics
  • Middle Eastern Politics
  • Political Methodology
  • Political Communication
  • Political Philosophy
  • Political Sociology
  • Political Theory
  • Political Behaviour
  • Political Economy
  • Political Institutions
  • Politics and Law
  • Politics of Development
  • Public Administration
  • Public Policy
  • Quantitative Political Methodology
  • Regional Political Studies
  • Russian Politics
  • Security Studies
  • State and Local Government
  • UK Politics
  • US Politics
  • Browse content in Regional and Area Studies
  • African Studies
  • Asian Studies
  • East Asian Studies
  • Japanese Studies
  • Latin American Studies
  • Middle Eastern Studies
  • Native American Studies
  • Scottish Studies
  • Browse content in Research and Information
  • Research Methods
  • Browse content in Social Work
  • Addictions and Substance Misuse
  • Adoption and Fostering
  • Care of the Elderly
  • Child and Adolescent Social Work
  • Couple and Family Social Work
  • Direct Practice and Clinical Social Work
  • Emergency Services
  • Human Behaviour and the Social Environment
  • International and Global Issues in Social Work
  • Mental and Behavioural Health
  • Social Justice and Human Rights
  • Social Policy and Advocacy
  • Social Work and Crime and Justice
  • Social Work Macro Practice
  • Social Work Practice Settings
  • Social Work Research and Evidence-based Practice
  • Welfare and Benefit Systems
  • Browse content in Sociology
  • Childhood Studies
  • Community Development
  • Comparative and Historical Sociology
  • Economic Sociology
  • Gender and Sexuality
  • Gerontology and Ageing
  • Health, Illness, and Medicine
  • Marriage and the Family
  • Migration Studies
  • Occupations, Professions, and Work
  • Organizations
  • Population and Demography
  • Race and Ethnicity
  • Social Theory
  • Social Movements and Social Change
  • Social Research and Statistics
  • Social Stratification, Inequality, and Mobility
  • Sociology of Religion
  • Sociology of Education
  • Sport and Leisure
  • Urban and Rural Studies
  • Browse content in Warfare and Defence
  • Defence Strategy, Planning, and Research
  • Land Forces and Warfare
  • Military Administration
  • Military Life and Institutions
  • Naval Forces and Warfare
  • Other Warfare and Defence Issues
  • Peace Studies and Conflict Resolution
  • Weapons and Equipment

Design and Analysis for Quantitative Research in Music Education

  • < Previous chapter
  • Next chapter >

Design and Analysis for Quantitative Research in Music Education

6 Correlational Design and Analysis

  • Published: March 2018
  • Cite Icon Cite
  • Permissions Icon Permissions

Interests in how variables may relate to each other and how systems of relationships among variables may be at play often underlie the questions music education researchers pose. This chapter describes basic design and analysis considerations in research that involves the systematic investigation of whether and how variables are related; in other words, correlational research. The chapter poses correlational research as an extension of the book’s previous discussion of descriptive research. The chapter briefly describes the role of correlational studies in advancing theory, presents several issues to consider when designing studies, and provides an introduction to correlation as a statistical concept.

Signed in as

Institutional accounts.

  • GoogleCrawler [DO NOT DELETE]
  • Google Scholar Indexing

Personal account

  • Sign in with email/username & password
  • Get email alerts
  • Save searches
  • Purchase content
  • Activate your purchase/trial code
  • Add your ORCID iD

Institutional access

Sign in with a library card.

  • Sign in with username/password
  • Recommend to your librarian
  • Institutional account management
  • Get help with access

Access to content on Oxford Academic is often provided through institutional subscriptions and purchases. If you are a member of an institution with an active account, you may be able to access content in one of the following ways:

IP based access

Typically, access is provided across an institutional network to a range of IP addresses. This authentication occurs automatically, and it is not possible to sign out of an IP authenticated account.

Choose this option to get remote access when outside your institution. Shibboleth/Open Athens technology is used to provide single sign-on between your institution’s website and Oxford Academic.

  • Click Sign in through your institution.
  • Select your institution from the list provided, which will take you to your institution's website to sign in.
  • When on the institution site, please use the credentials provided by your institution. Do not use an Oxford Academic personal account.
  • Following successful sign in, you will be returned to Oxford Academic.

If your institution is not listed or you cannot sign in to your institution’s website, please contact your librarian or administrator.

Enter your library card number to sign in. If you cannot sign in, please contact your librarian.

Society Members

Society member access to a journal is achieved in one of the following ways:

Sign in through society site

Many societies offer single sign-on between the society website and Oxford Academic. If you see ‘Sign in through society site’ in the sign in pane within a journal:

  • Click Sign in through society site.
  • When on the society site, please use the credentials provided by that society. Do not use an Oxford Academic personal account.

If you do not have a society account or have forgotten your username or password, please contact your society.

Sign in using a personal account

Some societies use Oxford Academic personal accounts to provide access to their members. See below.

A personal account can be used to get email alerts, save searches, purchase content, and activate subscriptions.

Some societies use Oxford Academic personal accounts to provide access to their members.

Viewing your signed in accounts

Click the account icon in the top right to:

  • View your signed in personal account and access account management features.
  • View the institutional accounts that are providing access.

Signed in but can't access content

Oxford Academic is home to a wide variety of products. The institutional subscription may not cover the content that you are trying to access. If you believe you should have access to that content, please contact your librarian.

For librarians and administrators, your personal account also provides access to institutional account management. Here you will find options to view and activate subscriptions, manage institutional settings and access options, access usage statistics, and more.

Our books are available by subscription or purchase to libraries and institutions.

  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Rights and permissions
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Clin Kidney J
  • v.14(11); 2021 Nov

Logo of ckj

Conducting correlation analysis: important limitations and pitfalls

Roemer j janse.

Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, The Netherlands

Tiny Hoekstra

Department of Nephrology, Amsterdam Cardiovascular Sciences, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands

Kitty J Jager

ERA-EDTA Registry, Department of Medical Informatics, Amsterdam Public Health Research Institute, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands

Carmine Zoccali

CNR-IFC, Center of Clinical Physiology, Clinical Epidemiology of Renal Diseases and Hypertension, Reggio Calabria, Italy

Giovanni Tripepi

Friedo w dekker, merel van diepen.

The correlation coefficient is a statistical measure often used in studies to show an association between variables or to look at the agreement between two methods. In this paper, we will discuss not only the basics of the correlation coefficient, such as its assumptions and how it is interpreted, but also important limitations when using the correlation coefficient, such as its assumption of a linear association and its sensitivity to the range of observations. We will also discuss why the coefficient is invalid when used to assess agreement of two methods aiming to measure a certain value, and discuss better alternatives, such as the intraclass coefficient and Bland–Altman’s limits of agreement. The concepts discussed in this paper are supported with examples from literature in the field of nephrology.

‘Correlation is not causation’: a saying not rarely uttered when a person infers causality from two variables occurring together, without them truly affecting each other. Yet, though causation may not always be understood correctly, correlation too is a concept in which mistakes are easily made. Nonetheless, the correlation coefficient has often been reported within the medical literature. It estimates the association between two variables (e.g. blood pressure and kidney function), or is used for the estimation of agreement between two methods of measurement that aim to measure the same variable (e.g. the Modification of Diet in Renal Disease (MDRD) formula and the Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) formula for estimating the glomerular filtration rate (eGFR)]. Despite the wide use of the correlation coefficient, limitations and pitfalls for both situations exist, of which one should be aware when drawing conclusions from correlation coefficients. In this paper, we aim to describe the correlation coefficient and its limitations, together with methods that can be applied to avoid these limitations.

The basics: the correlation coefficient

Fundamentals.

The correlation coefficient was described over a hundred years ago by Karl Pearson [ 1 ], taking inspiration from a similar idea of correlation from Sir Francis Galton, who developed linear regression and was the not-so-well-known half-cousin of Charles Darwin [ 2 ]. In short, the correlation coefficient, denoted with the Greek character rho ( ρ ) for the true (theoretical) population and r for a sample of the true population, aims to estimate the strength of the linear association between two variables. If we have variables X and Y that are plotted against each other in a scatter plot, the correlation coefficient indicates how well a straight line fits these data. The coefficient ranges from −1 to 1 and is dimensionless (i.e., it has no unit). Two correlations with r = −1 and r  = 1 are shown in Figure 1A and B , respectively. The values of −1 and 1 indicate that all observations can be described perfectly using a straight line, which in turn means that if X is known, Y can be determined deterministically and vice versa. Here, the minus sign indicates an inverse association: if X increases, Y decreases. Nonetheless, real-world data are often not perfectly summarized using a straight line. In a scatterplot as shown in Figure 1C , the correlation coefficient represents how well a linear association fits the data.

An external file that holds a picture, illustration, etc.
Object name is sfab085f1.jpg

Different shapes of data and their correlation coefficients. ( A ) Linear association with r = −1. ( B ) A linear association with r  = 1. ( C ) A scatterplot through which a straight line could plausibly be drawn, with r  = 0.50. ( D ) A sinusoidal association with r  = 0. ( E ) A quadratic association with r  = 0. ( F ) An exponential association with r  = 0.50.

It is also possible to test the hypothesis of whether X and Y are correlated, which yields a P-value indicating the chance of finding the correlation coefficient’s observed value or any value indicating a higher degree of correlation, given that the two variables are not actually correlated. Though the correlation coefficient will not vary depending on sample size, the P-value yielded with the t -test will.

The value of the correlation coefficient is also not influenced by the units of measurement, but it is influenced by measurement error. If more error (also known as noise) is present in the variables X and Y , variability in X will be partially due to the error in X , and thus not solely explainable by Y . Moreover, the correlation coefficient is also sensitive to the range of observations, which we will discuss later in this paper.

An assumption of the Pearson correlation coefficient is that the joint distribution of the variables is normal. However, it has been shown that the correlation coefficient is quite robust with regard to this assumption, meaning that Pearson’s correlation coefficient may still be validly estimated in skewed distributions [ 3 ]. If desired, a non-parametric method is also available to estimate correlation; namely, the Spearman’s rank correlation coefficient. Instead of the actual values of observations, the Spearman’s correlation coefficient uses the rank of the observations when ordering observations from small to large, hence the ‘rank’ in its name [ 4 ]. This usage of the rank makes it robust against outliers [ 4 ].

Explained variance and interpretation

One may also translate the correlation coefficient into a measure of the explained variance (also known as R 2 ), by taking its square. The result can be interpreted as the proportion of statistical variability (i.e. variance) in one variable that can be explained by the other variable. In other words, to what degree can variable X be explained by Y and vice versa. For instance, as mentioned above, a correlation of −1 or +1 would both allow us to determine X from Y and vice versa without error, which is also shown in the coefficient of determination, which would be (−1) 2 or 1 2 = 1, indicating that 100% of variability in one variable can be explained by the other variable.

In some cases, the interpretation of the strength of correlation coefficient is based on rules of thumb, as is often the case with P-values (P-value <0.05 is statistically significant, P-value >0.05 is not statistically significant). However, such rules of thumb should not be used for correlations. Instead, the interpretation should always depend on context and purposes [ 5 ]. For instance, when studying the association of renin–angiotensin–system inhibitors (RASi) with blood pressure, patients with increased blood pressure may receive the perfect dosage of RASi until their blood pressure is exactly normal. Those with an already exactly normal blood pressure will not receive RASi. However, as the perfect dosage of RASi makes the blood pressure of the RASi users exactly normal, and thus equal to the blood pressure of the RASi non-users, no variation is left between users and non-users. Because of this, the correlation will be 0.

The linearity of correlation

An important limitation of the correlation coefficient is that it assumes a linear association. This also means that any linear transformation and any scale transformation of either variable X or Y , or both, will not affect the correlation coefficient. However, variables X and Y may also have a non-linear association, which could still yield a low correlation coefficient, as seen in Figure 1D and E , even though variables X and Y are clearly related. Nonetheless, the correlation coefficient will not always return 0 in case of a non-linear association, as portrayed in Figure 1F with an exponential correlation with r  = 0.5. In short, a correlation coefficient is not a measure of the best-fitted line through the observations, but only the degree to which the observations lie on one straight line.

In general, before calculating a correlation coefficient, it is advised to inspect a scatterplot of the observations in order to assess whether the data could possibly be described with a linear association and whether calculating a correlation coefficient makes sense. For instance, the scatterplot in Figure 1C could plausibly fit a straight line, and a correlation coefficient would therefore be suitable to describe the association in the data.

The range of observations for correlation

An important pitfall of the correlation coefficient is that it is influenced by the range of observations. In Figure 2A , we illustrate hypothetical data with 50 observations, with r  = 0.87. Included in the figure is an ellipse that shows the variance of the full observed data, and an ellipse that shows the variance of only the 25 lowest observations. If we subsequently analyse these 25 observations independently as shown in Figure 2B , we will see that the ellipse has shortened. If we determine the correlation coefficient for Figure 2B , we will also find a substantially lower correlation: r  = 0.57.

An external file that holds a picture, illustration, etc.
Object name is sfab085f2.jpg

The effect of the range of observations on the correlation coefficient, as shown with ellipses. ( A ) Set of 50 observations from hypothetical dataset X with r  = 0.87, with an illustrative ellipse showing length and width of the whole dataset, and an ellipse showing only the first 25 observations. ( B ) Set of only the 25 lowest observations from hypothetical dataset X with r  = 0.57, with an illustrative ellipse showing length and width.

The importance of the range of observations can further be illustrated using an example from a paper by Pierrat et al. [ 6 ] in which the correlation between the eGFR calculated using inulin clearance and eGFR calculated using the Cockcroft–Gault formula was studied both in adults and children. Children had a higher correlation coefficient than adults ( r  = 0.81 versus r  = 0.67), after which the authors mentioned: ‘The coefficients of correlation were even better […] in children than in adults.’ However, the range of observations in children was larger than the range of observations in adults, which in itself could explain the higher correlation coefficient observed in children. One can thus not simply conclude that the Cockcroft–Gault formula for eGFR correlates better with inulin in children than in adults. Because the range of the correlation influences the correlation coefficient, it is important to realize that correlation coefficients cannot be readily compared between groups or studies. Another consequence of this is that researchers could inflate the correlation coefficient by including additional low and high eGFR values.

The non-causality of correlation

Another important pitfall of the correlation coefficient is that it cannot be interpreted as causal. It is of course possible that there is a causal effect of one variable on the other, but there may also be other possible explanations that the correlation coefficient does not take into account. Take for example the phenomenon of confounding. We can study the association of prescribing angiotensin-converting enzyme (ACE)-inhibitors with a decline in kidney function. These two variables would be highly correlated, which may be due to the underlying factor albuminuria. A patient with albuminuria is more likely to receive ACE-inhibitors, but is also more likely to have a decline in kidney function. So ACE-inhibitors and a decline in kidney function are correlated not because of ACE-inhibitors causing a decline in kidney function, but because they have a shared underlying cause (also known as common cause) [ 7 ]. More reasons why associations may be biased exist, which are explained elsewhere [ 8 , 9 ].

It is however possible to adjust for such confounding effects, for example by using multivariable regression. Whereas a univariable (or ‘crude’) linear regression analysis is no different than calculating the correlation coefficient, a multivariable regression analysis allows one to adjust for possible confounder variables. Other factors need to be taken into account to estimate causal effects, but these are beyond the scope of this paper.

Agreement between methods

We have discussed the correlation coefficient and its limitations when studying the association between two variables. However, the correlation coefficient is also often incorrectly used to study the agreement between two methods that aim to estimate the same variable. Again, also here, the correlation coefficient is an invalid measure.

The correlation coefficient aims to represent to what degree a straight line fits the data. This is not the same as agreement between methods (i.e. whether X  =  Y ). If methods completely agree, all observations would fall on the line of equality (i.e. the line on which the observations would be situated if X and Y had equal values). Yet the correlation coefficient looks at the best-fitted straight line through the data, which is not per se the line of equality. As a result, any method that would consistently measure a twice as large value as the other method would still correlate perfectly with the other method. This is shown in Figure 3 , where the dashed line shows the line of equality, and the other lines portray different linear associations, all with perfect correlation, but no agreement between X and Y . These linear associations may portray a systematic difference, better known as bias, in one of the methods.

An external file that holds a picture, illustration, etc.
Object name is sfab085f3.jpg

A set of linear associations, with the dashed line (- - -) showing the line of equality where X  =  Y . The equations and correlations for the other lines are shown as well, which shows that only a linear association is needed for r  = 1, and not specifically agreement.

This limitation applies to all comparisons of methods, where it is studied whether methods can be used interchangeably, and it also applies to situations where two individuals measure a value and where the results are then compared (inter-observer variation or agreement; here the individuals can be seen as the ‘methods’), and to situations where it is studied whether one method measures consistently at two different time points (also known as repeatability). Fortunately, other methods exist to compare methods [ 10 , 11 ], of which one was proposed by Bland and Altman themselves [ 12 ].

Intraclass coefficient

One valid method to assess interchangeability is the intraclass coefficient (ICC), which is a generalization of Cohen’s κ , a measure for the assessment of intra- and interobserver agreement. The ICC shows the proportion of the variability in the new method that is due to the normal variability between individuals. The measure takes into account both the correlation and the systematic difference (i.e. bias), which makes it a measure of both the consistency and agreement of two methods. Nonetheless, like the correlation coefficient, it is influenced by the range of observations. However, an important advantage of the ICC is that it allows comparison between multiple variables or observers. Similar to the ICC is the concordance correlation coefficient (CCC), though it has been stated that the CCC yields values similar to the ICC [ 13 ]. Nonetheless, the CCC may also be found in the literature [ 14 ].

The 95% limits of agreement and the Bland–Altman plot

When they published their critique on the use of the correlation coefficient for the measurement of agreement, Bland and Altman also published an alternative method to measure agreement, which they called the limits of agreement (also referred to as a Bland–Altman plot) [ 12 ]. To illustrate the method of the limits of agreement, an artificial dataset was created using the MASS package (version 7.3-53) for R version 4.0.4 (R Corps, Vienna, Austria). Two sets of observations (two observations per person) were derived from a normal distribution with a mean ( µ ) of 120 and a randomly chosen standard deviation ( σ ) between 5 and 15. The mean of 120 was chosen with the aim to have the values resemble measurements of high eGFR, where the first set of observed eGFRs was hypothetically acquired using the MDRD formula, and the second set of observed eGFRs was hypothetically acquired using the CKD-EPI formula. The observations can be found in Table 1 .

Artificial data portraying hypothetically observed MDRD measurements and CKD-EPI measurements

The 95% limits of agreement can be easily calculated using the mean of the differences ( d ¯ ) and the standard deviation (SD) of the differences. The upper limit (UL) of the limits of agreement would then be UL = d ¯ + 1.96 * SD and the lower limit (LL) would be LL = d ¯ - 1.96 * SD . If we apply this to the data from Table 1 , we would find d ¯ = 0.32 and SD = 4.09. Subsequently, UL = 0.32 + 1.96 * 4.09 = 8.34 and LL = 0.32 − 1.96 * 4.09 = −7.70. Our limits of agreement are thus −7.70 to 8.34. We can now decide whether these limits of agreement are too broad. Imagine we decide that if we want to replace the MDRD formula with the CKD-EPI formula, we say that the difference may not be larger than 7 mL/min/1.73 m 2 . Thus, on the basis of these (hypothetical) data, the MDRD and CKD-EPI formulas cannot be used interchangeably in our case. It should also be noted that, as the limits of agreement are statistical parameters, they are also subject to uncertainty. The uncertainty can be determined by calculating 95% confidence intervals for the limits of agreement, on which Bland and Altman elaborate in their paper [ 12 ].

The limits of agreement are also subject to two assumptions: (i) the mean and SD of the differences should be constant over the range of observations and (ii) the differences are approximately normally distributed. To check these assumptions, two plots were proposed: the Bland–Altman plot, which is the differences plotted against the means of their measurements, and a histogram of the differences. If in the Bland–Altman plot the means and SDs of the differences appear to be equal along the x -axis, the first assumption is met. The histogram of the differences should follow the pattern of a normal distribution. We checked these assumptions by creating a Bland–Altman plot in Figure 4A and a histogram of the differences in Figure 4B . As often done, we also added the limits of agreement to the Bland–Altman plot, between which approximately 95% of datapoints are expected to be. In Figure 4A , we see that the mean of the differences appears to be equal along the x -axis; i.e., these datapoints could plausibly fit the horizontal line of the total mean across the whole x -axis. Nonetheless, the SD does not appear to be distributed equally: the means of the differences at the lower values of the x -axis are closer to the total mean (thus a lower SD) than the means of the differences at the middle values of the x -axis (thus a higher SD). Therefore, the first assumption is not met. Nonetheless, the second assumption is met, because our differences follow a normal distribution, as shown in Figure 4B . Our failure to meet the first assumption can be due to a number of reasons, for which Bland and Altman also proposed solutions [ 15 ]. For example, data may be skewed. However, in that case, log-transforming variables may be a solution [ 16 ].

An external file that holds a picture, illustration, etc.
Object name is sfab085f4.jpg

Plots to check assumptions for the limits of agreement. ( A ) The Bland–Altman plot for the assumption that the mean and SD of the differences are constant over the range of observations. In our case, we see that the mean of the differences appears to be equal along the x -axis; i.e., these datapoints could plausibly fit the horizontal line of the total mean across the whole x -axis. Nonetheless, the SD does not appear to be distributed equally: the means of the differences at the lower values of the x -axis are closer to the total mean (thus a lower SD) than the means of the differences at the middle values of the x -axis (thus a higher SD). Therefore, the first assumption is not met. The limits of agreement and the mean are added as dashed (- - -) lines. ( B ) A histogram of the distribution of differences to ascertain the assumption of whether the differences are normally distributed. In our case, the observations follow a normal distribution and thus, the assumption is met.

It is often mistakenly thought that the Bland–Altman plot alone is the analysis to determine the agreement between methods, but the authors themselves spoke strongly against this [ 15 ]. We suggest that authors should both report the limits of agreement and show the Bland–Altman plot, to allow readers to assess for themselves whether they think the agreement is met.

The correlation coefficient is easy to calculate and provides a measure of the strength of linear association in the data. However, it also has important limitations and pitfalls, both when studying the association between two variables and when studying agreement between methods. These limitations and pitfalls should be taken into account when using and interpreting it. If necessary, researchers should look into alternatives to the correlation coefficient, such as regression analysis for causal research, and the ICC and the limits of agreement combined with a Bland–Altman plot when comparing methods.

CONFLICT OF INTEREST STATEMENT

None declared.

Contributor Information

Roemer J Janse, Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, The Netherlands.

Tiny Hoekstra, Department of Nephrology, Amsterdam Cardiovascular Sciences, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands.

Kitty J Jager, ERA-EDTA Registry, Department of Medical Informatics, Amsterdam Public Health Research Institute, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands.

Carmine Zoccali, CNR-IFC, Center of Clinical Physiology, Clinical Epidemiology of Renal Diseases and Hypertension, Reggio Calabria, Italy.

Giovanni Tripepi, CNR-IFC, Center of Clinical Physiology, Clinical Epidemiology of Renal Diseases and Hypertension, Reggio Calabria, Italy.

Friedo W Dekker, Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, The Netherlands.

Merel van Diepen, Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, The Netherlands.

6.2 Correlational Research

Learning objectives.

  • Define correlational research and give several examples.
  • Explain why a researcher might choose to conduct correlational research rather than experimental research or another type of non-experimental research.
  • Interpret the strength and direction of different correlation coefficients.
  • Explain why correlation does not imply causation.

What Is Correlational Research?

Correlational research is a type of non-experimental research in which the researcher measures two variables and assesses the statistical relationship (i.e., the correlation) between them with little or no effort to control extraneous variables. There are many reasons that researchers interested in statistical relationships between variables would choose to conduct a correlational study rather than an experiment. The first is that they do not believe that the statistical relationship is a causal one or are not interested in causal relationships. Recall two goals of science are to describe and to predict and the correlational research strategy allows researchers to achieve both of these goals. Specifically, this strategy can be used to describe the strength and direction of the relationship between two variables and if there is a relationship between the variables then the researchers can use scores on one variable to predict scores on the other (using a statistical technique called regression).

Another reason that researchers would choose to use a correlational study rather than an experiment is that the statistical relationship of interest is thought to be causal, but the researcher  cannot  manipulate the independent variable because it is impossible, impractical, or unethical. For example, while I might be interested in the relationship between the frequency people use cannabis and their memory abilities I cannot ethically manipulate the frequency that people use cannabis. As such, I must rely on the correlational research strategy; I must simply measure the frequency that people use cannabis and measure their memory abilities using a standardized test of memory and then determine whether the frequency people use cannabis use is statistically related to memory test performance. 

Correlation is also used to establish the reliability and validity of measurements. For example, a researcher might evaluate the validity of a brief extraversion test by administering it to a large group of participants along with a longer extraversion test that has already been shown to be valid. This researcher might then check to see whether participants’ scores on the brief test are strongly correlated with their scores on the longer one. Neither test score is thought to cause the other, so there is no independent variable to manipulate. In fact, the terms  independent variable  and dependent variabl e  do not apply to this kind of research.

Another strength of correlational research is that it is often higher in external validity than experimental research. Recall there is typically a trade-off between internal validity and external validity. As greater controls are added to experiments, internal validity is increased but often at the expense of external validity. In contrast, correlational studies typically have low internal validity because nothing is manipulated or control but they often have high external validity. Since nothing is manipulated or controlled by the experimenter the results are more likely to reflect relationships that exist in the real world.

Finally, extending upon this trade-off between internal and external validity, correlational research can help to provide converging evidence for a theory. If a theory is supported by a true experiment that is high in internal validity as well as by a correlational study that is high in external validity then the researchers can have more confidence in the validity of their theory. As a concrete example, correlational studies establishing that there is a relationship between watching violent television and aggressive behavior have been complemented by experimental studies confirming that the relationship is a causal one (Bushman & Huesmann, 2001) [1] .  These converging results provide strong evidence that there is a real relationship (indeed a causal relationship) between watching violent television and aggressive behavior.

Data Collection in Correlational Research

Again, the defining feature of correlational research is that neither variable is manipulated. It does not matter how or where the variables are measured. A researcher could have participants come to a laboratory to complete a computerized backward digit span task and a computerized risky decision-making task and then assess the relationship between participants’ scores on the two tasks. Or a researcher could go to a shopping mall to ask people about their attitudes toward the environment and their shopping habits and then assess the relationship between these two variables. Both of these studies would be correlational because no independent variable is manipulated. 

Correlations Between Quantitative Variables

Correlations between quantitative variables are often presented using scatterplots . Figure 6.3 shows some hypothetical data on the relationship between the amount of stress people are under and the number of physical symptoms they have. Each point in the scatterplot represents one person’s score on both variables. For example, the circled point in Figure 6.3 represents a person whose stress score was 10 and who had three physical symptoms. Taking all the points into account, one can see that people under more stress tend to have more physical symptoms. This is a good example of a positive relationship , in which higher scores on one variable tend to be associated with higher scores on the other. A  negative relationship  is one in which higher scores on one variable tend to be associated with lower scores on the other. There is a negative relationship between stress and immune system functioning, for example, because higher stress is associated with lower immune system functioning.

Figure 2.2 Scatterplot Showing a Hypothetical Positive Relationship Between Stress and Number of Physical Symptoms

Figure 6.3 Scatterplot Showing a Hypothetical Positive Relationship Between Stress and Number of Physical Symptoms. The circled point represents a person whose stress score was 10 and who had three physical symptoms. Pearson’s r for these data is +.51.

The strength of a correlation between quantitative variables is typically measured using a statistic called  Pearson’s Correlation Coefficient (or Pearson’s  r ) . As Figure 6.4 shows, Pearson’s r ranges from −1.00 (the strongest possible negative relationship) to +1.00 (the strongest possible positive relationship). A value of 0 means there is no relationship between the two variables. When Pearson’s  r  is 0, the points on a scatterplot form a shapeless “cloud.” As its value moves toward −1.00 or +1.00, the points come closer and closer to falling on a single straight line. Correlation coefficients near ±.10 are considered small, values near ± .30 are considered medium, and values near ±.50 are considered large. Notice that the sign of Pearson’s  r  is unrelated to its strength. Pearson’s  r  values of +.30 and −.30, for example, are equally strong; it is just that one represents a moderate positive relationship and the other a moderate negative relationship. With the exception of reliability coefficients, most correlations that we find in Psychology are small or moderate in size. The website http://rpsychologist.com/d3/correlation/ , created by Kristoffer Magnusson, provides an excellent interactive visualization of correlations that permits you to adjust the strength and direction of a correlation while witnessing the corresponding changes to the scatterplot.

Figure 2.3 Range of Pearson’s r, From −1.00 (Strongest Possible Negative Relationship), Through 0 (No Relationship), to +1.00 (Strongest Possible Positive Relationship)

Figure 6.4 Range of Pearson’s r, From −1.00 (Strongest Possible Negative Relationship), Through 0 (No Relationship), to +1.00 (Strongest Possible Positive Relationship)

There are two common situations in which the value of Pearson’s  r  can be misleading. Pearson’s  r  is a good measure only for linear relationships, in which the points are best approximated by a straight line. It is not a good measure for nonlinear relationships, in which the points are better approximated by a curved line. Figure 6.5, for example, shows a hypothetical relationship between the amount of sleep people get per night and their level of depression. In this example, the line that best approximates the points is a curve—a kind of upside-down “U”—because people who get about eight hours of sleep tend to be the least depressed. Those who get too little sleep and those who get too much sleep tend to be more depressed. Even though Figure 6.5 shows a fairly strong relationship between depression and sleep, Pearson’s  r  would be close to zero because the points in the scatterplot are not well fit by a single straight line. This means that it is important to make a scatterplot and confirm that a relationship is approximately linear before using Pearson’s  r . Nonlinear relationships are fairly common in psychology, but measuring their strength is beyond the scope of this book.

Figure 2.4 Hypothetical Nonlinear Relationship Between Sleep and Depression

Figure 6.5 Hypothetical Nonlinear Relationship Between Sleep and Depression

The other common situations in which the value of Pearson’s  r  can be misleading is when one or both of the variables have a limited range in the sample relative to the population. This problem is referred to as  restriction of range . Assume, for example, that there is a strong negative correlation between people’s age and their enjoyment of hip hop music as shown by the scatterplot in Figure 6.6. Pearson’s  r  here is −.77. However, if we were to collect data only from 18- to 24-year-olds—represented by the shaded area of Figure 6.6—then the relationship would seem to be quite weak. In fact, Pearson’s  r  for this restricted range of ages is 0. It is a good idea, therefore, to design studies to avoid restriction of range. For example, if age is one of your primary variables, then you can plan to collect data from people of a wide range of ages. Because restriction of range is not always anticipated or easily avoidable, however, it is good practice to examine your data for possible restriction of range and to interpret Pearson’s  r  in light of it. (There are also statistical methods to correct Pearson’s  r  for restriction of range, but they are beyond the scope of this book).

Figure 12.10 Hypothetical Data Showing How a Strong Overall Correlation Can Appear to Be Weak When One Variable Has a Restricted Range

Figure 6.6 Hypothetical Data Showing How a Strong Overall Correlation Can Appear to Be Weak When One Variable Has a Restricted Range.The overall correlation here is −.77, but the correlation for the 18- to 24-year-olds (in the blue box) is 0.

Correlation Does Not Imply Causation

You have probably heard repeatedly that “Correlation does not imply causation.” An amusing example of this comes from a 2012 study that showed a positive correlation (Pearson’s r = 0.79) between the per capita chocolate consumption of a nation and the number of Nobel prizes awarded to citizens of that nation [2] . It seems clear, however, that this does not mean that eating chocolate causes people to win Nobel prizes, and it would not make sense to try to increase the number of Nobel prizes won by recommending that parents feed their children more chocolate.

There are two reasons that correlation does not imply causation. The first is called the  directionality problem . Two variables,  X  and  Y , can be statistically related because X  causes  Y  or because  Y  causes  X . Consider, for example, a study showing that whether or not people exercise is statistically related to how happy they are—such that people who exercise are happier on average than people who do not. This statistical relationship is consistent with the idea that exercising causes happiness, but it is also consistent with the idea that happiness causes exercise. Perhaps being happy gives people more energy or leads them to seek opportunities to socialize with others by going to the gym. The second reason that correlation does not imply causation is called the  third-variable problem . Two variables,  X  and  Y , can be statistically related not because  X  causes  Y , or because  Y  causes  X , but because some third variable,  Z , causes both  X  and  Y . For example, the fact that nations that have won more Nobel prizes tend to have higher chocolate consumption probably reflects geography in that European countries tend to have higher rates of per capita chocolate consumption and invest more in education and technology (once again, per capita) than many other countries in the world. Similarly, the statistical relationship between exercise and happiness could mean that some third variable, such as physical health, causes both of the others. Being physically healthy could cause people to exercise and cause them to be happier. Correlations that are a result of a third-variable are often referred to as  spurious correlations.

Some excellent and funny examples of spurious correlations can be found at http://www.tylervigen.com  (Figure 6.7  provides one such example).

Figure 2.5 Example of a Spurious Correlation Source: http://tylervigen.com/spurious-correlations (CC-BY 4.0)

“Lots of Candy Could Lead to Violence”

Although researchers in psychology know that correlation does not imply causation, many journalists do not. One website about correlation and causation, http://jonathan.mueller.faculty.noctrl.edu/100/correlation_or_causation.htm , links to dozens of media reports about real biomedical and psychological research. Many of the headlines suggest that a causal relationship has been demonstrated when a careful reading of the articles shows that it has not because of the directionality and third-variable problems.

One such article is about a study showing that children who ate candy every day were more likely than other children to be arrested for a violent offense later in life. But could candy really “lead to” violence, as the headline suggests? What alternative explanations can you think of for this statistical relationship? How could the headline be rewritten so that it is not misleading?

As you have learned by reading this book, there are various ways that researchers address the directionality and third-variable problems. The most effective is to conduct an experiment. For example, instead of simply measuring how much people exercise, a researcher could bring people into a laboratory and randomly assign half of them to run on a treadmill for 15 minutes and the rest to sit on a couch for 15 minutes. Although this seems like a minor change to the research design, it is extremely important. Now if the exercisers end up in more positive moods than those who did not exercise, it cannot be because their moods affected how much they exercised (because it was the researcher who determined how much they exercised). Likewise, it cannot be because some third variable (e.g., physical health) affected both how much they exercised and what mood they were in (because, again, it was the researcher who determined how much they exercised). Thus experiments eliminate the directionality and third-variable problems and allow researchers to draw firm conclusions about causal relationships.

Key Takeaways

  • Correlational research involves measuring two variables and assessing the relationship between them, with no manipulation of an independent variable.
  • Correlation does not imply causation. A statistical relationship between two variables,  X  and  Y , does not necessarily mean that  X  causes  Y . It is also possible that  Y  causes  X , or that a third variable,  Z , causes both  X  and  Y .
  • While correlational research cannot be used to establish causal relationships between variables, correlational research does allow researchers to achieve many other important objectives (establishing reliability and validity, providing converging evidence, describing relationships and making predictions)
  • Correlation coefficients can range from -1 to +1. The sign indicates the direction of the relationship between the variables and the numerical value indicates the strength of the relationship.
  • A cognitive psychologist compares the ability of people to recall words that they were instructed to “read” with their ability to recall words that they were instructed to “imagine.”
  • A manager studies the correlation between new employees’ college grade point averages and their first-year performance reports.
  • An automotive engineer installs different stick shifts in a new car prototype, each time asking several people to rate how comfortable the stick shift feels.
  • A food scientist studies the relationship between the temperature inside people’s refrigerators and the amount of bacteria on their food.
  • A social psychologist tells some research participants that they need to hurry over to the next building to complete a study. She tells others that they can take their time. Then she observes whether they stop to help a research assistant who is pretending to be hurt.

2. Practice: For each of the following statistical relationships, decide whether the directionality problem is present and think of at least one plausible third variable.

  • People who eat more lobster tend to live longer.
  • People who exercise more tend to weigh less.
  • College students who drink more alcohol tend to have poorer grades.
  • Bushman, B. J., & Huesmann, L. R. (2001). Effects of televised violence on aggression. In D. Singer & J. Singer (Eds.), Handbook of children and the media (pp. 223–254). Thousand Oaks, CA: Sage. ↵
  • Messerli, F. H. (2012). Chocolate consumption, cognitive function, and Nobel laureates. New England Journal of Medicine, 367 , 1562-1564. ↵

Creative Commons License

Share This Book

  • Increase Font Size
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

data analysis in correlational research

Home Market Research

Correlational Research: What it is with Examples

Use correlational research method to conduct a correlational study and measure the statistical relationship between two variables. Learn more.

Our minds can do some brilliant things. For example, it can memorize the jingle of a pizza truck. The louder the jingle, the closer the pizza truck is to us. Who taught us that? Nobody! We relied on our understanding and came to a conclusion. We don’t stop there, do we? If there are multiple pizza trucks in the area and each one has a different jingle, we would memorize it all and relate the jingle to its pizza truck.

This is what correlational research precisely is, establishing a relationship between two variables, “jingle” and “distance of the truck” in this particular example. The correlational study looks for variables that seem to interact with each other. When you see one variable changing, you have a fair idea of how the other variable will change.

What is Correlational research?

Correlational research is a type of non-experimental research method in which a researcher measures two variables and understands and assesses the statistical relationship between them with no influence from any extraneous variable. In statistical analysis, distinguishing between categorical data and numerical data is essential, as categorical data involves distinct categories or labels, while numerical data consists of measurable quantities.

Correlational Research Example

The correlation coefficient shows the correlation between two variables (A correlation coefficient is a statistical measure that calculates the strength of the relationship between two variables), a value measured between -1 and +1. When the correlation coefficient is close to +1, there is a positive correlation between the two variables. If the value is relative to -1, there is a negative correlation between the two variables. When the value is close to zero, then there is no relationship between the two variables.

Let us take an example to understand correlational research.

Consider hypothetically, a researcher is studying a correlation between cancer and marriage. In this study, there are two variables: disease and marriage. Let us say marriage has a negative association with cancer. This means that married people are less likely to develop cancer.

However, this doesn’t necessarily mean that marriage directly avoids cancer. In correlational research, it is not possible to establish the fact, what causes what. It is a misconception that a correlational study involves two quantitative variables. However, the reality is two variables are measured, but neither is changed. This is true independent of whether the variables are quantitative or categorical.

Types of correlational research

Mainly three types of correlational research have been identified:

1. Positive correlation: A positive relationship between two variables is when an increase in one variable leads to a rise in the other variable. A decrease in one variable will see a reduction in the other variable. For example, the amount of money a person has might positively correlate with the number of cars the person owns.

2. Negative correlation: A negative correlation is quite literally the opposite of a positive relationship. If there is an increase in one variable, the second variable will show a decrease, and vice versa.

For example, being educated might negatively correlate with the crime rate when an increase in one variable leads to a decrease in another and vice versa. If a country’s education level is improved, it can lower crime rates. Please note that this doesn’t mean that lack of education leads to crimes. It only means that a lack of education and crime is believed to have a common reason – poverty.

3. No correlation: There is no correlation between the two variables in this third type . A change in one variable may not necessarily see a difference in the other variable. For example, being a millionaire and happiness are not correlated. An increase in money doesn’t lead to happiness.

Characteristics of correlational research

Correlational research has three main characteristics. They are: 

  • Non-experimental : The correlational study is non-experimental. It means that researchers need not manipulate variables with a scientific methodology to either agree or disagree with a hypothesis. The researcher only measures and observes the relationship between the variables without altering them or subjecting them to external conditioning.
  • Backward-looking : Correlational research only looks back at historical data and observes events in the past. Researchers use it to measure and spot historical patterns between two variables. A correlational study may show a positive relationship between two variables, but this can change in the future.
  • Dynamic : The patterns between two variables from correlational research are never constant and are always changing. Two variables having negative correlation research in the past can have a positive correlation relationship in the future due to various factors.

Data collection

The distinctive feature of correlational research is that the researcher can’t manipulate either of the variables involved. It doesn’t matter how or where the variables are measured. A researcher could observe participants in a closed environment or a public setting.

Correlational Research

Researchers use two data collection methods to collect information in correlational research.

01. Naturalistic observation

Naturalistic observation is a way of data collection in which people’s behavioral targeting is observed in their natural environment, in which they typically exist. This method is a type of field research. It could mean a researcher might be observing people in a grocery store, at the cinema, playground, or in similar places.

Researchers who are usually involved in this type of data collection make observations as unobtrusively as possible so that the participants involved in the study are not aware that they are being observed else they might deviate from being their natural self.

Ethically this method is acceptable if the participants remain anonymous, and if the study is conducted in a public setting, a place where people would not normally expect complete privacy. As mentioned previously, taking an example of the grocery store where people can be observed while collecting an item from the aisle and putting in the shopping bags. This is ethically acceptable, which is why most researchers choose public settings for recording their observations. This data collection method could be both qualitative and quantitative . If you need to know more about qualitative data, you can explore our newly published blog, “ Examples of Qualitative Data in Education .”

02. Archival data

Another approach to correlational data is the use of archival data. Archival information is the data that has been previously collected by doing similar kinds of research . Archival data is usually made available through primary research .

In contrast to naturalistic observation, the information collected through archived data can be pretty straightforward. For example, counting the number of people named Richard in the various states of America based on social security records is relatively short.

Use the correlational research method to conduct a correlational study and measure the statistical relationship between two variables. Uncover the insights that matter the most. Use QuestionPro’s research platform to uncover complex insights that can propel your business to the forefront of your industry.

Research to make better decisions. Start a free trial today. No credit card required.

LEARN MORE         FREE TRIAL

MORE LIKE THIS

Cannabis Industry Business Intelligence

Cannabis Industry Business Intelligence: Impact on Research

May 28, 2024

Best Dynata Alternatives

Top 10 Dynata Alternatives & Competitors

May 27, 2024

data analysis in correlational research

What Are My Employees Really Thinking? The Power of Open-ended Survey Analysis

May 24, 2024

When I think of “disconnected”, it is important that this is not just in relation to people analytics, Employee Experience or Customer Experience - it is also relevant to looking across them.

I Am Disconnected – Tuesday CX Thoughts

May 21, 2024

Other categories

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Uncategorized
  • Video Learning Series
  • What’s Coming Up
  • Workforce Intelligence

Psychological Research

Analyzing data: correlational and experimental research.

Did you know that as sales of ice cream increase, so does the overall rate of crime? Is it possible that indulging in your favorite flavor of ice cream could send you on a crime spree? Or, after committing a crime, do you think you might decide to treat yourself to a cone? There is no question that a relationship exists between ice cream and crime (e.g., Harper, 2013), but does one thing actually caused the other to occur.

It is much more likely that both ice cream sales and crime rates are related to the temperature outside. When the temperature is warm, there are lots of people out of their houses, interacting with each other, getting annoyed with one another, and sometimes committing crimes. Also, when it is warm outside, we are more likely to seek a refreshing treat like ice cream. How do we determine if there is indeed a relationship between two things? And when there is a relationship, how can we discern whether it is attributable to coincidence or causation? We do this through statistical analysis of the data. Which analysis we use will depend on several conditions outlined next.

Introduction to Statistical Thinking

Coffee cup with heart shaped cream inside.

Figure 2.6.1 . People around the world differ in their preferences for drinking coffee versus drinking tea. Would the results of the coffee study be the same in Canada as in China? [Image: Duncan, https://goo.gl/vbMyTm, CC BY-NC 2.0, https://goo.gl/l8UUGY]

Does drinking coffee actually increase your life expectancy? A recent study (Freedman, Park, Abnet, Hollenbeck, & Sinha, 2012) found that men who drank at least six cups of coffee a day had a 10% lower chance of dying (women 15% lower) than those who drank none. Does this mean you should pick up or increase your own coffee habit? Modern society has become awash in studies such as this; you can read about several such studies in the news every day. Conducting such a study well, and interpreting the results of such studies requires understanding basic ideas of statistics , the science of gaining insight from data. Key components to a statistical investigation are:

  • Planning the study: Start by asking a testable research question and deciding how to collect data. For example, how long was the study period of the coffee study? How many people were recruited for the study, how were they recruited, and from where? How old were they? What other variables were recorded about the individuals? Were changes made to the participants’ coffee habits during the course of the study?
  • Examining the data: What are appropriate ways to examine the data? What graphs are relevant, and what do they reveal? What descriptive statistics can be calculated to summarize relevant aspects of the data, and what do they reveal? What patterns do you see in the data? Are there any individual observations that deviate from the overall pattern, and what do they reveal? For example, in the coffee study, did the proportions differ when we compared the smokers to the non-smokers?
  • Inferring from the data: What are valid statistical methods for drawing inferences “beyond” the data you collected? In the coffee study, is the 10%–15% reduction in risk of death something that could have happened just by chance?
  • Drawing conclusions: Based on what you learned from your data, what conclusions can you draw? Who do you think these conclusions apply to? (Were the people in the coffee study older? Healthy? Living in cities?) Can you draw a cause-and-effect conclusion about your treatments? (Are scientists now saying that the coffee drinking is the cause of the decreased risk of death?)

Notice that the numerical analysis (“crunching numbers” on the computer) comprises only a small part of overall statistical investigation. In this section, you will see how we can answer some of these questions and what questions you should be asking about any statistical investigation you read about.

Video 2.6.1.  Types of Statistical Studies explains the differences between correlational and experimental research.

Distributional Thinking

When data are collected to address a particular question, an important first step is to think of meaningful ways to organize and examine the data. Let’s take a look at an example.

Example 1 : Researchers investigated whether cancer pamphlets are written at an appropriate level to be read and understood by cancer patients (Short, Moriarty, & Cooley, 1995). Tests of reading ability were given to 63 patients. In addition, readability level was determined for a sample of 30 pamphlets, based on characteristics such as the lengths of words and sentences in the pamphlet. The results, reported in terms of grade levels, are displayed in Figure 2.6.2.

Table showing patients' reading levels and pahmphlet's reading levels.

Figure 2.6.2 . Frequency tables of patient reading levels and pamphlet readability levels.

  • Data vary . More specifically, values of a variable (such as reading level of a cancer patient or readability level of a cancer pamphlet) vary.
  • Analyzing the pattern of variation, called the distribution of the variable, often reveals insights.

Addressing the research question of whether the cancer pamphlets are written at appropriate levels for the cancer patients requires comparing the two distributions. A naïve comparison might focus only on the centers of the distributions. Both medians turn out to be ninth grade, but considering only medians ignores the variability and the overall distributions of these data. A more illuminating approach is to compare the entire distributions, for example with a graph, as in Figure 2.6.3.

Bar graph showing that the reading level of pamphlets is typically higher than the reading level of the patients.

Figure 2.6.3 . Comparison of patient reading levels and pamphlet readability levels.

Figure 2.6.3 makes clear that the two distributions are not well aligned at all. The most glaring discrepancy is that many patients (17/63, or 27%, to be precise) have a reading level below that of the most readable pamphlet. These patients will need help to understand the information provided in the cancer pamphlets. Notice that this conclusion follows from considering the distributions as a whole, not simply measures of center or variability, and that the graph contrasts those distributions more immediately than the frequency tables.

Statistical Significance

Even when we find patterns in data, often there is still uncertainty in various aspects of the data. For example, there may be potential for measurement errors (even your own body temperature can fluctuate by almost 1°F over the course of the day). Or we may only have a “snapshot” of observations from a more long-term process or only a small subset of individuals from the population of interest. In such cases, how can we determine whether patterns we see in our small set of data is convincing evidence of a systematic phenomenon in the larger process or population? Let’s take a look at another example.

Example 2 : In a study reported in the November 2007 issue of Nature , researchers investigated whether pre-verbal infants take into account an individual’s actions toward others in evaluating that individual as appealing or aversive (Hamlin, Wynn, & Bloom, 2007). In one component of the study, 10-month-old infants were shown a “climber” character (a piece of wood with “googly” eyes glued onto it) that could not make it up a hill in two tries. Then the infants were shown two scenarios for the climber’s next try, one where the climber was pushed to the top of the hill by another character (“helper”), and one where the climber was pushed back down the hill by another character (“hinderer”). The infant was alternately shown these two scenarios several times. Then the infant was presented with two pieces of wood (representing the helper and the hinderer characters) and asked to pick one to play with.

The researchers found that of the 16 infants who made a clear choice, 14 chose to play with the helper toy. One possible explanation for this clear majority result is that the helping behavior of the one toy increases the infants’ likelihood of choosing that toy. But are there other possible explanations? What about the color of the toy? Well, prior to collecting the data, the researchers arranged so that each color and shape (red square and blue circle) would be seen by the same number of infants. Or maybe the infants had right-handed tendencies and so picked whichever toy was closer to their right hand?

Well, prior to collecting the data, the researchers arranged it so half the infants saw the helper toy on the right and half on the left. Or, maybe the shapes of these wooden characters (square, triangle, circle) had an effect? Perhaps, but again, the researchers controlled for this by rotating which shape was the helper toy, the hinderer toy, and the climber. When designing experiments, it is important to control for as many variables as might affect the responses as possible. It is beginning to appear that the researchers accounted for all the other plausible explanations. But there is one more important consideration that cannot be controlled—if we did the study again with these 16 infants, they might not make the same choices. In other words, there is some randomness inherent in their selection process.

Maybe each infant had no genuine preference at all, and it was simply “random luck” that led to 14 infants picking the helper toy. Although this random component cannot be controlled, we can apply a probability model to investigate the pattern of results that would occur in the long run if random chance were the only factor.

If the infants were equally likely to pick between the two toys, then each infant had a 50% chance of picking the helper toy. It’s like each infant tossed a coin, and if it landed heads, the infant picked the helper toy. So if we tossed a coin 16 times, could it land heads 14 times? Sure, it’s possible, but it turns out to be very unlikely. Getting 14 (or more) heads in 16 tosses is about as likely as tossing a coin and getting 9 heads in a row. This probability is referred to as a p-value . The p-value represents the likelihood that experimental results happened by chance. Within psychology, the most common standard for p-values is “p < .05”. What this means is that there is less than a 5% probability that the results happened just by random chance, and therefore a 95% probability that the results reflect a meaningful pattern in human psychology. We call this statistical significance .

So, in the study above, if we assume that each infant was choosing equally, then the probability that 14 or more out of 16 infants would choose the helper toy is found to be 0.0021. We have only two logical possibilities: either the infants have a genuine preference for the helper toy, or the infants have no preference (50/50), and an outcome that would occur only 2 times in 1,000 iterations happened in this study. Because this p-value of 0.0021 is quite small, we conclude that the study provides very strong evidence that these infants have a genuine preference for the helper toy.

If we compare the p-value to some cut-off value, like 0.05, we see that the p=value is smaller. Because the p-value is smaller than that cut-off value, then we reject the hypothesis that only random chance was at play here. In this case, these researchers would conclude that significantly more than half of the infants in the study chose the helper toy, giving strong evidence of a genuine preference for the toy with the helping behavior.

Generalizability

Photo of a diverse group of college-aged students.

Figure 2.6.4 . Generalizability is an important research consideration: The results of studies with widely representative samples are more likely to generalize to the population. [Image: Barnacles Budget Accommodation]

One limitation to the study mentioned previously about the babies choosing the “helper” toy is that the conclusion only applies to the 16 infants in the study. We don’t know much about how those 16 infants were selected. Suppose we want to select a subset of individuals (a sample ) from a much larger group of individuals (the population ) in such a way that conclusions from the sample can be generalized to the larger population. This is the question faced by pollsters every day.

Example 3 : The General Social Survey (GSS) is a survey on societal trends conducted every other year in the United States. Based on a sample of about 2,000 adult Americans, researchers make claims about what percentage of the U.S. population consider themselves to be “liberal,” what percentage consider themselves “happy,” what percentage feel “rushed” in their daily lives, and many other issues. The key to making these claims about the larger population of all American adults lies in how the sample is selected. The goal is to select a sample that is representative of the population, and a common way to achieve this goal is to select a random sample that gives every member of the population an equal chance of being selected for the sample. In its simplest form, random sampling involves numbering every member of the population and then using a computer to randomly select the subset to be surveyed. Most polls don’t operate exactly like this, but they do use probability-based sampling methods to select individuals from nationally representative panels.

In 2004, the GSS reported that 817 of 977 respondents (or 83.6%) indicated that they always or sometimes feel rushed. This is a clear majority, but we again need to consider variation due to random sampling . Fortunately, we can use the same probability model we did in the previous example to investigate the probable size of this error. (Note, we can use the coin-tossing model when the actual population size is much, much larger than the sample size, as then we can still consider the probability to be the same for every individual in the sample.) This probability model predicts that the sample result will be within 3 percentage points of the population value (roughly 1 over the square root of the sample size, the margin of error ). A statistician would conclude, with 95% confidence, that between 80.6% and 86.6% of all adult Americans in 2004 would have responded that they sometimes or always feel rushed.

The key to the margin of error is that when we use a probability sampling method, we can make claims about how often (in the long run, with repeated random sampling) the sample result would fall within a certain distance from the unknown population value by chance (meaning by random sampling variation) alone. Conversely, non-random samples are often suspect to bias, meaning the sampling method systematically over-represents some segments of the population and under-represents others. We also still need to consider other sources of bias, such as individuals not responding honestly. These sources of error are not measured by the margin of error.

Cause and Effect Conclusions

In many research studies, the primary question of interest concerns differences between groups. Then the question becomes how were the groups formed (e.g., selecting people who already drink coffee vs. those who don’t). In some studies, the researchers actively form the groups themselves. But then we have a similar question—could any differences we observe in the groups be an artifact of that group-formation process? Or maybe the difference we observe in the groups is so large that we can discount a “fluke” in the group-formation process as a reasonable explanation for what we find?

Example 4 : A psychology study investigated whether people tend to display more creativity when they are thinking about intrinsic (internal) or extrinsic (external) motivations (Ramsey & Schafer, 2002, based on a study by Amabile, 1985). The subjects were 47 people with extensive experience with creative writing. Subjects began by answering survey questions about either intrinsic motivations for writing (such as the pleasure of self-expression) or extrinsic motivations (such as public recognition). Then all subjects were instructed to write a haiku, and those poems were evaluated for creativity by a panel of judges. The researchers conjectured beforehand that subjects who were thinking about intrinsic motivations would display more creativity than subjects who were thinking about extrinsic motivations. The creativity scores from the 47 subjects in this study are displayed in Figure 2.6.5, where higher scores indicate more creativity.

Image showing a dot for creativity scores, which vary between 5 and 27, and the types of motivation each person was given as a motivator, either extrinsic or intrinsic.

Figure 2.6.5 . Creativity scores separated by type of motivation.

In this example, the key question is whether the type of motivation affects creativity scores. In particular, do subjects who were asked about intrinsic motivations tend to have higher creativity scores than subjects who were asked about extrinsic motivations?

Figure 2.6.5 reveals that both motivation groups saw considerable variability in creativity scores, and these scores have considerable overlap between the groups. In other words, it’s certainly not always the case that those with extrinsic motivations have higher creativity than those with intrinsic motivations, but there may still be a statistical tendency in this direction. (Psychologist Keith Stanovich (2013) refers to people’s difficulties with thinking about such probabilistic tendencies as “the Achilles heel of human cognition.”)

The mean creativity score is 19.88 for the intrinsic group, compared to 15.74 for the extrinsic group, which supports the researchers’ conjecture. Yet comparing only the means of the two groups fails to consider the variability of creativity scores in the groups. We can measure variability with statistics using, for instance, the standard deviation: 5.25 for the extrinsic group and 4.40 for the intrinsic group. The standard deviations tell us that most of the creativity scores are within about 5 points of the mean score in each group. We see that the mean score for the intrinsic group lies within one standard deviation of the mean score for extrinsic group. So, although there is a tendency for the creativity scores to be higher in the intrinsic group, on average, the difference is not extremely large.

We again want to consider possible explanations for this difference. The study only involved individuals with extensive creative writing experience. Although this limits the population to which we can generalize, it does not explain why the mean creativity score was a bit larger for the intrinsic group than for the extrinsic group. Maybe women tend to receive higher creativity scores? Here is where we need to focus on how the individuals were assigned to the motivation groups. If only women were in the intrinsic motivation group and only men in the extrinsic group, then this would present a problem because we wouldn’t know if the intrinsic group did better because of the different type of motivation or because they were women. However, the researchers guarded against such a problem by randomly assigning the individuals to the motivation groups. Like flipping a coin, each individual was just as likely to be assigned to either type of motivation. Why is this helpful? Because this random assignment tends to balance out all the variables related to creativity we can think of, and even those we don’t think of in advance, between the two groups. So we should have a similar male/female split between the two groups; we should have a similar age distribution between the two groups; we should have a similar distribution of educational background between the two groups; and so on. Random assignment should produce groups that are as similar as possible except for the type of motivation, which presumably eliminates all those other variables as possible explanations for the observed tendency for higher scores in the intrinsic group.

But does this always work? No, so by “luck of the draw” the groups may be a little different prior to answering the motivation survey. So then the question is, is it possible that an unlucky random assignment is responsible for the observed difference in creativity scores between the groups? In other words, suppose each individual’s poem was going to get the same creativity score no matter which group they were assigned to, that the type of motivation in no way impacted their score. Then how often would the random-assignment process alone lead to a difference in mean creativity scores as large (or larger) than 19.88 – 15.74 = 4.14 points?

We again want to apply to a probability model to approximate a p-value , but this time the model will be a bit different. Think of writing everyone’s creativity scores on an index card, shuffling up the index cards, and then dealing out 23 to the extrinsic motivation group and 24 to the intrinsic motivation group, and finding the difference in the group means. We (better yet, the computer) can repeat this process over and over to see how often, when the scores don’t change, random assignment leads to a difference in means at least as large as 4.41. Figure 2.6.6 shows the results from 1,000 such hypothetical random assignments for these scores.

Standard distribution in a typical bell curve.

Figure 2.6.6 . Differences in group means under random assignment alone.

Only 2 of the 1,000 simulated random assignments produced a difference in group means of 4.41 or larger. In other words, the approximate p-value is 2/1000 = 0.002. This small p-value indicates that it would be very surprising for the random assignment process alone to produce such a large difference in group means. Therefore, as with Example 4, we have strong evidence that focusing on intrinsic motivations tends to increase creativity scores, as compared to thinking about extrinsic motivations.

Notice that the previous statement implies a cause-and-effect relationship between motivation and creativity score; is such a strong conclusion justified? Yes, because of the random assignment used in the study. That should have balanced out any other variables between the two groups, so now that the small p-value convinces us that the higher mean in the intrinsic group wasn’t just a coincidence, the only reasonable explanation left is the difference in the type of motivation. Can we generalize this conclusion to everyone? Not necessarily—we could cautiously generalize this conclusion to individuals with extensive experience in creative writing similar to the individuals in this study, but we would still want to know more about how these individuals were selected to participate.

Close-up photo of mathematical equations.

Figure 2.6.7 . Researchers employ the scientific method that involves a great deal of statistical thinking: generate a hypothesis –> design a study to test that hypothesis –> conduct the study –> analyze the data –> report the results. [Image: widdowquinn]

Statistical thinking involves the careful design of a study to collect meaningful data to answer a focused research question, detailed analysis of patterns in the data, and drawing conclusions that go beyond the observed data. Random sampling is paramount to generalizing results from our sample to a larger population, and random assignment is key to drawing cause-and-effect conclusions. With both kinds of randomness, probability models help us assess how much random variation we can expect in our results, in order to determine whether our results could happen by chance alone and to estimate a margin of error.

So where does this leave us with regard to the coffee study mentioned previously (the Freedman, Park, Abnet, Hollenbeck, & Sinha, 2012 found that men who drank at least six cups of coffee a day had a 10% lower chance of dying (women 15% lower) than those who drank none)? We can answer many of the questions:

  • This was a 14-year study conducted by researchers at the National Cancer Institute.
  • The results were published in the June issue of the New England Journal of Medicine , a respected, peer-reviewed journal.
  • The study reviewed coffee habits of more than 402,000 people ages 50 to 71 from six states and two metropolitan areas. Those with cancer, heart disease, and stroke were excluded at the start of the study. Coffee consumption was assessed once at the start of the study.
  • About 52,000 people died during the course of the study.
  • People who drank between two and five cups of coffee daily showed a lower risk as well, but the amount of reduction increased for those drinking six or more cups.
  • The sample sizes were fairly large and so the p-values are quite small, even though percent reduction in risk was not extremely large (dropping from a 12% chance to about 10%–11%).
  • Whether coffee was caffeinated or decaffeinated did not appear to affect the results.
  • This was an observational study, so no cause-and-effect conclusions can be drawn between coffee drinking and increased longevity, contrary to the impression conveyed by many news headlines about this study. In particular, it’s possible that those with chronic diseases don’t tend to drink coffee.

This study needs to be reviewed in the larger context of similar studies and consistency of results across studies, with the constant caution that this was not a randomized experiment. Whereas a statistical analysis can still “adjust” for other potential confounding variables, we are not yet convinced that researchers have identified them all or completely isolated why this decrease in death risk is evident. Researchers can now take the findings of this study and develop more focused studies that address new questions.

Explore these outside resources to learn more about applied statistics:

  • Video about p-values:  P-Value Extravaganza
  • Interactive web applets for teaching and learning statistics
  • Inter-university Consortium for Political and Social Research  where you can find and analyze data.
  • The Consortium for the Advancement of Undergraduate Statistics
  • Analyzing Data: Correlational and Experimental Research. Authored by : Nicole Arduini-Van Hoose. Provided by : Hudson Valley Community College. Located at : https://courses.lumenlearning.com/adolescent/chapter/analyzing-data-correlational-and-experimental-research/ . License : CC BY-NC-SA: Attribution-NonCommercial-ShareAlike
  • Psychology. Provided by : OpenStax. Located at : https://openstax.org/details/books/psychology . License : CC BY: Attribution
  • Types of Statistical Studies. Authored by : Sal Khan. Provided by : Khan Academy. Located at : https://youtu.be/z-Qi4w6Xkuc . License : CC BY-NC-ND: Attribution-NonCommercial-NoDerivatives

Footer Logo Lumen Candela

Privacy Policy

Logo for BCcampus Open Publishing

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Chapter 7: Nonexperimental Research

Correlational Research

Learning Objectives

  • Define correlational research and give several examples.
  • Explain why a researcher might choose to conduct correlational research rather than experimental research or another type of nonexperimental research.

What Is Correlational Research?

Correlational research is a type of nonexperimental research in which the researcher measures two variables and assesses the statistical relationship (i.e., the correlation) between them with little or no effort to control extraneous variables. There are essentially two reasons that researchers interested in statistical relationships between variables would choose to conduct a correlational study rather than an experiment. The first is that they do not believe that the statistical relationship is a causal one. For example, a researcher might evaluate the validity of a brief extraversion test by administering it to a large group of participants along with a longer extraversion test that has already been shown to be valid. This researcher might then check to see whether participants’ scores on the brief test are strongly correlated with their scores on the longer one. Neither test score is thought to cause the other, so there is no independent variable to manipulate. In fact, the terms  independent variable  and dependent variabl e  do not apply to this kind of research.

The other reason that researchers would choose to use a correlational study rather than an experiment is that the statistical relationship of interest is thought to be causal, but the researcher  cannot  manipulate the independent variable because it is impossible, impractical, or unethical. For example, Allen Kanner and his colleagues thought that the number of “daily hassles” (e.g., rude salespeople, heavy traffic) that people experience affects the number of physical and psychological symptoms they have (Kanner, Coyne, Schaefer, & Lazarus, 1981). [1] But because they could not  manipulate  the number of daily hassles their participants experienced, they had to settle for  measuring  the number of daily hassles—along with the number of symptoms—using self-report questionnaires. Although the strong positive relationship they found between these two variables is consistent with their idea that hassles cause symptoms, it is also consistent with the idea that symptoms cause hassles or that some third variable (e.g., neuroticism) causes both.

A common misconception among beginning researchers is that correlational research must involve two quantitative variables, such as scores on two extroversion tests or the number of hassles and number of symptoms people have experienced. However, the defining feature of correlational research is that the two variables are measured—neither one is manipulated—and this is true regardless of whether the variables are quantitative or categorical. Imagine, for example, that a researcher administers the Rosenberg Self-Esteem Scale to 50 American university students and 50 Japanese university students. Although this “feels” like a between-subjects experiment, it is a correlational study because the researcher did not manipulate the students’ nationalities. The same is true of the study by Cacioppo and Petty comparing professors and factory workers in terms of their need for cognition. It is a correlational study because the researchers did not manipulate the participants’ occupations.

Figure 7.2 shows data from a hypothetical study on the relationship between whether people make a daily list of things to do (a “to-do list”) and stress. Notice that it is unclear whether this design is an experiment or a correlational study because it is unclear whether the independent variable was manipulated. If the researcher randomly assigned some participants to make daily to-do lists and others not to, then it is an experiment. If the researcher simply asked participants whether they made daily to-do lists, then it is a correlational study. The distinction is important because if the study was an experiment, then it could be concluded that making the daily to-do lists reduced participants’ stress. But if it was a correlational study, it could only be concluded that these variables are related. Perhaps being stressed has a negative effect on people’s ability to plan ahead (the directionality problem). Or perhaps people who are more conscientious are more likely to make to-do lists and less likely to be stressed (the third-variable problem). The crucial point is that what defines a study as experimental or correlational is not the variables being studied, nor whether the variables are quantitative or categorical, nor the type of graph or statistics used to analyze the data. It is  how  the study is conducted.

People who did make a daily to-do list had a stress level of 18. people who didn't had a stress level of 25

Data Collection in Correlational Research

Again, the defining feature of correlational research is that neither variable is manipulated. It does not matter how or where the variables are measured. A researcher could have participants come to a laboratory to complete a computerized backward digit span task and a computerized risky decision-making task and then assess the relationship between participants’ scores on the two tasks. Or a researcher could go to a shopping mall to ask people about their attitudes toward the environment and their shopping habits and then assess the relationship between these two variables. Both of these studies would be correlational because no independent variable is manipulated. However, because some approaches to data collection are strongly associated with correlational research, it makes sense to discuss them here. The two we will focus on are naturalistic observation and archival data. A third, survey research, is discussed in its own chapter, Chapter 9.

Naturalistic Observation

Naturalistic observation  is an approach to data collection that involves observing people’s behaviour in the environment in which it typically occurs. Thus naturalistic observation is a type of field research (as opposed to a type of laboratory research). It could involve observing shoppers in a grocery store, children on a school playground, or psychiatric inpatients in their wards. Researchers engaged in naturalistic observation usually make their observations as unobtrusively as possible so that participants are often not aware that they are being studied. Ethically, this method is considered to be acceptable if the participants remain anonymous and the behaviour occurs in a public setting where people would not normally have an expectation of privacy. Grocery shoppers putting items into their shopping carts, for example, are engaged in public behaviour that is easily observable by store employees and other shoppers. For this reason, most researchers would consider it ethically acceptable to observe them for a study. On the other hand, one of the arguments against the ethicality of the naturalistic observation of “bathroom behaviour” discussed earlier in the book is that people have a reasonable expectation of privacy even in a public restroom and that this expectation was violated.

Researchers Robert Levine and Ara Norenzayan used naturalistic observation to study differences in the “pace of life” across countries (Levine & Norenzayan, 1999). [2] One of their measures involved observing pedestrians in a large city to see how long it took them to walk 60 feet. They found that people in some countries walked reliably faster than people in other countries. For example, people in Canada and Sweden covered 60 feet in just under 13 seconds on average, while people in Brazil and Romania took close to 17 seconds.

Because naturalistic observation takes place in the complex and even chaotic “real world,” there are two closely related issues that researchers must deal with before collecting data. The first is sampling. When, where, and under what conditions will the observations be made, and who exactly will be observed? Levine and Norenzayan described their sampling process as follows:

“Male and female walking speed over a distance of 60 feet was measured in at least two locations in main downtown areas in each city. Measurements were taken during main business hours on clear summer days. All locations were flat, unobstructed, had broad sidewalks, and were sufficiently uncrowded to allow pedestrians to move at potentially maximum speeds. To control for the effects of socializing, only pedestrians walking alone were used. Children, individuals with obvious physical handicaps, and window-shoppers were not timed. Thirty-five men and 35 women were timed in most cities.” (p. 186)

Precise specification of the sampling process in this way makes data collection manageable for the observers, and it also provides some control over important extraneous variables. For example, by making their observations on clear summer days in all countries, Levine and Norenzayan controlled for effects of the weather on people’s walking speeds.

The second issue is measurement. What specific behaviours will be observed? In Levine and Norenzayan’s study, measurement was relatively straightforward. They simply measured out a 60-foot distance along a city sidewalk and then used a stopwatch to time participants as they walked over that distance. Often, however, the behaviours of interest are not so obvious or objective. For example, researchers Robert Kraut and Robert Johnston wanted to study bowlers’ reactions to their shots, both when they were facing the pins and then when they turned toward their companions (Kraut & Johnston, 1979). [3] But what “reactions” should they observe? Based on previous research and their own pilot testing, Kraut and Johnston created a list of reactions that included “closed smile,” “open smile,” “laugh,” “neutral face,” “look down,” “look away,” and “face cover” (covering one’s face with one’s hands). The observers committed this list to memory and then practised by coding the reactions of bowlers who had been videotaped. During the actual study, the observers spoke into an audio recorder, describing the reactions they observed. Among the most interesting results of this study was that bowlers rarely smiled while they still faced the pins. They were much more likely to smile after they turned toward their companions, suggesting that smiling is not purely an expression of happiness but also a form of social communication.

When the observations require a judgment on the part of the observers—as in Kraut and Johnston’s study—this process is often described as  coding . Coding generally requires clearly defining a set of target behaviours. The observers then categorize participants individually in terms of which behaviour they have engaged in and the number of times they engaged in each behaviour. The observers might even record the duration of each behaviour. The target behaviours must be defined in such a way that different observers code them in the same way. This difficulty with coding is the issue of interrater reliability, as mentioned in Chapter 5. Researchers are expected to demonstrate the interrater reliability of their coding procedure by having multiple raters code the same behaviours independently and then showing that the different observers are in close agreement. Kraut and Johnston, for example, video recorded a subset of their participants’ reactions and had two observers independently code them. The two observers showed that they agreed on the reactions that were exhibited 97% of the time, indicating good interrater reliability.

Archival Data

Another approach to correlational research is the use of  archival data , which are data that have already been collected for some other purpose. An example is a study by Brett Pelham and his colleagues on “implicit egotism”—the tendency for people to prefer people, places, and things that are similar to themselves (Pelham, Carvallo, & Jones, 2005). [4] In one study, they examined Social Security records to show that women with the names Virginia, Georgia, Louise, and Florence were especially likely to have moved to the states of Virginia, Georgia, Louisiana, and Florida, respectively.

As with naturalistic observation, measurement can be more or less straightforward when working with archival data. For example, counting the number of people named Virginia who live in various states based on Social Security records is relatively straightforward. But consider a study by Christopher Peterson and his colleagues on the relationship between optimism and health using data that had been collected many years before for a study on adult development (Peterson, Seligman, & Vaillant, 1988). [5] In the 1940s, healthy male college students had completed an open-ended questionnaire about difficult wartime experiences. In the late 1980s, Peterson and his colleagues reviewed the men’s questionnaire responses to obtain a measure of explanatory style—their habitual ways of explaining bad events that happen to them. More pessimistic people tend to blame themselves and expect long-term negative consequences that affect many aspects of their lives, while more optimistic people tend to blame outside forces and expect limited negative consequences. To obtain a measure of explanatory style for each participant, the researchers used a procedure in which all negative events mentioned in the questionnaire responses, and any causal explanations for them, were identified and written on index cards. These were given to a separate group of raters who rated each explanation in terms of three separate dimensions of optimism-pessimism. These ratings were then averaged to produce an explanatory style score for each participant. The researchers then assessed the statistical relationship between the men’s explanatory style as undergraduate students and archival measures of their health at approximately 60 years of age. The primary result was that the more optimistic the men were as undergraduate students, the healthier they were as older men. Pearson’s  r  was +.25.

This method is an example of  content analysis —a family of systematic approaches to measurement using complex archival data. Just as naturalistic observation requires specifying the behaviours of interest and then noting them as they occur, content analysis requires specifying keywords, phrases, or ideas and then finding all occurrences of them in the data. These occurrences can then be counted, timed (e.g., the amount of time devoted to entertainment topics on the nightly news show), or analyzed in a variety of other ways.

Key Takeaways

  • Correlational research involves measuring two variables and assessing the relationship between them, with no manipulation of an independent variable.
  • Correlational research is not defined by where or how the data are collected. However, some approaches to data collection are strongly associated with correlational research. These include naturalistic observation (in which researchers observe people’s behaviour in the context in which it normally occurs) and the use of archival data that were already collected for some other purpose.

Discussion: For each of the following, decide whether it is most likely that the study described is experimental or correlational and explain why.

  • An educational researcher compares the academic performance of students from the “rich” side of town with that of students from the “poor” side of town.
  • A cognitive psychologist compares the ability of people to recall words that they were instructed to “read” with their ability to recall words that they were instructed to “imagine.”
  • A manager studies the correlation between new employees’ college grade point averages and their first-year performance reports.
  • An automotive engineer installs different stick shifts in a new car prototype, each time asking several people to rate how comfortable the stick shift feels.
  • A food scientist studies the relationship between the temperature inside people’s refrigerators and the amount of bacteria on their food.
  • A social psychologist tells some research participants that they need to hurry over to the next building to complete a study. She tells others that they can take their time. Then she observes whether they stop to help a research assistant who is pretending to be hurt.
  • Kanner, A. D., Coyne, J. C., Schaefer, C., & Lazarus, R. S. (1981). Comparison of two modes of stress measurement: Daily hassles and uplifts versus major life events. Journal of Behavioural Medicine, 4 , 1–39. ↵
  • Levine, R. V., & Norenzayan, A. (1999). The pace of life in 31 countries. Journal of Cross-Cultural Psychology, 30 , 178–205. ↵
  • Kraut, R. E., & Johnston, R. E. (1979). Social and emotional messages of smiling: An ethological approach. Journal of Personality and Social Psychology, 37 , 1539–1553. ↵
  • Pelham, B. W., Carvallo, M., & Jones, J. T. (2005). Implicit egotism. Current Directions in Psychological Science, 14 , 106–110. ↵
  • Peterson, C., Seligman, M. E. P., & Vaillant, G. E. (1988). Pessimistic explanatory style is a risk factor for physical illness: A thirty-five year longitudinal study. Journal of Personality and Social Psychology, 55 , 23–27. ↵

An approach to data collection that involves observing people’s behaviour in the environment in which it typically occurs.

A judgment on part of the observers by clearly defining a set of target behaviours.

Data that have already been collected for some other purpose.

A family of systematic approaches to measurement using complex archival data.

Research Methods in Psychology - 2nd Canadian Edition Copyright © 2015 by Paul C. Price, Rajiv Jhangiani, & I-Chant A. Chiang is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

data analysis in correlational research

  • Correlational Research Designs: Types, Examples & Methods

busayo.longe

A human mind is a powerful tool that allows you to sift through seemingly unrelated variables and establish a connection with regards to a specific subject at hand. This skill is what comes to play when we talk about correlational research.

Correlational research is something that we do every day; think about how you establish a connection between the doorbell ringing at a particular time and the milkman’s arrival. As such, it is expedient to understand the different types of correlational research that are available and more importantly, how to go about it. 

What is Correlational Research?

Correlational research is a type of research method that involves observing two variables in order to establish a statistically corresponding relationship between them. The aim of correlational research is to identify variables that have some sort of relationship do the extent that a change in one creates some change in the other. 

This type of research is descriptive, unlike experimental research that relies entirely on scientific methodology and hypothesis. For example, correlational research may reveal the statistical relationship between high-income earners and relocation; that is, the more people earn, the more likely they are to relocate or not. 

What are the Types of Correlational Research?

Essentially, there are 3 types of correlational research which are positive correlational research, negative correlational research, and no correlational research. Each of these types is defined by peculiar characteristics. 

  • Positive Correlational Research

Positive correlational research is a research method involving 2 variables that are statistically corresponding where an increase or decrease in 1 variable creates a like change in the other. An example is when an increase in workers’ remuneration results in an increase in the prices of goods and services and vice versa.  

  • Negative Correlational Research

Negative correlational research is a research method involving 2 variables that are statistically opposite where an increase in one of the variables creates an alternate effect or decrease in the other variable. An example of a negative correlation is if the rise in goods and services causes a decrease in demand and vice versa. 

  • Zero Correlational Research

Zero correlational research is a type of correlational research that involves 2 variables that are not necessarily statistically connected. In this case, a change in one of the variables may not trigger a corresponding or alternate change in the other variable.

Zero correlational research caters for variables with vague statistical relationships. For example, wealth and patience can be variables under zero correlational research because they are statistically independent. 

Sporadic change patterns that occur in variables with zero correlational are usually by chance and not as a result of corresponding or alternate mutual inclusiveness. 

Correlational research can also be classified based on data collection methods. Based on these, there are 3 types of correlational research: Naturalistic observation research, survey research and archival research. 

What are the Data Collection Methods in Correlational research? 

Data collection methods in correlational research are the research methodologies adopted by persons carrying out correlational research in order to determine the linear statistical relationship between 2 variables. These data collection methods are used to gather information in correlational research. 

The 3 methods of data collection in correlational research are naturalistic observation method, archival data method, and the survey method. All of these would be clearly explained in the subsequent paragraphs. 

  • Naturalistic Observation

Naturalistic observation is a correlational research methodology that involves observing people’s behaviors as shown in the natural environment where they exist, over a period of time. It is a type of research-field method that involves the researcher paying closing attention to natural behavior patterns of the subjects under consideration.

This method is extremely demanding as the researcher must take extra care to ensure that the subjects do not suspect that they are being observed else they deviate from their natural behavior patterns. It is best for all subjects under observation to remain anonymous in order to avoid a breach of privacy. 

The major advantages of the naturalistic observation method are that it allows the researcher to fully observe the subjects (variables) in their natural state. However, it is a very expensive and time-consuming process plus the subjects can become aware of this act at any time and may act contrary. 

  • Archival Data

Archival data is a type of correlational research method that involves making use of already gathered information about the variables in correlational research. Since this method involves using data that is already gathered and analyzed, it is usually straight to the point. 

For this method of correlational research, the research makes use of earlier studies conducted by other researchers or the historical records of the variables being analyzed. This method helps a researcher to track already determined statistical patterns of the variables or subjects. 

This method is less expensive, saves time and provides the researcher with more disposable data to work with. However, it has the problem of data accuracy as important information may be missing from previous research since the researcher has no control over the data collection process. 

  • Survey Method

The survey method is the most common method of correlational research; especially in fields like psychology. It involves random sampling of the variables or the subjects in the research in which the participants fill a questionnaire centered on the subjects of interest. 

This method is very flexible as researchers can gather large amounts of data in very little time. However, it is subject to survey response bias and can also be affected by biased survey questions or under-representation of survey respondents or participants. 

These would be properly explained under data collection methods in correlational research. 

Examples of Correlational Research

Correlational research examples are numerous and highlight several instances where a correlational study may be carried out in order to determine the statistical behavioral trend with regards to the variables under consideration. Here are 3 case examples of correlational research. 

  • You want to know if wealthy people are less likely to be patient. From your experience, you believe that wealthy people are impatient. However, you want to establish a statistical pattern that proves or disproves your belief. In this case, you can carry out correlational research to identify a trend that links both variables. 
  • You want to know if there’s a correlation between how much people earn and the number of children that they have. You do not believe that people with more spending power have more children than people with less spending power. 

You think that how much people earn hardly determines the number of children that they have. Yet, carrying out correlational research on both variables could reveal any correlational relationship that exists between them. 

  • You believe that domestic violence causes a brain hemorrhage. You cannot carry out an experiment as it would be unethical to deliberately subject people to domestic violence. 

However, you can carry out correlational research to find out if victims of domestic violence suffer brain hemorrhage more than non-victims. 

What are the Characteristics of Correlational Research? 

  • Correlational Research is non-experimental

Correlational research is non-experimental as it does not involve manipulating variables using a scientific methodology in order to agree or disagree with a hypothesis. In correlational research, the researcher simply observes and measures the natural relationship between 2 variables; without subjecting either of the variables to external conditioning. 

  • Correlational Research is Backward-looking

Correlational research doesn’t take the future into consideration as it only observes and measures the recent historical relationship that exists between 2 variables. In this sense, the statistical pattern resulting from correlational research is backward-looking and can seize to exist at any point, going forward. 

Correlational research observes and measures historical patterns between 2 variables such as the relationship between high-income earners and tax payment. Correlational research may reveal a positive relationship between the aforementioned variables but this may change at any point in the future. 

  • Correlational Research is Dynamic

Statistical patterns between 2 variables that result from correlational research are ever-changing. The correlation between 2 variables changes on a daily basis and such, it cannot be used as a fixed data for further research. 

For example, the 2 variables can have a negative correlational relationship for a period of time, maybe 5 years. After this time, the correlational relationship between them can become positive; as observed in the relationship between bonds and stocks. 

  • Data resulting from correlational research are not constant and cannot be used as a standard variable for further research. 

What is the Correlation Coefficient? 

A correlation coefficient is an important value in correlational research that indicates whether the inter-relationship between 2 variables is positive, negative or non-existent. It is usually represented with the sign [r] and is part of a range of possible correlation coefficients from -1.0 to +1.0. 

The strength of a correlation between quantitative variables is typically measured using a statistic called Pearson’s Correlation Coefficient (or Pearson’s r) . A positive correlation is indicated by a value of 1.0, a perfect negative correlation is indicated by a value of -1.0 while zero correlation is indicated by a value of 0.0. 

It is important to note that a correlation coefficient only reflects the linear relationship between 2 variables; it does not capture non-linear relationships and cannot separate dependent and independent variables. The correlation coefficient helps you to determine the degree of statistical relationship that exists between variables. 

What are the Advantages of Correlational Research?

  • In cases where carrying out experimental research is unethical, correlational research  can be used to determine the relationship between 2 variables. For example, when studying humans, carrying out an experiment can be seen as unsafe or unethical; hence, choosing correlational research would be the best option. 
  • Through correlational research, you can easily determine the statistical relationship between 2 variables.
  • Carrying out correlational research is less time-consuming and less expensive than experimental research. This becomes a strong advantage when working with a minimum of researchers and funding or when keeping the number of variables in a study very low. 
  • Correlational research allows the researcher to carry out shallow data gathering using different methods such as a short survey. A short survey does not require the researcher to personally administer it so this allows the researcher to work with a few people. 

What are the Disadvantages of Correlational Research? 

  • Correlational research is limiting in nature as it can only be used to determine the statistical relationship between 2 variables. It cannot be used to establish a relationship between more than 2 variables. 
  • It does not account for cause and effect between 2 variables as it doesn’t highlight which of the 2 variables is responsible for the statistical pattern that is observed. For example, finding that education correlates positively with vegetarianism doesn’t explain whether being educated leads to becoming a vegetarian or whether vegetarianism leads to more education.
  • Reasons for either can be assumed, but until more research is done, causation can’t be determined. Also, a third, unknown variable might be causing both. For instance, living in the state of Detroit can lead to both education and vegetarianism.
  • Correlational research depends on past statistical patterns to determine the relationship between variables. As such, its data cannot be fully depended on for further research. 
  • In correlational research, the researcher has no control over the variables. Unlike experimental research, correlational research only allows the researcher to observe the variables for connecting statistical patterns without introducing a catalyst. 
  • The information received from correlational research is limited. Correlational research only shows the relationship between variables and does not equate to causation. 

What are the Differences between Correlational and Experimental Research?  

  • Methodology

The major difference between correlational research and experimental research is methodology. In correlational research, the researcher looks for a statistical pattern linking 2 naturally-occurring variables while in experimental research, the researcher introduces a catalyst and monitors its effects on the variables. 

  • Observation

In correlational research, the researcher passively observes the phenomena and measures whatever relationship that occurs between them. However, in experimental research, the researcher actively observes phenomena after triggering a change in the behavior of the variables. 

In experimental research, the researcher introduces a catalyst and monitors its effects on the variables, that is, cause and effect. In correlational research, the researcher is not interested in cause and effect as it applies; rather, he or she identifies recurring statistical patterns connecting the variables in research. 

  • Number of Variables

research caters to an unlimited number of variables. Correlational research, on the other hand, caters to only 2 variables. 

  • Experimental research is causative while correlational research is relational.
  • Correlational research is preliminary and almost always precedes experimental research. 
  • Unlike correlational research, experimental research allows the researcher to control the variables.

How to Use Online Forms for Correlational Research

One of the most popular methods of conducting correlational research is by carrying out a survey which can be made easier with the use of an online form. Surveys for correlational research involve generating different questions that revolve around the variables under observation and, allowing respondents to provide answers to these questions. 

Using an online form for your correlational research survey would help the researcher to gather more data in minimum time. In addition, the researcher would be able to reach out to more survey respondents than is plausible with printed correlational research survey forms . 

In addition, the researcher would be able to swiftly process and analyze all responses in order to objectively establish the statistical pattern that links the variables in the research. Using an online form for correlational research also helps the researcher to minimize the cost incurred during the research period. 

To use an online form for a correlational research survey, you would need to sign up on a data-gathering platform like Formplus . Formplus allows you to create custom forms for correlational research surveys using the Formplus builder. 

You can customize your correlational research survey form by adding background images, new color themes or your company logo to make it appear even more professional. In addition, Formplus also has a survey form template that you can edit for a correlational research study. 

You can create different types of survey questions including open-ended questions , rating questions, close-ended questions and multiple answers questions in your survey in the Formplus builder. After creating your correlational research survey, you can share the personalized link with respondents via email or social media.

Formplus also enables you to collect offline responses in your form.

Conclusion 

Correlational research enables researchers to establish the statistical pattern between 2 seemingly interconnected variables; as such, it is the starting point of any type of research. It allows you to link 2 variables by observing their behaviors in the most natural state. 

Unlike experimental research, correlational research does not emphasize the causative factor affecting 2 variables and this makes the data that results from correlational research subject to constant change. However, it is quicker, easier, less expensive and more convenient than experimental research. 

It is important to always keep the aim of your research at the back of your mind when choosing the best type of research to adopt. If you simply need to observe how the variables react to change then, experimental research is the best type to subscribe for. 

It is best to conduct correlational research using an online correlational research survey form as this makes the data-gathering process, more convenient. Formplus is a great online data-gathering platform that you can use to create custom survey forms for correlational research. 

Logo

Connect to Formplus, Get Started Now - It's Free!

  • characteristics of correlational research
  • types of correlational research
  • what is correlational research
  • busayo.longe

Formplus

You may also like:

Recall Bias: Definition, Types, Examples & Mitigation

This article will discuss the impact of recall bias in studies and the best ways to avoid them during research.

data analysis in correlational research

What is Pure or Basic Research? + [Examples & Method]

Simple guide on pure or basic research, its methods, characteristics, advantages, and examples in science, medicine, education and psychology

Extrapolation in Statistical Research: Definition, Examples, Types, Applications

In this article we’ll look at the different types and characteristics of extrapolation, plus how it contrasts to interpolation.

Exploratory Research: What are its Method & Examples?

Overview on exploratory research, examples and methodology. Shows guides on how to conduct exploratory research with online surveys

Formplus - For Seamless Data Collection

Collect data the right way with a versatile data collection tool. try formplus and transform your work productivity today..

Root out friction in every digital experience, super-charge conversion rates, and optimize digital self-service

Uncover insights from any interaction, deliver AI-powered agent coaching, and reduce cost to serve

Increase revenue and loyalty with real-time insights and recommendations delivered to teams on the ground

Know how your people feel and empower managers to improve employee engagement, productivity, and retention

Take action in the moments that matter most along the employee journey and drive bottom line growth

Whatever they’re are saying, wherever they’re saying it, know exactly what’s going on with your people

Get faster, richer insights with qual and quant tools that make powerful market research available to everyone

Run concept tests, pricing studies, prototyping + more with fast, powerful studies designed by UX research experts

Track your brand performance 24/7 and act quickly to respond to opportunities and challenges in your market

Explore the platform powering Experience Management

  • Free Account
  • For Digital
  • For Customer Care
  • For Human Resources
  • For Researchers
  • Financial Services
  • All Industries

Popular Use Cases

  • Customer Experience
  • Employee Experience
  • Employee Exit Interviews
  • Net Promoter Score
  • Voice of Customer
  • Customer Success Hub
  • Product Documentation
  • Training & Certification
  • XM Institute
  • Popular Resources
  • Customer Stories
  • Artificial Intelligence

Market Research

  • Partnerships
  • Marketplace

The annual gathering of the experience leaders at the world’s iconic brands building breakthrough business results, live in Sydney.

language

  • English/AU & NZ
  • Español/Europa
  • Español/América Latina
  • Português Brasileiro
  • REQUEST DEMO
  • Experience Management
  • The Ultimate Guide to Market Research
  • Correlation Research

Try Qualtrics for free

Correlation research: what is it and how can you use it.

11 min read If you want to find out if a new marketing campaign or product feature is connected to an increase in sales, correlation can help you determine if a relationship exists between those variables and whether there is a positive, negative or neutral impact.

What is correlation in research?

Correlation (often referred to as correlational study, correlation research, bivariate correlation or correlation analysis) is a core step in understanding your data (such as from survey research) or the relationship between variables in your dataset, typically expressed as x1 and x2.

If a correlation exists, one variable is correlated to another in a pairwise fashion.

Streamline your research processes with Qualtrics

Measuring correlation

To measure the degree to which any two variables are correlated, we use a correlation coefficient (of which there are many).

A correlation coefficient is a statistical value, also known as Pearson’s Correlation Coefficient (or Pearson’s r), and is always between -1 and 1. Note: outliers can make coefficients look statistically significant but not meaningful or insightful.

Data points are plotted on a scatterplot and the shape of the data informs the researcher of the relationship between variables.

The flow of correlation

  • -1 indicates a perfectly linear negative correlation
  • 0 indicates no linear correlation
  • 1 indicates a perfectly positive linear correlation

Negative correlation (or negative relationship)

A negative correlation is a relationship between two variables in which an increase in one variable is associated with a decrease in the other. For example, as you spend more money (increase) you save less (decrease).

Positive correlation (or positive relationship)

For positive correlation, both variables either increase or decrease at the same time. Let’s take hours worked versus money earned (assuming no set limit on working hours). As hours worked increases, so too does money earned.

What is a correlation matrix?

Once you’ve plotted your correlation coefficients for different variables, you can build a correlation matrix to display them (or use Stats iQ which can produce one for you). A correlation matrix essentially depicts the correlations between all possible pairs of values in a table. It’s an easy way to summarise large datasets and identify visual patterns across the relationships you are testing.

Relate capability in Stats iQ  

Relate explores the relationships between variables. When you select two variables and then select Relate, Stats iQ will choose the appropriate statistical test based on the structure of the data, run that test, then translate the results into a simple and clear explanation.

When you select three or more variables, Stats iQ will relate each variable to the one variable that has the key by it, then bring the strongest relationships to the top. You can select dozens of variables at a time, so you can sift through many relationships quickly.

Again, “Descriptive Frequencies” and “Bivariate Correlation” are basic steps that every data analyst should take before they move onto regression.

Relating numbers and number variables

Note, a correlational analysis only provides information about variables at one specific point in time. The results could change if you repeat the study.

Furthermore, whilst a relationship may exist between variables, any change in one isn’t necessarily the cause of the change in the other. This brings us onto a basic rule and famous maxim: “Correlation does not imply causation.”

Correlation and causation

It’s a well-known saying that correlation doesn’t imply causation, but why?

Well, with correlation, nothing is constant — and this lack of control makes it impossible to determine cause and effect from a simple correlation study.

Correlation and causation exist at the same time, but “ causation ” is a much higher standard. For example, you find that your child is standing by a table and there’s milk all over the place. So they spilled it. No — the cat did it before you walked in the room.

Causation explicitly applies to time and prior relationships where an action causes an outcome. Put simply: it indicates that one event is the result of another.

Correlation, on the other hand, is simply a reflection of a relationship between two variables — when one changes, so does the other, but it’s not necessarily the cause. The only way to prove or demonstrate a causal relationship is through an appropriately designed and controlled experiment.

As such, there are two basic reasons why correlation doesn’t imply causation:

1. Directionality problem

The directionality problem refers to a possible relationship between two variables — that a change in one will result in a change in the other. This also implies that there’s a correlation between them. However, as correlation doesn’t imply causation, we cannot say with certainty that the change in one of the variables is the cause of the change in the other.

2. Latent variables

A latent variable is a variable that you can’t observe or measure — but you can detect them based on their effects on other observable variables. Consider the psychological construct of happiness or the idea of customer satisfaction: you can’t directly see these variables, but you can measure them indirectly using observed variables.

For example, cities with more grocery stores also tend to have higher crime rates. However, these two variables are only correlated because they have a high correlation with a third variable: population size.

Measuring latent variables

To measure latent variables, we use observed variables and then mathematically estimate the unseen variables. This involves using advanced statistical techniques like factor analysis, latent class analysis (LCA), structural equation modelling (SEM), and Rasch analysis. These techniques rely on the inter-correlations of variables.

The next step is multiple regression/correlation, then casual or predictive modelling. But more on these methods in another topic. So, why use correlation?

Why use correlation?

Correlation is an essential part of any research study as it helps you to understand the relationships between variables, and therefore form hypotheses as the next step of the process.

The advantages of using correlation in research are:

Results are likely to be more truthful to natural occurrences.

If no variables are influenced, then the variables are existing and interacting together as they would in ‘real life’, so the findings will be a true and accurate reflection of the variables.

It does identify variables with strong relationships

During statistical analysis of the data, correlational research will be able to indicate whether there is a positive or negative relationship, or no correlation at all, between the variables. This can be invaluable for research teams trying to identify the right variables to be concentrating future research on. Saves time and money

It can be time-consuming and costly to set up experiment conditions to test whether two variables interact with each other in a cause-and-effect way. correlational research provides a stepping-stone to show researchers the potential of variables in their natural setting, and perhaps bringing patterns to light that might not have been identified in the first place.

You should always use correlation in research, but you cannot always make inferences, because:

There is less external validity

If research findings cannot be repeated and are unable to provide conclusive results, because the observations were done in a natural setting where the variables were not isolated and may have been influenced by other factors.

Having a strong correlation does not infer causation

While two variables may be strongly connected, there cannot be a clear assessment of the cause-and-effect to provide a conclusion.

There is little control over the variables

It’s not possible to isolate the variables to confirm that only the two variables are being explored. There is always the possibility of the third variable.

No guarantee of the results not changing

If results are gathered that a researcher wants to replicate, the method of correlational research is backwards-looking, so there is no guarantee that the variable results won’t change in the future.

Use an intelligent statistical tool to streamline the entire process

By using a survey software technology platform to do your correlation analysis and research, you can save time analysing your data yourself, and instead use the tool to conduct start-to-finish correlation analysis across the creation, data collection, analysis and reporting stages.

Qualtrics’ survey software streamlines your data collection methods and correlations, making it easy to access results, measure data trends, and uncover insights without the complexity or need to jump between systems.

What makes Qualtrics so different from other survey providers is that you can consult with trained research professionals, and it includes high-tech statistical software like Qualtrics Stats iQ ™. This can handle complicated analyses using these methods:

  • Regression analysis – This is vital in correlational research as it measures the degree of influence of independent variables on a dependent variable (the relationship between two variables).
  • Analysis of Variance (ANOVA) test – Commonly used with a regression study to find out what effect independent variables have on the dependent variable. It can compare multiple groups simultaneously to see if there is a relationship between them.
  • Conjoint analysis – Asks people to make trade-offs when making decisions, then analyses the results to give the most popular outcome. Helps you understand why people make the complex choices they do.
  • T-Test – Helps you compare whether two data groups have different mean values and allows the user to interpret whether differences are meaningful or merely coincidental.
  • Crosstab analysis – Used in quantitative market research to analyze categorical data – that is, variables that are different and mutually exclusive, and allows you to compare the relationship between two variables in contingency tables.

If you want to learn how the system is set up for conducting and analyzing correlational research, try out a Qualtrics survey software demo to see how it works.

Streamline your processes with Qualtrics

Related resources

Market intelligence tools 10 min read, qualitative research questions 11 min read, primary vs secondary research 14 min read, business research methods 12 min read, ethnographic research 11 min read, business research 10 min read, qualitative research design 12 min read, request demo.

Ready to learn more about Qualtrics?

  • Open access
  • Published: 27 May 2024

Patients’ satisfaction with cancer pain treatment at adult oncologic centers in Northern Ethiopia; a multi-center cross-sectional study

  • Molla Amsalu 1 ,
  • Henos Enyew Ashagrie 2 ,
  • Amare Belete Getahun 2 &
  • Yophtahe Woldegerima Berhe   ORCID: orcid.org/0000-0002-0988-7723 2  

BMC Cancer volume  24 , Article number:  647 ( 2024 ) Cite this article

Metrics details

Patient satisfaction is an important indicator of the quality of healthcare. Pain is one of the most common symptoms among cancer patients that needs optimal treatment; rather, it compromises the quality of life of patients.

To assess the levels and associated factors of satisfaction with cancer pain treatment among adult patients at cancer centers found in Northern Ethiopia in 2023.

After obtaining ethical approval, a multi-center cross-sectional study was conducted at four cancer care centers in northern Ethiopia. The data were collected using an interviewer-administered structured questionnaire that included the Lubeck Medication Satisfaction Questionnaire (LMSQ). The severity of pain was assessed by a numerical rating scale from 0 to 10 with a pain score of 0 = no pain, 1–3 = mild pain, 4–6 = moderate pain, and 7–10 = severe pain Binary logistic regression analysis was employed, and the strength of association was described in an adjusted odds ratio with a 95% confidence interval.

A total of 397 cancer patients participated in this study, with a response rate of 98.3%. We found that 70.3% of patients were satisfied with their cancer pain treatment. Being married (AOR = 5.6, CI = 2.6–12, P  < 0.001) and being single (never married) (AOR = 3.5, CI = 1.3–9.7, P  = 0.017) as compared to divorced, receiving adequate pain management (AOR = 2.4, CI = 1.1–5.3, P  = 0.03) as compared to those who didn’t receive it, and having lower pain severity (AOR = 2.6, CI = 1.5–4.8, P  < 0.001) as compared to those who had higher level of pain severity were found to be associated with satisfaction with cancer pain treatment.

The majority of cancer patients were satisfied with cancer pain treatment. Being married, being single (never married), lower pain severity, and receiving adequate pain management were found to be associated with satisfaction with cancer pain treatment. It would be better to enhance the use of multimodal analgesia in combination with strong opioids to ensure adequate pain management and lower pain severity scores.

Peer Review reports

Introduction

Pain is defined as an unpleasant sensory and emotional experience associated with, or resembling that associated with, actual or potential tissue damage [ 1 ]. The prevalence of pain in cancer patients is 44.5-66%. with the prevalence of moderate to severe pain ranging from 30 to 38%, and it can persist in 5-10% of cancer survivors [ 2 ]. Using the World Health Organization’s (WHO) cancer pain management guidelines can effectively reduce cancer-related pain in 70-90% of patients [ 3 , 4 ]. Compared to traditional pain states, the mechanism of cancer-related pain is less understood; however, cancer-specific mechanisms, inflammatory, and neuropathic processes have been identified [ 5 ]. Uncontrolled pain can negatively affect patients’ daily lives, emotional health, social relationships, and adherence to cancer treatment [ 6 ]. Patients with moderate to severe pain have a higher fatigue score, a loss of appetite, and financial difficulties [ 7 ]. Patients fear the pain caused by cancer more than dying from the disease since pain affects their physical and mental aspects of life [ 8 ]. A meta-analysis of 30 studies stated that pain was found to be a significant prognostic factor for short-term survival in cancer patients [ 9 ]. Many cancer patients have a very poor prognosis. However, adequate pain treatment prevents suffering and improves their quality of life. Although the WHO suggested non-opioids for mild pain, weak opioids for moderate pain, and strong opioids for severe pain, pain treatment is not yet adequate in one-third of cancer patients [ 10 ].

Patient satisfaction with pain management is a valuable measure of treatment effectiveness and outcome. It could be used to evaluate the quality of care [ 11 , 12 , 13 ]. Patient satisfaction affects treatment compliance and adherence [ 12 ]. Studies have reported that 60-76% of patients were satisfied with pain treatment, and a variety of factors were found associated with levels of satisfaction [ 3 , 14 , 15 , 16 ]. Studies conducted in Ethiopia reported the prevalence of pain ranging from 59.9 to 93.4% [ 17 , 18 ]. These studies indicate that cancer pain is inadequately treated. Assessment of pain treatment satisfaction can help identify appropriate treatment modalities and further its effectiveness. We conducted this study since there was limited research-based evidence on cancer pain management in low-income countries like Ethiopia. Our research questions were: how satisfied are adult cancer patients with pain treatment, and what are the factors associated with the satisfaction of adult cancer patients with pain treatment?

Methodology

Study design, area, period, and population.

A multi-center cross-sectional study was conducted at four cancer care centers in Amhara National Regional State, Northern Ethiopia from March to May 2023. Those cancer care centers were found in the University of Gondar Comprehensive Specialized Hospital (UoGCSH), Felege-Hiwot Comprehensive Specialized Hospital (FHCSH), Tibebe-Ghion Comprehensive Specialised Hospital (TGCSH) and Dessie Comprehensive Specialized Hospital (DCSH). We selected these centers as they were the only institutions providing oncologic care in the region during the study period.

The UoGCSH had 28 beds in its adult oncology ward and serves 450 cancer patients every month. Three specialist oncologists and 12 nurses provide services in the ward. The FHCSH had 22 beds and provides services for 325 cancer patients every month. Two specialist oncologists, two oncologic nurses, and 7 comprehensive nurses provide services. The TGCSH had eight beds and serves 300 cancer patients every month. There were three specialist oncologists and four oncologic nurses at the care center. The cancer care center at DCSH had 10 beds. It serves 350 cancer patients every month. There was one specialist oncologist, three oncologic nurses, and three comprehensive nurses.

All cancer patients who attended those cancer care centers were the source population, and adult (18+) cancer patients who were prescribed pain treatment for a minimum of one month were the study population. Unconscious patients, patients with psychiatric problems, patients with advanced cancer who were unable to cooperate, and patients with oncologic emergencies were excluded from this study.

Variables and operational definitions

The outcome variable was patient satisfaction with cancer pain treatment, which was measured by the Lubeck Medication Satisfaction Questionnaire. The independent variables were sociodemographic (age, sex, marital status, monthly income, and level of education), clinical (site of tumor, stage of cancer, metastasis), cancer treatment (surgery, chemotherapy, radiotherapy), level of pain, and analgesia (type of analgesia, severity of pain, adequacy of pain treatment, adjuvant analgesic).

  • Patient satisfaction

perceptions of the patients regarding the outcome of pain management and the extent to which it meets their needs and expectations. It was measured by a 4-point Likert scale (1 = strongly disagree, 2 = disagree, 3 = agree, 4 = strongly agree) using the LMSQ which has 18 items within 6 subscales that have 3 items in each (effectivity, practicality, side-effects, daily life, healthcare providers, and overall satisfaction) [ 19 ]. Final categorization was done by dichotomizing into satisfied and dissatisfied by using the demarcation threshold formula.

\((\frac{\text{T}\text{o}\text{t}\text{a}\text{l}\,\,\text{h}\text{i}\text{g}\text{h}\text{e}\text{s}\text{t}\,\,\text{s}\text{c}\text{o}\text{r}\text{e} - \text{T}\text{o}\text{t}\text{a}\text{l}\,\, \text{l}\text{o}\text{w}\text{e}\text{s}\text{t}\,\, \text{s}\text{c}\text{o}\text{r}\text{e} }{2}\) ) + Total lowest score [ 20 ]. The highest patient satisfaction score was 70 and the lowest satisfaction score was 26. A score < 48 was classified as dissatisfied, and a score ≥ 48 was classified as satisfied.

The Numeric rating scale (NRS) is a validated pain intensity assessment tool that helps to give patients a subjective feeling of pain with a numerical value between 0 and 10, in which 0 = no pain, 1–3 = mild pain, 4–6 = moderate pain, 7–10 = severe pain [ 21 ].

The Adequacy of cancer pain treatment was measured by calculating the Pain Management Index (PMI) according to the recommendations of the WHO pain management guideline [ 22 ]. The PMI was calculated by considering the prescribed most potent analgesic agent and the worst pain reported in the last 24 h [ 23 ]. The prescribed analgesics were scored as follows: 0 = no analgesia, 1 = non-opioid analgesia, 2 = weak opioids, and 3 = strong opioids. The PMI was calculated by subtracting the reported NRS value from the type of most potent analgesics administered. The calculated values of PMI ranged from − 3 (no analgesia therapy for a patient with severe pain) to + 3 (strong opioid for a patient with no pain). Patients with a positive PMI value were considered to be receiving adequate analgesia, whereas those with a negative PMI value were considered to be receiving inadequate analgesia.

Sample size determination and sampling technique

A single population proportion formula was used to determine the sample size by considering 50% satisfaction with cancer pain treatment and a 5% margin of error at a 95% confidence interval (CI). A non-probability (consecutive) sampling technique was employed to attain a sample size within two months of data collection period. After adjusting the proportional allocation for each center and adding 5% none response, a total of 404 study participants were included in the study: 128 from the University of Gondar Comprehensive Specialized Hospital, 99 from Dessie Comprehensive Specialized Hospital, 92 from Felege Hiwot Comprehensive Specialized Hospital, and 85 from Tibebe Ghion Comprehensive Specialized Hospital.

Data collection, processing, and analysis

Ethical approval.

was obtained from the Ethical Review Committee of the School of Medicine at the University of Gondar ( Reference number: CMHS/SM/06/01/4097/2015) . Data were collected using an interviewer-administered structured questionnaire and chart review during outpatient and inpatient hospital visits by four trained data collectors (one for every center). Written informed consent was obtained from each participant after detailed explanations about the study. Informed consent with a fingerprint signature was obtained from patients who could not read or write after detailed explanations by the data collectors as approved by the Ethical Review Committee of the School of Medicine, at the University of Gondar.

Questions to assess the severity of pain and pain relief were taken from the American Pain Society patient outcome questionnaire [ 24 ]. Patients were asked to report the worst and least pain in the past 24 h and the current pain by using a numeric rating scale from 0 to 10, with a pain score of 0 = no pain, 1–3 = mild pain, 4–6 = moderate pain, 7–10 = severe pain.

The Pain Management Index (PMI) based on WHO guidelines, was used to quantify pain management by measuring the adequacy of cancer pain treatment [ 25 ]. The following scores were given (0 = no analgesia, 1 = non-opioid analgesia, 2 = weak opioid 3 = strong opioid). Pain Management Index was calculated by subtracting self-reported pain level from the type of analgesia administered and ranges from − 3 (no analgesic therapy for a patient with severe pain) to + 3 (strong opioid for a patient with no pain). The level of pain was defined as 0 with no pain, 1 for mild pain, 2 for moderate pain, and 3 for severe pain. Patients with negative PMI scores received inadequate analgesia.

The pain treatment satisfaction was measured by the Lübeck Medication Satisfaction Questionnaire (LMSQ) consisting of 18 items [ 19 ]. Lübeck Medication Satisfaction Questionnaire (LMSQ) has six subclasses each consisting of equally waited and similar context of three items. The subclass includes satisfaction with the effectiveness of pain medication, satisfaction with the practicality or form of pain medication, satisfaction with the side effect profile of pain medication, satisfaction with daily life after receiving pain treatment, satisfaction with healthcare providers, and overall satisfaction. Satisfaction was expressed by a four-point Likert scale (4 = Strongly Agree, 3 = Agree, 2 = Disagree, 1 = Strongly Disagree). The side effect subclass was phrased negatively, marked with Asterix, and reverse-scored in STATA before data analysis.

Data were collected with an interviewer-administered questionnaire. The reliability of the questionnaire was assessed by using 40 pretested participants and the reliability coefficient (Cronbach’s alpha value) of the questionnaire was 91.2%. The collected data was checked for completeness, accuracy, and clarity by the investigators. The cleaned and coded data were entered in Epi-data software version 4.6 and exported to STATA version 17. The Shapiro-Wilk test, variance inflation factor, and Hosmer-Lemeshow test were used to assess distribution, multicollinearity, and model fitness, respectively. Descriptive, Chi-square and binary logistic regression analyses were performed to investigate the associations between the independent and dependent variables. The independent variables with a p-value < 0.2 in the bivariable binary logistic regression were fitted to the final multivariable binary logistic regression analysis. Variables with p-value < 0.05 in the final analysis were considered to have a statistically significant association. The strength of associations was described in adjusted odds ratio (AOR) at a 95% confidence interval.

Sociodemographic and clinical characteristics

A total of 397 patients were involved in this study (response rate of 98.3%). Of the participants, 224 (56.4%) were female, and over half were from rural areas ( n  = 210, 52.9%). The median (IQR) age was 48 (38–59) years [Table  1 ]. The most common type of cancer was gastrointestinal cancer 114 (28.7%). Most of the study participants, 213 (63.7%), were diagnosed with stage II to III cancer. The majority of the participants were taking chemotherapy alone (292 (73.6%)) [Table  2 ]. Over 90% of patients reported pain; 42.3% reported mild pain, 39.8% reported moderate pain, and 10.1% reported severe pain. Pain treatment adequacy was assessed by self-reports from study participants following pain management guidelines, and 17.1% of patients responded to having inadequate pain treatment. The majority of patients, 132 (33.3%), were prescribed combinations of non-opioid and weak opioid analgesics for cancer pain treatment. Only 34 (8.6%) cancer patients used either strong opioids alone or in combination with non-opioid analgesics.

Patients’ satisfaction with cancer pain treatment and correlation among the subscales

Most participants strongly agree (243, (61.2%)) with item LMSQ18 in the “overall satisfaction” subscale and strongly disagree (206, (51.9%)) for item LMSQ2 in the “side-effect” subscale respectively [Table  3 ]. The highest satisfaction score was observed in the side-effect subscale, with a median (IQR) of 10 (9–11) [Table  4 ].

Two hundred and seventy-nine (70.3%) cancer patients were found to be satisfied with cancer pain treatment (CI = 65.6−74.6%). The highest satisfaction rate was observed in the “side-effects” subscale, to which 343 (86.4%) responded satisfied [Fig.  1 ]. A Spearman’s correlation test revealed that there were correlations among the subscales of LMSQ; and the strongest positive correlation was observed between effectivity and healthcare workers subscale (r s = 0.7, p  < 0.0001). The correlation among the subscales is illustrated in a heatmap [Fig.  2 ].

figure 1

Patient satisfaction with cancer pain treatment with each LMSQ subclass, n  = 397

figure 2

A heatmap showing the Spearman correlation of each subclass of pain treatment satisfaction, n  = 397

Factors associated with patient satisfaction with cancer pain treatment

In the bivariable binary logistic regression analysis, marital status, stage of cancer, types of cancer treatment, severity of pain in the last 24 h, current pain severity, types of analgesics, and pain management index met the threshold of P-value < 0.2 to be included into the final multivariable binary logistic regression analysis. In the final analysis, marital status, current pain severity, and pain management index were significantly associated with patient satisfaction (P-value < 0.05). Married and single cancer patients had higher odds of being satisfied with cancer pain treatment compared to divorced patients (AOR = 5.6, CI, 2.6–12.0, P  < 0.001), (AOR = 3.5, CI = 1.3–9.7, P  = 0.017), respectively. The odds of being satisfied with cancer pain treatment among patients who received adequate pain management were more than two times greater than those who received inadequate pain management (AOR = 2.4, CI = 1.1–5.3, P  = 0.03). Patients who reported a lesser severity of current pain were nearly three times more likely to be satisfied with cancer pain treatment (AOR = 2.6, CI = 1.5–4.8, P  < 0.001) [Table  5 ].

The objective of the present study was to assess patients’ satisfaction with cancer pain treatment at adult oncologic centers. Our study revealed that most cancer patients (70.3%) have been satisfied with cancer pain treatment. This is consistent with studies done by Kaggwa et al. and Mazzotta et al. [ 16 , 26 ]. Whereas, it is a higher rate of satisfaction compared to other studies that reported 33.0% [ 27 ] and 47.7% [ 28 ] of satisfaction. The differences might be possibly explained by the use of different pain and satisfaction assessment tools, the greater inclusion (about 70%) of patients with advanced stages of cancer, the duration of cancer pain treatment, and the adequacy of pain management. In the current study, only 19.6% of patients have been diagnosed with stage IV cancer: patients should take treatment at least for a month, and over 80% of patients have received adequate pain management according to PMI. However, some studies have reported higher rates of satisfaction with cancer pain treatment [ 15 , 29 ]. The possible reason for the discrepancy might be the greater (over 40%) use of strong opioid analgesics in the previous studies. Strong opioids were prescribed only for 8.6% of patients in our study. Due to the complex pathophysiology, cancer pain involves multiple pain pathways. Hence, multimodal analgesia in combination with strong opioids is vital in cancer pain management [ 30 ]. Furthermore, the use of epidural analgesia could be another reason for higher rates of satisfaction [ 29 ].

Regarding satisfaction with subscales of LMSQ, about 80% of patients were satisfied with the information provided by the healthcare providers [ 27 ]. In our study; 67.8% of patients were satisfied with the education provided by healthcare providers about their disease and treatment. In contrast, a higher proportion of participants were satisfied with information provision in a study conducted by Kharel et al. [ 29 ]. Furthermore, we observed the lowest satisfaction rate in the daily life subscale. About 48% of cancer patients were not satisfied with their daily lives after receiving analgesic treatment for cancer pain.

Married and single (never married) cancer patients were found to have higher odds of being satisfied with cancer pain treatment as compared to divorced cancer patients. These findings could be explained by the presence of better social support from family or loved ones. Better social support can enhance positive coping mechanisms, increase a sense of well-being, and decrease anxiety and depression. It also improves a sense of societal vitality and results in higher patient’ satisfaction [ 31 , 32 ].

Patients who had a lower pain score were satisfied compared to those who reported a higher pain score, and this is supported by multiple previous studies [ 16 , 26 , 27 , 29 , 33 , 34 ]. This could be explained by the negative impacts of pain on physical function, sleep, mood, and wellbeing [ 35 ]. Moreover, higher pain severity scores could increase financial expenses because of unnecessary or avoidable emergency department visits; and has a consequence of dissatisfaction [ 23 ]. On the contrary, there are studies that state pain severity does not affect patients’ satisfaction [ 36 , 37 ].

Positive PMI scores were significantly associated with cancer pain treatment satisfaction. Patients who received adequate pain management were highly likely to be satisfied with cancer pain treatment. This finding is similar to that of a study done in Taiwan [ 38 ]. However, a study conducted by Kaggwa et al. has denied any association between PMI scores and cancer pain satisfaction [ 16 ].

Satisfaction with healthcare workers and effectivity of analgesics

This study showed that there was a moderately positive correlation between satisfaction with healthcare workers and satisfaction with patients’ perceived effectiveness of analgesics. This might be explained by a positive relationship between healthcare professionals and patients receiving cancer pain treatment. Healthcare providers who provide health education regarding the effectiveness of analgesics may improve patients’ adherence to the prescribed analgesic agent and improve patients’ perceived satisfaction with the effectiveness of analgesics. A systematic review showed that the hope and positivity of healthcare professionals were important for patients to cope with cancer and increase satisfaction with care [ 39 ]. Increased patient satisfaction with care provided by healthcare workers may change attitude of patients who accepted cancer pain as God’s wisdom or punishment and create a positive attitude toward the effectiveness of analgesics [ 40 ]. Another study supported this finding and stated that healthcare providers who deliver health education regarding the prevention of drug addiction, side effects of analgesics, timing, and dosage of analgesics improve patient attitude and cancer pain treatment [ 41 ].

Correlation of each subclass of cancer pain treatment satisfaction

A Spearman correlation was run to assess the correlation of each subclass of LMSQ using the total sample. There was strong positive correlation (r s = 0.5–0.64) between most of LMSQ subclass at p  < 0.01.

A cross-sectional study stated that the effectiveness of analgesic, efficacy of medication and patient healthcare provider communication were associated with patient satisfaction [ 42 ]. In this study, 58.2% of patients were satisfied with the practicability of analgesic medications. Comparable to this study, a cross-sectional study stated that patients who were prescribed convenient, fast-acting medications were more satisfied [ 43 ]. Another study stated that 100% of patients who received sufficient information on analgesic treatment and 97.9% of patients who received sufficient information about the side effects of analgesic treatment were satisfied with cancer pain management [ 44 ]. Patients who were satisfied with their pain levels reported statistically lower mean pain scores (2.26 ± 1.70) compared to those not satisfied (4.68 ± 2.07) or not sure (4.21 ± 2.21) [ 27 ]. This may be explained by the impact of pain on daily activity. Patients who report a lower average pain score may have a lower impact of pain on physical activity compared to those who report a higher mean pain score. Another study also supports this evidence and states that patients who reported a severe pain score and lower quality of life had lower satisfaction with the treatment received [ 45 ].

As a secondary outcome, only 16% of patients were diagnosed to have stage I cancer. This finding could indirectly indicate that there were delays in cancer diagnosis at earlier stage. Further studies may be required to underpin this finding.

In this study, baseline pain before analgesic treatment was not assessed and documented. As a cross-sectional study, we could not draw a cause-and-effect conclusion. Since questions that were used to measure oncologic pain treatment satisfaction were self-reported, answers to each question might not be trustful. The expectation and opinion of the interviewer also might affect the result of the study. These could be potential limitations of the study.

Conclusions

Despite the fact that most cancer patients reported moderate to severe pain, there was a high rate of satisfaction with cancer pain treatment. It would be better if hospitals, healthcare professionals, and administrators took measures to enhance the use of multimodal analgesia in combination with strong opioids to ensure adequate pain management, lower pain severity scores, and better daily life. We also urge the arrangement of better social support mechanisms for cancer patients, the improvement of information provision, and the deployment of professionals who have trained in pain management discipline at cancer care centres.

Data availability

Data and materials used in this study are available and can be presented by the corresponding author upon reasonable request.

Abbreviations

Adjusted Odds Ratio

Crude Odds Ratio

Confidence Interval

Dessie Compressive and Specialized Hospital

Felege-Hiwot Compressive and Specialized Hospital

Inter-quartile Range

Lubeck Medication Satisfaction Questionnaire

Numerical Rating Scale

Pain Management Index

Standard Deviation

Tibebe-Ghion Compressive and Specialized Hospital

University of Gondar Compressive and Specialized Hospital

World health organization

Raja SN, Carr DB, Cohen M, Finnerup NB, Flor H, Gibson S, et al. The revised International Association for the study of Pain definition of pain: concepts, challenges, and compromises. Pain. 2020;161(9):1976–82.

Article   PubMed   PubMed Central   Google Scholar  

Brown MR, Ramirez JD, Farquhar-Smith P. Pain in cancer survivors. Br J pain. 2014;8(4):139–53.

Hochstenbach LM, Joosten EA, Tjan-Heijnen VC, Janssen DJ. Update on Prevalence of Pain in Patients With Cancer: Systematic Review and Meta-Analysis. Journal of pain and symptom management. 2016;51(6):1070-90.e9.

Snijders RAH, Brom L, Theunissen M, van den Beuken-van Everdingen MHJ. Update on Prevalence of Pain in patients with Cancer 2022: a systematic literature review and Meta-analysis. Cancers. 2023;15(3).

Falk S, Dickenson AH. Pain and nociception: mechanisms of cancer-induced bone pain. J Clin Oncol. 2014;32(16):1647–54.

Article   CAS   PubMed   Google Scholar  

Gibson S, McConigley R. Unplanned oncology admissions within 14 days of non-surgical discharge: a retrospective study. Support Care Cancer. 2016;24:311–7.

Article   PubMed   Google Scholar  

Oliveira KG, von Zeidler SV, Podestá JRV, Sena A, Souza ED, Lenzi J, et al. Influence of pain severity on the quality of life in patients with head and neck cancer before antineoplastic therapy. BMC Cancer. 2014;14(1):39.

Smith MD, Meredith PJ, Chua SY. The experience of persistent pain and quality of life among women following treatment for breast cancer: an attachment perspective. Psycho-oncology. 2018;27(10):2442–9.

Zylla D, Steele G, Gupta P. A systematic review of the impact of pain on overall survival in patients with cancer. Support Care Cancer. 2017;25(5):1687–98.

Greco MT, Roberto A, Corli O, Deandrea S, Bandieri E, Cavuto S, et al. Quality of cancer pain management: an update of a systematic review of undertreatment of patients with cancer. J Clin Oncol. 2014;32(36):4149–54.

Baker TA, Krok-Schoen JL, O’Connor ML, Brooks AK. The influence of pain severity and interference on satisfaction with pain management among middle-aged and older adults. Pain Research and Management. 2016;2016.

Baker TA, O’Connor ML, Roker R, Krok JL. Satisfaction with pain treatment in older cancer patients: identifying variants of discrimination, trust, communication, and self-efficacy. J Hospice Palliat Nursing: JHPN: Official J Hospice Palliat Nurses Association. 2013;15(8).

Naidu A. Factors affecting patient satisfaction and healthcare quality. Int J Health care Qual Assur. 2009.

Davies A, Zeppetella G, Andersen S, Damkier A, Vejlgaard T, Nauck F, et al. Multi-centre European study of breakthrough cancer pain: pain characteristics and patient perceptions of current and potential management strategies. Eur J Pain. 2011;15(7):756–63.

Thinh DHQ, Sriraj W, Mansor M, Tan KH, Irawan C, Kurnianda J et al. Patient and physician satisfaction with analgesic treatment: findings from the analgesic treatment for cancer pain in Southeast Asia (ACE) study. Pain Research and Management. 2018;2018.

Kaggwa AT, Kituyi PW, Muteti EN, Ayumba RB. Cancer-related Bone Pain: patients’ satisfaction with analgesic Pain Control. Annals Afr Surg. 2022;19(3):144–52.

Article   Google Scholar  

Adugna DG, Ayelign AA, Woldie HF, Aragie H, Tafesse E, Melese EB et al. Prevalence and associated factors of cancer pain among adult cancer patients evaluated at the Oncology unit in the University of Gondar Comprehensive Specialized Hospital, Northwest Ethiopia. Front Pain Res.3:231.

Tuem KB, Gebremeskel L, Hiluf K, Arko K, Hailu HG. Adequacy of cancer-related pain treatments and factors affecting proper management in Ayder Comprehensive Specialized Hospital, Mekelle, Ethiopia. Journal of Oncology. 2020;2020.

Matrisch L, Rau Y, Karsten H, Graßhoff H, Riemekasten G. The Lübeck medication satisfaction Questionnaire—A Novel Measurement Tool for Therapy satisfaction. J Personalized Med. 2023;13(3):505.

Bayable SD, Ahmed SA, Lema GF, Yaregal Melesse D. Assessment of Maternal Satisfaction and Associated Factors among Parturients Who Underwent Cesarean Delivery under Spinal Anesthesia at University of Gondar Comprehensive Specialized Hospital, Northwest Ethiopia, 2019. Anesthesiology research and practice. 2020;2020:8697651.

Breivik H, Borchgrevink P-C, Allen S-M, Rosseland L-A, Romundstad L, Breivik Hals E, et al. Assessment of pain. BJA: Br J Anaesth. 2008;101(1):17–24.

Tegegn HG, Gebreyohannes EA. Cancer Pain Management and Pain Interference with Daily Functioning among Cancer patients in Gondar University Hospital. Pain Res Manage. 2017;2017:5698640.

Shen W-C, Chen J-S, Shao Y-Y, Lee K-D, Chiou T-J, Sung Y-C, et al. Impact of undertreatment of cancer pain with analgesic drugs on patient outcomes: a nationwide survey of outpatient cancer patient care in Taiwan. J Pain Symptom Manag. 2017;54(1):55–65. e1.

Gordon DB, Polomano RC, Pellino TA, Turk DC, McCracken LM, Sherwood G, et al. Revised American Pain Society Patient Outcome Questionnaire (APS-POQ-R) for quality improvement of pain management in hospitalized adults: preliminary psychometric evaluation. J Pain. 2010;11(11):1172–86.

Thronæs M, Balstad TR, Brunelli C, Løhre ET, Klepstad P, Vagnildhaug OM, et al. Pain management index (PMI)—does it reflect cancer patients’ wish for focus on pain? Support Care Cancer. 2020;28:1675–84.

Mazzotta M, Filetti M, Piras M, Mercadante S, Marchetti P, Giusti R. Patients’ satisfaction with breakthrough cancer pain therapy: A secondary analysis of IOPS-MS study. Cancer Manage Res. 2022:1237–45.

Golas M, Park CG, Wilkie DJ. Patient satisfaction with Pain Level in patients with Cancer. Pain Manage Nursing: Official J Am Soc Pain Manage Nurses. 2016;17(3):218–25.

Tang ST, Tang W-R, Liu T-W, Lin C-P, Chen J-S. What really matters in pain management for terminally ill cancer patients in Taiwan. J Palliat Care. 2010;26(3):151–8.

Kharel S, Adhikari I, Shrestha K. Satisfaction on Pain Management among Cancer patient in selected Cancer Care Center Bhaktapur Nepal. Int J Med Sci Clin Res Stud. 2023;3(4):597–603.

Breivik H, Eisenberg E, O’Brien T. The individual and societal burden of chronic pain in Europe: the case for strategic prioritisation and action to improve knowledge and availability of appropriate care. BMC Public Health. 2013;13:1–14.

Gonzalez-Saenz de Tejada M, Bilbao A, Baré M, Briones E, Sarasqueta C, Quintana J, et al. Association between social support, functional status, and change in health‐related quality of life and changes in anxiety and depression in colorectal cancer patients. Psycho‐oncology. 2017;26(9):1263–9.

Yoo H, Shin DW, Jeong A, Kim SY, Yang H-k, Kim JS, et al. Perceived social support and its impact on depression and health-related quality of life: a comparison between cancer patients and general population. Jpn J Clin Oncol. 2017;47(8):728–34.

Hanna MN, González-Fernández M, Barrett AD, Williams KA, Pronovost P. Does patient perception of pain control affect patient satisfaction across surgical units in a tertiary teaching hospital? Am J Med Qual. 2012;27(5):411–6.

Naveh P. Pain severity, satisfaction with pain management, and patient-related barriers to pain management in patients with cancer in Israel. Number 4/July 2011. 2011;38(4):E305–13.

Google Scholar  

Black B, Herr K, Fine P, Sanders S, Tang X, Bergen-Jackson K, et al. The relationships among pain, nonpain symptoms, and quality of life measures in older adults with cancer receiving hospice care. Pain Med. 2011;12(6):880–9.

Kelly A-M. Patient satisfaction with pain management does not correlate with initial or discharge VAS pain score, verbal pain rating at discharge, or change in VAS score in the emergency department. J Emerg Med. 2000;19(2):113–6.

Lin J, Hsieh RK, Chen JS, Lee KD, Rau KM, Shao YY, et al. Satisfaction with pain management and impact of pain on quality of life in cancer patients. Asia-Pac J Clin Oncol. 2020;16(2):e91–8.

Su W-C, Chuang C-H, Chen F-M, Tsai H-L, Huang C-W, Chang T-K, et al. Effects of Good Pain Management (GPM) ward program on patterns of care and pain control in patients with cancer pain in Taiwan. Support Care Cancer. 2021;29(4):1903–11.

Prip A, Møller KA, Nielsen DL, Jarden M, Olsen M-H, Danielsen AK. The patient–healthcare professional relationship and communication in the oncology outpatient setting: a systematic review. Cancer Nurs. 2018;41(5):E11.

Orujlu S, Hassankhani H, Rahmani A, Sanaat Z, Dadashzadeh A, Allahbakhshian A. Barriers to cancer pain management from the perspective of patients: a qualitative study. Nurs open. 2022;9(1):541–9.

Uysal N. Clearing barriers in Cancer Pain Management: roles of nurses. Int J Caring Sci. 2018;11(2).

Beck SL, Towsley GL, Berry PH, Lindau K, Field RB, Jensen S. Core aspects of satisfaction with pain management: cancer patients’ perspectives. J Pain Symptom Manag. 2010;39(1):100–15.

Wada N, Handa S, Yamamoto H, Higuchi H, Okamoto K, Sasaki T, et al. Integrating Cancer patients’ satisfaction with Rescue Medication in Pain assessments. Showa Univ J Med Sci. 2020;32(3):181–91.

Antón A, Montalar J, Carulla J, Jara C, Batista N, Camps C, et al. Pain in clinical oncology: patient satisfaction with management of cancer pain. Eur J Pain. 2012;16(3):381–9.

Valero-Cantero I, Casals C, Espinar-Toledo M, Barón-López FJ, Martínez-Valero FJ, Vázquez-Sánchez MÁ. Cancer Patients&rsquo; Satisfaction with In-Home Palliative Care and Its Impact on Disease Symptoms. Healthcare. 2023;11(9):1272.

Download references

Acknowledgements

We would like to acknowledge the University of Gondar Comprehensive Specialized Hospital, Tibebe-Ghion Comprehensive Specialized Hospital, Felege-Hiwot Comprehensive Specialized Hospital, Dessie Comprehensive Specialized Hospital. We would also want to acknowledge Ludwig Matrisch from the Department of Rheumatology and Clinical Immunology, Universität zu Lübeck, 23562 Lübeck, Germany for supporting us on the utilization of the Lübeck Medication Satisfaction Questionnaire (LMSQ) [email protected],

This study was supported by University of Gondar and Debre Birhan University with no conflict of interest. The support did not include publication charges.

Author information

Authors and affiliations.

Department of Anesthesia, Debre Birhan University, Debre Birhan, Ethiopia

Molla Amsalu

Department of Anaesthesia, University of Gondar, Gondar, Ethiopia

Henos Enyew Ashagrie, Amare Belete Getahun & Yophtahe Woldegerima Berhe

You can also search for this author in PubMed   Google Scholar

Contributions

‘’M.A. has conceptualized the study and objectives; and developed the proposal. Y.W.B., H.E.A., and A.B.G. criticized the proposal. All authors had participated in the data management and statistical analyses. Y.W.B, M.A., and H.E.A. have prepared the final manuscript. All authors read and approved the final manuscript.‘’.

Corresponding author

Correspondence to Yophtahe Woldegerima Berhe .

Ethics declarations

Ethics approval and consent to participate.

Ethical approval was obtained from the Ethical Review Committee of the School of Medicine, at the University of Gondar ( Reference number: CMHS/SM/06/01/4097/2015, Date: March 24, 2023 ). Permission support letters were obtained from FHCSH, TGCSH, and DCSH. Written informed consent was obtained from each participant after detailed explanations about the study. Informed consent with a fingerprint signature was obtained from patients who could not read or write after detailed explanations by the data collectors as approved by the Ethical Review Committee of the School of Medicine, at the University of Gondar.

Consent for publication

Not applicable; this article does not include any personal details of any participant.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Amsalu, M., Ashagrie, H.E., Getahun, A.B. et al. Patients’ satisfaction with cancer pain treatment at adult oncologic centers in Northern Ethiopia; a multi-center cross-sectional study. BMC Cancer 24 , 647 (2024). https://doi.org/10.1186/s12885-024-12359-7

Download citation

Received : 17 October 2023

Accepted : 08 May 2024

Published : 27 May 2024

DOI : https://doi.org/10.1186/s12885-024-12359-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Cancer pain treatment
  • Treatment satisfaction
  • Cancer pain

ISSN: 1471-2407

data analysis in correlational research

IMAGES

  1. What Is a Correlational Study And Examples of correlational research

    data analysis in correlational research

  2. Correlational Research: Definition with Examples

    data analysis in correlational research

  3. PPT

    data analysis in correlational research

  4. Correlation analysis

    data analysis in correlational research

  5. PPT

    data analysis in correlational research

  6. Pearson Correlation Analysis Table.

    data analysis in correlational research

VIDEO

  1. Correlational Analysis (Part-1) by Dr. C. L. Prajapati, UTD, MCBU

  2. Unit 1: Correlational Research (AP Psychology)

  3. Reporting Correlational Analysis

  4. Correlational Analysis (Part-2) by Dr. C. L. Prajapati, UTD, MCBU

  5. 8.Correlational research design & Its types

  6. Correlation Chapter 11 #statistics #youtubeshorts

COMMENTS

  1. Correlational Research

    Correlational research is a type of study that explores how variables are related to each other. It can help you identify patterns, trends, and predictions in your data. In this guide, you will learn when and how to use correlational research, and what its advantages and limitations are. You will also find examples of correlational research questions and designs. If you want to know the ...

  2. Correlational Research

    Surveys are a common method used in correlational research. Researchers collect data by asking participants to complete questionnaires or surveys that measure different variables of interest. Surveys are useful for exploring the relationships between variables such as personality traits, attitudes, and behaviors.

  3. Correlation Analysis

    Correlation analysis is a statistical method used to evaluate the strength and direction of the relationship between two or more variables. The correlation coefficient ranges from -1 to 1. A correlation coefficient of 1 indicates a perfect positive correlation. This means that as one variable increases, the other variable also increases.

  4. Correlational Research

    Revised on 5 December 2022. A correlational research design investigates relationships between variables without the researcher controlling or manipulating any of them. A correlation reflects the strength and/or direction of the relationship between two (or more) variables. The direction of a correlation can be either positive or negative.

  5. 7.2 Correlational Research

    Data Collection in Correlational Research. Again, the defining feature of correlational research is that neither variable is manipulated. It does not matter how or where the variables are measured. ... This is an example of content analysis —a family of systematic approaches to measurement using complex archival data. Just as naturalistic ...

  6. Correlational Study Overview & Examples

    A correlational study is an experimental design that evaluates only the correlation between variables. The researchers record measurements but do not control or manipulate the variables. Correlational research is a form of observational study. A correlation indicates that as the value of one variable increases, the other tends to change in a ...

  7. Correlation

    3.1. Typs of Correlation. Positive Correlation: - Value: r is between 0 and +1. - Meaning: When one variable increases, the other also increases, and when one decreases, the other also decreases. - Graphically, a positive correlation will generally display a line of best fit that slopes upwards.

  8. Correlation Coefficient

    Using a correlation coefficient. In correlational research, you investigate whether changes in one variable are associated with changes in other variables.. Correlational research example You investigate whether standardized scores from high school are related to academic grades in college. You predict that there's a positive correlation: higher SAT scores are associated with higher college ...

  9. Correlational Design and Analysis

    This chapter describes basic design and analysis considerations in research that involves the systematic investigation of whether and how variables are related; in other words, correlational research. The chapter poses correlational research as an extension of the book's previous discussion of descriptive research. The chapter briefly ...

  10. How to use correlational research to spot patterns and trends

    In correlational research, you simply observe the two variables, their natural relationship, and their effects on each other. Observation takes place in the natural environment of the variables, and neither variable is manipulated. Data collection. Correlational research generally involves two or more sets of data.

  11. Conducting correlation analysis: important limitations and pitfalls

    The correlation coefficient is easy to calculate and provides a measure of the strength of linear association in the data. However, it also has important limitations and pitfalls, both when studying the association between two variables and when studying agreement between methods. These limitations and pitfalls should be taken into account when ...

  12. Pearson Correlation Coefficient (r)

    Revised on February 10, 2024. The Pearson correlation coefficient (r) is the most common way of measuring a linear correlation. It is a number between -1 and 1 that measures the strength and direction of the relationship between two variables. When one variable changes, the other variable changes in the same direction.

  13. 6.2 Correlational Research

    Correlational research is a type of non-experimental research in which the researcher measures two variables and assesses the statistical relationship (i.e., the correlation) between them with little or no effort to control extraneous variables. There are many reasons that researchers interested in statistical relationships between variables ...

  14. Correlational Research: What it is with Examples

    Correlational research is a type of non-experimental research method in which a researcher measures two variables and understands and assesses the statistical relationship between them with no influence from any extraneous variable. In statistical analysis, distinguishing between categorical data and numerical data is essential, as categorical ...

  15. Correlation Research: What It Is & How to Use It

    Correlation (often referred to as correlational study, correlation research, bivariate correlation or correlation analysis) is a core step in understanding your data (such as from survey research) or the relationship between variables in your dataset, typically expressed as x1 and x2. If a correlation exists, one variable is correlated to ...

  16. Correlation in Statistics: Correlation Analysis Explained

    Step 1: Type your data into a worksheet in Excel. The best format is two columns. Place your x-values in column A and your y-values in column B. Step 2: Click the "Data" tab and then click "Data Analysis.". Step 3: Click "Correlation" and then click "OK.". Step 4: Type the location for your x-y variables in the Input. Range box.

  17. Understanding the Correlation Coefficient: A Complete Guide

    Correlation (Pearson, Kendall, Spearman) Correlation is a bivariate analysis that measures the strength of association between two variables and the direction of the relationship. In terms of the strength of relationship, the value of the correlation coefficient varies between +1 and -1. A value of ± 1 indicates a perfect degree of association ...

  18. Analyzing Data: Correlational and Experimental Research

    Within psychology, the most common standard for p-values is "p < .05". What this means is that there is less than a 5% probability that the results happened just by random chance, and therefore a 95% probability that the results reflect a meaningful pattern in human psychology. We call this statistical significance.

  19. Correlational Research

    Data Collection in Correlational Research. Again, the defining feature of correlational research is that neither variable is manipulated. It does not matter how or where the variables are measured. ... This method is an example of content analysis —a family of systematic approaches to measurement using complex archival data. Just as ...

  20. Using Correlation Analysis with Survey Data

    Correlation analysis is a statistical research technique used to determine if there is a relationship between two variables or datasets. In the area of market research, it's used to examine quantitative survey data to identify significant patterns, trends, or connections between the variables. Correlation should not be confused with causation.

  21. The Beginner's Guide to Statistical Analysis

    This article is a practical introduction to statistical analysis for students and researchers. We'll walk you through the steps using two research examples. The first investigates a potential cause-and-effect relationship, while the second investigates a potential correlation between variables. Example: Causal research question.

  22. Correlational Research Designs: Types, Examples & Methods

    Positive correlational research is a research method involving 2 variables that are statistically corresponding where an increase or decrease in 1 variable creates a like change in the other. An example is when an increase in workers' remuneration results in an increase in the prices of goods and services and vice versa.

  23. Correlation Research: What Is It & How Can You Use It?

    Correlation (often referred to as correlational study, correlation research, bivariate correlation or correlation analysis) is a core step in understanding your data (such as from survey research) or the relationship between variables in your dataset, typically expressed as x1 and x2. If a correlation exists, one variable is correlated to ...

  24. Patients' satisfaction with cancer pain treatment at adult oncologic

    Two hundred and seventy-nine (70.3%) cancer patients were found to be satisfied with cancer pain treatment (CI = 65.6−74.6%). The highest satisfaction rate was observed in the "side-effects" subscale, to which 343 (86.4%) responded satisfied [Fig. 1].A Spearman's correlation test revealed that there were correlations among the subscales of LMSQ; and the strongest positive correlation ...