Research Hypothesis In Psychology: Types, & Examples

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

A research hypothesis, in its plural form “hypotheses,” is a specific, testable prediction about the anticipated results of a study, established at its outset. It is a key component of the scientific method .

Hypotheses connect theory to data and guide the research process towards expanding scientific understanding

Some key points about hypotheses:

  • A hypothesis expresses an expected pattern or relationship. It connects the variables under investigation.
  • It is stated in clear, precise terms before any data collection or analysis occurs. This makes the hypothesis testable.
  • A hypothesis must be falsifiable. It should be possible, even if unlikely in practice, to collect data that disconfirms rather than supports the hypothesis.
  • Hypotheses guide research. Scientists design studies to explicitly evaluate hypotheses about how nature works.
  • For a hypothesis to be valid, it must be testable against empirical evidence. The evidence can then confirm or disprove the testable predictions.
  • Hypotheses are informed by background knowledge and observation, but go beyond what is already known to propose an explanation of how or why something occurs.
Predictions typically arise from a thorough knowledge of the research literature, curiosity about real-world problems or implications, and integrating this to advance theory. They build on existing literature while providing new insight.

Types of Research Hypotheses

Alternative hypothesis.

The research hypothesis is often called the alternative or experimental hypothesis in experimental research.

It typically suggests a potential relationship between two key variables: the independent variable, which the researcher manipulates, and the dependent variable, which is measured based on those changes.

The alternative hypothesis states a relationship exists between the two variables being studied (one variable affects the other).

A hypothesis is a testable statement or prediction about the relationship between two or more variables. It is a key component of the scientific method. Some key points about hypotheses:

  • Important hypotheses lead to predictions that can be tested empirically. The evidence can then confirm or disprove the testable predictions.

In summary, a hypothesis is a precise, testable statement of what researchers expect to happen in a study and why. Hypotheses connect theory to data and guide the research process towards expanding scientific understanding.

An experimental hypothesis predicts what change(s) will occur in the dependent variable when the independent variable is manipulated.

It states that the results are not due to chance and are significant in supporting the theory being investigated.

The alternative hypothesis can be directional, indicating a specific direction of the effect, or non-directional, suggesting a difference without specifying its nature. It’s what researchers aim to support or demonstrate through their study.

Null Hypothesis

The null hypothesis states no relationship exists between the two variables being studied (one variable does not affect the other). There will be no changes in the dependent variable due to manipulating the independent variable.

It states results are due to chance and are not significant in supporting the idea being investigated.

The null hypothesis, positing no effect or relationship, is a foundational contrast to the research hypothesis in scientific inquiry. It establishes a baseline for statistical testing, promoting objectivity by initiating research from a neutral stance.

Many statistical methods are tailored to test the null hypothesis, determining the likelihood of observed results if no true effect exists.

This dual-hypothesis approach provides clarity, ensuring that research intentions are explicit, and fosters consistency across scientific studies, enhancing the standardization and interpretability of research outcomes.

Nondirectional Hypothesis

A non-directional hypothesis, also known as a two-tailed hypothesis, predicts that there is a difference or relationship between two variables but does not specify the direction of this relationship.

It merely indicates that a change or effect will occur without predicting which group will have higher or lower values.

For example, “There is a difference in performance between Group A and Group B” is a non-directional hypothesis.

Directional Hypothesis

A directional (one-tailed) hypothesis predicts the nature of the effect of the independent variable on the dependent variable. It predicts in which direction the change will take place. (i.e., greater, smaller, less, more)

It specifies whether one variable is greater, lesser, or different from another, rather than just indicating that there’s a difference without specifying its nature.

For example, “Exercise increases weight loss” is a directional hypothesis.

hypothesis

Falsifiability

The Falsification Principle, proposed by Karl Popper , is a way of demarcating science from non-science. It suggests that for a theory or hypothesis to be considered scientific, it must be testable and irrefutable.

Falsifiability emphasizes that scientific claims shouldn’t just be confirmable but should also have the potential to be proven wrong.

It means that there should exist some potential evidence or experiment that could prove the proposition false.

However many confirming instances exist for a theory, it only takes one counter observation to falsify it. For example, the hypothesis that “all swans are white,” can be falsified by observing a black swan.

For Popper, science should attempt to disprove a theory rather than attempt to continually provide evidence to support a research hypothesis.

Can a Hypothesis be Proven?

Hypotheses make probabilistic predictions. They state the expected outcome if a particular relationship exists. However, a study result supporting a hypothesis does not definitively prove it is true.

All studies have limitations. There may be unknown confounding factors or issues that limit the certainty of conclusions. Additional studies may yield different results.

In science, hypotheses can realistically only be supported with some degree of confidence, not proven. The process of science is to incrementally accumulate evidence for and against hypothesized relationships in an ongoing pursuit of better models and explanations that best fit the empirical data. But hypotheses remain open to revision and rejection if that is where the evidence leads.
  • Disproving a hypothesis is definitive. Solid disconfirmatory evidence will falsify a hypothesis and require altering or discarding it based on the evidence.
  • However, confirming evidence is always open to revision. Other explanations may account for the same results, and additional or contradictory evidence may emerge over time.

We can never 100% prove the alternative hypothesis. Instead, we see if we can disprove, or reject the null hypothesis.

If we reject the null hypothesis, this doesn’t mean that our alternative hypothesis is correct but does support the alternative/experimental hypothesis.

Upon analysis of the results, an alternative hypothesis can be rejected or supported, but it can never be proven to be correct. We must avoid any reference to results proving a theory as this implies 100% certainty, and there is always a chance that evidence may exist which could refute a theory.

How to Write a Hypothesis

  • Identify variables . The researcher manipulates the independent variable and the dependent variable is the measured outcome.
  • Operationalized the variables being investigated . Operationalization of a hypothesis refers to the process of making the variables physically measurable or testable, e.g. if you are about to study aggression, you might count the number of punches given by participants.
  • Decide on a direction for your prediction . If there is evidence in the literature to support a specific effect of the independent variable on the dependent variable, write a directional (one-tailed) hypothesis. If there are limited or ambiguous findings in the literature regarding the effect of the independent variable on the dependent variable, write a non-directional (two-tailed) hypothesis.
  • Make it Testable : Ensure your hypothesis can be tested through experimentation or observation. It should be possible to prove it false (principle of falsifiability).
  • Clear & concise language . A strong hypothesis is concise (typically one to two sentences long), and formulated using clear and straightforward language, ensuring it’s easily understood and testable.

Consider a hypothesis many teachers might subscribe to: students work better on Monday morning than on Friday afternoon (IV=Day, DV= Standard of work).

Now, if we decide to study this by giving the same group of students a lesson on a Monday morning and a Friday afternoon and then measuring their immediate recall of the material covered in each session, we would end up with the following:

  • The alternative hypothesis states that students will recall significantly more information on a Monday morning than on a Friday afternoon.
  • The null hypothesis states that there will be no significant difference in the amount recalled on a Monday morning compared to a Friday afternoon. Any difference will be due to chance or confounding factors.

More Examples

  • Memory : Participants exposed to classical music during study sessions will recall more items from a list than those who studied in silence.
  • Social Psychology : Individuals who frequently engage in social media use will report higher levels of perceived social isolation compared to those who use it infrequently.
  • Developmental Psychology : Children who engage in regular imaginative play have better problem-solving skills than those who don’t.
  • Clinical Psychology : Cognitive-behavioral therapy will be more effective in reducing symptoms of anxiety over a 6-month period compared to traditional talk therapy.
  • Cognitive Psychology : Individuals who multitask between various electronic devices will have shorter attention spans on focused tasks than those who single-task.
  • Health Psychology : Patients who practice mindfulness meditation will experience lower levels of chronic pain compared to those who don’t meditate.
  • Organizational Psychology : Employees in open-plan offices will report higher levels of stress than those in private offices.
  • Behavioral Psychology : Rats rewarded with food after pressing a lever will press it more frequently than rats who receive no reward.

Print Friendly, PDF & Email

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology
  • How to Write a Strong Hypothesis | Guide & Examples

How to Write a Strong Hypothesis | Guide & Examples

Published on 6 May 2022 by Shona McCombes .

A hypothesis is a statement that can be tested by scientific research. If you want to test a relationship between two or more variables, you need to write hypotheses before you start your experiment or data collection.

Table of contents

What is a hypothesis, developing a hypothesis (with example), hypothesis examples, frequently asked questions about writing hypotheses.

A hypothesis states your predictions about what your research will find. It is a tentative answer to your research question that has not yet been tested. For some research projects, you might have to write several hypotheses that address different aspects of your research question.

A hypothesis is not just a guess – it should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations, and statistical analysis of data).

Variables in hypotheses

Hypotheses propose a relationship between two or more variables . An independent variable is something the researcher changes or controls. A dependent variable is something the researcher observes and measures.

In this example, the independent variable is exposure to the sun – the assumed cause . The dependent variable is the level of happiness – the assumed effect .

Prevent plagiarism, run a free check.

Step 1: ask a question.

Writing a hypothesis begins with a research question that you want to answer. The question should be focused, specific, and researchable within the constraints of your project.

Step 2: Do some preliminary research

Your initial answer to the question should be based on what is already known about the topic. Look for theories and previous studies to help you form educated assumptions about what your research will find.

At this stage, you might construct a conceptual framework to identify which variables you will study and what you think the relationships are between them. Sometimes, you’ll have to operationalise more complex constructs.

Step 3: Formulate your hypothesis

Now you should have some idea of what you expect to find. Write your initial answer to the question in a clear, concise sentence.

Step 4: Refine your hypothesis

You need to make sure your hypothesis is specific and testable. There are various ways of phrasing a hypothesis, but all the terms you use should have clear definitions, and the hypothesis should contain:

  • The relevant variables
  • The specific group being studied
  • The predicted outcome of the experiment or analysis

Step 5: Phrase your hypothesis in three ways

To identify the variables, you can write a simple prediction in if … then form. The first part of the sentence states the independent variable and the second part states the dependent variable.

In academic research, hypotheses are more commonly phrased in terms of correlations or effects, where you directly state the predicted relationship between variables.

If you are comparing two groups, the hypothesis can state what difference you expect to find between them.

Step 6. Write a null hypothesis

If your research involves statistical hypothesis testing , you will also have to write a null hypothesis. The null hypothesis is the default position that there is no association between the variables. The null hypothesis is written as H 0 , while the alternative hypothesis is H 1 or H a .

Hypothesis testing is a formal procedure for investigating our ideas about the world using statistics. It is used by scientists to test specific predictions, called hypotheses , by calculating how likely it is that a pattern or relationship between variables could have arisen by chance.

A hypothesis is not just a guess. It should be based on existing theories and knowledge. It also has to be testable, which means you can support or refute it through scientific research methods (such as experiments, observations, and statistical analysis of data).

A research hypothesis is your proposed answer to your research question. The research hypothesis usually includes an explanation (‘ x affects y because …’).

A statistical hypothesis, on the other hand, is a mathematical statement about a population parameter. Statistical hypotheses always come in pairs: the null and alternative hypotheses. In a well-designed study , the statistical hypotheses correspond logically to the research hypothesis.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

McCombes, S. (2022, May 06). How to Write a Strong Hypothesis | Guide & Examples. Scribbr. Retrieved 15 April 2024, from https://www.scribbr.co.uk/research-methods/hypothesis-writing/

Is this article helpful?

Shona McCombes

Shona McCombes

Other students also liked, operationalisation | a guide with examples, pros & cons, what is a conceptual framework | tips & examples, a quick guide to experimental design | 5 steps & examples.

  • Bipolar Disorder
  • Therapy Center
  • When To See a Therapist
  • Types of Therapy
  • Best Online Therapy
  • Best Couples Therapy
  • Best Family Therapy
  • Managing Stress
  • Sleep and Dreaming
  • Understanding Emotions
  • Self-Improvement
  • Healthy Relationships
  • Student Resources
  • Personality Types
  • Guided Meditations
  • Verywell Mind Insights
  • 2023 Verywell Mind 25
  • Mental Health in the Classroom
  • Editorial Process
  • Meet Our Review Board
  • Crisis Support

How to Write a Great Hypothesis

Hypothesis Format, Examples, and Tips

Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

hypothesis relationship between two variables

Amy Morin, LCSW, is a psychotherapist and international bestselling author. Her books, including "13 Things Mentally Strong People Don't Do," have been translated into more than 40 languages. Her TEDx talk,  "The Secret of Becoming Mentally Strong," is one of the most viewed talks of all time.

hypothesis relationship between two variables

Verywell / Alex Dos Diaz

  • The Scientific Method

Hypothesis Format

Falsifiability of a hypothesis, operational definitions, types of hypotheses, hypotheses examples.

  • Collecting Data

Frequently Asked Questions

A hypothesis is a tentative statement about the relationship between two or more  variables. It is a specific, testable prediction about what you expect to happen in a study.

One hypothesis example would be a study designed to look at the relationship between sleep deprivation and test performance might have a hypothesis that states: "This study is designed to assess the hypothesis that sleep-deprived people will perform worse on a test than individuals who are not sleep-deprived."

This article explores how a hypothesis is used in psychology research, how to write a good hypothesis, and the different types of hypotheses you might use.

The Hypothesis in the Scientific Method

In the scientific method , whether it involves research in psychology, biology, or some other area, a hypothesis represents what the researchers think will happen in an experiment. The scientific method involves the following steps:

  • Forming a question
  • Performing background research
  • Creating a hypothesis
  • Designing an experiment
  • Collecting data
  • Analyzing the results
  • Drawing conclusions
  • Communicating the results

The hypothesis is a prediction, but it involves more than a guess. Most of the time, the hypothesis begins with a question which is then explored through background research. It is only at this point that researchers begin to develop a testable hypothesis. Unless you are creating an exploratory study, your hypothesis should always explain what you  expect  to happen.

In a study exploring the effects of a particular drug, the hypothesis might be that researchers expect the drug to have some type of effect on the symptoms of a specific illness. In psychology, the hypothesis might focus on how a certain aspect of the environment might influence a particular behavior.

Remember, a hypothesis does not have to be correct. While the hypothesis predicts what the researchers expect to see, the goal of the research is to determine whether this guess is right or wrong. When conducting an experiment, researchers might explore a number of factors to determine which ones might contribute to the ultimate outcome.

In many cases, researchers may find that the results of an experiment  do not  support the original hypothesis. When writing up these results, the researchers might suggest other options that should be explored in future studies.

In many cases, researchers might draw a hypothesis from a specific theory or build on previous research. For example, prior research has shown that stress can impact the immune system. So a researcher might hypothesize: "People with high-stress levels will be more likely to contract a common cold after being exposed to the virus than people who have low-stress levels."

In other instances, researchers might look at commonly held beliefs or folk wisdom. "Birds of a feather flock together" is one example of folk wisdom that a psychologist might try to investigate. The researcher might pose a specific hypothesis that "People tend to select romantic partners who are similar to them in interests and educational level."

Elements of a Good Hypothesis

So how do you write a good hypothesis? When trying to come up with a hypothesis for your research or experiments, ask yourself the following questions:

  • Is your hypothesis based on your research on a topic?
  • Can your hypothesis be tested?
  • Does your hypothesis include independent and dependent variables?

Before you come up with a specific hypothesis, spend some time doing background research. Once you have completed a literature review, start thinking about potential questions you still have. Pay attention to the discussion section in the  journal articles you read . Many authors will suggest questions that still need to be explored.

To form a hypothesis, you should take these steps:

  • Collect as many observations about a topic or problem as you can.
  • Evaluate these observations and look for possible causes of the problem.
  • Create a list of possible explanations that you might want to explore.
  • After you have developed some possible hypotheses, think of ways that you could confirm or disprove each hypothesis through experimentation. This is known as falsifiability.

In the scientific method ,  falsifiability is an important part of any valid hypothesis.   In order to test a claim scientifically, it must be possible that the claim could be proven false.

Students sometimes confuse the idea of falsifiability with the idea that it means that something is false, which is not the case. What falsifiability means is that  if  something was false, then it is possible to demonstrate that it is false.

One of the hallmarks of pseudoscience is that it makes claims that cannot be refuted or proven false.

A variable is a factor or element that can be changed and manipulated in ways that are observable and measurable. However, the researcher must also define how the variable will be manipulated and measured in the study.

For example, a researcher might operationally define the variable " test anxiety " as the results of a self-report measure of anxiety experienced during an exam. A "study habits" variable might be defined by the amount of studying that actually occurs as measured by time.

These precise descriptions are important because many things can be measured in a number of different ways. One of the basic principles of any type of scientific research is that the results must be replicable.   By clearly detailing the specifics of how the variables were measured and manipulated, other researchers can better understand the results and repeat the study if needed.

Some variables are more difficult than others to define. How would you operationally define a variable such as aggression ? For obvious ethical reasons, researchers cannot create a situation in which a person behaves aggressively toward others.

In order to measure this variable, the researcher must devise a measurement that assesses aggressive behavior without harming other people. In this situation, the researcher might utilize a simulated task to measure aggressiveness.

Hypothesis Checklist

  • Does your hypothesis focus on something that you can actually test?
  • Does your hypothesis include both an independent and dependent variable?
  • Can you manipulate the variables?
  • Can your hypothesis be tested without violating ethical standards?

The hypothesis you use will depend on what you are investigating and hoping to find. Some of the main types of hypotheses that you might use include:

  • Simple hypothesis : This type of hypothesis suggests that there is a relationship between one independent variable and one dependent variable.
  • Complex hypothesis : This type of hypothesis suggests a relationship between three or more variables, such as two independent variables and a dependent variable.
  • Null hypothesis : This hypothesis suggests no relationship exists between two or more variables.
  • Alternative hypothesis : This hypothesis states the opposite of the null hypothesis.
  • Statistical hypothesis : This hypothesis uses statistical analysis to evaluate a representative sample of the population and then generalizes the findings to the larger group.
  • Logical hypothesis : This hypothesis assumes a relationship between variables without collecting data or evidence.

A hypothesis often follows a basic format of "If {this happens} then {this will happen}." One way to structure your hypothesis is to describe what will happen to the  dependent variable  if you change the  independent variable .

The basic format might be: "If {these changes are made to a certain independent variable}, then we will observe {a change in a specific dependent variable}."

A few examples of simple hypotheses:

  • "Students who eat breakfast will perform better on a math exam than students who do not eat breakfast."
  • Complex hypothesis: "Students who experience test anxiety before an English exam will get lower scores than students who do not experience test anxiety."​
  • "Motorists who talk on the phone while driving will be more likely to make errors on a driving course than those who do not talk on the phone."

Examples of a complex hypothesis include:

  • "People with high-sugar diets and sedentary activity levels are more likely to develop depression."
  • "Younger people who are regularly exposed to green, outdoor areas have better subjective well-being than older adults who have limited exposure to green spaces."

Examples of a null hypothesis include:

  • "Children who receive a new reading intervention will have scores different than students who do not receive the intervention."
  • "There will be no difference in scores on a memory recall task between children and adults."

Examples of an alternative hypothesis:

  • "Children who receive a new reading intervention will perform better than students who did not receive the intervention."
  • "Adults will perform better on a memory task than children." 

Collecting Data on Your Hypothesis

Once a researcher has formed a testable hypothesis, the next step is to select a research design and start collecting data. The research method depends largely on exactly what they are studying. There are two basic types of research methods: descriptive research and experimental research.

Descriptive Research Methods

Descriptive research such as  case studies ,  naturalistic observations , and surveys are often used when it would be impossible or difficult to  conduct an experiment . These methods are best used to describe different aspects of a behavior or psychological phenomenon.

Once a researcher has collected data using descriptive methods, a correlational study can then be used to look at how the variables are related. This type of research method might be used to investigate a hypothesis that is difficult to test experimentally.

Experimental Research Methods

Experimental methods  are used to demonstrate causal relationships between variables. In an experiment, the researcher systematically manipulates a variable of interest (known as the independent variable) and measures the effect on another variable (known as the dependent variable).

Unlike correlational studies, which can only be used to determine if there is a relationship between two variables, experimental methods can be used to determine the actual nature of the relationship—whether changes in one variable actually  cause  another to change.

A Word From Verywell

The hypothesis is a critical part of any scientific exploration. It represents what researchers expect to find in a study or experiment. In situations where the hypothesis is unsupported by the research, the research still has value. Such research helps us better understand how different aspects of the natural world relate to one another. It also helps us develop new hypotheses that can then be tested in the future.

Some examples of how to write a hypothesis include:

  • "Staying up late will lead to worse test performance the next day."
  • "People who consume one apple each day will visit the doctor fewer times each year."
  • "Breaking study sessions up into three 20-minute sessions will lead to better test results than a single 60-minute study session."

The four parts of a hypothesis are:

  • The research question
  • The independent variable (IV)
  • The dependent variable (DV)
  • The proposed relationship between the IV and DV

Castillo M. The scientific method: a need for something better? . AJNR Am J Neuroradiol. 2013;34(9):1669-71. doi:10.3174/ajnr.A3401

Nevid J. Psychology: Concepts and Applications. Wadworth, 2013.

By Kendra Cherry, MSEd Kendra Cherry, MS, is a psychosocial rehabilitation specialist, psychology educator, and author of the "Everything Psychology Book."

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base
  • Choosing the Right Statistical Test | Types & Examples

Choosing the Right Statistical Test | Types & Examples

Published on January 28, 2020 by Rebecca Bevans . Revised on June 22, 2023.

Statistical tests are used in hypothesis testing . They can be used to:

  • determine whether a predictor variable has a statistically significant relationship with an outcome variable.
  • estimate the difference between two or more groups.

Statistical tests assume a null hypothesis of no relationship or no difference between groups. Then they determine whether the observed data fall outside of the range of values predicted by the null hypothesis.

If you already know what types of variables you’re dealing with, you can use the flowchart to choose the right statistical test for your data.

Statistical tests flowchart

Table of contents

What does a statistical test do, when to perform a statistical test, choosing a parametric test: regression, comparison, or correlation, choosing a nonparametric test, flowchart: choosing a statistical test, other interesting articles, frequently asked questions about statistical tests.

Statistical tests work by calculating a test statistic – a number that describes how much the relationship between variables in your test differs from the null hypothesis of no relationship.

It then calculates a p value (probability value). The p -value estimates how likely it is that you would see the difference described by the test statistic if the null hypothesis of no relationship were true.

If the value of the test statistic is more extreme than the statistic calculated from the null hypothesis, then you can infer a statistically significant relationship between the predictor and outcome variables.

If the value of the test statistic is less extreme than the one calculated from the null hypothesis, then you can infer no statistically significant relationship between the predictor and outcome variables.

Receive feedback on language, structure, and formatting

Professional editors proofread and edit your paper by focusing on:

  • Academic style
  • Vague sentences
  • Style consistency

See an example

hypothesis relationship between two variables

You can perform statistical tests on data that have been collected in a statistically valid manner – either through an experiment , or through observations made using probability sampling methods .

For a statistical test to be valid , your sample size needs to be large enough to approximate the true distribution of the population being studied.

To determine which statistical test to use, you need to know:

  • whether your data meets certain assumptions.
  • the types of variables that you’re dealing with.

Statistical assumptions

Statistical tests make some common assumptions about the data they are testing:

  • Independence of observations (a.k.a. no autocorrelation): The observations/variables you include in your test are not related (for example, multiple measurements of a single test subject are not independent, while measurements of multiple different test subjects are independent).
  • Homogeneity of variance : the variance within each group being compared is similar among all groups. If one group has much more variation than others, it will limit the test’s effectiveness.
  • Normality of data : the data follows a normal distribution (a.k.a. a bell curve). This assumption applies only to quantitative data .

If your data do not meet the assumptions of normality or homogeneity of variance, you may be able to perform a nonparametric statistical test , which allows you to make comparisons without any assumptions about the data distribution.

If your data do not meet the assumption of independence of observations, you may be able to use a test that accounts for structure in your data (repeated-measures tests or tests that include blocking variables).

Types of variables

The types of variables you have usually determine what type of statistical test you can use.

Quantitative variables represent amounts of things (e.g. the number of trees in a forest). Types of quantitative variables include:

  • Continuous (aka ratio variables): represent measures and can usually be divided into units smaller than one (e.g. 0.75 grams).
  • Discrete (aka integer variables): represent counts and usually can’t be divided into units smaller than one (e.g. 1 tree).

Categorical variables represent groupings of things (e.g. the different tree species in a forest). Types of categorical variables include:

  • Ordinal : represent data with an order (e.g. rankings).
  • Nominal : represent group names (e.g. brands or species names).
  • Binary : represent data with a yes/no or 1/0 outcome (e.g. win or lose).

Choose the test that fits the types of predictor and outcome variables you have collected (if you are doing an experiment , these are the independent and dependent variables ). Consult the tables below to see which test best matches your variables.

Parametric tests usually have stricter requirements than nonparametric tests, and are able to make stronger inferences from the data. They can only be conducted with data that adheres to the common assumptions of statistical tests.

The most common types of parametric test include regression tests, comparison tests, and correlation tests.

Regression tests

Regression tests look for cause-and-effect relationships . They can be used to estimate the effect of one or more continuous variables on another variable.

Comparison tests

Comparison tests look for differences among group means . They can be used to test the effect of a categorical variable on the mean value of some other characteristic.

T-tests are used when comparing the means of precisely two groups (e.g., the average heights of men and women). ANOVA and MANOVA tests are used when comparing the means of more than two groups (e.g., the average heights of children, teenagers, and adults).

Correlation tests

Correlation tests check whether variables are related without hypothesizing a cause-and-effect relationship.

These can be used to test whether two variables you want to use in (for example) a multiple regression test are autocorrelated.

Non-parametric tests don’t make as many assumptions about the data, and are useful when one or more of the common statistical assumptions are violated. However, the inferences they make aren’t as strong as with parametric tests.

Prevent plagiarism. Run a free check.

This flowchart helps you choose among parametric tests. For nonparametric alternatives, check the table above.

Choosing the right statistical test

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Normal distribution
  • Descriptive statistics
  • Measures of central tendency
  • Correlation coefficient
  • Null hypothesis

Methodology

  • Cluster sampling
  • Stratified sampling
  • Types of interviews
  • Cohort study
  • Thematic analysis

Research bias

  • Implicit bias
  • Cognitive bias
  • Survivorship bias
  • Availability heuristic
  • Nonresponse bias
  • Regression to the mean

Statistical tests commonly assume that:

  • the data are normally distributed
  • the groups that are being compared have similar variance
  • the data are independent

If your data does not meet these assumptions you might still be able to use a nonparametric statistical test , which have fewer requirements but also make weaker inferences.

A test statistic is a number calculated by a  statistical test . It describes how far your observed data is from the  null hypothesis  of no relationship between  variables or no difference among sample groups.

The test statistic tells you how different two or more groups are from the overall population mean , or how different a linear slope is from the slope predicted by a null hypothesis . Different test statistics are used in different statistical tests.

Statistical significance is a term used by researchers to state that it is unlikely their observations could have occurred under the null hypothesis of a statistical test . Significance is usually denoted by a p -value , or probability value.

Statistical significance is arbitrary – it depends on the threshold, or alpha value, chosen by the researcher. The most common threshold is p < 0.05, which means that the data is likely to occur less than 5% of the time under the null hypothesis .

When the p -value falls below the chosen alpha value, then we say the result of the test is statistically significant.

Quantitative variables are any variables where the data represent amounts (e.g. height, weight, or age).

Categorical variables are any variables where the data represent groups. This includes rankings (e.g. finishing places in a race), classifications (e.g. brands of cereal), and binary outcomes (e.g. coin flips).

You need to know what type of variables you are working with to choose the right statistical test for your data and interpret your results .

Discrete and continuous variables are two types of quantitative variables :

  • Discrete variables represent counts (e.g. the number of objects in a collection).
  • Continuous variables represent measurable amounts (e.g. water volume or weight).

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Bevans, R. (2023, June 22). Choosing the Right Statistical Test | Types & Examples. Scribbr. Retrieved April 15, 2024, from https://www.scribbr.com/statistics/statistical-tests/

Is this article helpful?

Rebecca Bevans

Rebecca Bevans

Other students also liked, hypothesis testing | a step-by-step guide with easy examples, test statistics | definition, interpretation, and examples, normal distribution | examples, formulas, & uses, what is your plagiarism score.

  • Privacy Policy

Buy Me a Coffee

Research Method

Home » What is a Hypothesis – Types, Examples and Writing Guide

What is a Hypothesis – Types, Examples and Writing Guide

Table of Contents

What is a Hypothesis

Definition:

Hypothesis is an educated guess or proposed explanation for a phenomenon, based on some initial observations or data. It is a tentative statement that can be tested and potentially proven or disproven through further investigation and experimentation.

Hypothesis is often used in scientific research to guide the design of experiments and the collection and analysis of data. It is an essential element of the scientific method, as it allows researchers to make predictions about the outcome of their experiments and to test those predictions to determine their accuracy.

Types of Hypothesis

Types of Hypothesis are as follows:

Research Hypothesis

A research hypothesis is a statement that predicts a relationship between variables. It is usually formulated as a specific statement that can be tested through research, and it is often used in scientific research to guide the design of experiments.

Null Hypothesis

The null hypothesis is a statement that assumes there is no significant difference or relationship between variables. It is often used as a starting point for testing the research hypothesis, and if the results of the study reject the null hypothesis, it suggests that there is a significant difference or relationship between variables.

Alternative Hypothesis

An alternative hypothesis is a statement that assumes there is a significant difference or relationship between variables. It is often used as an alternative to the null hypothesis and is tested against the null hypothesis to determine which statement is more accurate.

Directional Hypothesis

A directional hypothesis is a statement that predicts the direction of the relationship between variables. For example, a researcher might predict that increasing the amount of exercise will result in a decrease in body weight.

Non-directional Hypothesis

A non-directional hypothesis is a statement that predicts the relationship between variables but does not specify the direction. For example, a researcher might predict that there is a relationship between the amount of exercise and body weight, but they do not specify whether increasing or decreasing exercise will affect body weight.

Statistical Hypothesis

A statistical hypothesis is a statement that assumes a particular statistical model or distribution for the data. It is often used in statistical analysis to test the significance of a particular result.

Composite Hypothesis

A composite hypothesis is a statement that assumes more than one condition or outcome. It can be divided into several sub-hypotheses, each of which represents a different possible outcome.

Empirical Hypothesis

An empirical hypothesis is a statement that is based on observed phenomena or data. It is often used in scientific research to develop theories or models that explain the observed phenomena.

Simple Hypothesis

A simple hypothesis is a statement that assumes only one outcome or condition. It is often used in scientific research to test a single variable or factor.

Complex Hypothesis

A complex hypothesis is a statement that assumes multiple outcomes or conditions. It is often used in scientific research to test the effects of multiple variables or factors on a particular outcome.

Applications of Hypothesis

Hypotheses are used in various fields to guide research and make predictions about the outcomes of experiments or observations. Here are some examples of how hypotheses are applied in different fields:

  • Science : In scientific research, hypotheses are used to test the validity of theories and models that explain natural phenomena. For example, a hypothesis might be formulated to test the effects of a particular variable on a natural system, such as the effects of climate change on an ecosystem.
  • Medicine : In medical research, hypotheses are used to test the effectiveness of treatments and therapies for specific conditions. For example, a hypothesis might be formulated to test the effects of a new drug on a particular disease.
  • Psychology : In psychology, hypotheses are used to test theories and models of human behavior and cognition. For example, a hypothesis might be formulated to test the effects of a particular stimulus on the brain or behavior.
  • Sociology : In sociology, hypotheses are used to test theories and models of social phenomena, such as the effects of social structures or institutions on human behavior. For example, a hypothesis might be formulated to test the effects of income inequality on crime rates.
  • Business : In business research, hypotheses are used to test the validity of theories and models that explain business phenomena, such as consumer behavior or market trends. For example, a hypothesis might be formulated to test the effects of a new marketing campaign on consumer buying behavior.
  • Engineering : In engineering, hypotheses are used to test the effectiveness of new technologies or designs. For example, a hypothesis might be formulated to test the efficiency of a new solar panel design.

How to write a Hypothesis

Here are the steps to follow when writing a hypothesis:

Identify the Research Question

The first step is to identify the research question that you want to answer through your study. This question should be clear, specific, and focused. It should be something that can be investigated empirically and that has some relevance or significance in the field.

Conduct a Literature Review

Before writing your hypothesis, it’s essential to conduct a thorough literature review to understand what is already known about the topic. This will help you to identify the research gap and formulate a hypothesis that builds on existing knowledge.

Determine the Variables

The next step is to identify the variables involved in the research question. A variable is any characteristic or factor that can vary or change. There are two types of variables: independent and dependent. The independent variable is the one that is manipulated or changed by the researcher, while the dependent variable is the one that is measured or observed as a result of the independent variable.

Formulate the Hypothesis

Based on the research question and the variables involved, you can now formulate your hypothesis. A hypothesis should be a clear and concise statement that predicts the relationship between the variables. It should be testable through empirical research and based on existing theory or evidence.

Write the Null Hypothesis

The null hypothesis is the opposite of the alternative hypothesis, which is the hypothesis that you are testing. The null hypothesis states that there is no significant difference or relationship between the variables. It is important to write the null hypothesis because it allows you to compare your results with what would be expected by chance.

Refine the Hypothesis

After formulating the hypothesis, it’s important to refine it and make it more precise. This may involve clarifying the variables, specifying the direction of the relationship, or making the hypothesis more testable.

Examples of Hypothesis

Here are a few examples of hypotheses in different fields:

  • Psychology : “Increased exposure to violent video games leads to increased aggressive behavior in adolescents.”
  • Biology : “Higher levels of carbon dioxide in the atmosphere will lead to increased plant growth.”
  • Sociology : “Individuals who grow up in households with higher socioeconomic status will have higher levels of education and income as adults.”
  • Education : “Implementing a new teaching method will result in higher student achievement scores.”
  • Marketing : “Customers who receive a personalized email will be more likely to make a purchase than those who receive a generic email.”
  • Physics : “An increase in temperature will cause an increase in the volume of a gas, assuming all other variables remain constant.”
  • Medicine : “Consuming a diet high in saturated fats will increase the risk of developing heart disease.”

Purpose of Hypothesis

The purpose of a hypothesis is to provide a testable explanation for an observed phenomenon or a prediction of a future outcome based on existing knowledge or theories. A hypothesis is an essential part of the scientific method and helps to guide the research process by providing a clear focus for investigation. It enables scientists to design experiments or studies to gather evidence and data that can support or refute the proposed explanation or prediction.

The formulation of a hypothesis is based on existing knowledge, observations, and theories, and it should be specific, testable, and falsifiable. A specific hypothesis helps to define the research question, which is important in the research process as it guides the selection of an appropriate research design and methodology. Testability of the hypothesis means that it can be proven or disproven through empirical data collection and analysis. Falsifiability means that the hypothesis should be formulated in such a way that it can be proven wrong if it is incorrect.

In addition to guiding the research process, the testing of hypotheses can lead to new discoveries and advancements in scientific knowledge. When a hypothesis is supported by the data, it can be used to develop new theories or models to explain the observed phenomenon. When a hypothesis is not supported by the data, it can help to refine existing theories or prompt the development of new hypotheses to explain the phenomenon.

When to use Hypothesis

Here are some common situations in which hypotheses are used:

  • In scientific research , hypotheses are used to guide the design of experiments and to help researchers make predictions about the outcomes of those experiments.
  • In social science research , hypotheses are used to test theories about human behavior, social relationships, and other phenomena.
  • I n business , hypotheses can be used to guide decisions about marketing, product development, and other areas. For example, a hypothesis might be that a new product will sell well in a particular market, and this hypothesis can be tested through market research.

Characteristics of Hypothesis

Here are some common characteristics of a hypothesis:

  • Testable : A hypothesis must be able to be tested through observation or experimentation. This means that it must be possible to collect data that will either support or refute the hypothesis.
  • Falsifiable : A hypothesis must be able to be proven false if it is not supported by the data. If a hypothesis cannot be falsified, then it is not a scientific hypothesis.
  • Clear and concise : A hypothesis should be stated in a clear and concise manner so that it can be easily understood and tested.
  • Based on existing knowledge : A hypothesis should be based on existing knowledge and research in the field. It should not be based on personal beliefs or opinions.
  • Specific : A hypothesis should be specific in terms of the variables being tested and the predicted outcome. This will help to ensure that the research is focused and well-designed.
  • Tentative: A hypothesis is a tentative statement or assumption that requires further testing and evidence to be confirmed or refuted. It is not a final conclusion or assertion.
  • Relevant : A hypothesis should be relevant to the research question or problem being studied. It should address a gap in knowledge or provide a new perspective on the issue.

Advantages of Hypothesis

Hypotheses have several advantages in scientific research and experimentation:

  • Guides research: A hypothesis provides a clear and specific direction for research. It helps to focus the research question, select appropriate methods and variables, and interpret the results.
  • Predictive powe r: A hypothesis makes predictions about the outcome of research, which can be tested through experimentation. This allows researchers to evaluate the validity of the hypothesis and make new discoveries.
  • Facilitates communication: A hypothesis provides a common language and framework for scientists to communicate with one another about their research. This helps to facilitate the exchange of ideas and promotes collaboration.
  • Efficient use of resources: A hypothesis helps researchers to use their time, resources, and funding efficiently by directing them towards specific research questions and methods that are most likely to yield results.
  • Provides a basis for further research: A hypothesis that is supported by data provides a basis for further research and exploration. It can lead to new hypotheses, theories, and discoveries.
  • Increases objectivity: A hypothesis can help to increase objectivity in research by providing a clear and specific framework for testing and interpreting results. This can reduce bias and increase the reliability of research findings.

Limitations of Hypothesis

Some Limitations of the Hypothesis are as follows:

  • Limited to observable phenomena: Hypotheses are limited to observable phenomena and cannot account for unobservable or intangible factors. This means that some research questions may not be amenable to hypothesis testing.
  • May be inaccurate or incomplete: Hypotheses are based on existing knowledge and research, which may be incomplete or inaccurate. This can lead to flawed hypotheses and erroneous conclusions.
  • May be biased: Hypotheses may be biased by the researcher’s own beliefs, values, or assumptions. This can lead to selective interpretation of data and a lack of objectivity in research.
  • Cannot prove causation: A hypothesis can only show a correlation between variables, but it cannot prove causation. This requires further experimentation and analysis.
  • Limited to specific contexts: Hypotheses are limited to specific contexts and may not be generalizable to other situations or populations. This means that results may not be applicable in other contexts or may require further testing.
  • May be affected by chance : Hypotheses may be affected by chance or random variation, which can obscure or distort the true relationship between variables.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Data collection

Data Collection – Methods Types and Examples

Delimitations

Delimitations in Research – Types, Examples and...

Research Process

Research Process – Steps, Examples and Tips

Research Design

Research Design – Types, Methods and Examples

Institutional Review Board (IRB)

Institutional Review Board – Application Sample...

Evaluating Research

Evaluating Research – Process, Examples and...

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Statistics LibreTexts

5.1: Linear Regression and Correlation

  • Last updated
  • Save as PDF
  • Page ID 1746

  • John H. McDonald
  • University of Delaware

Learning Objectives

  • To use linear regression or correlation when you want to know whether one measurement variable is associated with another measurement variable; you want to measure the strength of the association (\(r^2\)); or you want an equation that describes the relationship and can be used to predict unknown values.

One of the most common graphs in science plots one measurement variable on the \(x\) (horizontal) axis vs. another on the \(y\) (vertical) axis. For example, here are two graphs. For the first, I dusted off the elliptical machine in our basement and measured my pulse after one minute of ellipticizing at various speeds:

For the second graph, I dusted off some data from McDonald (1989): I collected the amphipod crustacean Platorchestia platensis on a beach near Stony Brook, Long Island, in April, 1987, removed and counted the number of eggs each female was carrying, then freeze-dried and weighed the mothers:

There are three things you can do with this kind of data. One is a hypothesis test, to see if there is an association between the two variables; in other words, as the \(X\) variable goes up, does the \(Y\) variable tend to change (up or down). For the exercise data, you'd want to know whether pulse rate was significantly higher with higher speeds. The \(P\) value is \(1.3\times 10^{-8}\), but the relationship is so obvious from the graph, and so biologically unsurprising (of course my pulse rate goes up when I exercise harder!), that the hypothesis test wouldn't be a very interesting part of the analysis. For the amphipod data, you'd want to know whether bigger females had more eggs or fewer eggs than smaller amphipods, which is neither biologically obvious nor obvious from the graph. It may look like a random scatter of points, but there is a significant relationship (\(P=0.015)\).

The second goal is to describe how tightly the two variables are associated. This is usually expressed with \(r\), which ranges from \(-1\) to \(1\), or \(r^2\), which ranges from \(0\) to \(1\). For the exercise data, there's a very tight relationship, as shown by the \(r^2\) of \(0.98\); this means that if you knew my speed on the elliptical machine, you'd be able to predict my pulse quite accurately. The \(r^2\) for the amphipod data is a lot lower, at \(0.21\); this means that even though there's a significant relationship between female weight and number of eggs, knowing the weight of a female wouldn't let you predict the number of eggs she had with very much accuracy.

The final goal is to determine the equation of a line that goes through the cloud of points. The equation of a line is given in the form \(\hat{Y}=a+bX\), where \(\hat{Y}\) is the value of \(Y\) predicted for a given value of \(X\), a is the \(Y\) intercept (the value of \(Y\) when \(X\) is zero), and \(b\) is the slope of the line (the change in \(\hat{Y}\) for a change in \(X\) of one unit). For the exercise data, the equation is \(\hat{Y}=63.5+3.75X\); this predicts that my pulse would be \(63.5\) when the speed of the elliptical machine is \(0 kph\), and my pulse would go up by \(3.75\) beats per minute for every \(1 kph\) increase in speed. This is probably the most useful part of the analysis for the exercise data; if I wanted to exercise with a particular level of effort, as measured by pulse rate, I could use the equation to predict the speed I should use. For the amphipod data, the equation is \(\hat{Y}=12.7+1.60X\). For most purposes, just knowing that bigger amphipods have significantly more eggs (the hypothesis test) would be more interesting than knowing the equation of the line, but it depends on the goals of your experiment.

When to use them

Use correlation/linear regression when you have two measurement variables, such as food intake and weight, drug dosage and blood pressure, air temperature and metabolic rate, etc.

There's also one nominal variable that keeps the two measurements together in pairs, such as the name of an individual organism, experimental trial, or location. I'm not aware that anyone else considers this nominal variable to be part of correlation and regression, and it's not something you need to know the value of—you could indicate that a food intake measurement and weight measurement came from the same rat by putting both numbers on the same line, without ever giving the rat a name. For that reason, I'll call it a "hidden" nominal variable.

The main value of the hidden nominal variable is that it lets me make the blanket statement that any time you have two or more measurements from a single individual (organism, experimental trial, location, etc.), the identity of that individual is a nominal variable; if you only have one measurement from an individual, the individual is not a nominal variable. I think this rule helps clarify the difference between one-way, two-way, and nested anova. If the idea of hidden nominal variables in regression confuses you, you can ignore it.

There are three main goals for correlation and regression in biology. One is to see whether two measurement variables are associated with each other; whether as one variable increases, the other tends to increase (or decrease). You summarize this test of association with the \(P\) value. In some cases, this addresses a biological question about cause-and-effect relationships; a significant association means that different values of the independent variable cause different values of the dependent. An example would be giving people different amounts of a drug and measuring their blood pressure. The null hypothesis would be that there was no relationship between the amount of drug and the blood pressure. If you reject the null hypothesis, you would conclude that the amount of drug causes the changes in blood pressure. In this kind of experiment, you determine the values of the independent variable; for example, you decide what dose of the drug each person gets. The exercise and pulse data are an example of this, as I determined the speed on the elliptical machine, then measured the effect on pulse rate.

In other cases, you want to know whether two variables are associated, without necessarily inferring a cause-and-effect relationship. In this case, you don't determine either variable ahead of time; both are naturally variable and you measure both of them. If you find an association, you infer that variation in \(X\) may cause variation in \(Y\), or variation in \(Y\) may cause variation in \(X\), or variation in some other factor may affect both \(Y\) and \(X\). An example would be measuring the amount of a particular protein on the surface of some cells and the pH of the cytoplasm of those cells. If the protein amount and pH are correlated, it may be that the amount of protein affects the internal pH; or the internal pH affects the amount of protein; or some other factor, such as oxygen concentration, affects both protein concentration and pH. Often, a significant correlation suggests further experiments to test for a cause and effect relationship; if protein concentration and pH were correlated, you might want to manipulate protein concentration and see what happens to pH, or manipulate pH and measure protein, or manipulate oxygen and see what happens to both. The amphipod data are another example of this; it could be that being bigger causes amphipods to have more eggs, or that having more eggs makes the mothers bigger (maybe they eat more when they're carrying more eggs?), or some third factor (age? food intake?) makes amphipods both larger and have more eggs.

The second goal of correlation and regression is estimating the strength of the relationship between two variables; in other words, how close the points on the graph are to the regression line. You summarize this with the \(r^2\) value. For example, let's say you've measured air temperature (ranging from \(15^{\circ}C\) to \(30^{\circ}C\)) and running speed in the lizard Agama savignyi , and you find a significant relationship: warmer lizards run faster. You would also want to know whether there's a tight relationship (high \(r^2\)), which would tell you that air temperature is the main factor affecting running speed; if the \(r^2\) is low, it would tell you that other factors besides air temperature are also important, and you might want to do more experiments to look for them. You might also want to know how the \(r^2\) for Agama savignyi compared to that for other lizard species, or for Agama savignyi under different conditions.

The third goal of correlation and regression is finding the equation of a line that fits the cloud of points. You can then use this equation for prediction. For example, if you have given volunteers diets with \(500 mg\) to \(2500 mg\) of salt per day, and then measured their blood pressure, you could use the regression line to estimate how much a person's blood pressure would go down if they ate \(500 mg\) less salt per day.

Correlation versus Linear Regression

The statistical tools used for hypothesis testing, describing the closeness of the association, and drawing a line through the points, are correlation and linear regression. Unfortunately, I find the descriptions of correlation and regression in most textbooks to be unnecessarily confusing. Some statistics textbooks have correlation and linear regression in separate chapters, and make it seem as if it is always important to pick one technique or the other. I think this overemphasizes the differences between them. Other books muddle correlation and regression together without really explaining what the difference is.

There are real differences between correlation and linear regression, but fortunately, they usually don't matter. Correlation and linear regression give the exact same \(P\) value for the hypothesis test, and for most biological experiments, that's the only really important result. So if you're mainly interested in the \(P\) value, you don't need to worry about the difference between correlation and regression.

For the most part, I'll treat correlation and linear regression as different aspects of a single analysis, and you can consider correlation/linear regression to be a single statistical test. Be aware that my approach is probably different from what you'll see elsewhere.

The main difference between correlation and regression is that in correlation, you sample both measurement variables randomly from a population, while in regression you choose the values of the independent (\(X\)) variable. For example, let's say you're a forensic anthropologist, interested in the relationship between foot length and body height in humans. If you find a severed foot at a crime scene, you'd like to be able to estimate the height of the person it was severed from. You measure the foot length and body height of a random sample of humans, get a significant \(P\) value, and calculate \(r^2\) to be \(0.72\). This is a correlation, because you took measurements of both variables on a random sample of people. The \(r^2\) is therefore a meaningful estimate of the strength of the association between foot length and body height in humans, and you can compare it to other \(r^2\) values. You might want to see if the \(r^2\) for feet and height is larger or smaller than the \(r^2\) for hands and height, for example.

As an example of regression, let's say you've decided forensic anthropology is too disgusting, so now you're interested in the effect of air temperature on running speed in lizards. You put some lizards in a temperature chamber set to \(10^{\circ}C\), chase them, and record how fast they run. You do the same for \(10\) different temperatures, ranging up to \(30^{\circ}C\). This is a regression, because you decided which temperatures to use. You'll probably still want to calculate \(r^2\), just because high values are more impressive. But it's not a very meaningful estimate of anything about lizards. This is because the \(r^2\) depends on the values of the independent variable that you chose. For the exact same relationship between temperature and running speed, a narrower range of temperatures would give a smaller \(r^2\). Here are three graphs showing some simulated data, with the same scatter (standard deviation) of \(Y\) values at each value of \(X\). As you can see, with a narrower range of \(X\) values, the \(r^2\) gets smaller. If you did another experiment on humidity and running speed in your lizards and got a lower \(r^2\), you couldn't say that running speed is more strongly associated with temperature than with humidity; if you had chosen a narrower range of temperatures and a broader range of humidities, humidity might have had a larger \(r^2\) than temperature.

If you try to classify every experiment as either regression or correlation, you'll quickly find that there are many experiments that don't clearly fall into one category. For example, let's say that you study air temperature and running speed in lizards. You go out to the desert every Saturday for the eight months of the year that your lizards are active, measure the air temperature, then chase lizards and measure their speed. You haven't deliberately chosen the air temperature, just taken a sample of the natural variation in air temperature, so is it a correlation? But you didn't take a sample of the entire year, just those eight months, and you didn't pick days at random, just Saturdays, so is it a regression?

If you are mainly interested in using the \(P\) value for hypothesis testing, to see whether there is a relationship between the two variables, it doesn't matter whether you call the statistical test a regression or correlation. If you are interested in comparing the strength of the relationship (\(r^2\)) to the strength of other relationships, you are doing a correlation and should design your experiment so that you measure \(X\) and \(Y\) on a random sample of individuals. If you determine the \(X\) values before you do the experiment, you are doing a regression and shouldn't interpret the \(r^2\) as an estimate of something general about the population you've observed.

Correlation and Causation

You have probably heard people warn you, "Correlation does not imply causation." This is a reminder that when you are sampling natural variation in two variables, there is also natural variation in a lot of possible confounding variables that could cause the association between \(A\) and \(B\). So if you see a significant association between \(A\) and \(B\), it doesn't necessarily mean that variation in \(A\) causes variation in \(B\); there may be some other variable, \(C\), that affects both of them. For example, let's say you went to an elementary school, found \(100\) random students, measured how long it took them to tie their shoes, and measured the length of their thumbs. I'm pretty sure you'd find a strong association between the two variables, with longer thumbs associated with shorter shoe-tying times. I'm sure you could come up with a clever, sophisticated biomechanical explanation for why having longer thumbs causes children to tie their shoes faster, complete with force vectors and moment angles and equations and \(3-D\) modeling. However, that would be silly; your sample of \(100\) random students has natural variation in another variable, age, and older students have bigger thumbs and take less time to tie their shoes.

So what if you make sure all your student volunteers are the same age, and you still see a significant association between shoe-tying time and thumb length; would that correlation imply causation? No, because think of why different children have different length thumbs. Some people are genetically larger than others; could the genes that affect overall size also affect fine motor skills? Maybe. Nutrition affects size, and family economics affects nutrition; could poor children have smaller thumbs due to poor nutrition, and also have slower shoe-tying times because their parents were too overworked to teach them to tie their shoes, or because they were so poor that they didn't get their first shoes until they reached school age? Maybe. I don't know, maybe some kids spend so much time sucking their thumb that the thumb actually gets longer, and having a slimy spit-covered thumb makes it harder to grip a shoelace. But there would be multiple plausible explanations for the association between thumb length and shoe-tying time, and it would be incorrect to conclude "Longer thumbs make you tie your shoes faster."

Since it's possible to think of multiple explanations for an association between two variables, does that mean you should cynically sneer "Correlation does not imply causation!" and dismiss any correlation studies of naturally occurring variation? No. For one thing, observing a correlation between two variables suggests that there's something interesting going on, something you may want to investigate further. For example, studies have shown a correlation between eating more fresh fruits and vegetables and lower blood pressure. It's possible that the correlation is because people with more money, who can afford fresh fruits and vegetables, have less stressful lives than poor people, and it's the difference in stress that affects blood pressure; it's also possible that people who are concerned about their health eat more fruits and vegetables and exercise more, and it's the exercise that affects blood pressure. But the correlation suggests that eating fruits and vegetables may reduce blood pressure. You'd want to test this hypothesis further, by looking for the correlation in samples of people with similar socioeconomic status and levels of exercise; by statistically controlling for possible confounding variables using techniques such as multiple regression; by doing animal studies; or by giving human volunteers controlled diets with different amounts of fruits and vegetables. If your initial correlation study hadn't found an association of blood pressure with fruits and vegetables, you wouldn't have a reason to do these further studies. Correlation may not imply causation, but it tells you that something interesting is going on.

In a regression study, you set the values of the independent variable, and you control or randomize all of the possible confounding variables. For example, if you are investigating the relationship between blood pressure and fruit and vegetable consumption, you might think that it's the potassium in the fruits and vegetables that lowers blood pressure. You could investigate this by getting a bunch of volunteers of the same sex, age, and socioeconomic status. You randomly choose the potassium intake for each person, give them the appropriate pills, have them take the pills for a month, then measure their blood pressure. All of the possible confounding variables are either controlled (age, sex, income) or randomized (occupation, psychological stress, exercise, diet), so if you see an association between potassium intake and blood pressure, the only possible cause would be that potassium affects blood pressure. So if you've designed your experiment correctly, regression does imply causation.

Null Hypothesis

The null hypothesis of correlation/linear regression is that the slope of the best-fit line is equal to zero; in other words, as the \(X\) variable gets larger, the associated \(Y\) variable gets neither higher nor lower.

It is also possible to test the null hypothesis that the \(Y\) value predicted by the regression equation for a given value of \(X\) is equal to some theoretical expectation; the most common would be testing the null hypothesis that the \(Y\) intercept is \(0\). This is rarely necessary in biological experiments, so I won't cover it here, but be aware that it is possible.

Independent vs. dependent variables

When you are testing a cause-and-effect relationship, the variable that causes the relationship is called the independent variable and you plot it on the \(X\) axis, while the effect is called the dependent variable and you plot it on the \(Y\) axis. In some experiments you set the independent variable to values that you have chosen; for example, if you're interested in the effect of temperature on calling rate of frogs, you might put frogs in temperature chambers set to \(10^{\circ}C\), \(15^{\circ}C\), \(20^{\circ}C\), etc. In other cases, both variables exhibit natural variation, but any cause-and-effect relationship would be in one way; if you measure the air temperature and frog calling rate at a pond on several different nights, both the air temperature and the calling rate would display natural variation, but if there's a cause-and-effect relationship, it's temperature affecting calling rate; the rate at which frogs call does not affect the air temperature.

Sometimes it's not clear which is the independent variable and which is the dependent, even if you think there may be a cause-and-effect relationship. For example, if you are testing whether salt content in food affects blood pressure, you might measure the salt content of people's diets and their blood pressure, and treat salt content as the independent variable. But if you were testing the idea that high blood pressure causes people to crave high-salt foods, you'd make blood pressure the independent variable and salt intake the dependent variable.

Sometimes, you're not looking for a cause-and-effect relationship at all, you just want to see if two variables are related. For example, if you measure the range-of-motion of the hip and the shoulder, you're not trying to see whether more flexible hips cause more flexible shoulders, or more flexible shoulders cause more flexible hips; instead, you're just trying to see if people with more flexible hips also tend to have more flexible shoulders, presumably due to some factor (age, diet, exercise, genetics) that affects overall flexibility. In this case, it would be completely arbitrary which variable you put on the \(X\) axis and which you put on the \(Y\) axis.

Fortunately, the \(P\) value and the \(r^2\) are not affected by which variable you call the \(X\) and which you call the \(Y\); you'll get mathematically identical values either way. The least-squares regression line does depend on which variable is the \(X\) and which is the \(Y\); the two lines can be quite different if the \(r^2\) is low. If you're truly interested only in whether the two variables covary, and you are not trying to infer a cause-and-effect relationship, you may want to avoid using the linear regression line as decoration on your graph.

Researchers in a few fields traditionally put the independent variable on the \(Y\) axis. Oceanographers, for example, often plot depth on the \(Y\) axis (with \(0\) at the top) and a variable that is directly or indirectly affected by depth, such as chlorophyll concentration, on the \(X\) axis. I wouldn't recommend this unless it's a really strong tradition in your field, as it could lead to confusion about which variable you're considering the independent variable in a linear regression.

How the test works

Regression line.

Linear regression finds the line that best fits the data points. There are actually a number of different definitions of "best fit," and therefore a number of different methods of linear regression that fit somewhat different lines. By far the most common is "ordinary least-squares regression"; when someone just says "least-squares regression" or "linear regression" or "regression," they mean ordinary least-squares regression.

In ordinary least-squares regression, the "best" fit is defined as the line that minimizes the squared vertical distances between the data points and the line. For a data point with an \(X\) value of \(X_1\) and a \(Y\) value of \(Y_1\), the difference between \(Y_1\) and \(\hat{Y_1}\) (the predicted value of \(Y\) at \(X_1\)) is calculated, then squared. This squared deviate is calculated for each data point, and the sum of these squared deviates measures how well a line fits the data. The regression line is the one for which this sum of squared deviates is smallest. I'll leave out the math that is used to find the slope and intercept of the best-fit line; you're a biologist and have more important things to think about.

The equation for the regression line is usually expressed as \(\hat{Y}=a+bX\), where \(a\) is the \(Y\) intercept and \(b\) is the slope. Once you know \(a\) and \(b\), you can use this equation to predict the value of \(Y\) for a given value of \(X\). For example, the equation for the heart rate-speed experiment is \(\text{rate}=63.357+3.749\times \text{speed}\). I could use this to predict that for a speed of \(10 kph\), my heart rate would be \(100.8 bpm\). You should do this kind of prediction within the range of \(X\) values found in the original data set (interpolation). Predicting \(Y\) values outside the range of observed values (extrapolation) is sometimes interesting, but it can easily yield ridiculous results if you go far outside the observed range of \(X\). In the frog example below, you could mathematically predict that the inter-call interval would be about \(16\) seconds at \(-40^{\circ}C\). Actually, the inter-calling interval would be infinity at that temperature, because all the frogs would be frozen solid.

Sometimes you want to predict \(X\) from \(Y\). The most common use of this is constructing a standard curve. For example, you might weigh some dry protein and dissolve it in water to make solutions containing \(0,\; 100,\; 200…1000\; µg\) protein per \(ml\), add some reagents that turn color in the presence of protein, then measure the light absorbance of each solution using a spectrophotometer. Then when you have a solution with an unknown concentration of protein, you add the reagents, measure the light absorbance, and estimate the concentration of protein in the solution.

There are two common methods to estimate \(X\) from \(Y\). One way is to do the usual regression with \(X\) as the independent variable and \(Y\) as the dependent variable; for the protein example, you'd have protein as the independent variable and absorbance as the dependent variable. You get the usual equation, \(\hat{Y}=a+bX\), then rearrange it to solve for \(X\), giving you \(\hat{X}=\frac{(Y-a)}{b}\). This is called "classical estimation."

The other method is to do linear regression with \(Y\) as the independent variable and \(X\) as the dependent variable, also known as regressing \(X\) on \(Y\). For the protein standard curve, you would do a regression with absorbance as the \(X\) variable and protein concentration as the \(Y\) variable. You then use this regression equation to predict unknown values of \(X\) from \(Y\). This is known as "inverse estimation."

Several simulation studies have suggested that inverse estimation gives a more accurate estimate of \(X\) than classical estimation (Krutchkoff 1967, Krutchkoff 1969, Lwin and Maritz 1982, Kannan et al. 2007), so that is what I recommend. However, some statisticians prefer classical estimation (Sokal and Rohlf 1995, pp. 491-493). If the \(r^2\) is high (the points are close to the regression line), the difference between classical estimation and inverse estimation is pretty small. When you're construction a standard curve for something like protein concentration, the \(r^2\) is usually so high that the difference between classical and inverse estimation will be trivial. But the two methods can give quite different estimates of \(X\) when the original points were scattered around the regression line. For the exercise and pulse data, with an \(r^2\) of \(0.98\), classical estimation predicts that to get a pulse of \(100 bpm\), I should run at \(9.8 kph\), while inverse estimation predicts a speed of \(9.7 kph\). The amphipod data has a much lower \(r^2\) of \(0.25\), so the difference between the two techniques is bigger; if I want to know what size amphipod would have \(30\) eggs, classical estimation predicts a size of \(10.8 mg\), while inverse estimation predicts a size of \(7.5 mg\).

Sometimes your goal in drawing a regression line is not predicting \(Y\) from \(X\), or predicting \(X\) from \(Y\), but instead describing the relationship between two variables. If one variable is the independent variable and the other is the dependent variable, you should use the least-squares regression line. However, if there is no cause-and-effect relationship between the two variables, the least-squares regression line is inappropriate. This is because you will get two different lines, depending on which variable you pick to be the independent variable. For example, if you want to describe the relationship between thumb length and big toe length, you would get one line if you made thumb length the independent variable, and a different line if you made big-toe length the independent variable. The choice would be completely arbitrary, as there is no reason to think that thumb length causes variation in big-toe length, or vice versa.

A number of different lines have been proposed to describe the relationship between two variables with a symmetrical relationship (where neither is the independent variable). The most common method is reduced major axis regression (also known as standard major axis regression or geometric mean regression). It gives a line that is intermediate in slope between the least-squares regression line of \(Y\) on \(X\) and the least-squares regression line of \(X\) on \(Y\); in fact, the slope of the reduced major axis line is the geometric mean of the two least-squares regression lines.

While reduced major axis regression gives a line that is in some ways a better description of the symmetrical relationship between two variables (McArdle 2003, Smith 2009), you should keep two things in mind. One is that you shouldn't use the reduced major axis line for predicting values of \(X\) from \(Y\), or \(Y\) from \(X\); you should still use least-squares regression for prediction. The other thing to know is that you cannot test the null hypothesis that the slope of the reduced major axis line is zero, because it is mathematically impossible to have a reduced major axis slope that is exactly zero. Even if your graph shows a reduced major axis line, your \(P\) value is the test of the null that the least-square regression line has a slope of zero.

Coefficient of determination (\(r^2\))

The coefficient of determination, or \(r^2\), expresses the strength of the relationship between the \(X\) and \(Y\) variables. It is the proportion of the variation in the \(Y\) variable that is "explained" by the variation in the \(X\) variable. \(r^2\) can vary from \(0\) to \(1\); values near \(1\) mean the \(Y\) values fall almost right on the regression line, while values near \(0\) mean there is very little relationship between \(X\) and \(Y\). As you can see, regressions can have a small \(r^2\) and not look like there's any relationship, yet they still might have a slope that's significantly different from zero.

To illustrate the meaning of r 2 , here are six pairs of X and Y values:

If you didn't know anything about the \(X\) value and were told to guess what a \(Y\) value was, your best guess would be the mean \(Y\); for this example, the mean \(Y\) is \(10\). The squared deviates of the \(Y\) values from their mean is the total sum of squares, familiar from analysis of variance. The vertical lines on the left graph below show the deviates from the mean; the first point has a deviate of \(8\), so its squared deviate is \(64\), etc. The total sum of squares for these numbers is \(64+1+1+1+16+25=108\).

If you did know the \(X\) value and were told to guess what a \(Y\) value was, you'd calculate the regression equation and use it. The regression equation for these numbers is \(\hat{Y}=2.0286+1.5429X\), so for the first \(X\) value you'd predict a \(Y\) value of \(2.0286+1.5429\times 1=3.5715\), etc. The vertical lines on the right graph above show the deviates of the actual \(Y\) values from the predicted \(\hat{Y}\) values. As you can see, most of the points are closer to the regression line than they are to the overall mean. Squaring these deviates and taking the sum gives us the regression sum of squares, which for these numbers is \(10.8\).

The regression sum of squares is \(10.8\), which is \(90\%\) smaller than the total sum of squares (\(108\)). This difference between the two sums of squares, expressed as a fraction of the total sum of squares, is the definition of \(r^2\). In this case we would say that \(r^2=0.90\); the \(X\) variable "explains" \(90\%\) of the variation in the \(Y\) variable.

The \(r^2\) value is formally known as the "coefficient of determination," although it is usually just called \(r^2\). The square root of \(r^2\), with a negative sign if the slope is negative, is the Pearson product-moment correlation coefficient, \(r\), or just "correlation coefficient." You can use either \(r\) or \(r^2\) to describe the strength of the association between two variables. I prefer \(r^2\), because it is used more often in my area of biology, it has a more understandable meaning (the proportional difference between total sum of squares and regression sum of squares), and it doesn't have those annoying negative values. You should become familiar with the literature in your field and use whichever measure is most common. One situation where r is more useful is if you have done linear regression/correlation for multiple sets of samples, with some having positive slopes and some having negative slopes, and you want to know whether the mean correlation coefficient is significantly different from zero; see McDonald and Dunn (2013) for an application of this idea.

Test statistic

The test statistic for a linear regression is \(t_s=\frac{\sqrt{d.f.}\times r^2}{\sqrt{(1-r^2)}}\). It gets larger as the degrees of freedom (\(n-2\)) get larger or the \(r^2\) gets larger. Under the null hypothesis, the test statistic is \(t\)-distributed with \(n-2\) degrees of freedom. When reporting the results of a linear regression, most people just give the r 2 and degrees of freedom, not the \(t_s\) value. Anyone who really needs the \(t_s\) value can calculate it from the \(r^2\) and degrees of freedom.

For the heart rate–speed data, the \(r^2\) is \(0.976\) and there are \(9\) degrees of freedom, so the \(t_s\)-statistic is \(19.2\). It is significant (\(P=1.3\times 10^{-8}\)).

Some people square \(t_s\) and get an \(F\)-statistic with \(1\) degree of freedom in the numerator and \(n-2\) degrees of freedom in the denominator. The resulting \(P\) value is mathematically identical to that calculated with \(t_s\).

Because the P value is a function of both the \(r^2\) and the sample size, you should not use the \(P\) value as a measure of the strength of association. If the correlation of \(A\) and \(B\) has a smaller \(P\) value than the correlation of \(A\) and \(C\), it doesn't necessarily mean that \(A\) and \(B\) have a stronger association; it could just be that the data set for the \(A\)–\(B\) experiment was larger. If you want to compare the strength of association of different data sets, you should use \(r\) or \(r^2\).

Assumptions

Normality and homoscedasticity.

Two assumptions, similar to those for anova, are that for any value of \(X\), the \(Y\) values will be normally distributed and they will be homoscedastic. Although you will rarely have enough data to test these assumptions, they are often violated.

Fortunately, numerous simulation studies have shown that regression and correlation are quite robust to deviations from normality; this means that even if one or both of the variables are non-normal, the \(P\) value will be less than \(0.05\) about \(5\%\) of the time if the null hypothesis is true (Edgell and Noon 1984, and references therein). So in general, you can use linear regression/correlation without worrying about non-normality.

Sometimes you'll see a regression or correlation that looks like it may be significant due to one or two points being extreme on both the \(x\) and \(y\) axes. In this case, you may want to use Spearman's rank correlation, which reduces the influence of extreme values, or you may want to find a data transformation that makes the data look more normal. Another approach would be analyze the data without the extreme values, and report the results with or without them outlying points; your life will be easier if the results are similar with or without them.

When there is a significant regression or correlation, \(X\) values with higher mean \(Y\) values will often have higher standard deviations of \(Y\) as well. This happens because the standard deviation is often a constant proportion of the mean. For example, people who are \(1.5\) meters tall might have a mean weight of \(50 kg\) and a standard deviation of \(10 kg\), while people who are \(2\) meters tall might have a mean weight of \(100 kg\) and a standard deviation of \(20 kg\). When the standard deviation of \(Y\) is proportional to the mean, you can make the data be homoscedastic with a log transformation of the \(Y\) variable.

Linear regression and correlation assume that the data fit a straight line. If you look at the data and the relationship looks curved, you can try different data transformations of the \(X\), the \(Y\), or both, and see which makes the relationship straight. Of course, it's best if you choose a data transformation before you analyze your data. You can choose a data transformation beforehand based on previous data you've collected, or based on the data transformation that others in your field use for your kind of data.

A data transformation will often straighten out a J-shaped curve. If your curve looks U-shaped, S-shaped, or something more complicated, a data transformation won't turn it into a straight line. In that case, you'll have to use curvilinear regression.

Independence

Linear regression and correlation assume that the data points are independent of each other, meaning that the value of one data point does not depend on the value of any other data point. The most common violation of this assumption in regression and correlation is in time series data, where some \(Y\) variable has been measured at different times. For example, biologists have counted the number of moose on Isle Royale, a large island in Lake Superior, every year. Moose live a long time, so the number of moose in one year is not independent of the number of moose in the previous year, it is highly dependent on it; if the number of moose in one year is high, the number in the next year will probably be pretty high, and if the number of moose is low one year, the number will probably be low the next year as well. This kind of non-independence, or "autocorrelation," can give you a "significant" regression or correlation much more often than \(5\%\) of the time, even when the null hypothesis of no relationship between time and \(Y\) is true. If both \(X\) and \(Y\) are time series—for example, you analyze the number of wolves and the number of moose on Isle Royale—you can also get a "significant" relationship between them much too often.

To illustrate how easy it is to fool yourself with time-series data, I tested the correlation between the number of moose on Isle Royale in the winter and the number of strikeouts thrown by major league baseball teams the following season, using data for 2004–2013. I did this separately for each baseball team, so there were 30 statistical tests. I'm pretty sure the null hypothesis is true (I can't think of anything that would affect both moose abundance in the winter and strikeouts the following summer), so with \(30\) baseball teams, you'd expect the \(P\) value to be less than \(0.05\) for \(5\%\) of the teams, or about one or two. Instead, the \(P\) value is significant for \(7\) teams, which means that if you were stupid enough to test the correlation of moose numbers and strikeouts by your favorite team, you'd have almost a \(1\)-in-\(4\) chance of convincing yourself there was a relationship between the two. Some of the correlations look pretty good: strikeout numbers by the Cleveland team and moose numbers have an \(r^2\) of \(0.70\) and a \(P\) value of \(0.002\):

There are special statistical tests for time-series data. I will not cover them here; if you need to use them, see how other people in your field have analyzed data similar to yours, then find out more about the methods they used.

Spatial autocorrelation is another source of non-independence. This occurs when you measure a variable at locations that are close enough together that nearby locations will tend to have similar values. For example, if you want to know whether the abundance of dandelions is associated with the among of phosphate in the soil, you might mark a bunch of \(1 m^2\) squares in a field, count the number of dandelions in each quadrat, and measure the phosphate concentration in the soil of each quadrat. However, both dandelion abundance and phosphate concentration are likely to be spatially autocorrelated; if one quadrat has a lot of dandelions, its neighboring quadrats will also have a lot of dandelions, for reasons that may have nothing to do with phosphate. Similarly, soil composition changes gradually across most areas, so a quadrat with low phosphate will probably be close to other quadrats that are low in phosphate. It would be easy to find a significant correlation between dandelion abundance and phosphate concentration, even if there is no real relationship. If you need to learn about spatial autocorrelation in ecology, Dale and Fortin (2009) is a good place to start.

Another area where spatial autocorrelation is a problem is image analysis. For example, if you label one protein green and another protein red, then look at the amount of red and green protein in different parts of a cell, the high level of autocorrelation between neighboring pixels makes it very easy to find a correlation between the amount of red and green protein, even if there is no true relationship. See McDonald and Dunn (2013) for a solution to this problem.

A common observation in ecology is that species diversity decreases as you get further from the equator. To see whether this pattern could be seen on a small scale, I used data from the Audubon Society's Christmas Bird Count, in which birders try to count all the birds in a \(15\; mile\) diameter area during one winter day. I looked at the total number of species seen in each area on the Delmarva Peninsula during the 2005 count. Latitude and number of bird species are the two measurement variables; location is the hidden nominal variable.

The result is \(r^2=0.214\), with \(15 d.f.\), so the \(P\) value is \(0.061\). The trend is in the expected direction, but it is not quite significant. The equation of the regression line is \(\text {number of species}=-12.039\times \text {latitude}+585.14\). Even if it were significant, I don't know what you'd do with the equation; I suppose you could extrapolate and use it to predict that above the \(49^{th}\) parallel, there would be fewer than zero bird species.

grayfrog.jpg

Gayou (1984) measured the intervals between male mating calls in the gray tree frog, Hyla versicolor , at different temperatures. The regression line is \(\text {interval}=-0.205\times \text{temperature}+8.36\), and it is highly significant (\(r^2=0.29,\; 45 d.f.,\; P=9\times 10^{-5}\)). You could rearrange the equation, \(\text{temperature}=\frac{(\text{interval}-8.36)}{(-0.205)}\), measure the interval between frog mating calls, and estimate the air temperature. Or you could buy a thermometer.

Goheen et al. (2003) captured \(14\) female northern grasshopper mice ( Onchomys leucogaster ) in north-central Kansas, measured the body length, and counted the number of offspring. There are two measurement variables, body length and number of offspring, and the authors were interested in whether larger body size causes an increase in the number of offspring, so they did a linear regression. The results are significant: \(r^2=0.46,\; 12 d.f.,\; P=0.008\). The equation of the regression line is \(\text{offspring}=0.108\times \text{length}-7.88\).

Graphing the results

In a spreadsheet, you show the results of a regression on a scatter graph, with the independent variable on the \(X\) axis. To add the regression line to the graph, finish making the graph, then select the graph and go to the Chart menu. Choose "Add Trendline" and choose the straight line. If you want to show the regression line extending beyond the observed range of \(X\) values, choose "Options" and adjust the "Forecast" numbers until you get the line you want.

Similar tests

Sometimes it is not clear whether an experiment includes one measurement variable and two nominal variables, and should be analyzed with a two-way anova or paired t –test, or includes two measurement variables and one hidden nominal variable, and should be analyzed with correlation and regression. In that case, your choice of test is determined by the biological question you're interested in. For example, let's say you've measured the range of motion of the right shoulder and left shoulder of a bunch of right-handed people. If your question is "Is there an association between the range of motion of people's right and left shoulders—do people with more flexible right shoulders also tend to have more flexible left shoulders?", you'd treat "right shoulder range-of-motion" and "left shoulder range-of-motion" as two different measurement variables, and individual as one hidden nominal variable, and analyze with correlation and regression. If your question is "Is the right shoulder more flexible than the left shoulder?", you'd treat "range of motion" as one measurement variable, "right vs. left" as one nominal variable, individual as one nominal variable, and you'd analyze with two-way anova or a paired \(t\)–test.

If the dependent variable is a percentage, such as percentage of people who have heart attacks on different doses of a drug, it's really a nominal variable, not a measurement. Each individual observation is a value of the nominal variable ("heart attack" or "no heart attack"); the percentage is not really a single observation, it's a way of summarizing a bunch of observations. One approach for percentage data is to arcsine transform the percentages and analyze with correlation and linear regression. You'll see this in the literature, and it's not horrible, but it's better to analyze using logistic regression.

If the relationship between the two measurement variables is best described by a curved line, not a straight one, one possibility is to try different transformations on one or both of the variables. The other option is to use curvilinear regression.

If one or both of your variables are ranked variables, not measurement, you should use Spearman rank correlation. Some people recommend Spearman rank correlation when the assumptions of linear regression/correlation (normality and homoscedasticity) are not met, but I'm not aware of any research demonstrating that Spearman is really better in this situation.

To compare the slopes or intercepts of two or more regression lines to each other, use ancova.

If you have more than two measurement variables, use multiple regression.

How to do the test

Spreadsheet.

I have put together a spreadsheet regression.xls to do linear regression and correlation on up to \(1000 pairs\) of observations. It provides the following:

  • The regression coefficient (the slope of the regression line).
  • The \(Y\) intercept. With the slope and the intercept, you have the equation for the regression line: \(\hat{Y}=a+bX\), where \(a\) is the \(y\) intercept and \(b\) is the slope.
  • The \(r^2\) value.
  • The degrees of freedom. There are \(n-2\) degrees of freedom in a regression, where \(n\) is the number of observations.
  • The \(P\) value. This gives you the probability of finding a slope that is as large or larger than the observed slope, under the null hypothesis that the true slope is \(0\).
  • A \(Y\) estimator and an \(X\) estimator. This enables you to enter a value of \(X\) and find the corresponding value of \(Y\) on the best-fit line, or vice-versa. This would be useful for constructing standard curves, such as used in protein assays for example.

Web pages that will perform linear regression are here, here, and here. They all require you to enter each number individually, and thus are inconvenient for large data sets. This web page does linear regression and lets you paste in a set of numbers, which is more convenient for large data sets.

Salvatore Mangiafico's \(R\) Companion has a sample R program for correlation and linear regression.

You can use either PROC GLM or PROC REG for a simple linear regression; since PROC REG is also used for multiple regression, you might as well learn to use it. In the MODEL statement, you give the \(Y\) variable first, then the \(X\) variable after the equals sign. Here's an example using the bird data from above.

DATA birds; INPUT town $ state $ latitude species; DATALINES;

Bombay_Hook DE 39.217 128 Cape_Henlopen DE 38.800 137 Middletown DE 39.467 108 Milford DE 38.958 118 Rehoboth DE 38.600 135 Seaford-Nanticoke DE 38.583 94 Wilmington DE 39.733 113 Crisfield MD 38.033 118 Denton MD 38.900 96 Elkton MD 39.533 98 Lower_Kent_County MD 39.133 121 Ocean_City MD 38.317 152 Salisbury MD 38.333 108 S_Dorchester_County MD 38.367 118 Cape_Charles VA 37.200 157 Chincoteague VA 37.967 125 Wachapreague VA 37.667 114 ; PROC REG DATA=birds; MODEL species=latitude; RUN; The output includes an analysis of variance table. Don't be alarmed by this; if you dig down into the math, regression is just another variety of anova. Below the anova table are the \(r^2\), slope, intercept, and \(P\) value:

Root MSE 16.37357 R-Square 0.2143 r 2 Dependent Mean 120.00000 Adj R-Sq 0.1619 Coeff Var 13.64464

Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > |t| intercept Intercept 1 585.14462 230.02416 2.54 0.0225 latitude 1 -12.03922 5.95277 -2.02 0.0613 P value slope These results indicate an \(r^2\) of \(0.21\), intercept of \(585.1\), a slope of \(-12.04\), and a \(P\) value of \(0.061\).

Power analysis

The G*Power program will calculate the sample size needed for a regression/correlation. The effect size is the absolute value of the correlation coefficient \(r\); if you have \(r^2\), take the positive square root of it. Choose "t tests" from the "Test family" menu and "Correlation: Point biserial model" from the "Statistical test" menu. Enter the \(r\) value you hope to see, your alpha (usually \(0.05\)) and your power (usually \(0.80\) or \(0.90\)).

For example, let's say you want to look for a relationship between calling rate and temperature in the barking tree frog, Hyla gratiosa . Gayou (1984) found an \(r^2\) of \(0.29\) in another frog species, H. versicolor , so you decide you want to be able to detect an \(r^2\) of \(0.25\) or more. The square root of \(0.25\) is \(0.5\), so you enter \(0.5\) for "Effect size", \(0.05\) for alpha, and \(0.8\) for power. The result is \(26\) observations of temperature and frog calling rate.

It's important to note that the distribution of \(X\) variables, in this case air temperatures, should be the same for the proposed study as for the pilot study the sample size calculation was based on. Gayou (1984) measured frog calling rate at temperatures that were fairly evenly distributed from \(10^{\circ}C\) to \(34^{\circ}C\). If you looked at a narrower range of temperatures, you'd need a lot more observations to detect the same kind of relationship.

Dale, M.R.T., and M.-J. Fortin. 2009. Spatial autocorrelation and statistical tests: some solutions. Journal of Agricultural, Biological and Environmental Statistics 14: 188-206.

Edgell, S.E., and S.M. Noon. 1984. Effect of violation of normality on the t –test of the correlation coefficient. Psychological Bulletin 95: 576-583.

Gayou, D.C. 1984. Effects of temperature on the mating call of Hyla versicolor. Copeia 1984: 733-738.

Goheen, J.R., G.A. Kaufman, and D.W. Kaufman. 2003. Effect of body size on reproductive characteristics of the northern grasshopper mouse in north-central Kansas. Southwestern Naturalist 48: 427-431.

Kannan, N., J.P. Keating, and R.L. Mason. 2007. A comparison of classical and inverse estimators in the calibration problem. Communications in Statistics: Theory and Methods 36: 83-95.

Krutchkoff, R.G. 1967. Classical and inverse regression methods of calibration. Technometrics 9: 425-439.

Krutchkoff, R.G. 1969. Classical and inverse regression methods of calibration in extrapolation. Technometrics 11: 605-608.

Lwin, T., and J.S. Maritz. 1982. An analysis of the linear-calibration controversy from the perspective of compound estimation. Technometrics 24: 235-242.

McCardle, B.H. 2003. Lines, models, and errors: Regression in the field. Limnology and Oceanography 48: 1363-1366.

McDonald, J.H. 1989. Selection component analysis of the Mpi locus in the amphipod Platorchestia platensis . Heredity 62: 243-249.

McDonald, J.H., and K.W. Dunn. 2013. Statistical tests for measures of colocalization in biological microscopy. Journal of Microscopy 252: 295-302.

Smith, R.J. 2009. Use and misuse of the reduced major axis for line-fitting. American Journal of Physical Anthropology 140: 476-486.

Sokal, R.R., and F.J. Rohlf. 1995. Biometry. W.H. Freeman, New York.

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

Relationships between two variables, introduction.

Let's get started! Here is what you will learn in this lesson.

Learning objectives for this lesson

Upon completion of this lesson, you should be able to do the following:

  • Understand the relationship between the slope of the regression line and correlation,
  • Comprehend the meaning of the Coefficient of Determination, R 2,
  • Know how to determine which variable is a response and which is an explanatory in a regression equation,
  • Understand that correlation measures the strength of a linear relationship between two variables,
  • Realize how outliers can influence a regression equation, and
  • Determine if variables are categorical or quantitative.

Examining Relationships Between Two Variables

Previously we considered the distribution of a single quantitative variable. Now we will study the relationship between two variables where both variables are qualitative, i.e. categorical, or quantitative. When we consider the relationship between two variables, there are three possibilities:

  • Both variables are categorical. We analyze an association through a comparison of conditional probabilities and graphically represent the data using contingency tables. Examples of categorical variables are gender and class standing.
  • Both variables are quantitative. To analyze this situation we consider how one variable, called a response variable, changes in relation to changes in the other variable called an explanatory variable. Graphically we use scatterplots to display two quantitative variables. Examples are age, height, weight (i.e. things that are measured).
  • One variable is categorical and the other is quantitative, for instance height and gender. These are best compared by using side-by-side boxplots to display any differences or similarities in the center and variability of the quantitative variable (e.g. height) across the categories (e.g. Male and Female).

Ohio State nav bar

The Ohio State University

  • BuckeyeLink
  • Find People
  • Search Ohio State

Research Questions & Hypotheses

Generally, in quantitative studies, reviewers expect hypotheses rather than research questions. However, both research questions and hypotheses serve different purposes and can be beneficial when used together.

Research Questions

Clarify the research’s aim (farrugia et al., 2010).

  • Research often begins with an interest in a topic, but a deep understanding of the subject is crucial to formulate an appropriate research question.
  • Descriptive: “What factors most influence the academic achievement of senior high school students?”
  • Comparative: “What is the performance difference between teaching methods A and B?”
  • Relationship-based: “What is the relationship between self-efficacy and academic achievement?”
  • Increasing knowledge about a subject can be achieved through systematic literature reviews, in-depth interviews with patients (and proxies), focus groups, and consultations with field experts.
  • Some funding bodies, like the Canadian Institute for Health Research, recommend conducting a systematic review or a pilot study before seeking grants for full trials.
  • The presence of multiple research questions in a study can complicate the design, statistical analysis, and feasibility.
  • It’s advisable to focus on a single primary research question for the study.
  • The primary question, clearly stated at the end of a grant proposal’s introduction, usually specifies the study population, intervention, and other relevant factors.
  • The FINER criteria underscore aspects that can enhance the chances of a successful research project, including specifying the population of interest, aligning with scientific and public interest, clinical relevance, and contribution to the field, while complying with ethical and national research standards.
  • The P ICOT approach is crucial in developing the study’s framework and protocol, influencing inclusion and exclusion criteria and identifying patient groups for inclusion.
  • Defining the specific population, intervention, comparator, and outcome helps in selecting the right outcome measurement tool.
  • The more precise the population definition and stricter the inclusion and exclusion criteria, the more significant the impact on the interpretation, applicability, and generalizability of the research findings.
  • A restricted study population enhances internal validity but may limit the study’s external validity and generalizability to clinical practice.
  • A broadly defined study population may better reflect clinical practice but could increase bias and reduce internal validity.
  • An inadequately formulated research question can negatively impact study design, potentially leading to ineffective outcomes and affecting publication prospects.

Checklist: Good research questions for social science projects (Panke, 2018)

hypothesis relationship between two variables

Research Hypotheses

Present the researcher’s predictions based on specific statements.

  • These statements define the research problem or issue and indicate the direction of the researcher’s predictions.
  • Formulating the research question and hypothesis from existing data (e.g., a database) can lead to multiple statistical comparisons and potentially spurious findings due to chance.
  • The research or clinical hypothesis, derived from the research question, shapes the study’s key elements: sampling strategy, intervention, comparison, and outcome variables.
  • Hypotheses can express a single outcome or multiple outcomes.
  • After statistical testing, the null hypothesis is either rejected or not rejected based on whether the study’s findings are statistically significant.
  • Hypothesis testing helps determine if observed findings are due to true differences and not chance.
  • Hypotheses can be 1-sided (specific direction of difference) or 2-sided (presence of a difference without specifying direction).
  • 2-sided hypotheses are generally preferred unless there’s a strong justification for a 1-sided hypothesis.
  • A solid research hypothesis, informed by a good research question, influences the research design and paves the way for defining clear research objectives.

Types of Research Hypothesis

  • In a Y-centered research design, the focus is on the dependent variable (DV) which is specified in the research question. Theories are then used to identify independent variables (IV) and explain their causal relationship with the DV.
  • Example: “An increase in teacher-led instructional time (IV) is likely to improve student reading comprehension scores (DV), because extensive guided practice under expert supervision enhances learning retention and skill mastery.”
  • Hypothesis Explanation: The dependent variable (student reading comprehension scores) is the focus, and the hypothesis explores how changes in the independent variable (teacher-led instructional time) affect it.
  • In X-centered research designs, the independent variable is specified in the research question. Theories are used to determine potential dependent variables and the causal mechanisms at play.
  • Example: “Implementing technology-based learning tools (IV) is likely to enhance student engagement in the classroom (DV), because interactive and multimedia content increases student interest and participation.”
  • Hypothesis Explanation: The independent variable (technology-based learning tools) is the focus, with the hypothesis exploring its impact on a potential dependent variable (student engagement).
  • Probabilistic hypotheses suggest that changes in the independent variable are likely to lead to changes in the dependent variable in a predictable manner, but not with absolute certainty.
  • Example: “The more teachers engage in professional development programs (IV), the more their teaching effectiveness (DV) is likely to improve, because continuous training updates pedagogical skills and knowledge.”
  • Hypothesis Explanation: This hypothesis implies a probable relationship between the extent of professional development (IV) and teaching effectiveness (DV).
  • Deterministic hypotheses state that a specific change in the independent variable will lead to a specific change in the dependent variable, implying a more direct and certain relationship.
  • Example: “If the school curriculum changes from traditional lecture-based methods to project-based learning (IV), then student collaboration skills (DV) are expected to improve because project-based learning inherently requires teamwork and peer interaction.”
  • Hypothesis Explanation: This hypothesis presumes a direct and definite outcome (improvement in collaboration skills) resulting from a specific change in the teaching method.
  • Example : “Students who identify as visual learners will score higher on tests that are presented in a visually rich format compared to tests presented in a text-only format.”
  • Explanation : This hypothesis aims to describe the potential difference in test scores between visual learners taking visually rich tests and text-only tests, without implying a direct cause-and-effect relationship.
  • Example : “Teaching method A will improve student performance more than method B.”
  • Explanation : This hypothesis compares the effectiveness of two different teaching methods, suggesting that one will lead to better student performance than the other. It implies a direct comparison but does not necessarily establish a causal mechanism.
  • Example : “Students with higher self-efficacy will show higher levels of academic achievement.”
  • Explanation : This hypothesis predicts a relationship between the variable of self-efficacy and academic achievement. Unlike a causal hypothesis, it does not necessarily suggest that one variable causes changes in the other, but rather that they are related in some way.

Tips for developing research questions and hypotheses for research studies

  • Perform a systematic literature review (if one has not been done) to increase knowledge and familiarity with the topic and to assist with research development.
  • Learn about current trends and technological advances on the topic.
  • Seek careful input from experts, mentors, colleagues, and collaborators to refine your research question as this will aid in developing the research question and guide the research study.
  • Use the FINER criteria in the development of the research question.
  • Ensure that the research question follows PICOT format.
  • Develop a research hypothesis from the research question.
  • Ensure that the research question and objectives are answerable, feasible, and clinically relevant.

If your research hypotheses are derived from your research questions, particularly when multiple hypotheses address a single question, it’s recommended to use both research questions and hypotheses. However, if this isn’t the case, using hypotheses over research questions is advised. It’s important to note these are general guidelines, not strict rules. If you opt not to use hypotheses, consult with your supervisor for the best approach.

Farrugia, P., Petrisor, B. A., Farrokhyar, F., & Bhandari, M. (2010). Practical tips for surgical research: Research questions, hypotheses and objectives.  Canadian journal of surgery. Journal canadien de chirurgie ,  53 (4), 278–281.

Hulley, S. B., Cummings, S. R., Browner, W. S., Grady, D., & Newman, T. B. (2007). Designing clinical research. Philadelphia.

Panke, D. (2018). Research design & method selection: Making good choices in the social sciences.  Research Design & Method Selection , 1-368.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • J Korean Med Sci
  • v.37(16); 2022 Apr 25

Logo of jkms

A Practical Guide to Writing Quantitative and Qualitative Research Questions and Hypotheses in Scholarly Articles

Edward barroga.

1 Department of General Education, Graduate School of Nursing Science, St. Luke’s International University, Tokyo, Japan.

Glafera Janet Matanguihan

2 Department of Biological Sciences, Messiah University, Mechanicsburg, PA, USA.

The development of research questions and the subsequent hypotheses are prerequisites to defining the main research purpose and specific objectives of a study. Consequently, these objectives determine the study design and research outcome. The development of research questions is a process based on knowledge of current trends, cutting-edge studies, and technological advances in the research field. Excellent research questions are focused and require a comprehensive literature search and in-depth understanding of the problem being investigated. Initially, research questions may be written as descriptive questions which could be developed into inferential questions. These questions must be specific and concise to provide a clear foundation for developing hypotheses. Hypotheses are more formal predictions about the research outcomes. These specify the possible results that may or may not be expected regarding the relationship between groups. Thus, research questions and hypotheses clarify the main purpose and specific objectives of the study, which in turn dictate the design of the study, its direction, and outcome. Studies developed from good research questions and hypotheses will have trustworthy outcomes with wide-ranging social and health implications.

INTRODUCTION

Scientific research is usually initiated by posing evidenced-based research questions which are then explicitly restated as hypotheses. 1 , 2 The hypotheses provide directions to guide the study, solutions, explanations, and expected results. 3 , 4 Both research questions and hypotheses are essentially formulated based on conventional theories and real-world processes, which allow the inception of novel studies and the ethical testing of ideas. 5 , 6

It is crucial to have knowledge of both quantitative and qualitative research 2 as both types of research involve writing research questions and hypotheses. 7 However, these crucial elements of research are sometimes overlooked; if not overlooked, then framed without the forethought and meticulous attention it needs. Planning and careful consideration are needed when developing quantitative or qualitative research, particularly when conceptualizing research questions and hypotheses. 4

There is a continuing need to support researchers in the creation of innovative research questions and hypotheses, as well as for journal articles that carefully review these elements. 1 When research questions and hypotheses are not carefully thought of, unethical studies and poor outcomes usually ensue. Carefully formulated research questions and hypotheses define well-founded objectives, which in turn determine the appropriate design, course, and outcome of the study. This article then aims to discuss in detail the various aspects of crafting research questions and hypotheses, with the goal of guiding researchers as they develop their own. Examples from the authors and peer-reviewed scientific articles in the healthcare field are provided to illustrate key points.

DEFINITIONS AND RELATIONSHIP OF RESEARCH QUESTIONS AND HYPOTHESES

A research question is what a study aims to answer after data analysis and interpretation. The answer is written in length in the discussion section of the paper. Thus, the research question gives a preview of the different parts and variables of the study meant to address the problem posed in the research question. 1 An excellent research question clarifies the research writing while facilitating understanding of the research topic, objective, scope, and limitations of the study. 5

On the other hand, a research hypothesis is an educated statement of an expected outcome. This statement is based on background research and current knowledge. 8 , 9 The research hypothesis makes a specific prediction about a new phenomenon 10 or a formal statement on the expected relationship between an independent variable and a dependent variable. 3 , 11 It provides a tentative answer to the research question to be tested or explored. 4

Hypotheses employ reasoning to predict a theory-based outcome. 10 These can also be developed from theories by focusing on components of theories that have not yet been observed. 10 The validity of hypotheses is often based on the testability of the prediction made in a reproducible experiment. 8

Conversely, hypotheses can also be rephrased as research questions. Several hypotheses based on existing theories and knowledge may be needed to answer a research question. Developing ethical research questions and hypotheses creates a research design that has logical relationships among variables. These relationships serve as a solid foundation for the conduct of the study. 4 , 11 Haphazardly constructed research questions can result in poorly formulated hypotheses and improper study designs, leading to unreliable results. Thus, the formulations of relevant research questions and verifiable hypotheses are crucial when beginning research. 12

CHARACTERISTICS OF GOOD RESEARCH QUESTIONS AND HYPOTHESES

Excellent research questions are specific and focused. These integrate collective data and observations to confirm or refute the subsequent hypotheses. Well-constructed hypotheses are based on previous reports and verify the research context. These are realistic, in-depth, sufficiently complex, and reproducible. More importantly, these hypotheses can be addressed and tested. 13

There are several characteristics of well-developed hypotheses. Good hypotheses are 1) empirically testable 7 , 10 , 11 , 13 ; 2) backed by preliminary evidence 9 ; 3) testable by ethical research 7 , 9 ; 4) based on original ideas 9 ; 5) have evidenced-based logical reasoning 10 ; and 6) can be predicted. 11 Good hypotheses can infer ethical and positive implications, indicating the presence of a relationship or effect relevant to the research theme. 7 , 11 These are initially developed from a general theory and branch into specific hypotheses by deductive reasoning. In the absence of a theory to base the hypotheses, inductive reasoning based on specific observations or findings form more general hypotheses. 10

TYPES OF RESEARCH QUESTIONS AND HYPOTHESES

Research questions and hypotheses are developed according to the type of research, which can be broadly classified into quantitative and qualitative research. We provide a summary of the types of research questions and hypotheses under quantitative and qualitative research categories in Table 1 .

Research questions in quantitative research

In quantitative research, research questions inquire about the relationships among variables being investigated and are usually framed at the start of the study. These are precise and typically linked to the subject population, dependent and independent variables, and research design. 1 Research questions may also attempt to describe the behavior of a population in relation to one or more variables, or describe the characteristics of variables to be measured ( descriptive research questions ). 1 , 5 , 14 These questions may also aim to discover differences between groups within the context of an outcome variable ( comparative research questions ), 1 , 5 , 14 or elucidate trends and interactions among variables ( relationship research questions ). 1 , 5 We provide examples of descriptive, comparative, and relationship research questions in quantitative research in Table 2 .

Hypotheses in quantitative research

In quantitative research, hypotheses predict the expected relationships among variables. 15 Relationships among variables that can be predicted include 1) between a single dependent variable and a single independent variable ( simple hypothesis ) or 2) between two or more independent and dependent variables ( complex hypothesis ). 4 , 11 Hypotheses may also specify the expected direction to be followed and imply an intellectual commitment to a particular outcome ( directional hypothesis ) 4 . On the other hand, hypotheses may not predict the exact direction and are used in the absence of a theory, or when findings contradict previous studies ( non-directional hypothesis ). 4 In addition, hypotheses can 1) define interdependency between variables ( associative hypothesis ), 4 2) propose an effect on the dependent variable from manipulation of the independent variable ( causal hypothesis ), 4 3) state a negative relationship between two variables ( null hypothesis ), 4 , 11 , 15 4) replace the working hypothesis if rejected ( alternative hypothesis ), 15 explain the relationship of phenomena to possibly generate a theory ( working hypothesis ), 11 5) involve quantifiable variables that can be tested statistically ( statistical hypothesis ), 11 6) or express a relationship whose interlinks can be verified logically ( logical hypothesis ). 11 We provide examples of simple, complex, directional, non-directional, associative, causal, null, alternative, working, statistical, and logical hypotheses in quantitative research, as well as the definition of quantitative hypothesis-testing research in Table 3 .

Research questions in qualitative research

Unlike research questions in quantitative research, research questions in qualitative research are usually continuously reviewed and reformulated. The central question and associated subquestions are stated more than the hypotheses. 15 The central question broadly explores a complex set of factors surrounding the central phenomenon, aiming to present the varied perspectives of participants. 15

There are varied goals for which qualitative research questions are developed. These questions can function in several ways, such as to 1) identify and describe existing conditions ( contextual research question s); 2) describe a phenomenon ( descriptive research questions ); 3) assess the effectiveness of existing methods, protocols, theories, or procedures ( evaluation research questions ); 4) examine a phenomenon or analyze the reasons or relationships between subjects or phenomena ( explanatory research questions ); or 5) focus on unknown aspects of a particular topic ( exploratory research questions ). 5 In addition, some qualitative research questions provide new ideas for the development of theories and actions ( generative research questions ) or advance specific ideologies of a position ( ideological research questions ). 1 Other qualitative research questions may build on a body of existing literature and become working guidelines ( ethnographic research questions ). Research questions may also be broadly stated without specific reference to the existing literature or a typology of questions ( phenomenological research questions ), may be directed towards generating a theory of some process ( grounded theory questions ), or may address a description of the case and the emerging themes ( qualitative case study questions ). 15 We provide examples of contextual, descriptive, evaluation, explanatory, exploratory, generative, ideological, ethnographic, phenomenological, grounded theory, and qualitative case study research questions in qualitative research in Table 4 , and the definition of qualitative hypothesis-generating research in Table 5 .

Qualitative studies usually pose at least one central research question and several subquestions starting with How or What . These research questions use exploratory verbs such as explore or describe . These also focus on one central phenomenon of interest, and may mention the participants and research site. 15

Hypotheses in qualitative research

Hypotheses in qualitative research are stated in the form of a clear statement concerning the problem to be investigated. Unlike in quantitative research where hypotheses are usually developed to be tested, qualitative research can lead to both hypothesis-testing and hypothesis-generating outcomes. 2 When studies require both quantitative and qualitative research questions, this suggests an integrative process between both research methods wherein a single mixed-methods research question can be developed. 1

FRAMEWORKS FOR DEVELOPING RESEARCH QUESTIONS AND HYPOTHESES

Research questions followed by hypotheses should be developed before the start of the study. 1 , 12 , 14 It is crucial to develop feasible research questions on a topic that is interesting to both the researcher and the scientific community. This can be achieved by a meticulous review of previous and current studies to establish a novel topic. Specific areas are subsequently focused on to generate ethical research questions. The relevance of the research questions is evaluated in terms of clarity of the resulting data, specificity of the methodology, objectivity of the outcome, depth of the research, and impact of the study. 1 , 5 These aspects constitute the FINER criteria (i.e., Feasible, Interesting, Novel, Ethical, and Relevant). 1 Clarity and effectiveness are achieved if research questions meet the FINER criteria. In addition to the FINER criteria, Ratan et al. described focus, complexity, novelty, feasibility, and measurability for evaluating the effectiveness of research questions. 14

The PICOT and PEO frameworks are also used when developing research questions. 1 The following elements are addressed in these frameworks, PICOT: P-population/patients/problem, I-intervention or indicator being studied, C-comparison group, O-outcome of interest, and T-timeframe of the study; PEO: P-population being studied, E-exposure to preexisting conditions, and O-outcome of interest. 1 Research questions are also considered good if these meet the “FINERMAPS” framework: Feasible, Interesting, Novel, Ethical, Relevant, Manageable, Appropriate, Potential value/publishable, and Systematic. 14

As we indicated earlier, research questions and hypotheses that are not carefully formulated result in unethical studies or poor outcomes. To illustrate this, we provide some examples of ambiguous research question and hypotheses that result in unclear and weak research objectives in quantitative research ( Table 6 ) 16 and qualitative research ( Table 7 ) 17 , and how to transform these ambiguous research question(s) and hypothesis(es) into clear and good statements.

a These statements were composed for comparison and illustrative purposes only.

b These statements are direct quotes from Higashihara and Horiuchi. 16

a This statement is a direct quote from Shimoda et al. 17

The other statements were composed for comparison and illustrative purposes only.

CONSTRUCTING RESEARCH QUESTIONS AND HYPOTHESES

To construct effective research questions and hypotheses, it is very important to 1) clarify the background and 2) identify the research problem at the outset of the research, within a specific timeframe. 9 Then, 3) review or conduct preliminary research to collect all available knowledge about the possible research questions by studying theories and previous studies. 18 Afterwards, 4) construct research questions to investigate the research problem. Identify variables to be accessed from the research questions 4 and make operational definitions of constructs from the research problem and questions. Thereafter, 5) construct specific deductive or inductive predictions in the form of hypotheses. 4 Finally, 6) state the study aims . This general flow for constructing effective research questions and hypotheses prior to conducting research is shown in Fig. 1 .

An external file that holds a picture, illustration, etc.
Object name is jkms-37-e121-g001.jpg

Research questions are used more frequently in qualitative research than objectives or hypotheses. 3 These questions seek to discover, understand, explore or describe experiences by asking “What” or “How.” The questions are open-ended to elicit a description rather than to relate variables or compare groups. The questions are continually reviewed, reformulated, and changed during the qualitative study. 3 Research questions are also used more frequently in survey projects than hypotheses in experiments in quantitative research to compare variables and their relationships.

Hypotheses are constructed based on the variables identified and as an if-then statement, following the template, ‘If a specific action is taken, then a certain outcome is expected.’ At this stage, some ideas regarding expectations from the research to be conducted must be drawn. 18 Then, the variables to be manipulated (independent) and influenced (dependent) are defined. 4 Thereafter, the hypothesis is stated and refined, and reproducible data tailored to the hypothesis are identified, collected, and analyzed. 4 The hypotheses must be testable and specific, 18 and should describe the variables and their relationships, the specific group being studied, and the predicted research outcome. 18 Hypotheses construction involves a testable proposition to be deduced from theory, and independent and dependent variables to be separated and measured separately. 3 Therefore, good hypotheses must be based on good research questions constructed at the start of a study or trial. 12

In summary, research questions are constructed after establishing the background of the study. Hypotheses are then developed based on the research questions. Thus, it is crucial to have excellent research questions to generate superior hypotheses. In turn, these would determine the research objectives and the design of the study, and ultimately, the outcome of the research. 12 Algorithms for building research questions and hypotheses are shown in Fig. 2 for quantitative research and in Fig. 3 for qualitative research.

An external file that holds a picture, illustration, etc.
Object name is jkms-37-e121-g002.jpg

EXAMPLES OF RESEARCH QUESTIONS FROM PUBLISHED ARTICLES

  • EXAMPLE 1. Descriptive research question (quantitative research)
  • - Presents research variables to be assessed (distinct phenotypes and subphenotypes)
  • “BACKGROUND: Since COVID-19 was identified, its clinical and biological heterogeneity has been recognized. Identifying COVID-19 phenotypes might help guide basic, clinical, and translational research efforts.
  • RESEARCH QUESTION: Does the clinical spectrum of patients with COVID-19 contain distinct phenotypes and subphenotypes? ” 19
  • EXAMPLE 2. Relationship research question (quantitative research)
  • - Shows interactions between dependent variable (static postural control) and independent variable (peripheral visual field loss)
  • “Background: Integration of visual, vestibular, and proprioceptive sensations contributes to postural control. People with peripheral visual field loss have serious postural instability. However, the directional specificity of postural stability and sensory reweighting caused by gradual peripheral visual field loss remain unclear.
  • Research question: What are the effects of peripheral visual field loss on static postural control ?” 20
  • EXAMPLE 3. Comparative research question (quantitative research)
  • - Clarifies the difference among groups with an outcome variable (patients enrolled in COMPERA with moderate PH or severe PH in COPD) and another group without the outcome variable (patients with idiopathic pulmonary arterial hypertension (IPAH))
  • “BACKGROUND: Pulmonary hypertension (PH) in COPD is a poorly investigated clinical condition.
  • RESEARCH QUESTION: Which factors determine the outcome of PH in COPD?
  • STUDY DESIGN AND METHODS: We analyzed the characteristics and outcome of patients enrolled in the Comparative, Prospective Registry of Newly Initiated Therapies for Pulmonary Hypertension (COMPERA) with moderate or severe PH in COPD as defined during the 6th PH World Symposium who received medical therapy for PH and compared them with patients with idiopathic pulmonary arterial hypertension (IPAH) .” 21
  • EXAMPLE 4. Exploratory research question (qualitative research)
  • - Explores areas that have not been fully investigated (perspectives of families and children who receive care in clinic-based child obesity treatment) to have a deeper understanding of the research problem
  • “Problem: Interventions for children with obesity lead to only modest improvements in BMI and long-term outcomes, and data are limited on the perspectives of families of children with obesity in clinic-based treatment. This scoping review seeks to answer the question: What is known about the perspectives of families and children who receive care in clinic-based child obesity treatment? This review aims to explore the scope of perspectives reported by families of children with obesity who have received individualized outpatient clinic-based obesity treatment.” 22
  • EXAMPLE 5. Relationship research question (quantitative research)
  • - Defines interactions between dependent variable (use of ankle strategies) and independent variable (changes in muscle tone)
  • “Background: To maintain an upright standing posture against external disturbances, the human body mainly employs two types of postural control strategies: “ankle strategy” and “hip strategy.” While it has been reported that the magnitude of the disturbance alters the use of postural control strategies, it has not been elucidated how the level of muscle tone, one of the crucial parameters of bodily function, determines the use of each strategy. We have previously confirmed using forward dynamics simulations of human musculoskeletal models that an increased muscle tone promotes the use of ankle strategies. The objective of the present study was to experimentally evaluate a hypothesis: an increased muscle tone promotes the use of ankle strategies. Research question: Do changes in the muscle tone affect the use of ankle strategies ?” 23

EXAMPLES OF HYPOTHESES IN PUBLISHED ARTICLES

  • EXAMPLE 1. Working hypothesis (quantitative research)
  • - A hypothesis that is initially accepted for further research to produce a feasible theory
  • “As fever may have benefit in shortening the duration of viral illness, it is plausible to hypothesize that the antipyretic efficacy of ibuprofen may be hindering the benefits of a fever response when taken during the early stages of COVID-19 illness .” 24
  • “In conclusion, it is plausible to hypothesize that the antipyretic efficacy of ibuprofen may be hindering the benefits of a fever response . The difference in perceived safety of these agents in COVID-19 illness could be related to the more potent efficacy to reduce fever with ibuprofen compared to acetaminophen. Compelling data on the benefit of fever warrant further research and review to determine when to treat or withhold ibuprofen for early stage fever for COVID-19 and other related viral illnesses .” 24
  • EXAMPLE 2. Exploratory hypothesis (qualitative research)
  • - Explores particular areas deeper to clarify subjective experience and develop a formal hypothesis potentially testable in a future quantitative approach
  • “We hypothesized that when thinking about a past experience of help-seeking, a self distancing prompt would cause increased help-seeking intentions and more favorable help-seeking outcome expectations .” 25
  • “Conclusion
  • Although a priori hypotheses were not supported, further research is warranted as results indicate the potential for using self-distancing approaches to increasing help-seeking among some people with depressive symptomatology.” 25
  • EXAMPLE 3. Hypothesis-generating research to establish a framework for hypothesis testing (qualitative research)
  • “We hypothesize that compassionate care is beneficial for patients (better outcomes), healthcare systems and payers (lower costs), and healthcare providers (lower burnout). ” 26
  • Compassionomics is the branch of knowledge and scientific study of the effects of compassionate healthcare. Our main hypotheses are that compassionate healthcare is beneficial for (1) patients, by improving clinical outcomes, (2) healthcare systems and payers, by supporting financial sustainability, and (3) HCPs, by lowering burnout and promoting resilience and well-being. The purpose of this paper is to establish a scientific framework for testing the hypotheses above . If these hypotheses are confirmed through rigorous research, compassionomics will belong in the science of evidence-based medicine, with major implications for all healthcare domains.” 26
  • EXAMPLE 4. Statistical hypothesis (quantitative research)
  • - An assumption is made about the relationship among several population characteristics ( gender differences in sociodemographic and clinical characteristics of adults with ADHD ). Validity is tested by statistical experiment or analysis ( chi-square test, Students t-test, and logistic regression analysis)
  • “Our research investigated gender differences in sociodemographic and clinical characteristics of adults with ADHD in a Japanese clinical sample. Due to unique Japanese cultural ideals and expectations of women's behavior that are in opposition to ADHD symptoms, we hypothesized that women with ADHD experience more difficulties and present more dysfunctions than men . We tested the following hypotheses: first, women with ADHD have more comorbidities than men with ADHD; second, women with ADHD experience more social hardships than men, such as having less full-time employment and being more likely to be divorced.” 27
  • “Statistical Analysis
  • ( text omitted ) Between-gender comparisons were made using the chi-squared test for categorical variables and Students t-test for continuous variables…( text omitted ). A logistic regression analysis was performed for employment status, marital status, and comorbidity to evaluate the independent effects of gender on these dependent variables.” 27

EXAMPLES OF HYPOTHESIS AS WRITTEN IN PUBLISHED ARTICLES IN RELATION TO OTHER PARTS

  • EXAMPLE 1. Background, hypotheses, and aims are provided
  • “Pregnant women need skilled care during pregnancy and childbirth, but that skilled care is often delayed in some countries …( text omitted ). The focused antenatal care (FANC) model of WHO recommends that nurses provide information or counseling to all pregnant women …( text omitted ). Job aids are visual support materials that provide the right kind of information using graphics and words in a simple and yet effective manner. When nurses are not highly trained or have many work details to attend to, these job aids can serve as a content reminder for the nurses and can be used for educating their patients (Jennings, Yebadokpo, Affo, & Agbogbe, 2010) ( text omitted ). Importantly, additional evidence is needed to confirm how job aids can further improve the quality of ANC counseling by health workers in maternal care …( text omitted )” 28
  • “ This has led us to hypothesize that the quality of ANC counseling would be better if supported by job aids. Consequently, a better quality of ANC counseling is expected to produce higher levels of awareness concerning the danger signs of pregnancy and a more favorable impression of the caring behavior of nurses .” 28
  • “This study aimed to examine the differences in the responses of pregnant women to a job aid-supported intervention during ANC visit in terms of 1) their understanding of the danger signs of pregnancy and 2) their impression of the caring behaviors of nurses to pregnant women in rural Tanzania.” 28
  • EXAMPLE 2. Background, hypotheses, and aims are provided
  • “We conducted a two-arm randomized controlled trial (RCT) to evaluate and compare changes in salivary cortisol and oxytocin levels of first-time pregnant women between experimental and control groups. The women in the experimental group touched and held an infant for 30 min (experimental intervention protocol), whereas those in the control group watched a DVD movie of an infant (control intervention protocol). The primary outcome was salivary cortisol level and the secondary outcome was salivary oxytocin level.” 29
  • “ We hypothesize that at 30 min after touching and holding an infant, the salivary cortisol level will significantly decrease and the salivary oxytocin level will increase in the experimental group compared with the control group .” 29
  • EXAMPLE 3. Background, aim, and hypothesis are provided
  • “In countries where the maternal mortality ratio remains high, antenatal education to increase Birth Preparedness and Complication Readiness (BPCR) is considered one of the top priorities [1]. BPCR includes birth plans during the antenatal period, such as the birthplace, birth attendant, transportation, health facility for complications, expenses, and birth materials, as well as family coordination to achieve such birth plans. In Tanzania, although increasing, only about half of all pregnant women attend an antenatal clinic more than four times [4]. Moreover, the information provided during antenatal care (ANC) is insufficient. In the resource-poor settings, antenatal group education is a potential approach because of the limited time for individual counseling at antenatal clinics.” 30
  • “This study aimed to evaluate an antenatal group education program among pregnant women and their families with respect to birth-preparedness and maternal and infant outcomes in rural villages of Tanzania.” 30
  • “ The study hypothesis was if Tanzanian pregnant women and their families received a family-oriented antenatal group education, they would (1) have a higher level of BPCR, (2) attend antenatal clinic four or more times, (3) give birth in a health facility, (4) have less complications of women at birth, and (5) have less complications and deaths of infants than those who did not receive the education .” 30

Research questions and hypotheses are crucial components to any type of research, whether quantitative or qualitative. These questions should be developed at the very beginning of the study. Excellent research questions lead to superior hypotheses, which, like a compass, set the direction of research, and can often determine the successful conduct of the study. Many research studies have floundered because the development of research questions and subsequent hypotheses was not given the thought and meticulous attention needed. The development of research questions and hypotheses is an iterative process based on extensive knowledge of the literature and insightful grasp of the knowledge gap. Focused, concise, and specific research questions provide a strong foundation for constructing hypotheses which serve as formal predictions about the research outcomes. Research questions and hypotheses are crucial elements of research that should not be overlooked. They should be carefully thought of and constructed when planning research. This avoids unethical studies and poor outcomes by defining well-founded objectives that determine the design, course, and outcome of the study.

Disclosure: The authors have no potential conflicts of interest to disclose.

Author Contributions:

  • Conceptualization: Barroga E, Matanguihan GJ.
  • Methodology: Barroga E, Matanguihan GJ.
  • Writing - original draft: Barroga E, Matanguihan GJ.
  • Writing - review & editing: Barroga E, Matanguihan GJ.

Enago Academy

How to Develop a Good Research Hypothesis

' src=

The story of a research study begins by asking a question. Researchers all around the globe are asking curious questions and formulating research hypothesis. However, whether the research study provides an effective conclusion depends on how well one develops a good research hypothesis. Research hypothesis examples could help researchers get an idea as to how to write a good research hypothesis.

This blog will help you understand what is a research hypothesis, its characteristics and, how to formulate a research hypothesis

Table of Contents

What is Hypothesis?

Hypothesis is an assumption or an idea proposed for the sake of argument so that it can be tested. It is a precise, testable statement of what the researchers predict will be outcome of the study.  Hypothesis usually involves proposing a relationship between two variables: the independent variable (what the researchers change) and the dependent variable (what the research measures).

What is a Research Hypothesis?

Research hypothesis is a statement that introduces a research question and proposes an expected result. It is an integral part of the scientific method that forms the basis of scientific experiments. Therefore, you need to be careful and thorough when building your research hypothesis. A minor flaw in the construction of your hypothesis could have an adverse effect on your experiment. In research, there is a convention that the hypothesis is written in two forms, the null hypothesis, and the alternative hypothesis (called the experimental hypothesis when the method of investigation is an experiment).

Characteristics of a Good Research Hypothesis

As the hypothesis is specific, there is a testable prediction about what you expect to happen in a study. You may consider drawing hypothesis from previously published research based on the theory.

A good research hypothesis involves more effort than just a guess. In particular, your hypothesis may begin with a question that could be further explored through background research.

To help you formulate a promising research hypothesis, you should ask yourself the following questions:

  • Is the language clear and focused?
  • What is the relationship between your hypothesis and your research topic?
  • Is your hypothesis testable? If yes, then how?
  • What are the possible explanations that you might want to explore?
  • Does your hypothesis include both an independent and dependent variable?
  • Can you manipulate your variables without hampering the ethical standards?
  • Does your research predict the relationship and outcome?
  • Is your research simple and concise (avoids wordiness)?
  • Is it clear with no ambiguity or assumptions about the readers’ knowledge
  • Is your research observable and testable results?
  • Is it relevant and specific to the research question or problem?

research hypothesis example

The questions listed above can be used as a checklist to make sure your hypothesis is based on a solid foundation. Furthermore, it can help you identify weaknesses in your hypothesis and revise it if necessary.

Source: Educational Hub

How to formulate a research hypothesis.

A testable hypothesis is not a simple statement. It is rather an intricate statement that needs to offer a clear introduction to a scientific experiment, its intentions, and the possible outcomes. However, there are some important things to consider when building a compelling hypothesis.

1. State the problem that you are trying to solve.

Make sure that the hypothesis clearly defines the topic and the focus of the experiment.

2. Try to write the hypothesis as an if-then statement.

Follow this template: If a specific action is taken, then a certain outcome is expected.

3. Define the variables

Independent variables are the ones that are manipulated, controlled, or changed. Independent variables are isolated from other factors of the study.

Dependent variables , as the name suggests are dependent on other factors of the study. They are influenced by the change in independent variable.

4. Scrutinize the hypothesis

Evaluate assumptions, predictions, and evidence rigorously to refine your understanding.

Types of Research Hypothesis

The types of research hypothesis are stated below:

1. Simple Hypothesis

It predicts the relationship between a single dependent variable and a single independent variable.

2. Complex Hypothesis

It predicts the relationship between two or more independent and dependent variables.

3. Directional Hypothesis

It specifies the expected direction to be followed to determine the relationship between variables and is derived from theory. Furthermore, it implies the researcher’s intellectual commitment to a particular outcome.

4. Non-directional Hypothesis

It does not predict the exact direction or nature of the relationship between the two variables. The non-directional hypothesis is used when there is no theory involved or when findings contradict previous research.

5. Associative and Causal Hypothesis

The associative hypothesis defines interdependency between variables. A change in one variable results in the change of the other variable. On the other hand, the causal hypothesis proposes an effect on the dependent due to manipulation of the independent variable.

6. Null Hypothesis

Null hypothesis states a negative statement to support the researcher’s findings that there is no relationship between two variables. There will be no changes in the dependent variable due the manipulation of the independent variable. Furthermore, it states results are due to chance and are not significant in terms of supporting the idea being investigated.

7. Alternative Hypothesis

It states that there is a relationship between the two variables of the study and that the results are significant to the research topic. An experimental hypothesis predicts what changes will take place in the dependent variable when the independent variable is manipulated. Also, it states that the results are not due to chance and that they are significant in terms of supporting the theory being investigated.

Research Hypothesis Examples of Independent and Dependent Variables

Research Hypothesis Example 1 The greater number of coal plants in a region (independent variable) increases water pollution (dependent variable). If you change the independent variable (building more coal factories), it will change the dependent variable (amount of water pollution).
Research Hypothesis Example 2 What is the effect of diet or regular soda (independent variable) on blood sugar levels (dependent variable)? If you change the independent variable (the type of soda you consume), it will change the dependent variable (blood sugar levels)

You should not ignore the importance of the above steps. The validity of your experiment and its results rely on a robust testable hypothesis. Developing a strong testable hypothesis has few advantages, it compels us to think intensely and specifically about the outcomes of a study. Consequently, it enables us to understand the implication of the question and the different variables involved in the study. Furthermore, it helps us to make precise predictions based on prior research. Hence, forming a hypothesis would be of great value to the research. Here are some good examples of testable hypotheses.

More importantly, you need to build a robust testable research hypothesis for your scientific experiments. A testable hypothesis is a hypothesis that can be proved or disproved as a result of experimentation.

Importance of a Testable Hypothesis

To devise and perform an experiment using scientific method, you need to make sure that your hypothesis is testable. To be considered testable, some essential criteria must be met:

  • There must be a possibility to prove that the hypothesis is true.
  • There must be a possibility to prove that the hypothesis is false.
  • The results of the hypothesis must be reproducible.

Without these criteria, the hypothesis and the results will be vague. As a result, the experiment will not prove or disprove anything significant.

What are your experiences with building hypotheses for scientific experiments? What challenges did you face? How did you overcome these challenges? Please share your thoughts with us in the comments section.

Frequently Asked Questions

The steps to write a research hypothesis are: 1. Stating the problem: Ensure that the hypothesis defines the research problem 2. Writing a hypothesis as an 'if-then' statement: Include the action and the expected outcome of your study by following a ‘if-then’ structure. 3. Defining the variables: Define the variables as Dependent or Independent based on their dependency to other factors. 4. Scrutinizing the hypothesis: Identify the type of your hypothesis

Hypothesis testing is a statistical tool which is used to make inferences about a population data to draw conclusions for a particular hypothesis.

Hypothesis in statistics is a formal statement about the nature of a population within a structured framework of a statistical model. It is used to test an existing hypothesis by studying a population.

Research hypothesis is a statement that introduces a research question and proposes an expected result. It forms the basis of scientific experiments.

The different types of hypothesis in research are: • Null hypothesis: Null hypothesis is a negative statement to support the researcher’s findings that there is no relationship between two variables. • Alternate hypothesis: Alternate hypothesis predicts the relationship between the two variables of the study. • Directional hypothesis: Directional hypothesis specifies the expected direction to be followed to determine the relationship between variables. • Non-directional hypothesis: Non-directional hypothesis does not predict the exact direction or nature of the relationship between the two variables. • Simple hypothesis: Simple hypothesis predicts the relationship between a single dependent variable and a single independent variable. • Complex hypothesis: Complex hypothesis predicts the relationship between two or more independent and dependent variables. • Associative and casual hypothesis: Associative and casual hypothesis predicts the relationship between two or more independent and dependent variables. • Empirical hypothesis: Empirical hypothesis can be tested via experiments and observation. • Statistical hypothesis: A statistical hypothesis utilizes statistical models to draw conclusions about broader populations.

' src=

Wow! You really simplified your explanation that even dummies would find it easy to comprehend. Thank you so much.

Thanks a lot for your valuable guidance.

I enjoy reading the post. Hypotheses are actually an intrinsic part in a study. It bridges the research question and the methodology of the study.

Useful piece!

This is awesome.Wow.

It very interesting to read the topic, can you guide me any specific example of hypothesis process establish throw the Demand and supply of the specific product in market

Nicely explained

It is really a useful for me Kindly give some examples of hypothesis

It was a well explained content ,can you please give me an example with the null and alternative hypothesis illustrated

clear and concise. thanks.

So Good so Amazing

Good to learn

Thanks a lot for explaining to my level of understanding

Explained well and in simple terms. Quick read! Thank you

It awesome. It has really positioned me in my research project

Rate this article Cancel Reply

Your email address will not be published.

hypothesis relationship between two variables

Enago Academy's Most Popular Articles

Content Analysis vs Thematic Analysis: What's the difference?

  • Reporting Research

Choosing the Right Analytical Approach: Thematic analysis vs. content analysis for data interpretation

In research, choosing the right approach to understand data is crucial for deriving meaningful insights.…

Cross-sectional and Longitudinal Study Design

Comparing Cross Sectional and Longitudinal Studies: 5 steps for choosing the right approach

The process of choosing the right research design can put ourselves at the crossroads of…

hypothesis relationship between two variables

  • Industry News

COPE Forum Discussion Highlights Challenges and Urges Clarity in Institutional Authorship Standards

The COPE forum discussion held in December 2023 initiated with a fundamental question — is…

Networking in Academic Conferences

  • Career Corner

Unlocking the Power of Networking in Academic Conferences

Embarking on your first academic conference experience? Fear not, we got you covered! Academic conferences…

Research recommendation

Research Recommendations – Guiding policy-makers for evidence-based decision making

Research recommendations play a crucial role in guiding scholars and researchers toward fruitful avenues of…

Choosing the Right Analytical Approach: Thematic analysis vs. content analysis for…

Comparing Cross Sectional and Longitudinal Studies: 5 steps for choosing the right…

How to Design Effective Research Questionnaires for Robust Findings

hypothesis relationship between two variables

Sign-up to read more

Subscribe for free to get unrestricted access to all our resources on research writing and academic publishing including:

  • 2000+ blog articles
  • 50+ Webinars
  • 10+ Expert podcasts
  • 50+ Infographics
  • 10+ Checklists
  • Research Guides

We hate spam too. We promise to protect your privacy and never spam you.

I am looking for Editing/ Proofreading services for my manuscript Tentative date of next journal submission:

hypothesis relationship between two variables

What should universities' stance be on AI tools in research and academic writing?

helpful professor logo

15 Hypothesis Examples

hypothesis definition and example, explained below

A hypothesis is defined as a testable prediction , and is used primarily in scientific experiments as a potential or predicted outcome that scientists attempt to prove or disprove (Atkinson et al., 2021; Tan, 2022).

In my types of hypothesis article, I outlined 13 different hypotheses, including the directional hypothesis (which makes a prediction about an effect of a treatment will be positive or negative) and the associative hypothesis (which makes a prediction about the association between two variables).

This article will dive into some interesting examples of hypotheses and examine potential ways you might test each one.

Hypothesis Examples

1. “inadequate sleep decreases memory retention”.

Field: Psychology

Type: Causal Hypothesis A causal hypothesis explores the effect of one variable on another. This example posits that a lack of adequate sleep causes decreased memory retention. In other words, if you are not getting enough sleep, your ability to remember and recall information may suffer.

How to Test:

To test this hypothesis, you might devise an experiment whereby your participants are divided into two groups: one receives an average of 8 hours of sleep per night for a week, while the other gets less than the recommended sleep amount.

During this time, all participants would daily study and recall new, specific information. You’d then measure memory retention of this information for both groups using standard memory tests and compare the results.

Should the group with less sleep have statistically significant poorer memory scores, the hypothesis would be supported.

Ensuring the integrity of the experiment requires taking into account factors such as individual health differences, stress levels, and daily nutrition.

Relevant Study: Sleep loss, learning capacity and academic performance (Curcio, Ferrara & De Gennaro, 2006)

2. “Increase in Temperature Leads to Increase in Kinetic Energy”

Field: Physics

Type: Deductive Hypothesis The deductive hypothesis applies the logic of deductive reasoning – it moves from a general premise to a more specific conclusion. This specific hypothesis assumes that as temperature increases, the kinetic energy of particles also increases – that is, when you heat something up, its particles move around more rapidly.

This hypothesis could be examined by heating a gas in a controlled environment and capturing the movement of its particles as a function of temperature.

You’d gradually increase the temperature and measure the kinetic energy of the gas particles with each increment. If the kinetic energy consistently rises with the temperature, your hypothesis gets supporting evidence.

Variables such as pressure and volume of the gas would need to be held constant to ensure validity of results.

3. “Children Raised in Bilingual Homes Develop Better Cognitive Skills”

Field: Psychology/Linguistics

Type: Comparative Hypothesis The comparative hypothesis posits a difference between two or more groups based on certain variables. In this context, you might propose that children raised in bilingual homes have superior cognitive skills compared to those raised in monolingual homes.

Testing this hypothesis could involve identifying two groups of children: those raised in bilingual homes, and those raised in monolingual homes.

Cognitive skills in both groups would be evaluated using a standard cognitive ability test at different stages of development. The examination would be repeated over a significant time period for consistency.

If the group raised in bilingual homes persistently scores higher than the other, the hypothesis would thereby be supported.

The challenge for the researcher would be controlling for other variables that could impact cognitive development, such as socio-economic status, education level of parents, and parenting styles.

Relevant Study: The cognitive benefits of being bilingual (Marian & Shook, 2012)

4. “High-Fiber Diet Leads to Lower Incidences of Cardiovascular Diseases”

Field: Medicine/Nutrition

Type: Alternative Hypothesis The alternative hypothesis suggests an alternative to a null hypothesis. In this context, the implied null hypothesis could be that diet has no effect on cardiovascular health, which the alternative hypothesis contradicts by suggesting that a high-fiber diet leads to fewer instances of cardiovascular diseases.

To test this hypothesis, a longitudinal study could be conducted on two groups of participants; one adheres to a high-fiber diet, while the other follows a diet low in fiber.

After a fixed period, the cardiovascular health of participants in both groups could be analyzed and compared. If the group following a high-fiber diet has a lower number of recorded cases of cardiovascular diseases, it would provide evidence supporting the hypothesis.

Control measures should be implemented to exclude the influence of other lifestyle and genetic factors that contribute to cardiovascular health.

Relevant Study: Dietary fiber, inflammation, and cardiovascular disease (King, 2005)

5. “Gravity Influences the Directional Growth of Plants”

Field: Agronomy / Botany

Type: Explanatory Hypothesis An explanatory hypothesis attempts to explain a phenomenon. In this case, the hypothesis proposes that gravity affects how plants direct their growth – both above-ground (toward sunlight) and below-ground (towards water and other resources).

The testing could be conducted by growing plants in a rotating cylinder to create artificial gravity.

Observations on the direction of growth, over a specified period, can provide insights into the influencing factors. If plants consistently direct their growth in a manner that indicates the influence of gravitational pull, the hypothesis is substantiated.

It is crucial to ensure that other growth-influencing factors, such as light and water, are uniformly distributed so that only gravity influences the directional growth.

6. “The Implementation of Gamified Learning Improves Students’ Motivation”

Field: Education

Type: Relational Hypothesis The relational hypothesis describes the relation between two variables. Here, the hypothesis is that the implementation of gamified learning has a positive effect on the motivation of students.

To validate this proposition, two sets of classes could be compared: one that implements a learning approach with game-based elements, and another that follows a traditional learning approach.

The students’ motivation levels could be gauged by monitoring their engagement, performance, and feedback over a considerable timeframe.

If the students engaged in the gamified learning context present higher levels of motivation and achievement, the hypothesis would be supported.

Control measures ought to be put into place to account for individual differences, including prior knowledge and attitudes towards learning.

Relevant Study: Does educational gamification improve students’ motivation? (Chapman & Rich, 2018)

7. “Mathematics Anxiety Negatively Affects Performance”

Field: Educational Psychology

Type: Research Hypothesis The research hypothesis involves making a prediction that will be tested. In this case, the hypothesis proposes that a student’s anxiety about math can negatively influence their performance in math-related tasks.

To assess this hypothesis, researchers must first measure the mathematics anxiety levels of a sample of students using a validated instrument, such as the Mathematics Anxiety Rating Scale.

Then, the students’ performance in mathematics would be evaluated through standard testing. If there’s a negative correlation between the levels of math anxiety and math performance (meaning as anxiety increases, performance decreases), the hypothesis would be supported.

It would be crucial to control for relevant factors such as overall academic performance and previous mathematical achievement.

8. “Disruption of Natural Sleep Cycle Impairs Worker Productivity”

Field: Organizational Psychology

Type: Operational Hypothesis The operational hypothesis involves defining the variables in measurable terms. In this example, the hypothesis posits that disrupting the natural sleep cycle, for instance through shift work or irregular working hours, can lessen productivity among workers.

To test this hypothesis, you could collect data from workers who maintain regular working hours and those with irregular schedules.

Measuring productivity could involve examining the worker’s ability to complete tasks, the quality of their work, and their efficiency.

If workers with interrupted sleep cycles demonstrate lower productivity compared to those with regular sleep patterns, it would lend support to the hypothesis.

Consideration should be given to potential confounding variables such as job type, worker age, and overall health.

9. “Regular Physical Activity Reduces the Risk of Depression”

Field: Health Psychology

Type: Predictive Hypothesis A predictive hypothesis involves making a prediction about the outcome of a study based on the observed relationship between variables. In this case, it is hypothesized that individuals who engage in regular physical activity are less likely to suffer from depression.

Longitudinal studies would suit to test this hypothesis, tracking participants’ levels of physical activity and their mental health status over time.

The level of physical activity could be self-reported or monitored, while mental health status could be assessed using standard diagnostic tools or surveys.

If data analysis shows that participants maintaining regular physical activity have a lower incidence of depression, this would endorse the hypothesis.

However, care should be taken to control other lifestyle and behavioral factors that could intervene with the results.

Relevant Study: Regular physical exercise and its association with depression (Kim, 2022)

10. “Regular Meditation Enhances Emotional Stability”

Type: Empirical Hypothesis In the empirical hypothesis, predictions are based on amassed empirical evidence . This particular hypothesis theorizes that frequent meditation leads to improved emotional stability, resonating with numerous studies linking meditation to a variety of psychological benefits.

Earlier studies reported some correlations, but to test this hypothesis directly, you’d organize an experiment where one group meditates regularly over a set period while a control group doesn’t.

Both groups’ emotional stability levels would be measured at the start and end of the experiment using a validated emotional stability assessment.

If regular meditators display noticeable improvements in emotional stability compared to the control group, the hypothesis gains credit.

You’d have to ensure a similar emotional baseline for all participants at the start to avoid skewed results.

11. “Children Exposed to Reading at an Early Age Show Superior Academic Progress”

Type: Directional Hypothesis The directional hypothesis predicts the direction of an expected relationship between variables. Here, the hypothesis anticipates that early exposure to reading positively affects a child’s academic advancement.

A longitudinal study tracking children’s reading habits from an early age and their consequent academic performance could validate this hypothesis.

Parents could report their children’s exposure to reading at home, while standardized school exam results would provide a measure of academic achievement.

If the children exposed to early reading consistently perform better acadically, it gives weight to the hypothesis.

However, it would be important to control for variables that might impact academic performance, such as socioeconomic background, parental education level, and school quality.

12. “Adopting Energy-efficient Technologies Reduces Carbon Footprint of Industries”

Field: Environmental Science

Type: Descriptive Hypothesis A descriptive hypothesis predicts the existence of an association or pattern related to variables. In this scenario, the hypothesis suggests that industries adopting energy-efficient technologies will resultantly show a reduced carbon footprint.

Global industries making use of energy-efficient technologies could track their carbon emissions over time. At the same time, others not implementing such technologies continue their regular tracking.

After a defined time, the carbon emission data of both groups could be compared. If industries that adopted energy-efficient technologies demonstrate a notable reduction in their carbon footprints, the hypothesis would hold strong.

In the experiment, you would exclude variations brought by factors such as industry type, size, and location.

13. “Reduced Screen Time Improves Sleep Quality”

Type: Simple Hypothesis The simple hypothesis is a prediction about the relationship between two variables, excluding any other variables from consideration. This example posits that by reducing time spent on devices like smartphones and computers, an individual should experience improved sleep quality.

A sample group would need to reduce their daily screen time for a pre-determined period. Sleep quality before and after the reduction could be measured using self-report sleep diaries and objective measures like actigraphy, monitoring movement and wakefulness during sleep.

If the data shows that sleep quality improved post the screen time reduction, the hypothesis would be validated.

Other aspects affecting sleep quality, like caffeine intake, should be controlled during the experiment.

Relevant Study: Screen time use impacts low‐income preschool children’s sleep quality, tiredness, and ability to fall asleep (Waller et al., 2021)

14. Engaging in Brain-Training Games Improves Cognitive Functioning in Elderly

Field: Gerontology

Type: Inductive Hypothesis Inductive hypotheses are based on observations leading to broader generalizations and theories. In this context, the hypothesis deduces from observed instances that engaging in brain-training games can help improve cognitive functioning in the elderly.

A longitudinal study could be conducted where an experimental group of elderly people partakes in regular brain-training games.

Their cognitive functioning could be assessed at the start of the study and at regular intervals using standard neuropsychological tests.

If the group engaging in brain-training games shows better cognitive functioning scores over time compared to a control group not playing these games, the hypothesis would be supported.

15. Farming Practices Influence Soil Erosion Rates

Type: Null Hypothesis A null hypothesis is a negative statement assuming no relationship or difference between variables. The hypothesis in this context asserts there’s no effect of different farming practices on the rates of soil erosion.

Comparing soil erosion rates in areas with different farming practices over a considerable timeframe could help test this hypothesis.

If, statistically, the farming practices do not lead to differences in soil erosion rates, the null hypothesis is accepted.

However, if marked variation appears, the null hypothesis is rejected, meaning farming practices do influence soil erosion rates. It would be crucial to control for external factors like weather, soil type, and natural vegetation.

The variety of hypotheses mentioned above underscores the diversity of research constructs inherent in different fields, each with its unique purpose and way of testing.

While researchers may develop hypotheses primarily as tools to define and narrow the focus of the study, these hypotheses also serve as valuable guiding forces for the data collection and analysis procedures, making the research process more efficient and direction-focused.

Hypotheses serve as a compass for any form of academic research. The diverse examples provided, from Psychology to Educational Studies, Environmental Science to Gerontology, clearly demonstrate how certain hypotheses suit specific fields more aptly than others.

It is important to underline that although these varied hypotheses differ in their structure and methods of testing, each endorses the fundamental value of empiricism in research. Evidence-based decision making remains at the heart of scholarly inquiry, regardless of the research field, thus aligning all hypotheses to the core purpose of scientific investigation.

Testing hypotheses is an essential part of the scientific method . By doing so, researchers can either confirm their predictions, giving further validity to an existing theory, or they might uncover new insights that could potentially shift the field’s understanding of a particular phenomenon. In either case, hypotheses serve as the stepping stones for scientific exploration and discovery.

Atkinson, P., Delamont, S., Cernat, A., Sakshaug, J. W., & Williams, R. A. (2021).  SAGE research methods foundations . SAGE Publications Ltd.

Curcio, G., Ferrara, M., & De Gennaro, L. (2006). Sleep loss, learning capacity and academic performance.  Sleep medicine reviews ,  10 (5), 323-337.

Kim, J. H. (2022). Regular physical exercise and its association with depression: A population-based study short title: Exercise and depression.  Psychiatry Research ,  309 , 114406.

King, D. E. (2005). Dietary fiber, inflammation, and cardiovascular disease.  Molecular nutrition & food research ,  49 (6), 594-600.

Marian, V., & Shook, A. (2012, September). The cognitive benefits of being bilingual. In Cerebrum: the Dana forum on brain science (Vol. 2012). Dana Foundation.

Tan, W. C. K. (2022). Research Methods: A Practical Guide For Students And Researchers (Second Edition) . World Scientific Publishing Company.

Waller, N. A., Zhang, N., Cocci, A. H., D’Agostino, C., Wesolek‐Greenson, S., Wheelock, K., … & Resnicow, K. (2021). Screen time use impacts low‐income preschool children’s sleep quality, tiredness, and ability to fall asleep. Child: care, health and development, 47 (5), 618-626.

Chris

Chris Drew (PhD)

Dr. Chris Drew is the founder of the Helpful Professor. He holds a PhD in education and has published over 20 articles in scholarly journals. He is the former editor of the Journal of Learning Development in Higher Education. [Image Descriptor: Photo of Chris]

  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd/ 5 Top Tips for Succeeding at University
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd/ 50 Durable Goods Examples
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd/ 100 Consumer Goods Examples
  • Chris Drew (PhD) https://helpfulprofessor.com/author/chris-drew-phd/ 30 Globalization Pros and Cons

Leave a Comment Cancel Reply

Your email address will not be published. Required fields are marked *

  • Scientific Methods

What is Hypothesis?

We have heard of many hypotheses which have led to great inventions in science. Assumptions that are made on the basis of some evidence are known as hypotheses. In this article, let us learn in detail about the hypothesis and the type of hypothesis with examples.

A hypothesis is an assumption that is made based on some evidence. This is the initial point of any investigation that translates the research questions into predictions. It includes components like variables, population and the relation between the variables. A research hypothesis is a hypothesis that is used to test the relationship between two or more variables.

Characteristics of Hypothesis

Following are the characteristics of the hypothesis:

  • The hypothesis should be clear and precise to consider it to be reliable.
  • If the hypothesis is a relational hypothesis, then it should be stating the relationship between variables.
  • The hypothesis must be specific and should have scope for conducting more tests.
  • The way of explanation of the hypothesis must be very simple and it should also be understood that the simplicity of the hypothesis is not related to its significance.

Sources of Hypothesis

Following are the sources of hypothesis:

  • The resemblance between the phenomenon.
  • Observations from past studies, present-day experiences and from the competitors.
  • Scientific theories.
  • General patterns that influence the thinking process of people.

Types of Hypothesis

There are six forms of hypothesis and they are:

  • Simple hypothesis
  • Complex hypothesis
  • Directional hypothesis
  • Non-directional hypothesis
  • Null hypothesis
  • Associative and casual hypothesis

Simple Hypothesis

It shows a relationship between one dependent variable and a single independent variable. For example – If you eat more vegetables, you will lose weight faster. Here, eating more vegetables is an independent variable, while losing weight is the dependent variable.

Complex Hypothesis

It shows the relationship between two or more dependent variables and two or more independent variables. Eating more vegetables and fruits leads to weight loss, glowing skin, and reduces the risk of many diseases such as heart disease.

Directional Hypothesis

It shows how a researcher is intellectual and committed to a particular outcome. The relationship between the variables can also predict its nature. For example- children aged four years eating proper food over a five-year period are having higher IQ levels than children not having a proper meal. This shows the effect and direction of the effect.

Non-directional Hypothesis

It is used when there is no theory involved. It is a statement that a relationship exists between two variables, without predicting the exact nature (direction) of the relationship.

Null Hypothesis

It provides a statement which is contrary to the hypothesis. It’s a negative statement, and there is no relationship between independent and dependent variables. The symbol is denoted by “H O ”.

Associative and Causal Hypothesis

Associative hypothesis occurs when there is a change in one variable resulting in a change in the other variable. Whereas, the causal hypothesis proposes a cause and effect interaction between two or more variables.

Examples of Hypothesis

Following are the examples of hypotheses based on their types:

  • Consumption of sugary drinks every day leads to obesity is an example of a simple hypothesis.
  • All lilies have the same number of petals is an example of a null hypothesis.
  • If a person gets 7 hours of sleep, then he will feel less fatigue than if he sleeps less. It is an example of a directional hypothesis.

Functions of Hypothesis

Following are the functions performed by the hypothesis:

  • Hypothesis helps in making an observation and experiments possible.
  • It becomes the start point for the investigation.
  • Hypothesis helps in verifying the observations.
  • It helps in directing the inquiries in the right direction.

How will Hypothesis help in the Scientific Method?

Researchers use hypotheses to put down their thoughts directing how the experiment would take place. Following are the steps that are involved in the scientific method:

  • Formation of question
  • Doing background research
  • Creation of hypothesis
  • Designing an experiment
  • Collection of data
  • Result analysis
  • Summarizing the experiment
  • Communicating the results

Frequently Asked Questions – FAQs

What is hypothesis.

A hypothesis is an assumption made based on some evidence.

Give an example of simple hypothesis?

What are the types of hypothesis.

Types of hypothesis are:

  • Associative and Casual hypothesis

State true or false: Hypothesis is the initial point of any investigation that translates the research questions into a prediction.

Define complex hypothesis..

A complex hypothesis shows the relationship between two or more dependent variables and two or more independent variables.

Quiz Image

Put your understanding of this concept to test by answering a few MCQs. Click ‘Start Quiz’ to begin!

Select the correct answer and click on the “Finish” button Check your score and answers at the end of the quiz

Visit BYJU’S for all Physics related queries and study materials

Your result is as below

Request OTP on Voice Call

Leave a Comment Cancel reply

Your Mobile number and Email id will not be published. Required fields are marked *

Post My Comment

hypothesis relationship between two variables

  • Share Share

Register with BYJU'S & Download Free PDFs

Register with byju's & watch live videos.

close

hypothesis relationship between two variables

Advertisement

The Independent Variable vs. Dependent Variable in Research

  • Share Content on Facebook
  • Share Content on LinkedIn
  • Share Content on Flipboard
  • Share Content on Reddit
  • Share Content via Email

lab

In any scientific research, there are typically two variables of interest: independent variables and dependent variables. In forming the backbone of scientific experiments , they help scientists understand relationships, predict outcomes and, in general, make sense of the factors that they're investigating.

Understanding the independent variable vs. dependent variable is so fundamental to scientific research that you need to have a good handle on both if you want to design your own research study or interpret others' findings.

To grasp the distinction between the two, let's delve into their definitions and roles.

What Is an Independent Variable?

What is a dependent variable, research study example, predictor variables vs. outcome variables, other variables, the relationship between independent and dependent variables.

The independent variable, often denoted as X, is the variable that is manipulated or controlled by the researcher intentionally. It's the factor that researchers believe may have a causal effect on the dependent variable.

In simpler terms, the independent variable is the variable you change or vary in an experiment so you can observe its impact on the dependent variable.

The dependent variable, often represented as Y, is the variable that is observed and measured to determine the outcome of the experiment.

In other words, the dependent variable is the variable that is affected by the changes in the independent variable. The values of the dependent variable always depend on the independent variable.

Let's consider an example to illustrate these concepts. Imagine you're conducting a research study aiming to investigate the effect of studying techniques on test scores among students.

In this scenario, the independent variable manipulated would be the studying technique, which you could vary by employing different methods, such as spaced repetition, summarization or practice testing.

The dependent variable, in this case, would be the test scores of the students. As the researcher following the scientific method , you would manipulate the independent variable (the studying technique) and then measure its impact on the dependent variable (the test scores).

You can also categorize variables as predictor variables or outcome variables. Sometimes a researcher will refer to the independent variable as the predictor variable since they use it to predict or explain changes in the dependent variable, which is also known as the outcome variable.

When conducting an experiment or study, it's crucial to acknowledge the presence of other variables, or extraneous variables, which may influence the outcome of the experiment but are not the focus of study.

These variables can potentially confound the results if they aren't controlled. In the example from above, other variables might include the students' prior knowledge, level of motivation, time spent studying and preferred learning style.

As a researcher, it would be your goal to control these extraneous variables to ensure you can attribute any observed differences in the dependent variable to changes in the independent variable. In practice, however, it's not always possible to control every variable.

The distinction between independent and dependent variables is essential for designing and conducting research studies and experiments effectively.

By manipulating the independent variable and measuring its impact on the dependent variable while controlling for other factors, researchers can gain insights into the factors that influence outcomes in their respective fields.

Whether investigating the effects of a new drug on blood pressure or studying the relationship between socioeconomic factors and academic performance, understanding the role of independent and dependent variables is essential for advancing knowledge and making informed decisions.

Correlation vs. Causation

Understanding the relationship between independent and dependent variables is essential for making sense of research findings. Depending on the nature of this relationship, researchers may identify correlations or infer causation between the variables.

Correlation implies that changes in one variable are associated with changes in another variable, while causation suggests that changes in the independent variable directly cause changes in the dependent variable.

Control and Intervention

In experimental research, the researcher has control over the independent variable, allowing them to manipulate it to observe its effects on the dependent variable. This controlled manipulation distinguishes experiments from other types of research designs.

For example, in observational studies, researchers merely observe variables without intervention, meaning they don't control or manipulate any variables.

Context and Analysis

Whether it's intentional or unintentional, independent, dependent and other variables can vary in different contexts, and their effects may differ based on various factors, such as age, characteristics of the participants, environmental influences and so on.

Researchers employ statistical analysis techniques to measure and analyze the relationships between these variables, helping them to draw meaningful conclusions from their data.

We created this article in conjunction with AI technology, then made sure it was fact-checked and edited by a HowStuffWorks editor.

Please copy/paste the following text to properly cite this HowStuffWorks.com article:

Addressing the Environmental Kuznets Curve in the West African Countries: Exploring the Roles of FDI, Corruption, and Renewable Energy

  • Published: 15 April 2024

Cite this article

  • Lobna Abid 1 , 3 ,
  • Sana Kacem 1 , 4 &
  • Haifa Saadaoui 2  

Environmental degradation and economic growth are two intricately related issues whose impact is in constant increase within a global context marked by climate risks and corruption, notably in certain African countries. This research work examines the impacts of economic growth, corruption, renewable energy, and foreign direct investment on carbon dioxide emissions for a set of West African economies between 1990 and 2020. The current paper uses the PMG-ARDL panel method in order to assess the relationships between the various variables invested. The results are indicative of the long-term effects of variables. These findings demonstrate that GDP per capita has a positive and significant effect on CO2 emissions, and that the Kuznet curve is not validated in this case. Moreover, FDI confirms the pollution heaven hypothesis as it reduces environmental quality in the long run. In contrast, renewable energy consumption and control corruption in West African countries constitute significant factors in the fight for environmental quality. The causality outcomes reveal that there exist one way of unidirectional link between CO2 to both income and corruption, and a one direction causality from FDI to CO2 emissions. Meanwhile, the link between renewable energy and CO2 emissions is neutral. In this respect, this research offers outstanding findings to help maintain influential procedures for environmental sustainability within the West African framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

Abdouli, M., & Hammami, S. (2017). Investigating the causality links between environmental quality, foreign direct investment and economic growth in MENA countries. International Business Review, 26 (2), 264–278.

Article   Google Scholar  

Acheampong, A. O., Adams, S., & Boateng, E. (2019). Do globalization and renewable energy contribute to carbon emissions mitigation in Sub-Saharan Africa? Science of the Total Environment, 677 , 436–446.

Akhbari, R., & Nejati, M. (2019). The effect of corruption on carbon emissions in developed and developing countries: Empirical investigation of a claim. Heliyon, 5 (9), e02516. https://doi.org/10.1016/j.heliyon.2019.e02516

Alaganthiran, J. R., & Anaba, M. I. (2022). The effects of economic growth on carbon dioxide emissions in selected Sub-Saharan African (SSA) countries. Heliyon, 8 (11).

Alam, M. S. (2022). Is trade, energy consumption and economic growth threat to environmental quality in Bahrain–Evidence from VECM and ARDL bound test approach. International Journal of Emergency Services, 11 (3), 396–408. https://doi.org/10.1108/IJES-12-2021-0084

Amri, F., Zaied, Y. B., & Lahouel, B. B. (2019). ICT, total factor productivity, and carbon dioxide emissions in Tunisia. Technological Forecasting and Social Change, 146 (C), 212–217. https://doi.org/10.1016/j.techfore.2019.05.028

Asongu, S. A., El Montasser, G., & Toumi, H. (2016). Testing the relationships between energy consumption, CO 2 emissions, and economic growth in 24 African countries: A panel ARDL approach. Environmental Science and Pollution Research, 23 (7), 6563–6573. https://doi.org/10.1007/s11356-015-5883-7

Awan, A. M., & Azam, M. (2022). Evaluating the impact of GDP per capita on environmental degradation for G-20 economies: Does N-shaped environmental Kuznets curve exist? Environment, Development and Sustainability, 24 , 11103–11126. https://doi.org/10.1007/s10668-021-01899-8

Azam, M., Khan, A. Q., Abdullah, H. B., & Qureshi, M. E. (2016). The impact of CO 2 emissions on economic growth: Evidence from selected higher CO 2 emissions economies. Environmental Science and Pollution Research, 23 (7), 6376–6389. https://doi.org/10.1007/s11356-015-5817-4

Aziz, N., Sharif, A., Raza, A., & Rong, K. (2020). Revisiting the role of forestry, agriculture, and renewable energy in testing environment Kuznets curve in Pakistan: Evidence from quantile ARDL approach. Environmental Science and Pollution Research, 27 (9), 10115–10128. https://doi.org/10.1007/s11356-020-07798-1

Bakare, I. A., & Ozegbe, A. E. (2022). The dynamics of corruption, human capital development and economic performance in Nigeria. Journal of Management Scholarship, 1 (1), 41–52. https://doi.org/10.38198/JMS/1.1.2022.7

Balsalobre-Lorente, D., Ibáñez-Luzón, L., Usman, M., & Shahbaz, M. (2022). The environmental Kuznets curve, based on the economic complexity, and the pollution haven hypothesis in PIIGS countries. Renewable Energy, 185 , 1441–1455. https://doi.org/10.1016/j.renene.2021.10.059

Belaïd, F., & Zrelli, M. H. (2019). Renewable and non-renewable electricity consumption, environmental degradation and economic development: Evidence from Mediterranean countries. Energy Policy, 133 , 110929.

Ben Jebli, M., et al. (2016). Testing environmental Kuznets curve hypothesis: The role of renewable and non-renewable energy consumption and trade in OECD countries. Ecological Indicators, 60 , 824–831. https://doi.org/10.1016/j.ecolind.2015.08.031

Ben Jebli, M., & Ben Youssef, S. (2017). The role of renewable energy and agriculture in reducing CO 2 emissions: Evidence for North Africa countries. Ecological Indicators, 74 , 295–301. https://doi.org/10.1016/j.ecolind.2016.11.032

Bergougui, B. (2024). Moving toward environmental mitigation in Algeria: Asymmetric impact of fossil fuel energy, renewable energy and technological innovation on CO2 emissions. Energy Strategy Reviews, 51 , 101281.

Boufateh, T. (2019). The environmental Kuznets curve by considering asymmetric oil price shocks: Evidence from the top two. Environmental Science and Pollution Research, 26 , 706–720.

Boukhelkhal, A. (2022). Energy use, economic growth and CO 2 emissions in Africa: Does the environmental Kuznets curve hypothesis exist? New evidence from heterogeneous panel under cross-sectional dependence. Environment, Development and Sustainability, 24 (11), 13083–13110.

Bouyghrissi, S., Murshed, M., Jindal, A., Berjaoui, A., Mahmood, H., & Khanniba, M. (2022). The importance of facilitating renewable energy transition for abating CO2 emissions in Morocco. Environmental Science and Pollution Research, 29 (14), 20752–20767.

Breitung, J. (2000). The local power of some unit root tests for panel data. In B. Baltagi (Ed.), Advances in Econometrics 15. Nonstationary panels, panel cointegration, and dynamic panels (pp. 161–178). JAI Press.

Chapter   Google Scholar  

Breusch, T. S., & Pagan, A. R. (1980). The Lagrange multiplier test and its application to model specifications in econometrics. Review of Economic Studies, 47 (1), 239–253. https://doi.org/10.2307/2297111

Çakmak, E. E., & Acar, S. (2022). The nexus between economic growth, renewable energy and ecological footprint: An empirical evidence from most oil-producing countries. Journal of Cleaner Production, 352 , 131548.

Danish, B. M. A. (2017). Dynamic linkages between road transport energy consumption, economic growth, and environmental quality: evidence from Pakistan. Environmental Science and Pollution Research, 25 , 7541–7552. https://doi.org/10.1007/s11356-017-1072-1

Danish, & Ulucak, R. (2022). Analyzing energy innovation-emissions nexus in China: A novel dynamic simulation method. Energy, 244 , 123010.

Danmaraya, I. A., & Danlami, A. H. (2022). Impact of hydropower consumption, foreign direct investment and manufacturing performance on CO2 emissions in the ASEAN-4 countries. International Journal of Energy Sector Management, 16 (5), 856–875.

Dickey, D. A., & Fuller, W. A. (1979). Distribution of the estimators for autoregressive time series with a unit root. Journal of the American Statistical Association, 74 (366a), 427–431.

Damania, R., Fredriksson, P. G., & List, J. A. (2003). Trade liberalization, corruption, and environmental policy formation: Theory and evidence. Journal of Environmental Economics and Management, 46 (3), 490–512.

Danmaraya, I. A., & Danlami, A. H. (2021). Impact of hydropower consumption, foreign direct investment and manufacturing performance on CO 2 emissions in the ASEAN-4 countries. International Journal of Energy Sector Management, 16 (5), 856–875. https://doi.org/10.1108/IJESM-06-2021-0019

Demena, B. A., & Afesorgbor, S. K. (2020). The effect of FDI on environmental emissions: Evidence from a meta-analysis. Energy Policy, 138 (c), 111192.

Dinda, S. (2004). Environmental Kuznets curve hypothesis: A survey. Ecological Indicators, 49 (4), 431–455. https://doi.org/10.1016/j.ecolecon.2004.02.011

Dogan, E., & Seker, F. (2016). Determinants of CO 2 emissions in the European Union: The role of renewable and non-renewable energy. Renewable Energy, 94 , 429–439.

Dumitrescu, E. I., & Hurlin, C. (2012). Testing for Granger non-causality in heterogeneous panels. Economic Modelling, 29 (4), 1450–1460.

Farhani, S., & Shahbaz, M. (2014). What role of renewable and non-renewable electricity consumption and output is needed to initially mitigate CO2 emissions in MENA region? Renewable and Sustainable Energy Reviews, 40 , 80–90.

Fredriksson, P. G., & Neumayer, E. (2016). Corruption and climate change policies: do the bad old days matter? Environmental and resource economics, 63 , 451–469.

Fakher, H. A., Ahmed, Z., Acheampong, A. O., & Nathaniel, S. P. (2023). Renewable energy, nonrenewable energy, and environmental quality nexus: An investigation of the N-shaped environmental Kuznets curve based on six environmental indicators. Energy, 263 (A), 125660. https://doi.org/10.1016/j.energy.2022.125660

Fang, X., Nie, L., & Mu, H. (2020). Research progress on logistics network optimization under low carbon constraints. IOP Conference Series: Earth and Environmental Science, 615 . https://doi.org/10.1088/1755-1315/615/1/012060

Ganda, F. (2020). The influence of corruption on environmental sustainability in the developing economies of Southern Africa. Heliyon, 6 (7), e4416. https://doi.org/10.1016/j.heliyon.2020.e04387

Ghazouani, T. (2021). Impact of FDI inflow, crude oil prices, and economic growth on CO2 emission in Tunisia: Symmetric and asymmetric analysis through ARDL and NARDL approach. Environmental Economics, 12 (1), 1.

Goodness, C. A., & Prosper, E. E. (2017). Effect of economic growth on CO2 emission in developing countries Evidence from a dynamic panel threshold model. Cogent Economics & Finance, 5 (1), 1379239. https://doi.org/10.1080/23322039.2017.1379239

Grossman, G. M., & Kreuger, A.B. (1991). Environmental impacts of a North American free trade agreement. NBER Working Paper 3914 ,.

Google Scholar  

Grossman, G. M., & Krueger, A. B. (1995). Economic growth and the environment. The Quarterly Journal of Economics, 110 (2), 353–377. https://doi.org/10.2307/2118443

Habib, S., Abdelmonen, S., & Khaled, M. (2020). The effect of corruption on the environmental quality in African countries: A panel quantile regression analysis. Journal of the Knowledge Economy, Springer; Portland International Center for Management of Engineering and Technology (PICMET), 11 (2), 788–804.

Haug, A. A., & Ucal, M. (2019). The role of trade and FDI for CO2 emissions in Turkey: Nonlinear relationships. Energy Economics, 81 , 297–307.

Hossain, M. R., Rej, S., Awan, A., Bandyopadhyay, A., Islam, M. S., Das, N., & Hossain, M. E. (2023). Natural resource dependency and environmental sustainability under N-shaped EKC: The curious case of India. Resources Policy, 80 , 103150.

Hou, H., Feng, X., Zhang, Y., Bai, H., Ji, Y., & Xu, H. (2021). Energy-related carbon emissions mitigation potential for the construction sector in China. Environmental Impact Assessment Review, 89 , 106599.

Hwang, Y., Kim, C.-B., & Yu, C. (2023). The effect of corruption on environmental quality: Evidence from a panel of CIS countries. Journal of the Knowledge Economy . https://doi.org/10.1007/s13132-023-01236-6

Ibrahiem, D. M. (2020). Do technological innovations and financial development improve environmental quality in Egypt? Environmental Science and Pollution Research, 27 (10), 10869–10881.

IEA. (2021). Available at https://www.iea.org/data-and-statistics

Im, K. S., Pesaran, M. H., & Shin, Y. (2003). Testing for unit roots in heterogenous panel. Journal of Econometrics, 115 , 53–74.

Inglesi-Lotz, R., & Dogan, E. (2018). The role of renewable versus non-renewable energy to the level of CO 2 emissions a panel analysis of sub-Saharan Africa’s Βig 10 electricity generators. Renewable Energy, 123 (C), 36–43.

Jahanger, A., Awan, A., Anwar, A., & Adebayo, T. S. (2023, August). Greening the Brazil, Russia, India, China and South Africa (BRICS) economies: Assessing the impact of electricity consumption, natural resources, and renewable energy on environmental footprint. In Natural resources forum (Vol. 47, No. 3, pp. 484–503). Blackwell Publishing Ltd.

Jebabli, I., Lahiani, A., & Mefteh-Wali, S. (2023). Quantile connectedness between CO2 emissions and economic growth in G7 countries. Resources Policy, 81 , 103348.

Khalid, W., Özdeşer, H., & Jalil, A. (2021). An empirical analysis of inter-factor and inter-fuel substitution in the energy sector of Pakistan. Renewable Energy, 177 (c), 953–966.

Khan, Z., Ali, M., Jinyu, L., et al. (2020). Consumption-based carbon emissions and trade nexus: Evidence from nine oil exporting countries. Energy Economics, 89 . https://doi.org/10.1016/j.eneco.2020.104806

Kirikkaleli, D. (2020). New insights into an old issue: Exploring the nexus between economic growth and CO2 emissions in China. Environmental Science and Pollution Research, 27 (32), 40777–40786.

Leitão, N. C. (2021). The effects of corruption, renewable energy, trade and CO 2 emissions. Economies, 9 (2), 62. https://doi.org/10.3390/economies9020062

Levin, A., & Lin, C. F. (1992). Unit root test in panel data: Asymptotic and finite-sample properties. Discussion Paper (pp. 92–93). Department of Economics, University of California at San Diego.

Levin, A., & Lin, C. F. (1993). Unit root test in panel data: New results. Discussion Paper (pp. 93–56). Department of Economics, University of California at San Diego.

Li, B., & Haneklaus, N. (2021). The role of renewable energy, fossil fuel consumption, urbanization and economic growth on CO2 emissions in China. Energy Report, 7 , 783–793.

Li, B., & Haneklaus, N. (2022). The potential of India’s net-zero carbon emissions: Analyzing the effect of clean energy, coal, urbanization, and trade openness. Energy Reports, 8 , 724–733.

Liang, W., & Yang, M. (2019). Urbanization, economic growth and environmental pollution: Evidence from China. Sustainable Computing: Informatics and Systems, 21 , 1–9. https://doi.org/10.1016/j.suscom.2018.11.007

Liao, J., Liu, X., Zhou, X., & Tursunova, N. R. (2023). Analyzing the role of renewable energy transition and industrialization on ecological sustainability: Can green innovation matter in OECD countries. Renewable Energy, 204 , 141–151.

List, J., & Gallet, C. (1999). The environmental Kuznets curve: Does one size fit all? Ecological Economics, 31 (3), 409–423. https://doi.org/10.1016/S0921-8009(99)00064-6

Lopez, R., & Mitra, S. (2000). Corruption, pollution, and the Kuznets environment curve. Journal of Environmental Economics and Management, 40 , 137–150. https://doi.org/10.1006/jeem.1999.1107

Lv, Z., & Gao, Z. (2021). The effect of corruption on environmental performance: Does spatial dependence play a role? Economic Systems, 45 (2), 100773. https://doi.org/10.1016/j.ecosys.2020.100773

Maddala, G., & Wu, S. (1999). A comparative study of unit root tests and a new simple test. Oxford Bulletin of Economics and Statistics, 61 , 631–652.

Massagony, A., & Budiono. (2023). Is the Environmental Kuznets Curve (EKC) hypothesis valid on CO2 emissions in Indonesia? International Journal of Environmental Studies, 80 (1), 20–31.

Mongo, M., Belaïd, F., & Ramdani, B. (2021). The effects of environmental innovations on CO2 emissions: Empirical evidence from Europe. Environmental Science & Policy, 118 , 1–9.

Mukhtarov, S., Aliyev, F., Aliyev, F., & Ajayi, R. (2022). Renewable energy consumption and carbon emissions: Evidence from an oil-rich economy. Sustainability, 15 (1), 1–12.

Musah, M., Gyamfi, B. A., Kwakwa, P. A., & Agozie, D. Q. (2023). Realizing the 2050 Paris climate agreement in West Africa: The role of financial inclusion and green investments. Journal of Environmental Management, 340 , 117911.

Naeem, M. A., Appiah, M., Taden, J., Amoasi, R., & Gyamfi, B. A. (2023). Transitioning to clean energy assessing the impact of renewable energy, bio-capacity and access to clean fuel on carbon emissions in OECD economies. Energy Economics, 127 , 107091.

Namahoro, J. P., Wu, Q., Zhou, N., & Xue, S. (2021). Impact of energy intensity, renewable energy, and economic growth on CO 2 emissions: Evidence from Africa across regions and income levels. Renewable and Sustainable Energy Reviews, 147 , 111233.

Nathaniel, S. P., & Adeleye, N. (2021). Environmental preservation amidst carbon emissions, energy consumption, and urbanization in selected African countries: Implication for sustainability. Journal of Cleaner Production, 285 , 125409.

Nathaniel, S. P., & Iheonu, C. I. (2019). Carbon dioxide abatement in Africa: The role of renewable and non-renewable energy consumption. Science of the Total Environment, 679 , 337–345. https://doi.org/10.1016/j.scitotenv.2019.05.011

Njoh, A. J. (2021). Renewable energy as a determinant of inter-country differentials in CO 2 emissions in Africa. Renewable Energy, 172 , 1225–1232.

Nkemdilim, I., Ike, O. C., & Ozegbe, A. E. (2023). Corruption, environmental sustainability and economic performance in emerging economies: Evidence from Nigeria. International Journal of Management, Economics and Social Sciences, 12 (1), 52–78. https://doi.org/10.32327/IJMESS/12.1.2023.3

Nourry, M. (2007). La croissance économique est-elle un moyen de lutte contre la pollution ? Revue Française D’économie, 21 (3), 137–176.  https://www.persee.fr/doc/rfeco_0769-0479_2007_num_21_3_1605

Ohajionu, U. C., Gyamfi, B. A., Haseki, M. I., & Bekun, F. V. (2022). Assessing the linkage between energy consumption, financial development, tourism and environment: Evidence from method of moments quantile regression. Environmental Science and Pollution Research, 29 , 30004–30018. https://doi.org/10.1007/s11356-021-17920-6

Onofrei, M., Vatamanu, A. F., & Cigu, E. (2022). The relationship between economic growth and CO2 emissions in EU countries: A cointegration analysis. Frontiers in Environmental Science, 10 , 934885.

Omri, E., & Saadaoui, H. (2022). An empirical investigation of the relationships between nuclear energy, economic growth, trade openness, fossil fuels, and carbon emissions in France: Fresh evidence using asymmetric cointegration. Environmental Science and Pollution Research , 30 (5).

Omri, E., & Saadaoui, H. (2023). An empirical investigation of the relationships between nuclear energy, economic growth, trade openness, fossil fuels, and carbon emissions in France: Fresh evidence using asymmetric cointegration. Environmental Science and Pollution Research, 30 (5), 13224–13245.

Panait, M., Janjua, L. R., Apostu, S. A., & Mihăescu, C. (2023). Impact factors to reduce carbon emissions. Evidences from Latin America. Kybernetes, 52 (11), 5669–5686. https://doi.org/10.1108/K-05-2022-0712

Pata, U. K., & Caglar, A. E. (2021). Investigating the EKC hypothesis with renewable energy consumption, human capital, globalization and trade openness for China: evidence from augmented ARDL approach with a structural break. Energy, 216 , 119220.

Pata, U. K., & Samour, A. (2022). Do renewable and nuclear energy enhance environmental quality in France? A new EKC approach with the load capacity factor. Progress in Nuclear Energy, 149 , 104249.

Pata, U. K., Yilanci, V., Hussain, B., & Naqvi, S. A. A. (2022). Analyzing the role of income inequality and political stability in environmental degradation: evidence from South Asia. Gondwana Research, 107 (1), 13–29. https://doi.org/10.1016/j.gr.2022.02.009

Pedroni, P. (1999). Critical values for cointegration tests in heterogeneous panels with multiple regressors. Oxford Bulletin of Economics and statistics, 61 (S1), 653–670.

Pedroni, P. (2004). Panel cointegration: Asymptotic and finite sample properties of pooled time series tests with an application to the PPP hypothesis. Econometric Theory, 20 (2), 597–625. https://doi.org/10.1017/S0266466604203073

Perone, G. (2024). The relationship between renewable energy production and CO 2 emissions in 27 OECD countries: A panel cointegration and Granger non-causality approach. Journal of Cleaner Production, 434 , 139655. https://doi.org/10.1016/j.jclepro.2023.139655

Perrings, C., & Pearce, D. W. (1994). Threshold effects and incentives for the conservation of biodiversity. Environmental and Resource Economics, 4 , 13–28. https://doi.org/10.1007/BF00691930

Pesaran, M. H. (1997). The role of economic theory in modelling the long run. The Economic Journal, 107 (440), 178–191.

Pesaran, M. H. (2007). A simple panel unit root test in the presence of cross-section dependence. Journal of Applied Econometrics, 22 (2), 265–312. https://doi.org/10.1002/jae.951

Pesaran, M. H. (2004a). General diagnostic tests for cross section dependence in panels. University of Cambridge, Faculty of Economics, Cambridge Working Papers in Economics No. 0435. http://www.econ.cam.ac.uk/research-files/repec/cam/pdf/cwpe0435.pdf .

Pesaran, M. H. (2004b). General diagnostic tests for cross section dependence in panels. Cambridge Working Papers in Economics 0435 . Faculty of Economics, University of Cambridge.

Pesaran, M. H., Shin, Y., & Smith, R. P. (1999). Pooled mean group estimation of dynamic heterogeneous panels. Journal of the American Statistical Association, 94 (446), 621–634.

Qamruzzaman, M. (2021). Nexus between environmental quality, institutional quality and trade openness through the channel of FDI: An application of common correlated effects estimation (CCEE), NARDL, and asymmetry causality. Environmental Science and Pollution Research, 28 , 52475–52498. https://doi.org/10.1007/s11356-021-14269-8

Rahman, M. M., & Alam, K. (2022). Effects of corruption, technological innovation, globalisation, and renewable energy on carbon emissions in Asian countries. Utilities Policy, 79 (c), 101448. https://doi.org/10.1016/j.jup.2022.101448

Rahman, M. M., Saidi, K., & Mbarek, M. B. (2020). Economic growth in South Asia: the role of CO2 emissions population density and trade openness. Heliyon, 6 (5).

Raihan, A., & Tuspekova, A. (2023). The role of renewable energy and technological innovations toward achieving Iceland’s goal of carbon neutrality by 2040. Journal of Technology Innovations and Energy, 2 (1), 22–37. https://doi.org/10.56556/jtie.v2i1.421

Saadaoui, H. (2022). The impact of financial development on renewable energy development in the MENA region: The role of institutional and political factors. Environmental Science and Pollution Research, 29 , 39461–39472. https://doi.org/10.1007/s11356-022-18976-8

Saadaoui, H., & Chtourou, N. (2023). Do institutional quality, financial development, and economic growth improve renewable energy transition? Some evidence from Tunisia. Journal of the Knowledge Economy, 14 , 2927–2958. https://doi.org/10.1007/s13132-022-00999-8

Sadiq, M., Hassan, S. T., Khan, I., & Rahman, M. M. (2023). Policy uncertainty, renewable energy, corruption and CO 2 emissions nexus in BRICS-1 countries: A panel CS-ARDL approach. Environment, Development and Sustainability , 1–27. https://doi.org/10.1007/s10668-023-03546-w

Sahoo, M., & Sethi, N. (2021). The intermittent effects of renewable energy on ecological footprint: Evidence from developing countries. Environmental Science and Pollution Research, 28 , 56401–56417.

Saidi, K., & Ben Mbarek, M. (2017). Dynamic relationship between CO 2 emissions, energy consumption and economic growth in three North African countries. International Journal of Sustainable Energy, 36 (9), 840–854. https://doi.org/10.1080/14786451.2015.1102910

Saidi, K., & Hammami, S. (2015). The impact of CO 2 emissions and economic growth on energy consumption in 58 countries. Energy Reports, 1 , 62–70. https://doi.org/10.1016/j.egyr.2015.01.003

Saidi, K., & Omri, A. (2020). The impact of renewable energy on carbon emissions and economic growth in 15 major renewable energy-consuming countries. Environmental Research, 186 , 109567. https://doi.org/10.1016/j.envres.2020.109567

Salari, M., Javid, R. J., & Noghanibehambari, H. (2021). The nexus between CO2 emissions, energy consumption, and economic growth in the US. Economic Analysis and Policy, 69 , 182–194.

Seker, F., Ertugrul, H. M., & Cetin, M. (2015). The impact of foreign direct investment on environmental quality: A bounds testing and causality analysis for Turkey. Renewable and Sustainable Energy Reviews, 52 (C), 347–356. https://doi.org/10.1016/j.rser.2015.07.118

Sekrafi, H., & Sghaier, A. (2018). The effect of corruption on carbon dioxide emissions and energy consumption in Tunisia. PSU Research Review, 2 (1), 81–95. https://doi.org/10.1108/PRR-11-2016-0008

Shafik, N. (1994). Economic development and environmental quality: An econometric analysis. Oxford Economic Papers, 46 (1), 757–773. https://doi.org/10.1093/oep/46.Supplement_1.757

Shafik, N., & Bandyopadhyay, S. (1992). Economic growth and environmental quality: Time series and cross-country evidence. World development working paper WPS 904 . World Bank.

Shahbaz, M., Balsalobre-Lorente, D., & Sinha, A. (2019). Foreign direct investment–CO 2 emissions nexus in Middle East and North African countries: Importance of biomass energy consumption. Journal of Cleaner Production, 217 , 603–614. https://doi.org/10.1016/j.jclepro.2019.01.282

Shahbaz, M., Khan, S., Ali, A., & Bhattacharya, M. (2017). The impact of globalization on CO2 emissions in China. The Singapore Economic Review, 62 . https://doi.org/10.1142/S021759081740033

Sharma, R., Shahbaz, M., Sinha, A., & Vo, X. V. (2021). Examining the temporal impact of stock market development on carbon intensity: Evidence from South Asian countries. Journal of Environmental Management, 297 , 113248.

Sharma, S. S. (2011). Determinants of carbon dioxide emissions: Empirical evidence from 69 countries. Applied Energy, 88 (1), 376–382. https://doi.org/10.1016/j.apenergy.2010.07.022

Shinwari, R., Wang, Y., Maghyereh, A., & Awartani, B. (2022). Does Chinese foreign direct investment harm CO2 emissions in the belt and road economies. Environmental Science and Pollution Research, 29 (26), 39528–39544.

Shoaib, H. M., Rafique, M. Z., Nadeem, A. M., & Huang, S. (2020). Impact of financial development on CO2 emissions: A comparative analysis of developing countries (D8) and developed countries (G8). Environmental Science and Pollution Research, 27 , 12461–12475.

Sulaiman, J., Azlinda, A., & Saboori, B. (2013). The potential of renewable energy: Using the environmental Kuznets curve model. American Journal of Environmental Science, 9 (2), 103–112. https://doi.org/10.3844/ajessp.2013.103.112

Sultana, N., Rahman, M., Khanan, R., & Kkabir, Z. (2022). Environmental quality and its nexus with informal economy, corruption control, energy use, and socioeconomic aspects: The perspective of emerging economies. Heliyon, 8 (6), 1–11.

Tahir, T., Luni, T., Majeed, M. T., & Zafar, A. (2021). The impact of financial development and globalization on environmental quality: Evidence from South Asian economies. Environmental Science and Pollution Research, 28 , 8088–8101.

Tang, C. F., & Tan, B. W. (2015). The impact of energy consumption, income and foreign direct investment on carbon dioxide emissions in Vietnam. Energy, 79 , 447–454. https://doi.org/10.1016/j.energy.2014.11.033

To, A. H., Ha, D. T. T., Nguyen, H. M., & Vo, D. H. (2019). The impact of foreign direct investment on environment degradation: Evidence from emerging markets in Asia. International Journal of Environmental Research and Public Health, 16 (9), 1636.

Usman, M., Jahanger, A., Radulescu, M., & Lorente, D. B. (2022). Do nuclear energy, renewable energy, and environmentalrelated technologies asymmetrically reduce ecological footprint? Evidence from Pakistan. Energies, 15 (9), 3448. https://doi.org/10.3390/en15093448

Usman, O. (2022). Modelling the economic and scocial issues related to environmental quality in Nigeria: The role of economic growth and internal conflict. Environmental Pollution and Pollution Research, 29 , 39209–39227. https://doi.org/10.1007/s11356-021-18157-z

Usman, O., Rafndadi, A. A., & Sarkodie, S. A. (2021). Conflicts and ecological footprint in MENA countries: Implications for sustainable terrestrial ecosystem. Environment Science Pollution Research, 28 (42), 59988–59999. https://doi.org/10.1007/s11356-021-14931-1

Uzar, U. (2020). Political economy of renewable energy: Does institutional quality make a difference in renewable energy consumption? Renewable Energy, 155 , 591–603.

Wang, Q., Zhang, F., & Li, R. (2023). Free trade and carbon emissions revisited: the asymmetric impacts of trade diversification and trade openness. Sustainable Development.

Wang, Z., Zhang, B., & Wang, B. (2018). The moderating role of corruption between economic growth and CO2 emissions: evidence from BRICS economies. Energy, 148 , 506–513.

Wang, S., Zhao, D., & Chen, H. (2020). Government corruption, resource misallocation, and ecological efficiency. Energy Economics, 85 (C). https://doi.org/10.1016/j.eneco.2019.104573

WDI. (2021). Available at https://databank.worldbank.org/source/world-development-indicators

Wei, C., Ren, S., Yang, P., Wang, Y., He, X., Xu, Z., et al. (2021). Effects of irrigation methods and salinity on CO2 emissions from farmland soil during growth and fallow periods. Science of the Total Environment, 752 , 141639.

Wen, J., Mughal, N., Zhao, J., Shabbir, M. S., Niedbała, G., Jain, V., & Anwar, A. (2021). Does globalization matter for environmental degradation? Nexus among energy consumption, economic growth, and carbon dioxide emission. Energy policy, 153 , 112230.

Wilson, J. K., & Damania, R. (2005). Corruption, political competition and environmental policy. Journal of Environmental Economics and Management, 49 (3), 516–535. https://doi.org/10.1016/j.jeem.2004.06.004

Xie, P., Yang, F., Mu, Z., & Gao, S. (2020). Influencing factors of the decoupling relationship between CO2 emission and economic development in China’s power industry. Energy, 209 , 118341.

Xu, F., Huang, Q., Yue, H., et al. (2020). Reexamining the relationship between urbanization and pollutant emissions in China based on the STIRPAT model. Journal of Environmental Management, 273 , 111134. https://doi.org/10.1016/j.jenvman.2020.111134

Xue, C., Shahbaz, M., Ahmed, Z., Ahmad, M., & Sinha, A. (2022). Clean energy consumption, economic growth, and environmental sustainability: what is the role of economic policy uncertainty? Renewable Energy, 184 , 899–907.

Yahaya, N. S., Mohd-Jali, M. R., & Raji, J. O. (2020). The role of financial development and corruption in environmental degradation of Sub-Saharan African countries. Management of Environmental Quality, 31 (4), 895–913. https://doi.org/10.1108/MEQ-09-2019-0190

Yang, F., Yuan, H., & Yi, N. (2022). Natural resources, environment and the sustainable development. Urban Climate, 42 . https://doi.org/10.1016/j.uclim.2022.101111

Yousefi-Sahzabi, A., Sasaki, K., Yousefi, H., Pirasteh, S., & Sugai, Y. (2011). GIS aided prediction of CO2 emission dispersion from geothermal electricity production. Journal of Cleaner Production, 19 (17-18), 1982–1993.

Yilanci, V., & Pata, U. K. (2020). Convergence of per capita ecological footprint among the ASEAN-5 countries: Evidence from a non-linear panel unit root test. Ecological Indicators, 113 , 106178. https://doi.org/10.1016/j.ecolind.2020.106178

Zhang, M., Yang, Z., Liu, L., & Zhou, D. (2021). Impact of renewable energy investment on carbon emissions in China - An empirical study using a nonparametric additive regression model. Science of the Total Environment, 785 , 147109. https://doi.org/10.1016/j.scitotenv.2021.147109

Zhang, Y., & Zhang, S. (2018). The impacts of GDP, trade structure, exchange rate and FDI inflows on China’s carbon emissions. Energy Policy, 120 (C), 347–353. https://doi.org/10.1016/j.enpol.2018.05.056

Zhou, A., & Li, J. (2019). Heterogeneous role of renewable energy consumption in economic growth and emissions reduction: Evidence from a panel quantile regression. Environmental Science and Pollution Research, 26 (22), 22575–22595. https://doi.org/10.1007/s11356-019-05447-w

Zoundi, Z. (2017). CO2 emissions, renewable energy and the environmental Kuznets curve, a panel cointegration approach. Renewable and Sustainable Energy Reviews, 72 , 1067–1075.

Download references

Availability of Data and Material

All data is provided in the results section of this manuscript.

Author information

Authors and affiliations.

University of Sfax, Tunisia, Higher Institute of Business Administration of Sfax, Laboratory of Probability and Statistics - PB 1171, Sfax, Tunisia

Lobna Abid & Sana Kacem

University of Sfax, Tunisia, Higher Institute of Business Administration of Sfax, Laboratory of Governance, Finance and Accounting-LR 13ES19, Sfax, Tunisia

Haifa Saadaoui

University of Sfax, Tunisia, Faculty of Economics and Management of Sfax, LED, Tunisia, Sfax

Sfax, Tunisia

You can also search for this author in PubMed   Google Scholar

Contributions

The authors contributed equally to this work.

Corresponding author

Correspondence to Lobna Abid .

Ethics declarations

Ethics approval.

Not applicable

Consent to Participate

Consent for publication, competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Abid, L., Kacem, S. & Saadaoui, H. Addressing the Environmental Kuznets Curve in the West African Countries: Exploring the Roles of FDI, Corruption, and Renewable Energy. J Knowl Econ (2024). https://doi.org/10.1007/s13132-024-01858-4

Download citation

Received : 08 April 2022

Accepted : 13 February 2024

Published : 15 April 2024

DOI : https://doi.org/10.1007/s13132-024-01858-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Economic growth
  • CO2 emissions
  • Renewable energy
  • West African economies

JEL Classification

  • Find a journal
  • Publish with us
  • Track your research

IMAGES

  1. 13 Different Types of Hypothesis (2024)

    hypothesis relationship between two variables

  2. Part 2: Analysis of Relationship Between Two Variables

    hypothesis relationship between two variables

  3. PPT

    hypothesis relationship between two variables

  4. Linear Equations & Relationships between two variables

    hypothesis relationship between two variables

  5. Hypothesis testing

    hypothesis relationship between two variables

  6. PPT

    hypothesis relationship between two variables

VIDEO

  1. Variable types and hypothesis testing

  2. Covariance

  3. PRACTICAL RESEARCH 2

  4. Hypothesis Tests in Multiple Linear Regression, Part 2

  5. Hypothesis Testing 🔥 Explained in 60 Seconds

  6. Statistics

COMMENTS

  1. Research Hypothesis In Psychology: Types, & Examples

    The alternative hypothesis states a relationship exists between the two variables being studied (one variable affects the other). A hypothesis is a testable statement or prediction about the relationship between two or more variables. It is a key component of the scientific method. Some key points about hypotheses: A hypothesis expresses an ...

  2. How to Write a Strong Hypothesis

    5. Phrase your hypothesis in three ways. To identify the variables, you can write a simple prediction in if…then form. The first part of the sentence states the independent variable and the second part states the dependent variable. If a first-year student starts attending more lectures, then their exam scores will improve.

  3. What is a Research Hypothesis: How to Write it, Types, and Examples

    The hypothesis should state a predicted relationship between two or more variables that can be measured and manipulated. Improve the original draft till it is clear and meaningful. State the null hypothesis: The null hypothesis is a statement that there is no relationship between the variables you are studying.

  4. How to Write a Strong Hypothesis

    Step 5: Phrase your hypothesis in three ways. To identify the variables, you can write a simple prediction in if … then form. The first part of the sentence states the independent variable and the second part states the dependent variable. If a first-year student starts attending more lectures, then their exam scores will improve.

  5. How to Write a Great Hypothesis

    Simple hypothesis: This type of hypothesis suggests that there is a relationship between one independent variable and one dependent variable. Complex hypothesis: This type of hypothesis suggests a relationship between three or more variables, such as two independent variables and a dependent variable. Null hypothesis: This hypothesis suggests ...

  6. Choosing the Right Statistical Test

    determine whether a predictor variable has a statistically significant relationship with an outcome variable. estimate the difference between two or more groups. Statistical tests assume a null hypothesis of no relationship or no difference between groups. Then they determine whether the observed data fall outside of the range of values ...

  7. 11.2: Correlation Hypothesis Test

    The p-value is calculated using a t -distribution with n − 2 degrees of freedom. The formula for the test statistic is t = r√n − 2 √1 − r2. The value of the test statistic, t, is shown in the computer or calculator output along with the p-value. The test statistic t has the same sign as the correlation coefficient r.

  8. 5.2

    5.2 - Writing Hypotheses. The first step in conducting a hypothesis test is to write the hypothesis statements that are going to be tested. For each test you will have a null hypothesis ( H 0) and an alternative hypothesis ( H a ). When writing hypotheses there are three things that we need to know: (1) the parameter that we are testing (2) the ...

  9. What is a Hypothesis

    An alternative hypothesis is a statement that assumes there is a significant difference or relationship between variables. It is often used as an alternative to the null hypothesis and is tested against the null hypothesis to determine which statement is more accurate. Directional Hypothesis. A directional hypothesis is a statement that ...

  10. 12.1.2: Hypothesis Test for a Correlation

    The t-test is a statistical test for the correlation coefficient. It can be used when x x and y y are linearly related, the variables are random variables, and when the population of the variable y y is normally distributed. The formula for the t-test statistic is t = r ( n − 2 1 −r2)− −−−−−−−√ t = r ( n − 2 1 − r 2).

  11. Types of Research Hypotheses

    A complex hypothesis predicts the relationship between two or more independent and dependent variables. Directional Hypothesis A directional hypothesis specifies the expected direction to be followed to determine the relationship between variables. This kind of hypothesis is derived from theory, and it also implies the researcher's academic ...

  12. PDF DEVELOPING HYPOTHESIS AND RESEARCH QUESTIONS

    "A hypothesis is a conjectural statement of the relation between two or more variables". (Kerlinger, 1956) "Hypothesis is a formal statement that presents the expected relationship between an independent and dependent variable."(Creswell, 1994) "A research question is essentially a hypothesis asked in the form of a question."

  13. 10.1

    Example 10.6: Hypotheses about the relationship between Two Categorical Variables Section . Research Question: Do the odds of ... Alternative Hypothesis: There is a relationship between whether or not a person has a stroke and whether or not a person lives with a smoker (odds ratio between stroke and second-hand smoke situation is > 1). This is ...

  14. 5.1: Linear Regression and Correlation

    One is a hypothesis test, to see if there is an association between the two variables; in other words, as the \(X\) variable goes up, does the \(Y\) variable tend to change (up or down). ... The second goal of correlation and regression is estimating the strength of the relationship between two variables; in other words, how close the points on ...

  15. Relationships Between Two Variables

    Now we will study the relationship between two variables where both variables are qualitative, i.e. categorical, or quantitative. When we consider the relationship between two variables, there are three possibilities: Both variables are categorical. We analyze an association through a comparison of conditional probabilities and graphically ...

  16. Research Questions & Hypotheses

    Relationship-Based Hypothesis A relationship-based hypothesis predicts a relationship between two or more variables. It suggests that changes in one variable will correspond to changes in another, indicating a potential correlation or association. Example: "Students with higher self-efficacy will show higher levels of academic achievement."

  17. A Practical Guide to Writing Quantitative and Qualitative Research

    In quantitative research, hypotheses predict the expected relationships among variables.15 Relationships among variables that can be predicted include 1) between a single dependent variable and a single independent variable (simple hypothesis) or 2) between two or more independent and dependent variables (complex hypothesis).4,11 Hypotheses may ...

  18. What is a Research Hypothesis and How to Write a Hypothesis

    It states that there is a relationship between the two variables of the study and that the results are significant to the research topic. An experimental hypothesis predicts what changes will take place in the dependent variable when the independent variable is manipulated. ... • Simple hypothesis: Simple hypothesis predicts the relationship ...

  19. Statistical Inference for the Relationship Between Two Variables

    We hypothesize that the two variables are related, but we are reluctant to specify the direction of the relationship. Therefore, we want to test H 0: ρ =0 versus H A : ρ ≠0. Previously, we found that the sample correlation coefficient between these two variables is r =−0.09 based on a sample of size n =252 men.

  20. How to Write a Hypothesis in 6 Steps, With Examples

    2 Complex hypothesis. A complex hypothesis suggests the relationship between more than two variables, for example, two independents and one dependent, or vice versa. Examples: People who both (1) eat a lot of fatty foods and (2) have a family history of health problems are more likely to develop heart diseases.

  21. Variables and Hypotheses

    A hypothesis states a presumed relationship between two variables in a way that can be tested with empirical data. It may take the form of a cause-effect statement, or an "if x,...then y" statement. The cause is called the independent variable; and the effect is called the dependent variable. Relationships can be of several forms: linear, or ...

  22. 15 Hypothesis Examples (2024)

    The relational hypothesis describes the relation between two variables. Here, the hypothesis is that the implementation of gamified learning has a positive effect on the motivation of students. ... The simple hypothesis is a prediction about the relationship between two variables, excluding any other variables from consideration. ...

  23. What is Hypothesis

    This is the initial point of any investigation that translates the research questions into predictions. It includes components like variables, population and the relation between the variables. A research hypothesis is a hypothesis that is used to test the relationship between two or more variables.

  24. The Independent Variable vs. Dependent Variable in Research

    In any scientific research, there are typically two variables of interest: independent variables and dependent variables. In forming the backbone of scientific experiments, they help scientists understand relationships, predict outcomes and, in general, make sense of the factors that they're investigating.. Understanding the independent variable vs. dependent variable is so fundamental to ...

  25. Addressing the Environmental Kuznets Curve in the West ...

    Environmental degradation and economic growth are two intricately related issues whose impact is in constant increase within a global context marked by climate risks and corruption, notably in certain African countries. This research work examines the impacts of economic growth, corruption, renewable energy, and foreign direct investment on carbon dioxide emissions for a set of West African ...

  26. Answered: You are testing the null hypothesis…

    Ho B₁20 H₁: B₁<0 D. Ho: B₁ =0 H₁: B₁ =0. Transcribed Image Text: You are testing the null hypothesis that there is no relationship between two variables, X and Y. From your sample of n = 20, you determine that SSR 80 and SSE 20. Complete parts (a) through (e) below. a.