• Business Essentials
  • Leadership & Management
  • Credential of Leadership, Impact, and Management in Business (CLIMB)
  • Entrepreneurship & Innovation
  • Digital Transformation
  • Finance & Accounting
  • Business in Society
  • For Organizations
  • Support Portal
  • Media Coverage
  • Founding Donors
  • Leadership Team

hypothesis testing meaning in management

  • Harvard Business School →
  • HBS Online →
  • Business Insights →

Business Insights

Harvard Business School Online's Business Insights Blog provides the career insights you need to achieve your goals and gain confidence in your business skills.

  • Career Development
  • Communication
  • Decision-Making
  • Earning Your MBA
  • Negotiation
  • News & Events
  • Productivity
  • Staff Spotlight
  • Student Profiles
  • Work-Life Balance
  • AI Essentials for Business
  • Alternative Investments
  • Business Analytics
  • Business Strategy
  • Business and Climate Change
  • Design Thinking and Innovation
  • Digital Marketing Strategy
  • Disruptive Strategy
  • Economics for Managers
  • Entrepreneurship Essentials
  • Financial Accounting
  • Global Business
  • Launching Tech Ventures
  • Leadership Principles
  • Leadership, Ethics, and Corporate Accountability
  • Leading Change and Organizational Renewal
  • Leading with Finance
  • Management Essentials
  • Negotiation Mastery
  • Organizational Leadership
  • Power and Influence for Positive Impact
  • Strategy Execution
  • Sustainable Business Strategy
  • Sustainable Investing
  • Winning with Digital Platforms

A Beginner’s Guide to Hypothesis Testing in Business

Business professionals performing hypothesis testing

  • 30 Mar 2021

Becoming a more data-driven decision-maker can bring several benefits to your organization, enabling you to identify new opportunities to pursue and threats to abate. Rather than allowing subjective thinking to guide your business strategy, backing your decisions with data can empower your company to become more innovative and, ultimately, profitable.

If you’re new to data-driven decision-making, you might be wondering how data translates into business strategy. The answer lies in generating a hypothesis and verifying or rejecting it based on what various forms of data tell you.

Below is a look at hypothesis testing and the role it plays in helping businesses become more data-driven.

Access your free e-book today.

What Is Hypothesis Testing?

To understand what hypothesis testing is, it’s important first to understand what a hypothesis is.

A hypothesis or hypothesis statement seeks to explain why something has happened, or what might happen, under certain conditions. It can also be used to understand how different variables relate to each other. Hypotheses are often written as if-then statements; for example, “If this happens, then this will happen.”

Hypothesis testing , then, is a statistical means of testing an assumption stated in a hypothesis. While the specific methodology leveraged depends on the nature of the hypothesis and data available, hypothesis testing typically uses sample data to extrapolate insights about a larger population.

Hypothesis Testing in Business

When it comes to data-driven decision-making, there’s a certain amount of risk that can mislead a professional. This could be due to flawed thinking or observations, incomplete or inaccurate data , or the presence of unknown variables. The danger in this is that, if major strategic decisions are made based on flawed insights, it can lead to wasted resources, missed opportunities, and catastrophic outcomes.

The real value of hypothesis testing in business is that it allows professionals to test their theories and assumptions before putting them into action. This essentially allows an organization to verify its analysis is correct before committing resources to implement a broader strategy.

As one example, consider a company that wishes to launch a new marketing campaign to revitalize sales during a slow period. Doing so could be an incredibly expensive endeavor, depending on the campaign’s size and complexity. The company, therefore, may wish to test the campaign on a smaller scale to understand how it will perform.

In this example, the hypothesis that’s being tested would fall along the lines of: “If the company launches a new marketing campaign, then it will translate into an increase in sales.” It may even be possible to quantify how much of a lift in sales the company expects to see from the effort. Pending the results of the pilot campaign, the business would then know whether it makes sense to roll it out more broadly.

Related: 9 Fundamental Data Science Skills for Business Professionals

Key Considerations for Hypothesis Testing

1. alternative hypothesis and null hypothesis.

In hypothesis testing, the hypothesis that’s being tested is known as the alternative hypothesis . Often, it’s expressed as a correlation or statistical relationship between variables. The null hypothesis , on the other hand, is a statement that’s meant to show there’s no statistical relationship between the variables being tested. It’s typically the exact opposite of whatever is stated in the alternative hypothesis.

For example, consider a company’s leadership team that historically and reliably sees $12 million in monthly revenue. They want to understand if reducing the price of their services will attract more customers and, in turn, increase revenue.

In this case, the alternative hypothesis may take the form of a statement such as: “If we reduce the price of our flagship service by five percent, then we’ll see an increase in sales and realize revenues greater than $12 million in the next month.”

The null hypothesis, on the other hand, would indicate that revenues wouldn’t increase from the base of $12 million, or might even decrease.

Check out the video below about the difference between an alternative and a null hypothesis, and subscribe to our YouTube channel for more explainer content.

2. Significance Level and P-Value

Statistically speaking, if you were to run the same scenario 100 times, you’d likely receive somewhat different results each time. If you were to plot these results in a distribution plot, you’d see the most likely outcome is at the tallest point in the graph, with less likely outcomes falling to the right and left of that point.

distribution plot graph

With this in mind, imagine you’ve completed your hypothesis test and have your results, which indicate there may be a correlation between the variables you were testing. To understand your results' significance, you’ll need to identify a p-value for the test, which helps note how confident you are in the test results.

In statistics, the p-value depicts the probability that, assuming the null hypothesis is correct, you might still observe results that are at least as extreme as the results of your hypothesis test. The smaller the p-value, the more likely the alternative hypothesis is correct, and the greater the significance of your results.

3. One-Sided vs. Two-Sided Testing

When it’s time to test your hypothesis, it’s important to leverage the correct testing method. The two most common hypothesis testing methods are one-sided and two-sided tests , or one-tailed and two-tailed tests, respectively.

Typically, you’d leverage a one-sided test when you have a strong conviction about the direction of change you expect to see due to your hypothesis test. You’d leverage a two-sided test when you’re less confident in the direction of change.

Business Analytics | Become a data-driven leader | Learn More

4. Sampling

To perform hypothesis testing in the first place, you need to collect a sample of data to be analyzed. Depending on the question you’re seeking to answer or investigate, you might collect samples through surveys, observational studies, or experiments.

A survey involves asking a series of questions to a random population sample and recording self-reported responses.

Observational studies involve a researcher observing a sample population and collecting data as it occurs naturally, without intervention.

Finally, an experiment involves dividing a sample into multiple groups, one of which acts as the control group. For each non-control group, the variable being studied is manipulated to determine how the data collected differs from that of the control group.

A Beginner's Guide to Data and Analytics | Access Your Free E-Book | Download Now

Learn How to Perform Hypothesis Testing

Hypothesis testing is a complex process involving different moving pieces that can allow an organization to effectively leverage its data and inform strategic decisions.

If you’re interested in better understanding hypothesis testing and the role it can play within your organization, one option is to complete a course that focuses on the process. Doing so can lay the statistical and analytical foundation you need to succeed.

Do you want to learn more about hypothesis testing? Explore Business Analytics —one of our online business essentials courses —and download our Beginner’s Guide to Data & Analytics .

hypothesis testing meaning in management

About the Author

  • Hypothesis Testing: Definition, Uses, Limitations + Examples

busayo.longe

Hypothesis testing is as old as the scientific method and is at the heart of the research process. 

Research exists to validate or disprove assumptions about various phenomena. The process of validation involves testing and it is in this context that we will explore hypothesis testing. 

What is a Hypothesis? 

A hypothesis is a calculated prediction or assumption about a population parameter based on limited evidence. The whole idea behind hypothesis formulation is testing—this means the researcher subjects his or her calculated assumption to a series of evaluations to know whether they are true or false. 

Typically, every research starts with a hypothesis—the investigator makes a claim and experiments to prove that this claim is true or false . For instance, if you predict that students who drink milk before class perform better than those who don’t, then this becomes a hypothesis that can be confirmed or refuted using an experiment.  

Read: What is Empirical Research Study? [Examples & Method]

What are the Types of Hypotheses? 

1. simple hypothesis.

Also known as a basic hypothesis, a simple hypothesis suggests that an independent variable is responsible for a corresponding dependent variable. In other words, an occurrence of the independent variable inevitably leads to an occurrence of the dependent variable. 

Typically, simple hypotheses are considered as generally true, and they establish a causal relationship between two variables. 

Examples of Simple Hypothesis  

  • Drinking soda and other sugary drinks can cause obesity. 
  • Smoking cigarettes daily leads to lung cancer.

2. Complex Hypothesis

A complex hypothesis is also known as a modal. It accounts for the causal relationship between two independent variables and the resulting dependent variables. This means that the combination of the independent variables leads to the occurrence of the dependent variables . 

Examples of Complex Hypotheses  

  • Adults who do not smoke and drink are less likely to develop liver-related conditions.
  • Global warming causes icebergs to melt which in turn causes major changes in weather patterns.

3. Null Hypothesis

As the name suggests, a null hypothesis is formed when a researcher suspects that there’s no relationship between the variables in an observation. In this case, the purpose of the research is to approve or disapprove this assumption. 

Examples of Null Hypothesis

  • This is no significant change in a student’s performance if they drink coffee or tea before classes. 
  • There’s no significant change in the growth of a plant if one uses distilled water only or vitamin-rich water. 
Read: Research Report: Definition, Types + [Writing Guide]

4. Alternative Hypothesis 

To disapprove a null hypothesis, the researcher has to come up with an opposite assumption—this assumption is known as the alternative hypothesis. This means if the null hypothesis says that A is false, the alternative hypothesis assumes that A is true. 

An alternative hypothesis can be directional or non-directional depending on the direction of the difference. A directional alternative hypothesis specifies the direction of the tested relationship, stating that one variable is predicted to be larger or smaller than the null value while a non-directional hypothesis only validates the existence of a difference without stating its direction. 

Examples of Alternative Hypotheses  

  • Starting your day with a cup of tea instead of a cup of coffee can make you more alert in the morning. 
  • The growth of a plant improves significantly when it receives distilled water instead of vitamin-rich water. 

5. Logical Hypothesis

Logical hypotheses are some of the most common types of calculated assumptions in systematic investigations. It is an attempt to use your reasoning to connect different pieces in research and build a theory using little evidence. In this case, the researcher uses any data available to him, to form a plausible assumption that can be tested. 

Examples of Logical Hypothesis

  • Waking up early helps you to have a more productive day. 
  • Beings from Mars would not be able to breathe the air in the atmosphere of the Earth. 

6. Empirical Hypothesis  

After forming a logical hypothesis, the next step is to create an empirical or working hypothesis. At this stage, your logical hypothesis undergoes systematic testing to prove or disprove the assumption. An empirical hypothesis is subject to several variables that can trigger changes and lead to specific outcomes. 

Examples of Empirical Testing 

  • People who eat more fish run faster than people who eat meat.
  • Women taking vitamin E grow hair faster than those taking vitamin K.

7. Statistical Hypothesis

When forming a statistical hypothesis, the researcher examines the portion of a population of interest and makes a calculated assumption based on the data from this sample. A statistical hypothesis is most common with systematic investigations involving a large target audience. Here, it’s impossible to collect responses from every member of the population so you have to depend on data from your sample and extrapolate the results to the wider population. 

Examples of Statistical Hypothesis  

  • 45% of students in Louisiana have middle-income parents. 
  • 80% of the UK’s population gets a divorce because of irreconcilable differences.

What is Hypothesis Testing? 

Hypothesis testing is an assessment method that allows researchers to determine the plausibility of a hypothesis. It involves testing an assumption about a specific population parameter to know whether it’s true or false. These population parameters include variance, standard deviation, and median. 

Typically, hypothesis testing starts with developing a null hypothesis and then performing several tests that support or reject the null hypothesis. The researcher uses test statistics to compare the association or relationship between two or more variables. 

Explore: Research Bias: Definition, Types + Examples

Researchers also use hypothesis testing to calculate the coefficient of variation and determine if the regression relationship and the correlation coefficient are statistically significant.

How Hypothesis Testing Works

The basis of hypothesis testing is to examine and analyze the null hypothesis and alternative hypothesis to know which one is the most plausible assumption. Since both assumptions are mutually exclusive, only one can be true. In other words, the occurrence of a null hypothesis destroys the chances of the alternative coming to life, and vice-versa. 

Interesting: 21 Chrome Extensions for Academic Researchers in 2021

What Are The Stages of Hypothesis Testing?  

To successfully confirm or refute an assumption, the researcher goes through five (5) stages of hypothesis testing; 

  • Determine the null hypothesis
  • Specify the alternative hypothesis
  • Set the significance level
  • Calculate the test statistics and corresponding P-value
  • Draw your conclusion
  • Determine the Null Hypothesis

Like we mentioned earlier, hypothesis testing starts with creating a null hypothesis which stands as an assumption that a certain statement is false or implausible. For example, the null hypothesis (H0) could suggest that different subgroups in the research population react to a variable in the same way. 

  • Specify the Alternative Hypothesis

Once you know the variables for the null hypothesis, the next step is to determine the alternative hypothesis. The alternative hypothesis counters the null assumption by suggesting the statement or assertion is true. Depending on the purpose of your research, the alternative hypothesis can be one-sided or two-sided. 

Using the example we established earlier, the alternative hypothesis may argue that the different sub-groups react differently to the same variable based on several internal and external factors. 

  • Set the Significance Level

Many researchers create a 5% allowance for accepting the value of an alternative hypothesis, even if the value is untrue. This means that there is a 0.05 chance that one would go with the value of the alternative hypothesis, despite the truth of the null hypothesis. 

Something to note here is that the smaller the significance level, the greater the burden of proof needed to reject the null hypothesis and support the alternative hypothesis.

Explore: What is Data Interpretation? + [Types, Method & Tools]
  • Calculate the Test Statistics and Corresponding P-Value 

Test statistics in hypothesis testing allow you to compare different groups between variables while the p-value accounts for the probability of obtaining sample statistics if your null hypothesis is true. In this case, your test statistics can be the mean, median and similar parameters. 

If your p-value is 0.65, for example, then it means that the variable in your hypothesis will happen 65 in100 times by pure chance. Use this formula to determine the p-value for your data: 

hypothesis testing meaning in management

  • Draw Your Conclusions

After conducting a series of tests, you should be able to agree or refute the hypothesis based on feedback and insights from your sample data.  

Applications of Hypothesis Testing in Research

Hypothesis testing isn’t only confined to numbers and calculations; it also has several real-life applications in business, manufacturing, advertising, and medicine. 

In a factory or other manufacturing plants, hypothesis testing is an important part of quality and production control before the final products are approved and sent out to the consumer. 

During ideation and strategy development, C-level executives use hypothesis testing to evaluate their theories and assumptions before any form of implementation. For example, they could leverage hypothesis testing to determine whether or not some new advertising campaign, marketing technique, etc. causes increased sales. 

In addition, hypothesis testing is used during clinical trials to prove the efficacy of a drug or new medical method before its approval for widespread human usage. 

What is an Example of Hypothesis Testing?

An employer claims that her workers are of above-average intelligence. She takes a random sample of 20 of them and gets the following results: 

Mean IQ Scores: 110

Standard Deviation: 15 

Mean Population IQ: 100

Step 1: Using the value of the mean population IQ, we establish the null hypothesis as 100.

Step 2: State that the alternative hypothesis is greater than 100.

Step 3: State the alpha level as 0.05 or 5% 

Step 4: Find the rejection region area (given by your alpha level above) from the z-table. An area of .05 is equal to a z-score of 1.645.

Step 5: Calculate the test statistics using this formula

hypothesis testing meaning in management

Z = (110–100) ÷ (15÷√20) 

10 ÷ 3.35 = 2.99 

If the value of the test statistics is higher than the value of the rejection region, then you should reject the null hypothesis. If it is less, then you cannot reject the null. 

In this case, 2.99 > 1.645 so we reject the null. 

Importance/Benefits of Hypothesis Testing 

The most significant benefit of hypothesis testing is it allows you to evaluate the strength of your claim or assumption before implementing it in your data set. Also, hypothesis testing is the only valid method to prove that something “is or is not”. Other benefits include: 

  • Hypothesis testing provides a reliable framework for making any data decisions for your population of interest. 
  • It helps the researcher to successfully extrapolate data from the sample to the larger population. 
  • Hypothesis testing allows the researcher to determine whether the data from the sample is statistically significant. 
  • Hypothesis testing is one of the most important processes for measuring the validity and reliability of outcomes in any systematic investigation. 
  • It helps to provide links to the underlying theory and specific research questions.

Criticism and Limitations of Hypothesis Testing

Several limitations of hypothesis testing can affect the quality of data you get from this process. Some of these limitations include: 

  • The interpretation of a p-value for observation depends on the stopping rule and definition of multiple comparisons. This makes it difficult to calculate since the stopping rule is subject to numerous interpretations, plus “multiple comparisons” are unavoidably ambiguous. 
  • Conceptual issues often arise in hypothesis testing, especially if the researcher merges Fisher and Neyman-Pearson’s methods which are conceptually distinct. 
  • In an attempt to focus on the statistical significance of the data, the researcher might ignore the estimation and confirmation by repeated experiments.
  • Hypothesis testing can trigger publication bias, especially when it requires statistical significance as a criterion for publication.
  • When used to detect whether a difference exists between groups, hypothesis testing can trigger absurd assumptions that affect the reliability of your observation.

Logo

Connect to Formplus, Get Started Now - It's Free!

  • alternative hypothesis
  • alternative vs null hypothesis
  • complex hypothesis
  • empirical hypothesis
  • hypothesis testing
  • logical hypothesis
  • simple hypothesis
  • statistical hypothesis
  • busayo.longe

Formplus

You may also like:

Internal Validity in Research: Definition, Threats, Examples

In this article, we will discuss the concept of internal validity, some clear examples, its importance, and how to test it.

hypothesis testing meaning in management

Alternative vs Null Hypothesis: Pros, Cons, Uses & Examples

We are going to discuss alternative hypotheses and null hypotheses in this post and how they work in research.

Type I vs Type II Errors: Causes, Examples & Prevention

This article will discuss the two different types of errors in hypothesis testing and how you can prevent them from occurring in your research

What is Pure or Basic Research? + [Examples & Method]

Simple guide on pure or basic research, its methods, characteristics, advantages, and examples in science, medicine, education and psychology

Formplus - For Seamless Data Collection

Collect data the right way with a versatile data collection tool. try formplus and transform your work productivity today..

lls-logo-main

  • Guide: Hypothesis Testing

Author's Avatar

Daniel Croft

Daniel Croft is an experienced continuous improvement manager with a Lean Six Sigma Black Belt and a Bachelor's degree in Business Management. With more than ten years of experience applying his skills across various industries, Daniel specializes in optimizing processes and improving efficiency. His approach combines practical experience with a deep understanding of business fundamentals to drive meaningful change.

  • Last Updated: September 8, 2023
  • Learn Lean Sigma

In the world of data-driven decision-making, Hypothesis Testing stands as a cornerstone methodology. It serves as the statistical backbone for a multitude of sectors, from manufacturing and logistics to healthcare and finance. But what exactly is Hypothesis Testing, and why is it so indispensable? Simply put, it’s a technique that allows you to validate or invalidate claims about a population based on sample data. Whether you’re looking to streamline a manufacturing process, optimize logistics, or improve customer satisfaction, Hypothesis Testing offers a structured approach to reach conclusive, data-supported decisions.

The graphical example above provides a simplified snapshot of a hypothesis test. The bell curve represents a normal distribution, the green area is where you’d accept the null hypothesis ( H 0​), and the red area is the “rejection zone,” where you’d favor the alternative hypothesis ( Ha ​). The vertical blue line represents the threshold value or “critical value,” beyond which you’d reject H 0​.

Example of Hypothesis Testing

Here’s a graphical example of a hypothesis test, which you can include in the introduction section of your guide. In this graph:

  • The curve represents a standard normal distribution, often encountered in hypothesis tests.
  • The green-shaded area signifies the “Acceptance Region,” where you would fail to reject the null hypothesis ( H 0​).
  • The red-shaded areas are the “Rejection Regions,” where you would reject H 0​ in favor of the alternative hypothesis ( Ha ​).
  • The blue dashed lines indicate the “Critical Values” (±1.96), which are the thresholds for rejecting H 0​.

This graphical representation serves as a conceptual foundation for understanding the mechanics of hypothesis testing. It visually illustrates what it means to accept or reject a hypothesis based on a predefined level of significance.

Table of Contents

What is hypothesis testing.

Hypothesis testing is a structured procedure in statistics used for drawing conclusions about a larger population based on a subset of that population, known as a sample. The method is widely used across different industries and sectors for a variety of purposes. Below, we’ll dissect the key components of hypothesis testing to provide a more in-depth understanding.

The Hypotheses: H 0 and Ha

In every hypothesis test, there are two competing statements:

  • Null Hypothesis ( H 0) : This is the “status quo” hypothesis that you are trying to test against. It is a statement that asserts that there is no effect or difference. For example, in a manufacturing setting, the null hypothesis might state that a new production process does not improve the average output quality.
  • Alternative Hypothesis ( Ha or H 1) : This is what you aim to prove by conducting the hypothesis test. It is the statement that there is an effect or difference. Using the same manufacturing example, the alternative hypothesis might state that the new process does improve the average output quality.

Significance Level ( α )

Before conducting the test, you decide on a “Significance Level” ( α ), typically set at 0.05 or 5%. This level represents the probability of rejecting the null hypothesis when it is actually true. Lower α values make the test more stringent, reducing the chances of a ‘false positive’.

Data Collection

You then proceed to gather data, which is usually a sample from a larger population. The quality of your test heavily relies on how well this sample represents the population. The data can be collected through various means such as surveys, observations, or experiments.

Statistical Test

Depending on the nature of the data and what you’re trying to prove, different statistical tests can be applied (e.g., t-test, chi-square test , ANOVA , etc.). These tests will compute a test statistic (e.g., t , 2 χ 2, F , etc.) based on your sample data.

hypothesis-testing-chart Types

Here are graphical examples of the distributions commonly used in three different types of statistical tests: t-test, Chi-square test, and ANOVA (Analysis of Variance), displayed side by side for comparison.

  • Graph 1 (Leftmost): This graph represents a t-distribution, often used in t-tests. The t-distribution is similar to the normal distribution but tends to have heavier tails. It is commonly used when the sample size is small or the population variance is unknown.

Chi-square Test

  • Graph 2 (Middle): The Chi-square distribution is used in Chi-square tests, often for testing independence or goodness-of-fit. Unlike the t-distribution, the Chi-square distribution is not symmetrical and only takes on positive values.

ANOVA (F-distribution)

  • Graph 3 (Rightmost): The F-distribution is used in Analysis of Variance (ANOVA), a statistical test used to analyze the differences between group means. Like the Chi-square distribution, the F-distribution is also not symmetrical and takes only positive values.

These visual representations provide an intuitive understanding of the different statistical tests and their underlying distributions. Knowing which test to use and when is crucial for conducting accurate and meaningful hypothesis tests.

Decision Making

The test statistic is then compared to a critical value determined by the significance level ( α ) and the sample size. This comparison will give you a p-value. If the p-value is less than α , you reject the null hypothesis in favor of the alternative hypothesis. Otherwise, you fail to reject the null hypothesis.

Interpretation

Finally, you interpret the results in the context of what you were investigating. Rejecting the null hypothesis might mean implementing a new process or strategy, while failing to reject it might lead to a continuation of current practices.

To sum it up, hypothesis testing is not just a set of formulas but a methodical approach to problem-solving and decision-making based on data. It’s a crucial tool for anyone interested in deriving meaningful insights from data to make informed decisions.

Why is Hypothesis Testing Important?

Hypothesis testing is a cornerstone of statistical and empirical research, serving multiple functions in various fields. Let’s delve into each of the key areas where hypothesis testing holds significant importance:

Data-Driven Decisions

In today’s complex business environment, making decisions based on gut feeling or intuition is not enough; you need data to back up your choices. Hypothesis testing serves as a rigorous methodology for making decisions based on data. By setting up a null hypothesis and an alternative hypothesis, you can use statistical methods to determine which is more likely to be true given a data sample. This structured approach eliminates guesswork and adds empirical weight to your decisions, thereby increasing their credibility and effectiveness.

Risk Management

Hypothesis testing allows you to assign a ‘p-value’ to your findings, which is essentially the probability of observing the given sample data if the null hypothesis is true. This p-value can be directly used to quantify risk. For instance, a p-value of 0.05 implies there’s a 5% risk of rejecting the null hypothesis when it’s actually true. This is invaluable in scenarios like product launches or changes in operational processes, where understanding the risk involved can be as crucial as the decision itself.

Here’s an example to help you understand the concept better.

hypothesis-testing-risk example

The graph above serves as a graphical representation to help explain the concept of a ‘p-value’ and its role in quantifying risk in hypothesis testing. Here’s how to interpret the graph:

Elements of the Graph

  • The curve represents a Standard Normal Distribution , which is often used to represent z-scores in hypothesis testing.
  • The red-shaded area on the right represents the Rejection Region . It corresponds to a 5% risk ( α =0.05) of rejecting the null hypothesis when it is actually true. This is the area where, if your test statistic falls, you would reject the null hypothesis.
  • The green-shaded area represents the Acceptance Region , with a 95% level of confidence. If your test statistic falls in this region, you would fail to reject the null hypothesis.
  • The blue dashed line is the Critical Value (approximately 1.645 in this example). If your standardized test statistic (z-value) exceeds this point, you enter the rejection region, and your p-value becomes less than 0.05, leading you to reject the null hypothesis.

Relating to Risk Management

The p-value can be directly related to risk management. For example, if you’re considering implementing a new manufacturing process, the p-value quantifies the risk of that decision. A low p-value (less than α ) would mean that the risk of rejecting the null hypothesis (i.e., going ahead with the new process) when it’s actually true is low, thus indicating a lower risk in implementing the change.

Quality Control

In sectors like manufacturing, automotive, and logistics, maintaining a high level of quality is not just an option but a necessity. Hypothesis testing is often employed in quality assurance and control processes to test whether a certain process or product conforms to standards. For example, if a car manufacturing line claims its error rate is below 5%, hypothesis testing can confirm or disprove this claim based on a sample of products. This ensures that quality is not compromised and that stakeholders can trust the end product.

Resource Optimization

Resource allocation is a significant challenge for any organization. Hypothesis testing can be a valuable tool in determining where resources will be most effectively utilized. For instance, in a manufacturing setting, you might want to test whether a new piece of machinery significantly increases production speed. A hypothesis test could provide the statistical evidence needed to decide whether investing in more of such machinery would be a wise use of resources.

In the realm of research and development, hypothesis testing can be a game-changer. When developing a new product or process, you’ll likely have various theories or hypotheses. Hypothesis testing allows you to systematically test these, filtering out the less likely options and focusing on the most promising ones. This not only speeds up the innovation process but also makes it more cost-effective by reducing the likelihood of investing in ideas that are statistically unlikely to be successful.

In summary, hypothesis testing is a versatile tool that adds rigor, reduces risk, and enhances the decision-making and innovation processes across various sectors and functions.

This graphical representation makes it easier to grasp how the p-value is used to quantify the risk involved in making a decision based on a hypothesis test.

Step-by-Step Guide to Hypothesis Testing

To make this guide practical and helpful if you are new learning about the concept we will explain each step of the process and follow it up with an example of the method being applied to a manufacturing line, and you want to test if a new process reduces the average time it takes to assemble a product.

Step 1: State the Hypotheses

The first and foremost step in hypothesis testing is to clearly define your hypotheses. This sets the stage for your entire test and guides the subsequent steps, from data collection to decision-making. At this stage, you formulate two competing hypotheses:

Null Hypothesis ( H 0)

The null hypothesis is a statement that there is no effect or no difference, and it serves as the hypothesis that you are trying to test against. It’s the default assumption that any kind of effect or difference you suspect is not real, and is due to chance. Formulating a clear null hypothesis is crucial, as your statistical tests will be aimed at challenging this hypothesis.

In a manufacturing context, if you’re testing whether a new assembly line process has reduced the time it takes to produce an item, your null hypothesis ( H 0) could be:

H 0:”The new process does not reduce the average assembly time.”

Alternative Hypothesis ( Ha or H 1)

The alternative hypothesis is what you want to prove. It is a statement that there is an effect or difference. This hypothesis is considered only after you find enough evidence against the null hypothesis.

Continuing with the manufacturing example, the alternative hypothesis ( Ha ) could be:

Ha :”The new process reduces the average assembly time.”

Types of Alternative Hypothesis

Depending on what exactly you are trying to prove, the alternative hypothesis can be:

  • Two-Sided : You’re interested in deviations in either direction (greater or smaller).
  • One-Sided : You’re interested in deviations only in one direction (either greater or smaller).

one way and two way hypothesis test

Scenario: Reducing Assembly Time in a Car Manufacturing Plant

You are a continuous improvement manager at a car manufacturing plant. One of the assembly lines has been struggling with longer assembly times, affecting the overall production schedule. A new assembly process has been proposed, promising to reduce the assembly time per car. Before rolling it out on the entire line, you decide to conduct a hypothesis test to see if the new process actually makes a difference. Null Hypothesis ( H 0​) In this context, the null hypothesis would be the status quo, asserting that the new assembly process doesn’t reduce the assembly time per car. Mathematically, you could state it as: H 0:The average assembly time per car with the new process ≥ The average assembly time per car with the old process. Or simply: H 0​:”The new process does not reduce the average assembly time per car.” Alternative Hypothesis ( Ha ​ or H 1​) The alternative hypothesis is what you aim to prove — that the new process is more efficient. Mathematically, it could be stated as: Ha :The average assembly time per car with the new process < The average assembly time per car with the old process Or simply: Ha ​:”The new process reduces the average assembly time per car.” Types of Alternative Hypothesis In this example, you’re only interested in knowing if the new process reduces the time, making it a One-Sided Alternative Hypothesis .

Step 2: Determine the Significance Level ( α )

Once you’ve clearly stated your null and alternative hypotheses, the next step is to decide on the significance level, often denoted by α . The significance level is a threshold below which the null hypothesis will be rejected. It quantifies the level of risk you’re willing to accept when making a decision based on the hypothesis test.

What is a Significance Level?

The significance level, usually expressed as a percentage, represents the probability of rejecting the null hypothesis when it is actually true. Common choices for α are 0.05, 0.01, and 0.10, representing 5%, 1%, and 10% levels of significance, respectively.

  • 5% Significance Level ( α =0.05) : This is the most commonly used level and implies that you are willing to accept a 5% chance of rejecting the null hypothesis when it is true.
  • 1% Significance Level ( α =0.01) : This is a more stringent level, used when you want to be more sure of your decision. The risk of falsely rejecting the null hypothesis is reduced to 1%.
  • 10% Significance Level ( α =0.10) : This is a more lenient level, used when you are willing to take a higher risk. Here, the chance of falsely rejecting the null hypothesis is 10%.

Continuing with the manufacturing example, let’s say you decide to set α =0.05, meaning you’re willing to take a 5% risk of concluding that the new process is effective when it might not be.

How to Choose the Right Significance Level?

Choosing the right significance level depends on the context and the consequences of making a wrong decision. Here are some factors to consider:

  • Criticality of Decision : For highly critical decisions with severe consequences if wrong, a lower α like 0.01 may be appropriate.
  • Resource Constraints : If the cost of collecting more data is high, you may choose a higher α to make a decision based on a smaller sample size.
  • Industry Standards : Sometimes, the choice of α may be dictated by industry norms or regulatory guidelines.

By the end of Step 2, you should have a well-defined significance level that will guide the rest of your hypothesis testing process. This level serves as the cut-off for determining whether the observed effect or difference in your sample is statistically significant or not.

Continuing the Scenario: Reducing Assembly Time in a Car Manufacturing Plant

After formulating the hypotheses, the next step is to set the significance level ( α ) that will be used to interpret the results of the hypothesis test. This is a critical decision as it quantifies the level of risk you’re willing to accept when making a conclusion based on the test. Setting the Significance Level Given that assembly time is a critical factor affecting the production schedule, and ultimately, the company’s bottom line, you decide to be fairly stringent in your test. You opt for a commonly used significance level: α = 0.05 This means you are willing to accept a 5% chance of rejecting the null hypothesis when it is actually true. In practical terms, if you find that the p-value of the test is less than 0.05, you will conclude that the new process significantly reduces assembly time and consider implementing it across the entire line. Why α = 0.05 ? Industry Standard : A 5% significance level is widely accepted in many industries, including manufacturing, for hypothesis testing. Risk Management : By setting  α = 0.05 , you’re limiting the risk of concluding that the new process is effective when it may not be to just 5%. Balanced Approach : This level offers a balance between being too lenient (e.g., α=0.10) and too stringent (e.g., α=0.01), making it a reasonable choice for this scenario.

Step 3: Collect and Prepare the Data

After stating your hypotheses and setting the significance level, the next vital step is data collection. The data you collect serves as the basis for your hypothesis test, so it’s essential to gather accurate and relevant data.

Types of Data

Depending on your hypothesis, you’ll need to collect either:

  • Quantitative Data : Numerical data that can be measured. Examples include height, weight, and temperature.
  • Qualitative Data : Categorical data that represent characteristics. Examples include colors, gender, and material types.

Data Collection Methods

Various methods can be used to collect data, such as:

  • Surveys and Questionnaires : Useful for collecting qualitative data and opinions.
  • Observation : Collecting data through direct or participant observation.
  • Experiments : Especially useful in scientific research where control over variables is possible.
  • Existing Data : Utilizing databases, records, or any other data previously collected.

Sample Size

The sample size ( n ) is another crucial factor. A larger sample size generally gives more accurate results, but it’s often constrained by resources like time and money. The choice of sample size might also depend on the statistical test you plan to use.

Continuing with the manufacturing example, suppose you decide to collect data on the assembly time of 30 randomly chosen products, 15 made using the old process and 15 made using the new process. Here, your sample size n =30.

Data Preparation

Once data is collected, it often needs to be cleaned and prepared for analysis. This could involve:

  • Removing Outliers : Outliers can skew the results and provide an inaccurate picture.
  • Data Transformation : Converting data into a format suitable for statistical analysis.
  • Data Coding : Categorizing or labeling data, necessary for qualitative data.

By the end of Step 3, you should have a dataset that is ready for statistical analysis. This dataset should be representative of the population you’re interested in and prepared in a way that makes it suitable for hypothesis testing.

With the hypotheses stated and the significance level set, you’re now ready to collect the data that will serve as the foundation for your hypothesis test. Given that you’re testing a change in a manufacturing process, the data will most likely be quantitative, representing the assembly time of cars produced on the line. Data Collection Plan You decide to use a Random Sampling Method for your data collection. For two weeks, assembly times for randomly selected cars will be recorded: one week using the old process and another week using the new process. Your aim is to collect data for 40 cars from each process, giving you a sample size ( n ) of 80 cars in total. Types of Data Quantitative Data : In this case, you’re collecting numerical data representing the assembly time in minutes for each car. Data Preparation Data Cleaning : Once the data is collected, you’ll need to inspect it for any anomalies or outliers that could skew your results. For example, if a significant machine breakdown happened during one of the weeks, you may need to adjust your data or collect more. Data Transformation : Given that you’re dealing with time, you may not need to transform your data, but it’s something to consider, depending on the statistical test you plan to use. Data Coding : Since you’re dealing with quantitative data in this scenario, coding is likely unnecessary unless you’re planning to categorize assembly times into bins (e.g., ‘fast’, ‘medium’, ‘slow’) for some reason. Example Data Points: Car_ID Process_Type Assembly_Time_Minutes 1 Old 38.53 2 Old 35.80 3 Old 36.96 4 Old 39.48 5 Old 38.74 6 Old 33.05 7 Old 36.90 8 Old 34.70 9 Old 34.79 … … … The complete dataset would contain 80 rows: 40 for the old process and 40 for the new process.

Step 4: Conduct the Statistical Test

After you have your hypotheses, significance level, and collected data, the next step is to actually perform the statistical test. This step involves calculations that will lead to a test statistic, which you’ll then use to make your decision regarding the null hypothesis.

Choose the Right Test

The first task is to decide which statistical test to use. The choice depends on several factors:

  • Type of Data : Quantitative or Qualitative
  • Sample Size : Large or Small
  • Number of Groups or Categories : One-sample, Two-sample, or Multiple groups

For instance, you might choose a t-test for comparing means of two groups when you have a small sample size. Chi-square tests are often used for categorical data, and ANOVA is used for comparing means across more than two groups.

Calculation of Test Statistic

Once you’ve chosen the appropriate statistical test, the next step is to calculate the test statistic. This involves using the sample data in a specific formula for the chosen test.

formular

Obtain the p-value

After calculating the test statistic, the next step is to find the p-value associated with it. The p-value represents the probability of observing the given test statistic if the null hypothesis is true.

  • A small p-value (< α ) indicates strong evidence against the null hypothesis, so you reject the null hypothesis.
  • A large p-value (> α ) indicates weak evidence against the null hypothesis, so you fail to reject the null hypothesis.

Make the Decision

You now compare the p-value to the predetermined significance level ( α ):

  • If p < α , you reject the null hypothesis in favor of the alternative hypothesis.
  • If p > α , you fail to reject the null hypothesis.

In the manufacturing case, if your calculated p-value is 0.03 and your α is 0.05, you would reject the null hypothesis, concluding that the new process effectively reduces the average assembly time.

By the end of Step 4, you will have either rejected or failed to reject the null hypothesis, providing a statistical basis for your decision-making process.

Now that you have collected and prepared your data, the next step is to conduct the actual statistical test to evaluate the null and alternative hypotheses. In this case, you’ll be comparing the mean assembly times between cars produced using the old and new processes to determine if the new process is statistically significantly faster. Choosing the Right Test Given that you have two sets of independent samples (old process and new process), a Two-sample t-test for Equality of Means seems appropriate for comparing the average assembly times. Preparing Data for Minitab Firstly, you would prepare your data in an Excel sheet or CSV file with one column for the assembly times using the old process and another column for the assembly times using the new process. Import this file into Minitab. Steps to Perform the Two-sample t-test in Minitab Open Minitab : Launch the Minitab software on your computer. Import Data : Navigate to File > Open and import your data file. Navigate to the t-test Menu : Go to Stat > Basic Statistics > 2-Sample t... . Select Columns : In the dialog box, specify the columns corresponding to the old and new process assembly times under “Sample 1” and “Sample 2.” Options : Click on Options and make sure that you set the confidence level to 95% (which corresponds to α = 0.05 ). Run the Test : Click OK to run the test. In this example output, the p-value is 0.0012, which is less than the significance level α = 0.05 . Hence, you would reject the null hypothesis. The t-statistic is -3.45, indicating that the mean of the new process is statistically significantly less than the mean of the old process, which aligns with your alternative hypothesis. Showing the data displayed as a Box plot in the below graphic it is easy to see the new process is statistically significantly better.

Box plot output from the hypothesis test

Why do a Hypothesis test?

You might ask, after all this why do a hypothesis test and not just look at the averages, which is a good question. While looking at average times might give you a general idea of which process is faster, hypothesis testing provides several advantages that a simple comparison of averages doesn’t offer:

Statistical Significance

Account for Random Variability : Hypothesis testing considers not just the averages, but also the variability within each group. This allows you to make more robust conclusions that account for random chance.

Quantify the Evidence : With hypothesis testing, you obtain a p-value that quantifies the strength of the evidence against the null hypothesis. A simple comparison of averages doesn’t provide this level of detail.

Control Type I Error : Hypothesis testing allows you to control the probability of making a Type I error (i.e., rejecting a true null hypothesis). This is particularly useful in settings where the consequences of such an error could be costly or risky.

Quantify Risk : Hypothesis testing provides a structured way to make decisions based on a predefined level of risk (the significance level, α ).

Decision-making Confidence

Objective Decision Making : The formal structure of hypothesis testing provides an objective framework for decision-making. This is especially useful in a business setting where decisions often have to be justified to stakeholders.

Replicability : The statistical rigor ensures that the results are replicable. Another team could perform the same test and expect to get similar results, which is not necessarily the case when comparing only averages.

Additional Insights

Understanding of Variability : Hypothesis testing often involves looking at measures of spread and distribution, not just the mean. This can offer additional insights into the processes you’re comparing.

Basis for Further Analysis : Once you’ve performed a hypothesis test, you can often follow it up with other analyses (like confidence intervals for the difference in means, or effect size calculations) that offer more detailed information.

In summary, while comparing averages is quicker and simpler, hypothesis testing provides a more reliable, nuanced, and objective basis for making data-driven decisions.

Step 5: Interpret the Results and Make Conclusions

Having conducted the statistical test and obtained the p-value, you’re now at a stage where you can interpret these results in the context of the problem you’re investigating. This step is crucial for transforming the statistical findings into actionable insights.

Interpret the p-value

The p-value you obtained tells you the significance of your results:

  • Low p-value ( p < α ) : Indicates that the results are statistically significant, and it’s unlikely that the observed effects are due to random chance. In this case, you generally reject the null hypothesis.
  • High p-value ( p > α ) : Indicates that the results are not statistically significant, and the observed effects could well be due to random chance. Here, you generally fail to reject the null hypothesis.

Relate to Real-world Context

You should then relate these statistical conclusions to the real-world context of your problem. This is where your expertise in your specific field comes into play.

In our manufacturing example, if you’ve found a statistically significant reduction in assembly time with a p-value of 0.03 (which is less than the α level of 0.05), you can confidently conclude that the new manufacturing process is more efficient. You might then consider implementing this new process across the entire assembly line.

Make Recommendations

Based on your conclusions, you can make recommendations for action or further study. For example:

  • Implement Changes : If the test results are significant, consider making the changes on a larger scale.
  • Further Research : If the test results are not clear or not significant, you may recommend further studies or data collection.
  • Review Methodology : If you find that the results are not as expected, it might be useful to review the methodology and see if the test was conducted under the right conditions and with the right test parameters.

Document the Findings

Lastly, it’s essential to document all the steps taken, the methodology used, the data collected, and the conclusions drawn. This documentation is not only useful for any further studies but also for auditing purposes or for stakeholders who may need to understand the process and the findings.

By the end of Step 5, you’ll have turned the raw statistical findings into meaningful conclusions and actionable insights. This is the final step in the hypothesis testing process, making it a complete, robust method for informed decision-making.

You’ve successfully conducted the hypothesis test and found strong evidence to reject the null hypothesis in favor of the alternative: The new assembly process is statistically significantly faster than the old one. It’s now time to interpret these results in the context of your business operations and make actionable recommendations. Interpretation of Results Statistical Significance : The p-value of 0.0012 is well below the significance level of = 0.05   α = 0.05 , indicating that the results are statistically significant. Practical Significance : The boxplot and t-statistic (-3.45) suggest not just statistical, but also practical significance. The new process appears to be both consistently and substantially faster. Risk Assessment : The low p-value allows you to reject the null hypothesis with a high degree of confidence, meaning the risk of making a Type I error is minimal. Business Implications Increased Productivity : Implementing the new process could lead to an increase in the number of cars produced, thereby enhancing productivity. Cost Savings : Faster assembly time likely translates to lower labor costs. Quality Control : Consider monitoring the quality of cars produced under the new process closely to ensure that the speedier assembly does not compromise quality. Recommendations Implement New Process : Given the statistical and practical significance of the findings, recommend implementing the new process across the entire assembly line. Monitor and Adjust : Implement a control phase where the new process is monitored for both speed and quality. This could involve additional hypothesis tests or control charts. Communicate Findings : Share the results and recommendations with stakeholders through a formal presentation or report, emphasizing both the statistical rigor and the potential business benefits. Review Resource Allocation : Given the likely increase in productivity, assess if resources like labor and parts need to be reallocated to optimize the workflow further.

By following this step-by-step guide, you’ve journeyed through the rigorous yet enlightening process of hypothesis testing. From stating clear hypotheses to interpreting the results, each step has paved the way for making informed, data-driven decisions that can significantly impact your projects, business, or research.

Hypothesis testing is more than just a set of formulas or calculations; it’s a holistic approach to problem-solving that incorporates context, statistics, and strategic decision-making. While the process may seem daunting at first, each step serves a crucial role in ensuring that your conclusions are both statistically sound and practically relevant.

  • McKenzie, C.R., 2004. Hypothesis testing and evaluation .  Blackwell handbook of judgment and decision making , pp.200-219.
  • Park, H.M., 2015. Hypothesis testing and statistical power of a test.
  • Eberhardt, L.L., 2003. What should we do about hypothesis testing? .  The Journal of wildlife management , pp.241-247.

Q: What is hypothesis testing in the context of Lean Six Sigma?

A: Hypothesis testing is a statistical method used in Lean Six Sigma to determine whether there is enough evidence in a sample of data to infer that a certain condition holds true for the entire population. In the Lean Six Sigma process, it’s commonly used to validate the effectiveness of process improvements by comparing performance metrics before and after changes are implemented. A null hypothesis ( H 0 ​ ) usually represents no change or effect, while the alternative hypothesis ( H 1 ​ ) indicates a significant change or effect.

Q: How do I determine which statistical test to use for my hypothesis?

A: The choice of statistical test for hypothesis testing depends on several factors, including the type of data (nominal, ordinal, interval, or ratio), the sample size, the number of samples (one sample, two samples, paired), and whether the data distribution is normal. For example, a t-test is used for comparing the means of two groups when the data is normally distributed, while a Chi-square test is suitable for categorical data to test the relationship between two variables. It’s important to choose the right test to ensure the validity of your hypothesis testing results.

Q: What is a p-value, and how does it relate to hypothesis testing?

A: A p-value is a probability value that helps you determine the significance of your results in hypothesis testing. It represents the likelihood of obtaining a result at least as extreme as the one observed during the test, assuming that the null hypothesis is true. In hypothesis testing, if the p-value is lower than the predetermined significance level (commonly α = 0.05 ), you reject the null hypothesis, suggesting that the observed effect is statistically significant. If the p-value is higher, you fail to reject the null hypothesis, indicating that there is not enough evidence to support the alternative hypothesis.

Q: Can you explain Type I and Type II errors in hypothesis testing?

A: Type I and Type II errors are potential errors that can occur in hypothesis testing. A Type I error, also known as a “false positive,” occurs when the null hypothesis is true, but it is incorrectly rejected. It is equivalent to a false alarm. On the other hand, a Type II error, or a “false negative,” happens when the null hypothesis is false, but it is erroneously failed to be rejected. This means a real effect or difference was missed. The risk of a Type I error is represented by the significance level ( α ), while the risk of a Type II error is denoted by β . Minimizing these errors is crucial for the reliability of hypothesis tests in continuous improvement projects.

Picture of Daniel Croft

Daniel Croft is a seasoned continuous improvement manager with a Black Belt in Lean Six Sigma. With over 10 years of real-world application experience across diverse sectors, Daniel has a passion for optimizing processes and fostering a culture of efficiency. He's not just a practitioner but also an avid learner, constantly seeking to expand his knowledge. Outside of his professional life, Daniel has a keen Investing, statistics and knowledge-sharing, which led him to create the website learnleansigma.com, a platform dedicated to Lean Six Sigma and process improvement insights.

Free Lean Six Sigma Templates

Improve your Lean Six Sigma projects with our free templates. They're designed to make implementation and management easier, helping you achieve better results.

Join us on Linked In

Other Guides

Hypothesis Testing in Business Analytics – A Beginner’s Guide

img

Introduction  

Organizations must understand how their decisions can impact the business in this data-driven age. Hypothesis testing enables organizations to analyze and examine their decisions’ causes and effects before making important management decisions. Based on research by the Harvard Business School Online, prior to making any decision, organizations like to explore the advantages of hypothesis testing and the investigation of decisions in a proper “laboratory” setting. By performing such tests, organizations can be more confident with their decisions. Read on to learn all about hypothesis testing , o ne of the essential concepts in Business Analytics.  

What Is Hypothesis Testing?  

To learn about hypothesis testing, it is crucial that you first understand what the term hypothesis is.   

A hypothesis statement or hypothesis tries to explain why something happened or what may happen under specific conditions. A hypothesis can also help understand how various variables are connected to each other. These are generally compiled as if-then statements; for example, “If something specific were to happen, then a specific condition will come true and vice versa.” Thus, the hypothesis is an arithmetical method of testing a hypothesis or an assumption that has been stated in the hypothesis.  

Turning into a decision-maker who is driven by data can add several advantages to an organization, such as allowing one to recognize new opportunities to follow and reducing the number of threats. In analytics, a hypothesis is nothing but an assumption or a supposition made about a specific population parameter, such as any measurement or quantity about the population that is set and that can be used as a value to the distribution variable. General examples of parameters used in hypothesis testing are variance and mean. In simpler words, hypothesis testing in business analytics is a method that helps researchers, scientists, or anyone for that matter, test the legitimacy or the authenticity of their hypotheses or claims about real-life or real-world events.  

To understand the example of hypothesis testing in business analytics, consider a restaurant owner interested in learning how adding extra house sauce to their chicken burgers can impact customer satisfaction. Or, you could also consider a social media marketing organization. A hypothesis test can be set up to explain how an increase in labor impacts productivity. Thus, hypothesis testing aims to discover the connection between two or more than two variables in the experimental setting.  

How Does Hypothesis Testing Work?  

Generally, each research begins with a hypothesis; the investigator makes a certain claim and experiments to prove that the claim is false or true. For example, if you claim that students drinking milk before class accomplish tasks better than those who do not, then this is a kind of hypothesis that can be refuted or confirmed using an experiment. There are different kinds of hypotheses. They are:  

  • Simple Hypothesis : Simple hypothesis, also known as a basic hypothesis, proposes that an independent variable is accountable for the corresponding dependent variable. In simpler words, the occurrence of independent variable results in the existence of the dependent variable. Generally, simple hypotheses are thought of as true and they create a causal relationship between the two variables. One example of a simple hypothesis is smoking cigarettes daily leads to cancer.  
  • Complex Hypothesis : This type of hypothesis is also termed a modal. It holds for the relationship between two variables that are independent and result in a dependent variable. This means that the amalgamation of independent variables results in the dependent variables. An example of this kind of hypothesis can be “adults who don’t drink and smoke are less likely to have liver-related problems.  
  • Null Hypothesis : A null hypothesis is created when a researcher thinks that there is no connection between the variables that are being observed. An example of this kind of hypothesis can be “A student’s performance is not impacted if they drink tea or coffee before classes.  
  • Alternative Hypothesis : If a researcher wants to disapprove of a null hypothesis, then the researcher has to develop an opposite assumption—known as an alternative hypothesis. For example, beginning your day with tea instead of coffee can keep you more alert.  
  • Logical Hypothesis: A proposed explanation supported by scant data is called a logical hypothesis. Generally, you wish to test your hypotheses or postulations by converting a logical hypothesis into an empirical hypothesis. For example, waking early helps one to have a productive day.  
  • Empirical Hypothesis : This type of hypothesis is based on real evidence, evidence that is verifiable by observation as opposed to something that is correct in theory or by some kind of reckoning or logic. This kind of hypothesis depends on various variables that can result in specific outcomes. For example, individuals eating more fish can run faster than those eating meat.   
  • Statistical Hypothesis : This kind of hypothesis is most common in systematic investigations that involve a huge target audience. For example, in Louisiana, 45% of students have middle-income parents.  

Four Steps of Hypothesis Testing  

There are four main steps in hypothesis testing in business analytics :  

Step 1: State the Null and Alternate Hypothesis  

After the initial research hypothesis, it is essential to restate it as a null (Ho) hypothesis and an alternate (Ha) hypothesis so that it can be tested mathematically.  

Step 2: Collate Data  

For a test to be valid, it is essential to do some sampling and collate data in a manner designed to test the hypothesis. If your data are not representative, then statistical inferences cannot be made about the population you are trying to analyze.  

Step 3: Perform a Statistical Test  

Various statistical tests are present, but all of them depend on the contrast of within-group variance (how to spread out the data in a group) against between-group variance (how dissimilar the groups are from one another).  

Step 4: Decide to Reject or Accept Your Null Hypothesis  

Based on the result of your statistical test, you need to decide whether you want to accept or reject your null hypothesis.  

Hypothesis Testing in Business   

When we talk about data-driven decision-making, a specific amount of risk can deceive a professional. This could result from flawed observations or thinking inaccurate or incomplete information , or unknown variables. The threat over here is that if key strategic decisions are made on incorrect insights, it can lead to catastrophic outcomes for an organization. The actual importance of hypothesis testing is that it enables professionals to analyze their assumptions and theories before putting them into action. This enables an organization to confirm the accuracy of its analysis before making key decisions.  

Key Considerations for Hypothesis Testing  

Let us look at the following key considerations of hypothesis testing:  

  • Alternative Hypothesis and Null Hypothesis : If a researcher wants to disapprove of a null hypothesis, then the researcher has to develop an opposite assumption—known as an alternative hypothesis. A null hypothesis is created when a researcher thinks that there is no connection between the variables that are being observed.  
  • Significance Level and P-Value : The statistical significance level is generally expressed as a p-value that lies between 0 and 1. The lesser the p-value, the more it suggests that you reject the null hypothesis. A p-value of less than 0.05 (generally ≤ 0.05) is significant statistically.  
  • One-Sided vs. Two-Sided Testing : One-sided tests suggest the possibility of an effect in a single direction only. Two-sided tests test for the likelihood of the effect in two directions—negative and positive. One-sided tests comprise more statistical power to identify an effect in a single direction than a two-sided test with the same significance level and design.   
  • Sampling: For hypothesis testing , you are required to collate a sample of data that has to be examined. In hypothesis testing, an analyst can test a statistical sample with the aim of providing proof of the credibility of the null hypothesis. Statistical analysts can test a hypothesis by examining and measuring a random sample of the population that is being examined.  

Real-World Example of Hypothesis Testing  

The following two examples give a glimpse of the various situations in which hypothesis testing is used in real-world scenarios.  

Example: BioSciences  

Hypothesis tests are frequently used in biological sciences. For example, consider that a biologist is sure that a certain kind of fertilizer will lead to better growth of plants which is at present 10 inches. To test this, the fertilizer is sprayed on the plants in the laboratory for a month. A hypothesis test is then done using the following:  

  • H0: μ = 10 inches (the fertilizer has no effect on the plant growth)  
  • HA: μ > 10 inches (the fertilizer leads to an increase in plant growth)  

Suppose the p-value is lesser than the significance level (e.g., α = .04). In that case, the null hypothesis can be rejected, and it can be concluded that the fertilizer results in increased plant growth.  

Example: Clinical Trials  

Consider an example where a doctor feels that a new medicine can decrease blood sugar in patients. To confirm this, he can measure the sugar of 20 diabetic patients prior to and after administering the new drug for a month. A hypothesis test is then done using the following:  

  • H0: μafter = μbefore (the blood sugar is the same as before and after administering the new drug)  
  • HA: μafter < μbefore (the blood sugar is less after the drug)  

If the p-value is less than the significance level (e.g., α = .04), then the null hypothesis can be rejected, and it can be proven that the new drug leads to reduced blood sugar.  

Conclusion  

Now you are aware of the need for hypotheses in Business Analytics . A hypothesis is not just an assumption— it has to be based on prior knowledge and theories. It also needs to be, which means that you can accept or reject it using scientific research methods (such as observations, experiments, and statistical data analysis). Most genuine Hypothesis testing programs teach you how to use hypothesis testing in real-world scenarios. If you are interested in getting a certificate degree in Integrated Program In Business Analytics , UNext Jigsaw is highly recommended.

 width=

Fill in the details to know more

facebook

PEOPLE ALSO READ

staffing pyramid, Understanding the Staffing Pyramid!

Related Articles

hypothesis testing meaning in management

Understanding the Staffing Pyramid!

May 15, 2023

 width=

From The Eyes Of Emerging Technologies: IPL Through The Ages

April 29, 2023

img

Understanding HR Terminologies!

April 24, 2023

HR, How Does HR Work in an Organization?

How Does HR Work in an Organization?

Measurment Maturity Model, A Brief Overview: Measurement Maturity Model!

A Brief Overview: Measurement Maturity Model!

April 20, 2023

HR Analytics, HR Analytics: Use Cases and Examples

HR Analytics: Use Cases and Examples

, What Are SOC and NOC In Cyber Security? What&#8217;s the Difference?

What Are SOC and NOC In Cyber Security? What’s the Difference?

February 27, 2023

hypothesis testing meaning in management

Fundamentals of Confidence Interval in Statistics!

February 26, 2023

hypothesis testing meaning in management

A Brief Introduction to Cyber Security Analytics

, Cyber Safe Behaviour In Banking Systems

Cyber Safe Behaviour In Banking Systems

February 17, 2023

img

Everything Best Of Analytics for 2023: 7 Must Read Articles!

December 26, 2022

, Best of 2022: 5 Most Popular Cybersecurity Blogs Of The Year

Best of 2022: 5 Most Popular Cybersecurity Blogs Of The Year

December 22, 2022

, 10 Reasons Why Business Analytics Is Important In Digital Age

10 Reasons Why Business Analytics Is Important In Digital Age

February 28, 2023

hypothesis testing meaning in management

Bivariate Analysis: Beginners Guide | UNext

November 18, 2022

, Everything You Need to Know About Hypothesis Tests: Chi-Square

Everything You Need to Know About Hypothesis Tests: Chi-Square

November 17, 2022

, Everything You Need to Know About Hypothesis Tests: Chi-Square, ANOVA

Everything You Need to Know About Hypothesis Tests: Chi-Square, ANOVA

November 15, 2022

share

Are you ready to build your own career?

arrow

Query? Ask Us

hypothesis testing meaning in management

Enter Your Details ×

  • Tools and Resources
  • Customer Services
  • Business Education
  • Business Law
  • Business Policy and Strategy
  • Entrepreneurship
  • Human Resource Management
  • Information Systems
  • International Business
  • Negotiations and Bargaining
  • Operations Management
  • Organization Theory
  • Organizational Behavior
  • Problem Solving and Creativity
  • Research Methods
  • Social Issues
  • Technology and Innovation Management
  • Share This Facebook LinkedIn Twitter

Article contents

Hypothesis testing in business administration.

  • Rand R. Wilcox Rand R. Wilcox Department of Psychology, University of Southern California
  • https://doi.org/10.1093/acrefore/9780190224851.013.279
  • Published online: 27 August 2020

Hypothesis testing is an approach to statistical inference that is routinely taught and used. It is based on a simple idea: develop some relevant speculation about the population of individuals or things under study and determine whether data provide reasonably strong empirical evidence that the hypothesis is wrong. Consider, for example, two approaches to advertising a product. A study might be conducted to determine whether it is reasonable to assume that both approaches are equally effective. A Type I error is rejecting this speculation when in fact it is true. A Type II error is failing to reject when the speculation is false. A common practice is to test hypotheses with the type I error probability set to 0.05 and to declare that there is a statistically significant result if the hypothesis is rejected.

There are various concerns about, limitations to, and criticisms of this approach. One criticism is the use of the term significant . Consider the goal of comparing the means of two populations of individuals. Saying that a result is significant suggests that the difference between the means is large and important. But in the context of hypothesis testing it merely means that there is empirical evidence that the means are not equal. Situations can and do arise where a result is declared significant, but the difference between the means is trivial and unimportant. Indeed, the goal of testing the hypothesis that two means are equal has been criticized based on the argument that surely the means differ at some decimal place. A simple way of dealing with this issue is to reformulate the goal. Rather than testing for equality, determine whether it is reasonable to make a decision about which group has the larger mean. The components of hypothesis-testing techniques can be used to address this issue with the understanding that the goal of testing some hypothesis has been replaced by the goal of determining whether a decision can be made about which group has the larger mean.

Another aspect of hypothesis testing that has seen considerable criticism is the notion of a p -value. Suppose some hypothesis is rejected with the Type I error probability set to 0.05. This leaves open the issue of whether the hypothesis would be rejected with Type I error probability set to 0.025 or 0.01. A p -value is the smallest Type I error probability for which the hypothesis is rejected. When comparing means, a p -value reflects the strength of the empirical evidence that a decision can be made about which has the larger mean. A concern about p -values is that they are often misinterpreted. For example, a small p -value does not necessarily mean that a large or important difference exists. Another common mistake is to conclude that if the p -value is close to zero, there is a high probability of rejecting the hypothesis again if the study is replicated. The probability of rejecting again is a function of the extent that the hypothesis is not true, among other things. Because a p -value does not directly reflect the extent the hypothesis is false, it does not provide a good indication of whether a second study will provide evidence to reject it.

Confidence intervals are closely related to hypothesis-testing methods. Basically, they are intervals that contain unknown quantities with some specified probability. For example, a goal might be to compute an interval that contains the difference between two population means with probability 0.95. Confidence intervals can be used to determine whether some hypothesis should be rejected. Clearly, confidence intervals provide useful information not provided by testing hypotheses and computing a p -value. But an argument for a p -value is that it provides a perspective on the strength of the empirical evidence that a decision can be made about the relative magnitude of the parameters of interest. For example, to what extent is it reasonable to decide whether the first of two groups has the larger mean? Even if a compelling argument can be made that p -values should be completely abandoned in favor of confidence intervals, there are situations where p -values provide a convenient way of developing reasonably accurate confidence intervals. Another argument against p -values is that because they are misinterpreted by some, they should not be used. But if this argument is accepted, it follows that confidence intervals should be abandoned because they are often misinterpreted as well.

Classic hypothesis-testing methods for comparing means and studying associations assume sampling is from a normal distribution. A fundamental issue is whether nonnormality can be a source of practical concern. Based on hundreds of papers published during the last 50 years, the answer is an unequivocal Yes. Granted, there are situations where nonnormality is not a practical concern, but nonnormality can have a substantial negative impact on both Type I and Type II errors. Fortunately, there is a vast literature describing how to deal with known concerns. Results based solely on some hypothesis-testing approach have clear implications about methods aimed at computing confidence intervals. Nonnormal distributions that tend to generate outliers are one source for concern. There are effective methods for dealing with outliers, but technically sound techniques are not obvious based on standard training. Skewed distributions are another concern. The combination of what are called bootstrap methods and robust estimators provides techniques that are particularly effective for dealing with nonnormality and outliers.

Classic methods for comparing means and studying associations also assume homoscedasticity. When comparing means, this means that groups are assumed to have the same amount of variance even when the means of the groups differ. Violating this assumption can have serious negative consequences in terms of both Type I and Type II errors, particularly when the normality assumption is violated as well. There is vast literature describing how to deal with this issue in a technically sound manner.

  • hypothesis testing
  • significance
  • confidence intervals
  • nonnormality
  • bootstrap methods
  • robust estimators

You do not currently have access to this article

Please login to access the full content.

Access to the full content requires a subscription

Printed from Oxford Research Encyclopedias, Business and Management. Under the terms of the licence agreement, an individual user may print out a single article for personal use (for details see Privacy Policy and Legal Notice).

date: 16 May 2024

  • Cookie Policy
  • Privacy Policy
  • Legal Notice
  • Accessibility
  • [66.249.64.20|185.66.14.133]
  • 185.66.14.133

Character limit 500 /500

hypothesis testing meaning in management

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

S.3 hypothesis testing.

In reviewing hypothesis tests, we start first with the general idea. Then, we keep returning to the basic procedures of hypothesis testing, each time adding a little more detail.

The general idea of hypothesis testing involves:

  • Making an initial assumption.
  • Collecting evidence (data).
  • Based on the available evidence (data), deciding whether to reject or not reject the initial assumption.

Every hypothesis test — regardless of the population parameter involved — requires the above three steps.

Example S.3.1

Is normal body temperature really 98.6 degrees f section  .

Consider the population of many, many adults. A researcher hypothesized that the average adult body temperature is lower than the often-advertised 98.6 degrees F. That is, the researcher wants an answer to the question: "Is the average adult body temperature 98.6 degrees? Or is it lower?" To answer his research question, the researcher starts by assuming that the average adult body temperature was 98.6 degrees F.

Then, the researcher went out and tried to find evidence that refutes his initial assumption. In doing so, he selects a random sample of 130 adults. The average body temperature of the 130 sampled adults is 98.25 degrees.

Then, the researcher uses the data he collected to make a decision about his initial assumption. It is either likely or unlikely that the researcher would collect the evidence he did given his initial assumption that the average adult body temperature is 98.6 degrees:

  • If it is likely , then the researcher does not reject his initial assumption that the average adult body temperature is 98.6 degrees. There is not enough evidence to do otherwise.
  • either the researcher's initial assumption is correct and he experienced a very unusual event;
  • or the researcher's initial assumption is incorrect.

In statistics, we generally don't make claims that require us to believe that a very unusual event happened. That is, in the practice of statistics, if the evidence (data) we collected is unlikely in light of the initial assumption, then we reject our initial assumption.

Example S.3.2

Criminal trial analogy section  .

One place where you can consistently see the general idea of hypothesis testing in action is in criminal trials held in the United States. Our criminal justice system assumes "the defendant is innocent until proven guilty." That is, our initial assumption is that the defendant is innocent.

In the practice of statistics, we make our initial assumption when we state our two competing hypotheses -- the null hypothesis ( H 0 ) and the alternative hypothesis ( H A ). Here, our hypotheses are:

  • H 0 : Defendant is not guilty (innocent)
  • H A : Defendant is guilty

In statistics, we always assume the null hypothesis is true . That is, the null hypothesis is always our initial assumption.

The prosecution team then collects evidence — such as finger prints, blood spots, hair samples, carpet fibers, shoe prints, ransom notes, and handwriting samples — with the hopes of finding "sufficient evidence" to make the assumption of innocence refutable.

In statistics, the data are the evidence.

The jury then makes a decision based on the available evidence:

  • If the jury finds sufficient evidence — beyond a reasonable doubt — to make the assumption of innocence refutable, the jury rejects the null hypothesis and deems the defendant guilty. We behave as if the defendant is guilty.
  • If there is insufficient evidence, then the jury does not reject the null hypothesis . We behave as if the defendant is innocent.

In statistics, we always make one of two decisions. We either "reject the null hypothesis" or we "fail to reject the null hypothesis."

Errors in Hypothesis Testing Section  

Did you notice the use of the phrase "behave as if" in the previous discussion? We "behave as if" the defendant is guilty; we do not "prove" that the defendant is guilty. And, we "behave as if" the defendant is innocent; we do not "prove" that the defendant is innocent.

This is a very important distinction! We make our decision based on evidence not on 100% guaranteed proof. Again:

  • If we reject the null hypothesis, we do not prove that the alternative hypothesis is true.
  • If we do not reject the null hypothesis, we do not prove that the null hypothesis is true.

We merely state that there is enough evidence to behave one way or the other. This is always true in statistics! Because of this, whatever the decision, there is always a chance that we made an error .

Let's review the two types of errors that can be made in criminal trials:

Table S.3.2 shows how this corresponds to the two types of errors in hypothesis testing.

Note that, in statistics, we call the two types of errors by two different  names -- one is called a "Type I error," and the other is called  a "Type II error." Here are the formal definitions of the two types of errors:

There is always a chance of making one of these errors. But, a good scientific study will minimize the chance of doing so!

Making the Decision Section  

Recall that it is either likely or unlikely that we would observe the evidence we did given our initial assumption. If it is likely , we do not reject the null hypothesis. If it is unlikely , then we reject the null hypothesis in favor of the alternative hypothesis. Effectively, then, making the decision reduces to determining "likely" or "unlikely."

In statistics, there are two ways to determine whether the evidence is likely or unlikely given the initial assumption:

  • We could take the " critical value approach " (favored in many of the older textbooks).
  • Or, we could take the " P -value approach " (what is used most often in research, journal articles, and statistical software).

In the next two sections, we review the procedures behind each of these two approaches. To make our review concrete, let's imagine that μ is the average grade point average of all American students who major in mathematics. We first review the critical value approach for conducting each of the following three hypothesis tests about the population mean $\mu$:

In Practice

  • We would want to conduct the first hypothesis test if we were interested in concluding that the average grade point average of the group is more than 3.
  • We would want to conduct the second hypothesis test if we were interested in concluding that the average grade point average of the group is less than 3.
  • And, we would want to conduct the third hypothesis test if we were only interested in concluding that the average grade point average of the group differs from 3 (without caring whether it is more or less than 3).

Upon completing the review of the critical value approach, we review the P -value approach for conducting each of the above three hypothesis tests about the population mean \(\mu\). The procedures that we review here for both approaches easily extend to hypothesis tests about any other population parameter.

Shipping Your Product in Iterations: A Guide to Hypothesis Testing

Glancing at the App Store on any phone will reveal that most installed apps have had updates released within the last week. Software products today are shipped in iterations to validate assumptions and hypotheses about what makes the product experience better for users.

Shipping Your Product in Iterations: A Guide to Hypothesis Testing

By Kumara Raghavendra

Kumara has successfully delivered high-impact products in various industries ranging from eCommerce, healthcare, travel, and ride-hailing.

PREVIOUSLY AT

A look at the App Store on any phone will reveal that most installed apps have had updates released within the last week. A website visit after a few weeks might show some changes in the layout, user experience, or copy.

Today, software is shipped in iterations to validate assumptions and the product hypothesis about what makes a better user experience. At any given time, companies like booking.com (where I worked before) run hundreds of A/B tests on their sites for this very purpose.

For applications delivered over the internet, there is no need to decide on the look of a product 12-18 months in advance, and then build and eventually ship it. Instead, it is perfectly practical to release small changes that deliver value to users as they are being implemented, removing the need to make assumptions about user preferences and ideal solutions—for every assumption and hypothesis can be validated by designing a test to isolate the effect of each change.

In addition to delivering continuous value through improvements, this approach allows a product team to gather continuous feedback from users and then course-correct as needed. Creating and testing hypotheses every couple of weeks is a cheaper and easier way to build a course-correcting and iterative approach to creating product value .

What Is Hypothesis Testing in Product Management?

While shipping a feature to users, it is imperative to validate assumptions about design and features in order to understand their impact in the real world.

This validation is traditionally done through product hypothesis testing , during which the experimenter outlines a hypothesis for a change and then defines success. For instance, if a data product manager at Amazon has a hypothesis that showing bigger product images will raise conversion rates, then success is defined by higher conversion rates.

One of the key aspects of hypothesis testing is the isolation of different variables in the product experience in order to be able to attribute success (or failure) to the changes made. So, if our Amazon product manager had a further hypothesis that showing customer reviews right next to product images would improve conversion, it would not be possible to test both hypotheses at the same time. Doing so would result in failure to properly attribute causes and effects; therefore, the two changes must be isolated and tested individually.

Thus, product decisions on features should be backed by hypothesis testing to validate the performance of features.

Different Types of Hypothesis Testing

A/b testing.

A/B testing in product hypothesis testing

One of the most common use cases to achieve hypothesis validation is randomized A/B testing, in which a change or feature is released at random to one-half of users (A) and withheld from the other half (B). Returning to the hypothesis of bigger product images improving conversion on Amazon, one-half of users will be shown the change, while the other half will see the website as it was before. The conversion will then be measured for each group (A and B) and compared. In case of a significant uplift in conversion for the group shown bigger product images, the conclusion would be that the original hypothesis was correct, and the change can be rolled out to all users.

Multivariate Testing

Multivariate testing in product hypothesis testing

Ideally, each variable should be isolated and tested separately so as to conclusively attribute changes. However, such a sequential approach to testing can be very slow, especially when there are several versions to test. To continue with the example, in the hypothesis that bigger product images lead to higher conversion rates on Amazon, “bigger” is subjective, and several versions of “bigger” (e.g., 1.1x, 1.3x, and 1.5x) might need to be tested.

Instead of testing such cases sequentially, a multivariate test can be adopted, in which users are not split in half but into multiple variants. For instance, four groups (A, B, C, D) are made up of 25% of users each, where A-group users will not see any change, whereas those in variants B, C, and D will see images bigger by 1.1x, 1.3x, and 1.5x, respectively. In this test, multiple variants are simultaneously tested against the current version of the product in order to identify the best variant.

Before/After Testing

Sometimes, it is not possible to split the users in half (or into multiple variants) as there might be network effects in place. For example, if the test involves determining whether one logic for formulating surge prices on Uber is better than another, the drivers cannot be divided into different variants, as the logic takes into account the demand and supply mismatch of the entire city. In such cases, a test will have to compare the effects before the change and after the change in order to arrive at a conclusion.

Before/after testing in product hypothesis testing

However, the constraint here is the inability to isolate the effects of seasonality and externality that can differently affect the test and control periods. Suppose a change to the logic that determines surge pricing on Uber is made at time t , such that logic A is used before and logic B is used after. While the effects before and after time t can be compared, there is no guarantee that the effects are solely due to the change in logic. There could have been a difference in demand or other factors between the two time periods that resulted in a difference between the two.

Time-based On/Off Testing

Time-based on/off testing in product hypothesis testing

The downsides of before/after testing can be overcome to a large extent by deploying time-based on/off testing, in which the change is introduced to all users for a certain period of time, turned off for an equal period of time, and then repeated for a longer duration.

For example, in the Uber use case, the change can be shown to drivers on Monday, withdrawn on Tuesday, shown again on Wednesday, and so on.

While this method doesn’t fully remove the effects of seasonality and externality, it does reduce them significantly, making such tests more robust.

Test Design

Choosing the right test for the use case at hand is an essential step in validating a hypothesis in the quickest and most robust way. Once the choice is made, the details of the test design can be outlined.

The test design is simply a coherent outline of:

  • The hypothesis to be tested: Showing users bigger product images will lead them to purchase more products.
  • Success metrics for the test: Customer conversion
  • Decision-making criteria for the test: The test validates the hypothesis that users in the variant show a higher conversion rate than those in the control group.
  • Metrics that need to be instrumented to learn from the test: Customer conversion, clicks on product images

In the case of the product hypothesis example that bigger product images will lead to improved conversion on Amazon, the success metric is conversion and the decision criteria is an improvement in conversion.

After the right test is chosen and designed, and the success criteria and metrics are identified, the results must be analyzed. To do that, some statistical concepts are necessary.

When running tests, it is important to ensure that the two variants picked for the test (A and B) do not have a bias with respect to the success metric. For instance, if the variant that sees the bigger images already has a higher conversion than the variant that doesn’t see the change, then the test is biased and can lead to wrong conclusions.

In order to ensure no bias in sampling, one can observe the mean and variance for the success metric before the change is introduced.

Significance and Power

Once a difference between the two variants is observed, it is important to conclude that the change observed is an actual effect and not a random one. This can be done by computing the significance of the change in the success metric.

In layman’s terms, significance measures the frequency with which the test shows that bigger images lead to higher conversion when they actually don’t. Power measures the frequency with which the test tells us that bigger images lead to higher conversion when they actually do.

So, tests need to have a high value of power and a low value of significance for more accurate results.

While an in-depth exploration of the statistical concepts involved in product management hypothesis testing is out of scope here, the following actions are recommended to enhance knowledge on this front:

  • Data analysts and data engineers are usually adept at identifying the right test designs and can guide product managers, so make sure to utilize their expertise early in the process.
  • There are numerous online courses on hypothesis testing, A/B testing, and related statistical concepts, such as Udemy , Udacity , and Coursera .
  • Using tools such as Google’s Firebase and Optimizely can make the process easier thanks to a large amount of out-of-the-box capabilities for running the right tests.

Using Hypothesis Testing for Successful Product Management

In order to continuously deliver value to users, it is imperative to test various hypotheses, for the purpose of which several types of product hypothesis testing can be employed. Each hypothesis needs to have an accompanying test design, as described above, in order to conclusively validate or invalidate it.

This approach helps to quantify the value delivered by new changes and features, bring focus to the most valuable features, and deliver incremental iterations.

  • How to Conduct Remote User Interviews [Infographic]
  • A/B Testing UX for Component-based Frameworks
  • Building an AI Product? Maximize Value With an Implementation Framework

Further Reading on the Toptal Blog:

  • Evolving UX: Experimental Product Design with a CXO
  • How to Conduct Usability Testing in Six Steps
  • 3 Product-led Growth Frameworks to Build Your Business
  • A Product Designer’s Guide to Competitive Analysis

Understanding the basics

What is a product hypothesis.

A product hypothesis is an assumption that some improvement in the product will bring an increase in important metrics like revenue or product usage statistics.

What are the three required parts of a hypothesis?

The three required parts of a hypothesis are the assumption, the condition, and the prediction.

Why do we do A/B testing?

We do A/B testing to make sure that any improvement in the product increases our tracked metrics.

What is A/B testing used for?

A/B testing is used to check if our product improvements create the desired change in metrics.

What is A/B testing and multivariate testing?

A/B testing and multivariate testing are types of hypothesis testing. A/B testing checks how important metrics change with and without a single change in the product. Multivariate testing can track multiple variations of the same product improvement.

Kumara Raghavendra

Dubai, United Arab Emirates

Member since August 6, 2019

About the author

World-class articles, delivered weekly.

By entering your email, you are agreeing to our privacy policy .

Toptal Product Managers

  • Artificial Intelligence Product Managers
  • Blockchain Product Managers
  • Business Systems Analysts
  • Cloud Product Managers
  • Data Science Product Managers
  • Digital Marketing Product Managers
  • Digital Product Managers
  • Directors of Product
  • eCommerce Product Managers
  • Enterprise Product Managers
  • Enterprise Resource Planning Product Managers
  • Freelance Product Managers
  • Interim CPOs
  • Jira Product Managers
  • Kanban Product Managers
  • Lean Product Managers
  • Mobile Product Managers
  • Product Consultants
  • Product Development Managers
  • Product Owners
  • Product Portfolio Managers
  • Product Strategy Consultants
  • Product Tour Consultants
  • Robotic Process Automation Product Managers
  • Robotics Product Managers
  • SaaS Product Managers
  • Salesforce Product Managers
  • Scrum Product Owner Contractors
  • Web Product Managers
  • View More Freelance Product Managers

Join the Toptal ® community.

Stratechi.com

  • What is Strategy?
  • Business Models
  • Developing a Strategy
  • Strategic Planning
  • Competitive Advantage
  • Growth Strategy
  • Market Strategy
  • Customer Strategy
  • Geographic Strategy
  • Product Strategy
  • Service Strategy
  • Pricing Strategy
  • Distribution Strategy
  • Sales Strategy
  • Marketing Strategy
  • Digital Marketing Strategy
  • Organizational Strategy
  • HR Strategy – Organizational Design
  • HR Strategy – Employee Journey & Culture
  • Process Strategy
  • Procurement Strategy
  • Cost and Capital Strategy
  • Business Value
  • Market Analysis
  • Problem Solving Skills
  • Strategic Options
  • Business Analytics
  • Strategic Decision Making
  • Process Improvement
  • Project Planning
  • Team Leadership
  • Personal Development
  • Leadership Maturity Model
  • Leadership Team Strategy
  • The Leadership Team
  • Leadership Mindset
  • Communication & Collaboration
  • Problem Solving
  • Decision Making
  • People Leadership
  • Strategic Execution
  • Executive Coaching
  • Strategy Coaching
  • Business Transformation
  • Strategy Workshops
  • Leadership Strategy Survey
  • Leadership Training
  • Who’s Joe?

“A fact is a simple statement that everyone believes. It is innocent, unless found guilty. A hypothesis is a novel suggestion that no one wants to believe. It is guilty until found effective.”

– Edward Teller, Nuclear Physicist

During my first brainstorming meeting on my first project at McKinsey, this very serious partner, who had a PhD in Physics, looked at me and said, “So, Joe, what are your main hypotheses.” I looked back at him, perplexed, and said, “Ummm, my what?” I was used to people simply asking, “what are your best ideas, opinions, thoughts, etc.” Over time, I began to understand the importance of hypotheses and how it plays an important role in McKinsey’s problem solving of separating ideas and opinions from facts.

What is a Hypothesis?

“Hypothesis” is probably one of the top 5 words used by McKinsey consultants. And, being hypothesis-driven was required to have any success at McKinsey. A hypothesis is an idea or theory, often based on limited data, which is typically the beginning of a thread of further investigation to prove, disprove or improve the hypothesis through facts and empirical data.

The first step in being hypothesis-driven is to focus on the highest potential ideas and theories of how to solve a problem or realize an opportunity.

Let’s go over an example of being hypothesis-driven.

Let’s say you own a website, and you brainstorm ten ideas to improve web traffic, but you don’t have the budget to execute all ten ideas. The first step in being hypothesis-driven is to prioritize the ten ideas based on how much impact you hypothesize they will create.

hypothesis driven example

The second step in being hypothesis-driven is to apply the scientific method to your hypotheses by creating the fact base to prove or disprove your hypothesis, which then allows you to turn your hypothesis into fact and knowledge. Running with our example, you could prove or disprove your hypothesis on the ideas you think will drive the most impact by executing:

1. An analysis of previous research and the performance of the different ideas 2. A survey where customers rank order the ideas 3. An actual test of the ten ideas to create a fact base on click-through rates and cost

While there are many other ways to validate the hypothesis on your prioritization , I find most people do not take this critical step in validating a hypothesis. Instead, they apply bad logic to many important decisions . An idea pops into their head, and then somehow it just becomes a fact.

One of my favorite lousy logic moments was a CEO who stated,

“I’ve never heard our customers talk about price, so the price doesn’t matter with our products , and I’ve decided we’re going to raise prices.”

Luckily, his management team was able to do a survey to dig deeper into the hypothesis that customers weren’t price-sensitive. Well, of course, they were and through the survey, they built a fantastic fact base that proved and disproved many other important hypotheses.

business hypothesis example

Why is being hypothesis-driven so important?

Imagine if medicine never actually used the scientific method. We would probably still be living in a world of lobotomies and bleeding people. Many organizations are still stuck in the dark ages, having built a house of cards on opinions disguised as facts, because they don’t prove or disprove their hypotheses. Decisions made on top of decisions, made on top of opinions, steer organizations clear of reality and the facts necessary to objectively evolve their strategic understanding and knowledge. I’ve seen too many leadership teams led solely by gut and opinion. The problem with intuition and gut is if you don’t ever prove or disprove if your gut is right or wrong, you’re never going to improve your intuition. There is a reason why being hypothesis-driven is the cornerstone of problem solving at McKinsey and every other top strategy consulting firm.

How do you become hypothesis-driven?

Most people are idea-driven, and constantly have hypotheses on how the world works and what they or their organization should do to improve. Though, there is often a fatal flaw in that many people turn their hypotheses into false facts, without actually finding or creating the facts to prove or disprove their hypotheses. These people aren’t hypothesis-driven; they are gut-driven.

The conversation typically goes something like “doing this discount promotion will increase our profits” or “our customers need to have this feature” or “morale is in the toilet because we don’t pay well, so we need to increase pay.” These should all be hypotheses that need the appropriate fact base, but instead, they become false facts, often leading to unintended results and consequences. In each of these cases, to become hypothesis-driven necessitates a different framing.

• Instead of “doing this discount promotion will increase our profits,” a hypothesis-driven approach is to ask “what are the best marketing ideas to increase our profits?” and then conduct a marketing experiment to see which ideas increase profits the most.

• Instead of “our customers need to have this feature,” ask the question, “what features would our customers value most?” And, then conduct a simple survey having customers rank order the features based on value to them.

• Instead of “morale is in the toilet because we don’t pay well, so we need to increase pay,” conduct a survey asking, “what is the level of morale?” what are potential issues affecting morale?” and what are the best ideas to improve morale?”

Beyond, watching out for just following your gut, here are some of the other best practices in being hypothesis-driven:

Listen to Your Intuition

Your mind has taken the collision of your experiences and everything you’ve learned over the years to create your intuition, which are those ideas that pop into your head and those hunches that come from your gut. Your intuition is your wellspring of hypotheses. So listen to your intuition, build hypotheses from it, and then prove or disprove those hypotheses, which will, in turn, improve your intuition. Intuition without feedback will over time typically evolve into poor intuition, which leads to poor judgment, thinking, and decisions.

Constantly Be Curious

I’m always curious about cause and effect. At Sports Authority, I had a hypothesis that customers that received service and assistance as they shopped, were worth more than customers who didn’t receive assistance from an associate. We figured out how to prove or disprove this hypothesis by tying surveys to transactional data of customers, and we found the hypothesis was true, which led us to a broad initiative around improving service. The key is you have to be always curious about what you think does or will drive value, create hypotheses and then prove or disprove those hypotheses.

Validate Hypotheses

You need to validate and prove or disprove hypotheses. Don’t just chalk up an idea as fact. In most cases, you’re going to have to create a fact base utilizing logic, observation, testing (see the section on Experimentation ), surveys, and analysis.

Be a Learning Organization

The foundation of learning organizations is the testing of and learning from hypotheses. I remember my first strategy internship at Mercer Management Consulting when I spent a good part of the summer combing through the results, findings, and insights of thousands of experiments that a banking client had conducted. It was fascinating to see the vastness and depth of their collective knowledge base. And, in today’s world of knowledge portals, it is so easy to disseminate, learn from, and build upon the knowledge created by companies.

NEXT SECTION: DISAGGREGATION

DOWNLOAD STRATEGY PRESENTATION TEMPLATES

THE $150 VALUE PACK - 600 SLIDES 168-PAGE COMPENDIUM OF STRATEGY FRAMEWORKS & TEMPLATES 186-PAGE HR & ORG STRATEGY PRESENTATION 100-PAGE SALES PLAN PRESENTATION 121-PAGE STRATEGIC PLAN & COMPANY OVERVIEW PRESENTATION 114-PAGE MARKET & COMPETITIVE ANALYSIS PRESENTATION 18-PAGE BUSINESS MODEL TEMPLATE

JOE NEWSUM COACHING

Newsum Headshot small

EXECUTIVE COACHING STRATEGY COACHING ELEVATE360 BUSINESS TRANSFORMATION STRATEGY WORKSHOPS LEADERSHIP STRATEGY SURVEY & WORKSHOP STRATEGY & LEADERSHIP TRAINING

THE LEADERSHIP MATURITY MODEL

Explore other types of strategy.

BIG PICTURE WHAT IS STRATEGY? BUSINESS MODEL COMP. ADVANTAGE GROWTH

TARGETS MARKET CUSTOMER GEOGRAPHIC

VALUE PROPOSITION PRODUCT SERVICE PRICING

GO TO MARKET DISTRIBUTION SALES MARKETING

ORGANIZATIONAL ORG DESIGN HR & CULTURE PROCESS PARTNER

EXPLORE THE TOP 100 STRATEGIC LEADERSHIP COMPETENCIES

TYPES OF VALUE MARKET ANALYSIS PROBLEM SOLVING

OPTION CREATION ANALYTICS DECISION MAKING PROCESS TOOLS

PLANNING & PROJECTS PEOPLE LEADERSHIP PERSONAL DEVELOPMENT

sm icons linkedIn In tm

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons

Margin Size

  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Statistics LibreTexts

9.1: Introduction to Hypothesis Testing

  • Last updated
  • Save as PDF
  • Page ID 10211

  • Kyle Siegrist
  • University of Alabama in Huntsville via Random Services

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

Basic Theory

Preliminaries.

As usual, our starting point is a random experiment with an underlying sample space and a probability measure \(\P\). In the basic statistical model, we have an observable random variable \(\bs{X}\) taking values in a set \(S\). In general, \(\bs{X}\) can have quite a complicated structure. For example, if the experiment is to sample \(n\) objects from a population and record various measurements of interest, then \[ \bs{X} = (X_1, X_2, \ldots, X_n) \] where \(X_i\) is the vector of measurements for the \(i\)th object. The most important special case occurs when \((X_1, X_2, \ldots, X_n)\) are independent and identically distributed. In this case, we have a random sample of size \(n\) from the common distribution.

The purpose of this section is to define and discuss the basic concepts of statistical hypothesis testing . Collectively, these concepts are sometimes referred to as the Neyman-Pearson framework, in honor of Jerzy Neyman and Egon Pearson, who first formalized them.

A statistical hypothesis is a statement about the distribution of \(\bs{X}\). Equivalently, a statistical hypothesis specifies a set of possible distributions of \(\bs{X}\): the set of distributions for which the statement is true. A hypothesis that specifies a single distribution for \(\bs{X}\) is called simple ; a hypothesis that specifies more than one distribution for \(\bs{X}\) is called composite .

In hypothesis testing , the goal is to see if there is sufficient statistical evidence to reject a presumed null hypothesis in favor of a conjectured alternative hypothesis . The null hypothesis is usually denoted \(H_0\) while the alternative hypothesis is usually denoted \(H_1\).

An hypothesis test is a statistical decision ; the conclusion will either be to reject the null hypothesis in favor of the alternative, or to fail to reject the null hypothesis. The decision that we make must, of course, be based on the observed value \(\bs{x}\) of the data vector \(\bs{X}\). Thus, we will find an appropriate subset \(R\) of the sample space \(S\) and reject \(H_0\) if and only if \(\bs{x} \in R\). The set \(R\) is known as the rejection region or the critical region . Note the asymmetry between the null and alternative hypotheses. This asymmetry is due to the fact that we assume the null hypothesis, in a sense, and then see if there is sufficient evidence in \(\bs{x}\) to overturn this assumption in favor of the alternative.

An hypothesis test is a statistical analogy to proof by contradiction, in a sense. Suppose for a moment that \(H_1\) is a statement in a mathematical theory and that \(H_0\) is its negation. One way that we can prove \(H_1\) is to assume \(H_0\) and work our way logically to a contradiction. In an hypothesis test, we don't prove anything of course, but there are similarities. We assume \(H_0\) and then see if the data \(\bs{x}\) are sufficiently at odds with that assumption that we feel justified in rejecting \(H_0\) in favor of \(H_1\).

Often, the critical region is defined in terms of a statistic \(w(\bs{X})\), known as a test statistic , where \(w\) is a function from \(S\) into another set \(T\). We find an appropriate rejection region \(R_T \subseteq T\) and reject \(H_0\) when the observed value \(w(\bs{x}) \in R_T\). Thus, the rejection region in \(S\) is then \(R = w^{-1}(R_T) = \left\{\bs{x} \in S: w(\bs{x}) \in R_T\right\}\). As usual, the use of a statistic often allows significant data reduction when the dimension of the test statistic is much smaller than the dimension of the data vector.

The ultimate decision may be correct or may be in error. There are two types of errors, depending on which of the hypotheses is actually true.

Types of errors:

  • A type 1 error is rejecting the null hypothesis \(H_0\) when \(H_0\) is true.
  • A type 2 error is failing to reject the null hypothesis \(H_0\) when the alternative hypothesis \(H_1\) is true.

Similarly, there are two ways to make a correct decision: we could reject \(H_0\) when \(H_1\) is true or we could fail to reject \(H_0\) when \(H_0\) is true. The possibilities are summarized in the following table:

Of course, when we observe \(\bs{X} = \bs{x}\) and make our decision, either we will have made the correct decision or we will have committed an error, and usually we will never know which of these events has occurred. Prior to gathering the data, however, we can consider the probabilities of the various errors.

If \(H_0\) is true (that is, the distribution of \(\bs{X}\) is specified by \(H_0\)), then \(\P(\bs{X} \in R)\) is the probability of a type 1 error for this distribution. If \(H_0\) is composite, then \(H_0\) specifies a variety of different distributions for \(\bs{X}\) and thus there is a set of type 1 error probabilities.

The maximum probability of a type 1 error, over the set of distributions specified by \( H_0 \), is the significance level of the test or the size of the critical region.

The significance level is often denoted by \(\alpha\). Usually, the rejection region is constructed so that the significance level is a prescribed, small value (typically 0.1, 0.05, 0.01).

If \(H_1\) is true (that is, the distribution of \(\bs{X}\) is specified by \(H_1\)), then \(\P(\bs{X} \notin R)\) is the probability of a type 2 error for this distribution. Again, if \(H_1\) is composite then \(H_1\) specifies a variety of different distributions for \(\bs{X}\), and thus there will be a set of type 2 error probabilities. Generally, there is a tradeoff between the type 1 and type 2 error probabilities. If we reduce the probability of a type 1 error, by making the rejection region \(R\) smaller, we necessarily increase the probability of a type 2 error because the complementary region \(S \setminus R\) is larger.

The extreme cases can give us some insight. First consider the decision rule in which we never reject \(H_0\), regardless of the evidence \(\bs{x}\). This corresponds to the rejection region \(R = \emptyset\). A type 1 error is impossible, so the significance level is 0. On the other hand, the probability of a type 2 error is 1 for any distribution defined by \(H_1\). At the other extreme, consider the decision rule in which we always rejects \(H_0\) regardless of the evidence \(\bs{x}\). This corresponds to the rejection region \(R = S\). A type 2 error is impossible, but now the probability of a type 1 error is 1 for any distribution defined by \(H_0\). In between these two worthless tests are meaningful tests that take the evidence \(\bs{x}\) into account.

If \(H_1\) is true, so that the distribution of \(\bs{X}\) is specified by \(H_1\), then \(\P(\bs{X} \in R)\), the probability of rejecting \(H_0\) is the power of the test for that distribution.

Thus the power of the test for a distribution specified by \( H_1 \) is the probability of making the correct decision.

Suppose that we have two tests, corresponding to rejection regions \(R_1\) and \(R_2\), respectively, each having significance level \(\alpha\). The test with region \(R_1\) is uniformly more powerful than the test with region \(R_2\) if \[ \P(\bs{X} \in R_1) \ge \P(\bs{X} \in R_2) \text{ for every distribution of } \bs{X} \text{ specified by } H_1 \]

Naturally, in this case, we would prefer the first test. Often, however, two tests will not be uniformly ordered; one test will be more powerful for some distributions specified by \(H_1\) while the other test will be more powerful for other distributions specified by \(H_1\).

If a test has significance level \(\alpha\) and is uniformly more powerful than any other test with significance level \(\alpha\), then the test is said to be a uniformly most powerful test at level \(\alpha\).

Clearly a uniformly most powerful test is the best we can do.

\(P\)-value

In most cases, we have a general procedure that allows us to construct a test (that is, a rejection region \(R_\alpha\)) for any given significance level \(\alpha \in (0, 1)\). Typically, \(R_\alpha\) decreases (in the subset sense) as \(\alpha\) decreases.

The \(P\)-value of the observed value \(\bs{x}\) of \(\bs{X}\), denoted \(P(\bs{x})\), is defined to be the smallest \(\alpha\) for which \(\bs{x} \in R_\alpha\); that is, the smallest significance level for which \(H_0\) is rejected, given \(\bs{X} = \bs{x}\).

Knowing \(P(\bs{x})\) allows us to test \(H_0\) at any significance level for the given data \(\bs{x}\): If \(P(\bs{x}) \le \alpha\) then we would reject \(H_0\) at significance level \(\alpha\); if \(P(\bs{x}) \gt \alpha\) then we fail to reject \(H_0\) at significance level \(\alpha\). Note that \(P(\bs{X})\) is a statistic . Informally, \(P(\bs{x})\) can often be thought of as the probability of an outcome as or more extreme than the observed value \(\bs{x}\), where extreme is interpreted relative to the null hypothesis \(H_0\).

Analogy with Justice Systems

There is a helpful analogy between statistical hypothesis testing and the criminal justice system in the US and various other countries. Consider a person charged with a crime. The presumed null hypothesis is that the person is innocent of the crime; the conjectured alternative hypothesis is that the person is guilty of the crime. The test of the hypotheses is a trial with evidence presented by both sides playing the role of the data. After considering the evidence, the jury delivers the decision as either not guilty or guilty . Note that innocent is not a possible verdict of the jury, because it is not the point of the trial to prove the person innocent. Rather, the point of the trial is to see whether there is sufficient evidence to overturn the null hypothesis that the person is innocent in favor of the alternative hypothesis of that the person is guilty. A type 1 error is convicting a person who is innocent; a type 2 error is acquitting a person who is guilty. Generally, a type 1 error is considered the more serious of the two possible errors, so in an attempt to hold the chance of a type 1 error to a very low level, the standard for conviction in serious criminal cases is beyond a reasonable doubt .

Tests of an Unknown Parameter

Hypothesis testing is a very general concept, but an important special class occurs when the distribution of the data variable \(\bs{X}\) depends on a parameter \(\theta\) taking values in a parameter space \(\Theta\). The parameter may be vector-valued, so that \(\bs{\theta} = (\theta_1, \theta_2, \ldots, \theta_n)\) and \(\Theta \subseteq \R^k\) for some \(k \in \N_+\). The hypotheses generally take the form \[ H_0: \theta \in \Theta_0 \text{ versus } H_1: \theta \notin \Theta_0 \] where \(\Theta_0\) is a prescribed subset of the parameter space \(\Theta\). In this setting, the probabilities of making an error or a correct decision depend on the true value of \(\theta\). If \(R\) is the rejection region, then the power function \( Q \) is given by \[ Q(\theta) = \P_\theta(\bs{X} \in R), \quad \theta \in \Theta \] The power function gives a lot of information about the test.

The power function satisfies the following properties:

  • \(Q(\theta)\) is the probability of a type 1 error when \(\theta \in \Theta_0\).
  • \(\max\left\{Q(\theta): \theta \in \Theta_0\right\}\) is the significance level of the test.
  • \(1 - Q(\theta)\) is the probability of a type 2 error when \(\theta \notin \Theta_0\).
  • \(Q(\theta)\) is the power of the test when \(\theta \notin \Theta_0\).

If we have two tests, we can compare them by means of their power functions.

Suppose that we have two tests, corresponding to rejection regions \(R_1\) and \(R_2\), respectively, each having significance level \(\alpha\). The test with rejection region \(R_1\) is uniformly more powerful than the test with rejection region \(R_2\) if \( Q_1(\theta) \ge Q_2(\theta)\) for all \( \theta \notin \Theta_0 \).

Most hypothesis tests of an unknown real parameter \(\theta\) fall into three special cases:

Suppose that \( \theta \) is a real parameter and \( \theta_0 \in \Theta \) a specified value. The tests below are respectively the two-sided test , the left-tailed test , and the right-tailed test .

  • \(H_0: \theta = \theta_0\) versus \(H_1: \theta \ne \theta_0\)
  • \(H_0: \theta \ge \theta_0\) versus \(H_1: \theta \lt \theta_0\)
  • \(H_0: \theta \le \theta_0\) versus \(H_1: \theta \gt \theta_0\)

Thus the tests are named after the conjectured alternative. Of course, there may be other unknown parameters besides \(\theta\) (known as nuisance parameters ).

Equivalence Between Hypothesis Test and Confidence Sets

There is an equivalence between hypothesis tests and confidence sets for a parameter \(\theta\).

Suppose that \(C(\bs{x})\) is a \(1 - \alpha\) level confidence set for \(\theta\). The following test has significance level \(\alpha\) for the hypothesis \( H_0: \theta = \theta_0 \) versus \( H_1: \theta \ne \theta_0 \): Reject \(H_0\) if and only if \(\theta_0 \notin C(\bs{x})\)

By definition, \(\P[\theta \in C(\bs{X})] = 1 - \alpha\). Hence if \(H_0\) is true so that \(\theta = \theta_0\), then the probability of a type 1 error is \(P[\theta \notin C(\bs{X})] = \alpha\).

Equivalently, we fail to reject \(H_0\) at significance level \(\alpha\) if and only if \(\theta_0\) is in the corresponding \(1 - \alpha\) level confidence set. In particular, this equivalence applies to interval estimates of a real parameter \(\theta\) and the common tests for \(\theta\) given above .

In each case below, the confidence interval has confidence level \(1 - \alpha\) and the test has significance level \(\alpha\).

  • Suppose that \(\left[L(\bs{X}, U(\bs{X})\right]\) is a two-sided confidence interval for \(\theta\). Reject \(H_0: \theta = \theta_0\) versus \(H_1: \theta \ne \theta_0\) if and only if \(\theta_0 \lt L(\bs{X})\) or \(\theta_0 \gt U(\bs{X})\).
  • Suppose that \(L(\bs{X})\) is a confidence lower bound for \(\theta\). Reject \(H_0: \theta \le \theta_0\) versus \(H_1: \theta \gt \theta_0\) if and only if \(\theta_0 \lt L(\bs{X})\).
  • Suppose that \(U(\bs{X})\) is a confidence upper bound for \(\theta\). Reject \(H_0: \theta \ge \theta_0\) versus \(H_1: \theta \lt \theta_0\) if and only if \(\theta_0 \gt U(\bs{X})\).

Pivot Variables and Test Statistics

Recall that confidence sets of an unknown parameter \(\theta\) are often constructed through a pivot variable , that is, a random variable \(W(\bs{X}, \theta)\) that depends on the data vector \(\bs{X}\) and the parameter \(\theta\), but whose distribution does not depend on \(\theta\) and is known. In this case, a natural test statistic for the basic tests given above is \(W(\bs{X}, \theta_0)\).

Statology

Statistics Made Easy

Introduction to Hypothesis Testing

A statistical hypothesis is an assumption about a population parameter .

For example, we may assume that the mean height of a male in the U.S. is 70 inches.

The assumption about the height is the statistical hypothesis and the true mean height of a male in the U.S. is the population parameter .

A hypothesis test is a formal statistical test we use to reject or fail to reject a statistical hypothesis.

The Two Types of Statistical Hypotheses

To test whether a statistical hypothesis about a population parameter is true, we obtain a random sample from the population and perform a hypothesis test on the sample data.

There are two types of statistical hypotheses:

The null hypothesis , denoted as H 0 , is the hypothesis that the sample data occurs purely from chance.

The alternative hypothesis , denoted as H 1 or H a , is the hypothesis that the sample data is influenced by some non-random cause.

Hypothesis Tests

A hypothesis test consists of five steps:

1. State the hypotheses. 

State the null and alternative hypotheses. These two hypotheses need to be mutually exclusive, so if one is true then the other must be false.

2. Determine a significance level to use for the hypothesis.

Decide on a significance level. Common choices are .01, .05, and .1. 

3. Find the test statistic.

Find the test statistic and the corresponding p-value. Often we are analyzing a population mean or proportion and the general formula to find the test statistic is: (sample statistic – population parameter) / (standard deviation of statistic)

4. Reject or fail to reject the null hypothesis.

Using the test statistic or the p-value, determine if you can reject or fail to reject the null hypothesis based on the significance level.

The p-value  tells us the strength of evidence in support of a null hypothesis. If the p-value is less than the significance level, we reject the null hypothesis.

5. Interpret the results. 

Interpret the results of the hypothesis test in the context of the question being asked. 

The Two Types of Decision Errors

There are two types of decision errors that one can make when doing a hypothesis test:

Type I error: You reject the null hypothesis when it is actually true. The probability of committing a Type I error is equal to the significance level, often called  alpha , and denoted as α.

Type II error: You fail to reject the null hypothesis when it is actually false. The probability of committing a Type II error is called the Power of the test or  Beta , denoted as β.

One-Tailed and Two-Tailed Tests

A statistical hypothesis can be one-tailed or two-tailed.

A one-tailed hypothesis involves making a “greater than” or “less than ” statement.

For example, suppose we assume the mean height of a male in the U.S. is greater than or equal to 70 inches. The null hypothesis would be H0: µ ≥ 70 inches and the alternative hypothesis would be Ha: µ < 70 inches.

A two-tailed hypothesis involves making an “equal to” or “not equal to” statement.

For example, suppose we assume the mean height of a male in the U.S. is equal to 70 inches. The null hypothesis would be H0: µ = 70 inches and the alternative hypothesis would be Ha: µ ≠ 70 inches.

Note: The “equal” sign is always included in the null hypothesis, whether it is =, ≥, or ≤.

Related:   What is a Directional Hypothesis?

Types of Hypothesis Tests

There are many different types of hypothesis tests you can perform depending on the type of data you’re working with and the goal of your analysis.

The following tutorials provide an explanation of the most common types of hypothesis tests:

Introduction to the One Sample t-test Introduction to the Two Sample t-test Introduction to the Paired Samples t-test Introduction to the One Proportion Z-Test Introduction to the Two Proportion Z-Test

Featured Posts

5 Regularization Techniques You Should Know

Hey there. My name is Zach Bobbitt. I have a Masters of Science degree in Applied Statistics and I’ve worked on machine learning algorithms for professional businesses in both healthcare and retail. I’m passionate about statistics, machine learning, and data visualization and I created Statology to be a resource for both students and teachers alike.  My goal with this site is to help you learn statistics through using simple terms, plenty of real-world examples, and helpful illustrations.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Join the Statology Community

Sign up to receive Statology's exclusive study resource: 100 practice problems with step-by-step solutions. Plus, get our latest insights, tutorials, and data analysis tips straight to your inbox!

By subscribing you accept Statology's Privacy Policy.

  • Search Search Please fill out this field.

What Is Hypothesis Testing?

Step 1: define the hypothesis, step 2: set the criteria, step 3: calculate the statistic, step 4: reach a conclusion, types of errors, the bottom line.

  • Trading Skills
  • Trading Basic Education

Hypothesis Testing in Finance: Concept and Examples

Charlene Rhinehart is a CPA , CFE, chair of an Illinois CPA Society committee, and has a degree in accounting and finance from DePaul University.

hypothesis testing meaning in management

Your investment advisor proposes you a monthly income investment plan that promises a variable return each month. You will invest in it only if you are assured of an average $180 monthly income. Your advisor also tells you that for the past 300 months, the scheme had investment returns with an average value of $190 and a standard deviation of $75. Should you invest in this scheme? Hypothesis testing comes to the aid for such decision-making.

Key Takeaways

  • Hypothesis testing is a mathematical tool for confirming a financial or business claim or idea.
  • Hypothesis testing is useful for investors trying to decide what to invest in and whether the instrument is likely to provide a satisfactory return.
  • Despite the existence of different methodologies of hypothesis testing, the same four steps are used: define the hypothesis, set the criteria, calculate the statistic, and reach a conclusion.
  • This mathematical model, like most statistical tools and models, has limitations and is prone to certain errors, necessitating investors also considering other models in conjunction with this one

Hypothesis or significance testing is a mathematical model for testing a claim, idea or hypothesis about a parameter of interest in a given population set, using data measured in a sample set. Calculations are performed on selected samples to gather more decisive information about the characteristics of the entire population, which enables a systematic way to test claims or ideas about the entire dataset.

Here is a simple example: A school principal reports that students in their school score an average of 7 out of 10 in exams. To test this “hypothesis,” we record marks of say 30 students (sample) from the entire student population of the school (say 300) and calculate the mean of that sample. We can then compare the (calculated) sample mean to the (reported) population mean and attempt to confirm the hypothesis.

To take another example, the annual return of a particular mutual fund is 8%. Assume that mutual fund has been in existence for 20 years. We take a random sample of annual returns of the mutual fund for, say, five years (sample) and calculate its mean. We then compare the (calculated) sample mean to the (claimed) population mean to verify the hypothesis.

This article assumes readers' familiarity with concepts of a normal distribution table, formula, p-value and related basics of statistics.

Different methodologies exist for hypothesis testing, but the same four basic steps are involved:

Usually, the reported value (or the claim statistics) is stated as the hypothesis and presumed to be true. For the above examples, the hypothesis will be:

  • Example A: Students in the school score an average of 7 out of 10 in exams.
  • Example B: The annual return of the mutual fund is 8% per annum.

This stated description constitutes the “ Null Hypothesis (H 0 ) ” and is  assumed  to be true – the way a defendant in a jury trial is presumed innocent until proven guilty by the evidence presented in court. Similarly, hypothesis testing starts by stating and assuming a “ null hypothesis ,” and then the process determines whether the assumption is likely to be true or false.

The important point to note is that we are testing the null hypothesis because there is an element of doubt about its validity. Whatever information that is against the stated null hypothesis is captured in the  Alternative Hypothesis (H 1 ).  For the above examples, the alternative hypothesis will be:

  • Students score an average that is not equal to 7.
  • The annual return of the mutual fund is not equal to 8% per annum.

In other words, the alternative hypothesis is a direct contradiction of the null hypothesis.

As in a trial, the jury assumes the defendant's innocence (null hypothesis). The prosecutor has to prove otherwise (alternative hypothesis). Similarly, the researcher has to prove that the null hypothesis is either true or false. If the prosecutor fails to prove the alternative hypothesis, the jury has to let the defendant go (basing the decision on the null hypothesis). Similarly, if the researcher fails to prove an alternative hypothesis (or simply does nothing), then the null hypothesis is assumed to be true.

The decision-making criteria have to be based on certain parameters of datasets.

The decision-making criteria have to be based on certain parameters of datasets and this is where the connection to normal distribution comes into the picture.

As per the standard statistics postulate  about sampling distribution , “For any sample size n, the sampling distribution of X̅ is normal if the population X from which the sample is drawn is normally distributed.” Hence, the probabilities of all other possible sample mean that one could select are normally distributed.

For e.g., determine if the average daily return, of any stock listed on XYZ stock market , around New Year's Day is greater than 2%.

H 0 : Null Hypothesis: mean = 2%

H 1 : Alternative Hypothesis: mean > 2% (this is what we want to prove)

Take the sample (say of 50 stocks out of total 500) and compute the mean of the sample.

For a normal distribution, 95% of the values lie within two standard deviations of the population mean. Hence, this normal distribution and central limit assumption for the sample dataset allows us to establish 5% as a significance level. It makes sense as, under this assumption, there is less than a 5% probability (100-95) of getting outliers that are beyond two standard deviations from the population mean. Depending upon the nature of datasets, other significance levels can be taken at 1%, 5% or 10%. For financial calculations (including behavioral finance), 5% is the generally accepted limit. If we find any calculations that go beyond the usual two standard deviations, then we have a strong case of outliers to reject the null hypothesis.  

Graphically, it is represented as follows:

In the above example, if the mean of the sample is much larger than 2% (say 3.5%), then we reject the null hypothesis. The alternative hypothesis (mean >2%) is accepted, which confirms that the average daily return of the stocks is indeed above 2%.

However, if the mean of the sample is not likely to be significantly greater than 2% (and remains at, say, around 2.2%), then we CANNOT reject the null hypothesis. The challenge comes on how to decide on such close range cases. To make a conclusion from selected samples and results, a level of significance is to be determined, which enables a conclusion to be made about the null hypothesis. The alternative hypothesis enables establishing the level of significance or the "critical value” concept for deciding on such close range cases.

According to the textbook standard definition , “A critical value is a cutoff value that defines the boundaries beyond which less than 5% of sample means can be obtained if the null hypothesis is true. Sample means obtained beyond a critical value will result in a decision to reject the null hypothesis."   In the above example, if we have defined the critical value as 2.1%, and the calculated mean comes to 2.2%, then we reject the null hypothesis. A critical value establishes a clear demarcation about acceptance or rejection.

This step involves calculating the required figure(s), known as test statistics (like mean, z-score , p-value , etc.), for the selected sample. (We'll get to these in a later section.)

With the computed value(s), decide on the null hypothesis. If the probability of getting a sample mean is less than 5%, then the conclusion is to reject the null hypothesis. Otherwise, accept and retain the null hypothesis.

There can be four possible outcomes in sample-based decision-making, with regard to the correct applicability to the entire population:

The “Correct” cases are the ones where the decisions taken on the samples are truly applicable to the entire population. The cases of errors arise when one decides to retain (or reject) the null hypothesis based on the sample calculations, but that decision does not really apply for the entire population. These cases constitute Type 1 ( alpha ) and Type 2 ( beta ) errors, as indicated in the table above.

Selecting the correct critical value allows eliminating the type-1 alpha errors or limiting them to an acceptable range.

Alpha denotes the error on the level of significance and is determined by the researcher. To maintain the standard 5% significance or confidence level for probability calculations, this is retained at 5%.

According to the applicable decision-making benchmarks and definitions:

  • “This (alpha) criterion is usually set at 0.05 (a = 0.05), and we compare the alpha level to the p-value. When the probability of a Type I error is less than 5% (p < 0.05), we decide to reject the null hypothesis; otherwise, we retain the null hypothesis.”  
  • The technical term used for this probability is the p-value . It is defined as “the probability of obtaining a sample outcome, given that the value stated in the null hypothesis is true. The p-value for obtaining a sample outcome is compared to the level of significance."  
  • A Type II error, or beta error, is defined as the probability of incorrectly retaining the null hypothesis, when in fact it is not applicable to the entire population.  

A few more examples will demonstrate this and other calculations.

A monthly income investment scheme exists that promises variable monthly returns. An investor will invest in it only if they are assured of an average $180 monthly income. The investor has a sample of 300 months’ returns which has a mean of $190 and a standard deviation of $75. Should they invest in this scheme?

Let’s set up the problem. The investor will invest in the scheme if they are assured of the investor's desired $180 average return.

H 0 : Null Hypothesis: mean = 180

H 1 : Alternative Hypothesis: mean > 180

Method 1: Critical Value Approach

Identify a critical value X L for the sample mean, which is large enough to reject the null hypothesis – i.e. reject the null hypothesis if the sample mean >= critical value X L

P (identify a Type I alpha error) = P(reject H 0  given that H 0  is true),

This would be achieved when the sample mean exceeds the critical limits.

= P (given that H 0  is true) = alpha

Graphically, it appears as follows:

Taking alpha = 0.05 (i.e. 5% significance level), Z 0.05  = 1.645 (from the Z-table or normal distribution table)

           = > X L  = 180 +1.645*(75/sqrt(300)) = 187.12

Since the sample mean (190) is greater than the critical value (187.12), the null hypothesis is rejected, and the conclusion is that the average monthly return is indeed greater than $180, so the investor can consider investing in this scheme.

Method 2: Using Standardized Test Statistics

One can also use standardized value z.

Test Statistic, Z = (sample mean – population mean) / (std-dev / sqrt (no. of samples).

Then, the rejection region becomes the following:

Z= (190 – 180) / (75 / sqrt (300)) = 2.309

Our rejection region at 5% significance level is Z> Z 0.05  = 1.645.

Since Z= 2.309 is greater than 1.645, the null hypothesis can be rejected with a similar conclusion mentioned above.

Method 3: P-value Calculation

We aim to identify P (sample mean >= 190, when mean = 180).

= P (Z >= (190- 180) / (75 / sqrt (300))

= P (Z >= 2.309) = 0.0084 = 0.84%

The following table to infer p-value calculations concludes that there is confirmed evidence of average monthly returns being higher than 180:

A new stockbroker (XYZ) claims that their brokerage fees are lower than that of your current stock broker's (ABC). Data available from an independent research firm indicates that the mean and std-dev of all ABC broker clients are $18 and $6, respectively.

A sample of 100 clients of ABC is taken and brokerage charges are calculated with the new rates of XYZ broker. If the mean of the sample is $18.75 and std-dev is the same ($6), can any inference be made about the difference in the average brokerage bill between ABC and XYZ broker?

H 0 : Null Hypothesis: mean = 18

H 1 : Alternative Hypothesis: mean <> 18 (This is what we want to prove.)

Rejection region: Z <= - Z 2.5  and Z>=Z 2.5  (assuming 5% significance level, split 2.5 each on either side).

Z = (sample mean – mean) / (std-dev / sqrt (no. of samples))

= (18.75 – 18) / (6/(sqrt(100)) = 1.25

This calculated Z value falls between the two limits defined by:

- Z 2.5  = -1.96 and Z 2.5  = 1.96.

This concludes that there is insufficient evidence to infer that there is any difference between the rates of your existing broker and the new broker.

Alternatively, The p-value = P(Z< -1.25)+P(Z >1.25)

= 2 * 0.1056 = 0.2112 = 21.12% which is greater than 0.05 or 5%, leading to the same conclusion.

Graphically, it is represented by the following:

Criticism Points for the Hypothetical Testing Method:

  • A statistical method based on assumptions
  • Error-prone as detailed in terms of alpha and beta errors
  • Interpretation of p-value can be ambiguous, leading to confusing results

Hypothesis testing allows a mathematical model to validate a claim or idea with a certain confidence level. However, like the majority of statistical tools and models, it is bound by a few limitations. The use of this model for making financial decisions should be considered with a critical eye, keeping all dependencies in mind. Alternate methods like  Bayesian Inference are also worth exploring for similar analysis.

Sage Publications. " Introduction to Hypothesis Testing ," Page 13.

Sage Publications. " Introduction to Hypothesis Testing ," Page 11.

Sage Publications. " Introduction to Hypothesis Testing ," Page 7.

Sage Publications. " Introduction to Hypothesis Testing ," Pages 10-11.

hypothesis testing meaning in management

  • Terms of Service
  • Editorial Policy
  • Privacy Policy
  • Your Privacy Choices

Tutorial Playlist

Statistics tutorial, everything you need to know about the probability density function in statistics, the best guide to understand central limit theorem, an in-depth guide to measures of central tendency : mean, median and mode, the ultimate guide to understand conditional probability.

A Comprehensive Look at Percentile in Statistics

The Best Guide to Understand Bayes Theorem

Everything you need to know about the normal distribution, an in-depth explanation of cumulative distribution function, a complete guide to chi-square test, a complete guide on hypothesis testing in statistics, understanding the fundamentals of arithmetic and geometric progression, the definitive guide to understand spearman’s rank correlation, a comprehensive guide to understand mean squared error, all you need to know about the empirical rule in statistics, the complete guide to skewness and kurtosis, a holistic look at bernoulli distribution.

All You Need to Know About Bias in Statistics

A Complete Guide to Get a Grasp of Time Series Analysis

The Key Differences Between Z-Test Vs. T-Test

The Complete Guide to Understand Pearson's Correlation

A complete guide on the types of statistical studies, everything you need to know about poisson distribution, your best guide to understand correlation vs. regression, the most comprehensive guide for beginners on what is correlation, what is hypothesis testing in statistics types and examples.

Lesson 10 of 24 By Avijeet Biswal

A Complete Guide on Hypothesis Testing in Statistics

Table of Contents

In today’s data-driven world , decisions are based on data all the time. Hypothesis plays a crucial role in that process, whether it may be making business decisions, in the health sector, academia, or in quality improvement. Without hypothesis & hypothesis tests, you risk drawing the wrong conclusions and making bad decisions. In this tutorial, you will look at Hypothesis Testing in Statistics.

What Is Hypothesis Testing in Statistics?

Hypothesis Testing is a type of statistical analysis in which you put your assumptions about a population parameter to the test. It is used to estimate the relationship between 2 statistical variables.

Let's discuss few examples of statistical hypothesis from real-life - 

  • A teacher assumes that 60% of his college's students come from lower-middle-class families.
  • A doctor believes that 3D (Diet, Dose, and Discipline) is 90% effective for diabetic patients.

Now that you know about hypothesis testing, look at the two types of hypothesis testing in statistics.

Hypothesis Testing Formula

Z = ( x̅ – μ0 ) / (σ /√n)

  • Here, x̅ is the sample mean,
  • μ0 is the population mean,
  • σ is the standard deviation,
  • n is the sample size.

How Hypothesis Testing Works?

An analyst performs hypothesis testing on a statistical sample to present evidence of the plausibility of the null hypothesis. Measurements and analyses are conducted on a random sample of the population to test a theory. Analysts use a random population sample to test two hypotheses: the null and alternative hypotheses.

The null hypothesis is typically an equality hypothesis between population parameters; for example, a null hypothesis may claim that the population means return equals zero. The alternate hypothesis is essentially the inverse of the null hypothesis (e.g., the population means the return is not equal to zero). As a result, they are mutually exclusive, and only one can be correct. One of the two possibilities, however, will always be correct.

Your Dream Career is Just Around The Corner!

Your Dream Career is Just Around The Corner!

Null Hypothesis and Alternate Hypothesis

The Null Hypothesis is the assumption that the event will not occur. A null hypothesis has no bearing on the study's outcome unless it is rejected.

H0 is the symbol for it, and it is pronounced H-naught.

The Alternate Hypothesis is the logical opposite of the null hypothesis. The acceptance of the alternative hypothesis follows the rejection of the null hypothesis. H1 is the symbol for it.

Let's understand this with an example.

A sanitizer manufacturer claims that its product kills 95 percent of germs on average. 

To put this company's claim to the test, create a null and alternate hypothesis.

H0 (Null Hypothesis): Average = 95%.

Alternative Hypothesis (H1): The average is less than 95%.

Another straightforward example to understand this concept is determining whether or not a coin is fair and balanced. The null hypothesis states that the probability of a show of heads is equal to the likelihood of a show of tails. In contrast, the alternate theory states that the probability of a show of heads and tails would be very different.

Become a Data Scientist with Hands-on Training!

Become a Data Scientist with Hands-on Training!

Hypothesis Testing Calculation With Examples

Let's consider a hypothesis test for the average height of women in the United States. Suppose our null hypothesis is that the average height is 5'4". We gather a sample of 100 women and determine that their average height is 5'5". The standard deviation of population is 2.

To calculate the z-score, we would use the following formula:

z = ( x̅ – μ0 ) / (σ /√n)

z = (5'5" - 5'4") / (2" / √100)

z = 0.5 / (0.045)

 We will reject the null hypothesis as the z-score of 11.11 is very large and conclude that there is evidence to suggest that the average height of women in the US is greater than 5'4".

Steps of Hypothesis Testing

Step 1: specify your null and alternate hypotheses.

It is critical to rephrase your original research hypothesis (the prediction that you wish to study) as a null (Ho) and alternative (Ha) hypothesis so that you can test it quantitatively. Your first hypothesis, which predicts a link between variables, is generally your alternate hypothesis. The null hypothesis predicts no link between the variables of interest.

Step 2: Gather Data

For a statistical test to be legitimate, sampling and data collection must be done in a way that is meant to test your hypothesis. You cannot draw statistical conclusions about the population you are interested in if your data is not representative.

Step 3: Conduct a Statistical Test

Other statistical tests are available, but they all compare within-group variance (how to spread out the data inside a category) against between-group variance (how different the categories are from one another). If the between-group variation is big enough that there is little or no overlap between groups, your statistical test will display a low p-value to represent this. This suggests that the disparities between these groups are unlikely to have occurred by accident. Alternatively, if there is a large within-group variance and a low between-group variance, your statistical test will show a high p-value. Any difference you find across groups is most likely attributable to chance. The variety of variables and the level of measurement of your obtained data will influence your statistical test selection.

Step 4: Determine Rejection Of Your Null Hypothesis

Your statistical test results must determine whether your null hypothesis should be rejected or not. In most circumstances, you will base your judgment on the p-value provided by the statistical test. In most circumstances, your preset level of significance for rejecting the null hypothesis will be 0.05 - that is, when there is less than a 5% likelihood that these data would be seen if the null hypothesis were true. In other circumstances, researchers use a lower level of significance, such as 0.01 (1%). This reduces the possibility of wrongly rejecting the null hypothesis.

Step 5: Present Your Results 

The findings of hypothesis testing will be discussed in the results and discussion portions of your research paper, dissertation, or thesis. You should include a concise overview of the data and a summary of the findings of your statistical test in the results section. You can talk about whether your results confirmed your initial hypothesis or not in the conversation. Rejecting or failing to reject the null hypothesis is a formal term used in hypothesis testing. This is likely a must for your statistics assignments.

Types of Hypothesis Testing

To determine whether a discovery or relationship is statistically significant, hypothesis testing uses a z-test. It usually checks to see if two means are the same (the null hypothesis). Only when the population standard deviation is known and the sample size is 30 data points or more, can a z-test be applied.

A statistical test called a t-test is employed to compare the means of two groups. To determine whether two groups differ or if a procedure or treatment affects the population of interest, it is frequently used in hypothesis testing.

Chi-Square 

You utilize a Chi-square test for hypothesis testing concerning whether your data is as predicted. To determine if the expected and observed results are well-fitted, the Chi-square test analyzes the differences between categorical variables from a random sample. The test's fundamental premise is that the observed values in your data should be compared to the predicted values that would be present if the null hypothesis were true.

Hypothesis Testing and Confidence Intervals

Both confidence intervals and hypothesis tests are inferential techniques that depend on approximating the sample distribution. Data from a sample is used to estimate a population parameter using confidence intervals. Data from a sample is used in hypothesis testing to examine a given hypothesis. We must have a postulated parameter to conduct hypothesis testing.

Bootstrap distributions and randomization distributions are created using comparable simulation techniques. The observed sample statistic is the focal point of a bootstrap distribution, whereas the null hypothesis value is the focal point of a randomization distribution.

A variety of feasible population parameter estimates are included in confidence ranges. In this lesson, we created just two-tailed confidence intervals. There is a direct connection between these two-tail confidence intervals and these two-tail hypothesis tests. The results of a two-tailed hypothesis test and two-tailed confidence intervals typically provide the same results. In other words, a hypothesis test at the 0.05 level will virtually always fail to reject the null hypothesis if the 95% confidence interval contains the predicted value. A hypothesis test at the 0.05 level will nearly certainly reject the null hypothesis if the 95% confidence interval does not include the hypothesized parameter.

Simple and Composite Hypothesis Testing

Depending on the population distribution, you can classify the statistical hypothesis into two types.

Simple Hypothesis: A simple hypothesis specifies an exact value for the parameter.

Composite Hypothesis: A composite hypothesis specifies a range of values.

A company is claiming that their average sales for this quarter are 1000 units. This is an example of a simple hypothesis.

Suppose the company claims that the sales are in the range of 900 to 1000 units. Then this is a case of a composite hypothesis.

One-Tailed and Two-Tailed Hypothesis Testing

The One-Tailed test, also called a directional test, considers a critical region of data that would result in the null hypothesis being rejected if the test sample falls into it, inevitably meaning the acceptance of the alternate hypothesis.

In a one-tailed test, the critical distribution area is one-sided, meaning the test sample is either greater or lesser than a specific value.

In two tails, the test sample is checked to be greater or less than a range of values in a Two-Tailed test, implying that the critical distribution area is two-sided.

If the sample falls within this range, the alternate hypothesis will be accepted, and the null hypothesis will be rejected.

Become a Data Scientist With Real-World Experience

Become a Data Scientist With Real-World Experience

Right Tailed Hypothesis Testing

If the larger than (>) sign appears in your hypothesis statement, you are using a right-tailed test, also known as an upper test. Or, to put it another way, the disparity is to the right. For instance, you can contrast the battery life before and after a change in production. Your hypothesis statements can be the following if you want to know if the battery life is longer than the original (let's say 90 hours):

  • The null hypothesis is (H0 <= 90) or less change.
  • A possibility is that battery life has risen (H1) > 90.

The crucial point in this situation is that the alternate hypothesis (H1), not the null hypothesis, decides whether you get a right-tailed test.

Left Tailed Hypothesis Testing

Alternative hypotheses that assert the true value of a parameter is lower than the null hypothesis are tested with a left-tailed test; they are indicated by the asterisk "<".

Suppose H0: mean = 50 and H1: mean not equal to 50

According to the H1, the mean can be greater than or less than 50. This is an example of a Two-tailed test.

In a similar manner, if H0: mean >=50, then H1: mean <50

Here the mean is less than 50. It is called a One-tailed test.

Type 1 and Type 2 Error

A hypothesis test can result in two types of errors.

Type 1 Error: A Type-I error occurs when sample results reject the null hypothesis despite being true.

Type 2 Error: A Type-II error occurs when the null hypothesis is not rejected when it is false, unlike a Type-I error.

Suppose a teacher evaluates the examination paper to decide whether a student passes or fails.

H0: Student has passed

H1: Student has failed

Type I error will be the teacher failing the student [rejects H0] although the student scored the passing marks [H0 was true]. 

Type II error will be the case where the teacher passes the student [do not reject H0] although the student did not score the passing marks [H1 is true].

Level of Significance

The alpha value is a criterion for determining whether a test statistic is statistically significant. In a statistical test, Alpha represents an acceptable probability of a Type I error. Because alpha is a probability, it can be anywhere between 0 and 1. In practice, the most commonly used alpha values are 0.01, 0.05, and 0.1, which represent a 1%, 5%, and 10% chance of a Type I error, respectively (i.e. rejecting the null hypothesis when it is in fact correct).

Future-Proof Your AI/ML Career: Top Dos and Don'ts

Future-Proof Your AI/ML Career: Top Dos and Don'ts

A p-value is a metric that expresses the likelihood that an observed difference could have occurred by chance. As the p-value decreases the statistical significance of the observed difference increases. If the p-value is too low, you reject the null hypothesis.

Here you have taken an example in which you are trying to test whether the new advertising campaign has increased the product's sales. The p-value is the likelihood that the null hypothesis, which states that there is no change in the sales due to the new advertising campaign, is true. If the p-value is .30, then there is a 30% chance that there is no increase or decrease in the product's sales.  If the p-value is 0.03, then there is a 3% probability that there is no increase or decrease in the sales value due to the new advertising campaign. As you can see, the lower the p-value, the chances of the alternate hypothesis being true increases, which means that the new advertising campaign causes an increase or decrease in sales.

Why is Hypothesis Testing Important in Research Methodology?

Hypothesis testing is crucial in research methodology for several reasons:

  • Provides evidence-based conclusions: It allows researchers to make objective conclusions based on empirical data, providing evidence to support or refute their research hypotheses.
  • Supports decision-making: It helps make informed decisions, such as accepting or rejecting a new treatment, implementing policy changes, or adopting new practices.
  • Adds rigor and validity: It adds scientific rigor to research using statistical methods to analyze data, ensuring that conclusions are based on sound statistical evidence.
  • Contributes to the advancement of knowledge: By testing hypotheses, researchers contribute to the growth of knowledge in their respective fields by confirming existing theories or discovering new patterns and relationships.

Limitations of Hypothesis Testing

Hypothesis testing has some limitations that researchers should be aware of:

  • It cannot prove or establish the truth: Hypothesis testing provides evidence to support or reject a hypothesis, but it cannot confirm the absolute truth of the research question.
  • Results are sample-specific: Hypothesis testing is based on analyzing a sample from a population, and the conclusions drawn are specific to that particular sample.
  • Possible errors: During hypothesis testing, there is a chance of committing type I error (rejecting a true null hypothesis) or type II error (failing to reject a false null hypothesis).
  • Assumptions and requirements: Different tests have specific assumptions and requirements that must be met to accurately interpret results.

After reading this tutorial, you would have a much better understanding of hypothesis testing, one of the most important concepts in the field of Data Science . The majority of hypotheses are based on speculation about observed behavior, natural phenomena, or established theories.

If you are interested in statistics of data science and skills needed for such a career, you ought to explore Simplilearn’s Post Graduate Program in Data Science.

If you have any questions regarding this ‘Hypothesis Testing In Statistics’ tutorial, do share them in the comment section. Our subject matter expert will respond to your queries. Happy learning!

1. What is hypothesis testing in statistics with example?

Hypothesis testing is a statistical method used to determine if there is enough evidence in a sample data to draw conclusions about a population. It involves formulating two competing hypotheses, the null hypothesis (H0) and the alternative hypothesis (Ha), and then collecting data to assess the evidence. An example: testing if a new drug improves patient recovery (Ha) compared to the standard treatment (H0) based on collected patient data.

2. What is hypothesis testing and its types?

Hypothesis testing is a statistical method used to make inferences about a population based on sample data. It involves formulating two hypotheses: the null hypothesis (H0), which represents the default assumption, and the alternative hypothesis (Ha), which contradicts H0. The goal is to assess the evidence and determine whether there is enough statistical significance to reject the null hypothesis in favor of the alternative hypothesis.

Types of hypothesis testing:

  • One-sample test: Used to compare a sample to a known value or a hypothesized value.
  • Two-sample test: Compares two independent samples to assess if there is a significant difference between their means or distributions.
  • Paired-sample test: Compares two related samples, such as pre-test and post-test data, to evaluate changes within the same subjects over time or under different conditions.
  • Chi-square test: Used to analyze categorical data and determine if there is a significant association between variables.
  • ANOVA (Analysis of Variance): Compares means across multiple groups to check if there is a significant difference between them.

3. What are the steps of hypothesis testing?

The steps of hypothesis testing are as follows:

  • Formulate the hypotheses: State the null hypothesis (H0) and the alternative hypothesis (Ha) based on the research question.
  • Set the significance level: Determine the acceptable level of error (alpha) for making a decision.
  • Collect and analyze data: Gather and process the sample data.
  • Compute test statistic: Calculate the appropriate statistical test to assess the evidence.
  • Make a decision: Compare the test statistic with critical values or p-values and determine whether to reject H0 in favor of Ha or not.
  • Draw conclusions: Interpret the results and communicate the findings in the context of the research question.

4. What are the 2 types of hypothesis testing?

  • One-tailed (or one-sided) test: Tests for the significance of an effect in only one direction, either positive or negative.
  • Two-tailed (or two-sided) test: Tests for the significance of an effect in both directions, allowing for the possibility of a positive or negative effect.

The choice between one-tailed and two-tailed tests depends on the specific research question and the directionality of the expected effect.

5. What are the 3 major types of hypothesis?

The three major types of hypotheses are:

  • Null Hypothesis (H0): Represents the default assumption, stating that there is no significant effect or relationship in the data.
  • Alternative Hypothesis (Ha): Contradicts the null hypothesis and proposes a specific effect or relationship that researchers want to investigate.
  • Nondirectional Hypothesis: An alternative hypothesis that doesn't specify the direction of the effect, leaving it open for both positive and negative possibilities.

Find our Data Analyst Online Bootcamp in top cities:

About the author.

Avijeet Biswal

Avijeet is a Senior Research Analyst at Simplilearn. Passionate about Data Analytics, Machine Learning, and Deep Learning, Avijeet is also interested in politics, cricket, and football.

Recommended Resources

The Key Differences Between Z-Test Vs. T-Test

Free eBook: Top Programming Languages For A Data Scientist

Normality Test in Minitab: Minitab with Statistics

Normality Test in Minitab: Minitab with Statistics

A Comprehensive Look at Percentile in Statistics

Machine Learning Career Guide: A Playbook to Becoming a Machine Learning Engineer

  • PMP, PMI, PMBOK, CAPM, PgMP, PfMP, ACP, PBA, RMP, SP, and OPM3 are registered marks of the Project Management Institute, Inc.
  • Physician Physician Board Reviews Physician Associate Board Reviews CME Lifetime CME Free CME
  • Student USMLE Step 1 USMLE Step 2 USMLE Step 3 COMLEX Level 1 COMLEX Level 2 COMLEX Level 3 96 Medical School Exams Student Resource Center NCLEX - RN NCLEX - LPN/LVN/PN 24 Nursing Exams
  • Nurse Practitioner APRN/NP Board Reviews CNS Certification Reviews CE - Nurse Practitioner FREE CE
  • Nurse RN Certification Reviews CE - Nurse FREE CE
  • Pharmacist Pharmacy Board Exam Prep CE - Pharmacist
  • Allied Allied Health Exam Prep Dentist Exams CE - Social Worker CE - Dentist
  • Point of Care
  • Free CME/CE

Hypothesis Testing, P Values, Confidence Intervals, and Significance

Definition/introduction.

Medical providers often rely on evidence-based medicine to guide decision-making in practice. Often a research hypothesis is tested with results provided, typically with p values, confidence intervals, or both. Additionally, statistical or research significance is estimated or determined by the investigators. Unfortunately, healthcare providers may have different comfort levels in interpreting these findings, which may affect the adequate application of the data.

Issues of Concern

Register for free and read the full article, learn more about a subscription to statpearls point-of-care.

Without a foundational understanding of hypothesis testing, p values, confidence intervals, and the difference between statistical and clinical significance, it may affect healthcare providers' ability to make clinical decisions without relying purely on the research investigators deemed level of significance. Therefore, an overview of these concepts is provided to allow medical professionals to use their expertise to determine if results are reported sufficiently and if the study outcomes are clinically appropriate to be applied in healthcare practice.

Hypothesis Testing

Investigators conducting studies need research questions and hypotheses to guide analyses. Starting with broad research questions (RQs), investigators then identify a gap in current clinical practice or research. Any research problem or statement is grounded in a better understanding of relationships between two or more variables. For this article, we will use the following research question example:

Research Question: Is Drug 23 an effective treatment for Disease A?

Research questions do not directly imply specific guesses or predictions; we must formulate research hypotheses. A hypothesis is a predetermined declaration regarding the research question in which the investigator(s) makes a precise, educated guess about a study outcome. This is sometimes called the alternative hypothesis and ultimately allows the researcher to take a stance based on experience or insight from medical literature. An example of a hypothesis is below.

Research Hypothesis: Drug 23 will significantly reduce symptoms associated with Disease A compared to Drug 22.

The null hypothesis states that there is no statistical difference between groups based on the stated research hypothesis.

Researchers should be aware of journal recommendations when considering how to report p values, and manuscripts should remain internally consistent.

Regarding p values, as the number of individuals enrolled in a study (the sample size) increases, the likelihood of finding a statistically significant effect increases. With very large sample sizes, the p-value can be very low significant differences in the reduction of symptoms for Disease A between Drug 23 and Drug 22. The null hypothesis is deemed true until a study presents significant data to support rejecting the null hypothesis. Based on the results, the investigators will either reject the null hypothesis (if they found significant differences or associations) or fail to reject the null hypothesis (they could not provide proof that there were significant differences or associations).

To test a hypothesis, researchers obtain data on a representative sample to determine whether to reject or fail to reject a null hypothesis. In most research studies, it is not feasible to obtain data for an entire population. Using a sampling procedure allows for statistical inference, though this involves a certain possibility of error. [1]  When determining whether to reject or fail to reject the null hypothesis, mistakes can be made: Type I and Type II errors. Though it is impossible to ensure that these errors have not occurred, researchers should limit the possibilities of these faults. [2]

Significance

Significance is a term to describe the substantive importance of medical research. Statistical significance is the likelihood of results due to chance. [3]  Healthcare providers should always delineate statistical significance from clinical significance, a common error when reviewing biomedical research. [4]  When conceptualizing findings reported as either significant or not significant, healthcare providers should not simply accept researchers' results or conclusions without considering the clinical significance. Healthcare professionals should consider the clinical importance of findings and understand both p values and confidence intervals so they do not have to rely on the researchers to determine the level of significance. [5]  One criterion often used to determine statistical significance is the utilization of p values.

P values are used in research to determine whether the sample estimate is significantly different from a hypothesized value. The p-value is the probability that the observed effect within the study would have occurred by chance if, in reality, there was no true effect. Conventionally, data yielding a p<0.05 or p<0.01 is considered statistically significant. While some have debated that the 0.05 level should be lowered, it is still universally practiced. [6]  Hypothesis testing allows us to determine the size of the effect.

An example of findings reported with p values are below:

Statement: Drug 23 reduced patients' symptoms compared to Drug 22. Patients who received Drug 23 (n=100) were 2.1 times less likely than patients who received Drug 22 (n = 100) to experience symptoms of Disease A, p<0.05.

Statement:Individuals who were prescribed Drug 23 experienced fewer symptoms (M = 1.3, SD = 0.7) compared to individuals who were prescribed Drug 22 (M = 5.3, SD = 1.9). This finding was statistically significant, p= 0.02.

For either statement, if the threshold had been set at 0.05, the null hypothesis (that there was no relationship) should be rejected, and we should conclude significant differences. Noticeably, as can be seen in the two statements above, some researchers will report findings with < or > and others will provide an exact p-value (0.000001) but never zero [6] . When examining research, readers should understand how p values are reported. The best practice is to report all p values for all variables within a study design, rather than only providing p values for variables with significant findings. [7]  The inclusion of all p values provides evidence for study validity and limits suspicion for selective reporting/data mining.  

While researchers have historically used p values, experts who find p values problematic encourage the use of confidence intervals. [8] . P-values alone do not allow us to understand the size or the extent of the differences or associations. [3]  In March 2016, the American Statistical Association (ASA) released a statement on p values, noting that scientific decision-making and conclusions should not be based on a fixed p-value threshold (e.g., 0.05). They recommend focusing on the significance of results in the context of study design, quality of measurements, and validity of data. Ultimately, the ASA statement noted that in isolation, a p-value does not provide strong evidence. [9]

When conceptualizing clinical work, healthcare professionals should consider p values with a concurrent appraisal study design validity. For example, a p-value from a double-blinded randomized clinical trial (designed to minimize bias) should be weighted higher than one from a retrospective observational study [7] . The p-value debate has smoldered since the 1950s [10] , and replacement with confidence intervals has been suggested since the 1980s. [11]

Confidence Intervals

A confidence interval provides a range of values within given confidence (e.g., 95%), including the accurate value of the statistical constraint within a targeted population. [12]  Most research uses a 95% CI, but investigators can set any level (e.g., 90% CI, 99% CI). [13]  A CI provides a range with the lower bound and upper bound limits of a difference or association that would be plausible for a population. [14]  Therefore, a CI of 95% indicates that if a study were to be carried out 100 times, the range would contain the true value in 95, [15]  confidence intervals provide more evidence regarding the precision of an estimate compared to p-values. [6]

In consideration of the similar research example provided above, one could make the following statement with 95% CI:

Statement: Individuals who were prescribed Drug 23 had no symptoms after three days, which was significantly faster than those prescribed Drug 22; there was a mean difference between the two groups of days to the recovery of 4.2 days (95% CI: 1.9 – 7.8).

It is important to note that the width of the CI is affected by the standard error and the sample size; reducing a study sample number will result in less precision of the CI (increase the width). [14]  A larger width indicates a smaller sample size or a larger variability. [16]  A researcher would want to increase the precision of the CI. For example, a 95% CI of 1.43 – 1.47 is much more precise than the one provided in the example above. In research and clinical practice, CIs provide valuable information on whether the interval includes or excludes any clinically significant values. [14]

Null values are sometimes used for differences with CI (zero for differential comparisons and 1 for ratios). However, CIs provide more information than that. [15]  Consider this example: A hospital implements a new protocol that reduced wait time for patients in the emergency department by an average of 25 minutes (95% CI: -2.5 – 41 minutes). Because the range crosses zero, implementing this protocol in different populations could result in longer wait times; however, the range is much higher on the positive side. Thus, while the p-value used to detect statistical significance for this may result in "not significant" findings, individuals should examine this range, consider the study design, and weigh whether or not it is still worth piloting in their workplace.

Similarly to p-values, 95% CIs cannot control for researchers' errors (e.g., study bias or improper data analysis). [14]  In consideration of whether to report p-values or CIs, researchers should examine journal preferences. When in doubt, reporting both may be beneficial. [13]  An example is below:

Reporting both: Individuals who were prescribed Drug 23 had no symptoms after three days, which was significantly faster than those prescribed Drug 22, p = 0.009. There was a mean difference between the two groups of days to the recovery of 4.2 days (95% CI: 1.9 – 7.8).

Clinical Significance

Recall that clinical significance and statistical significance are two different concepts. Healthcare providers should remember that a study with statistically significant differences and large sample size may be of no interest to clinicians, whereas a study with smaller sample size and statistically non-significant results could impact clinical practice. [14]  Additionally, as previously mentioned, a non-significant finding may reflect the study design itself rather than relationships between variables.

Healthcare providers using evidence-based medicine to inform practice should use clinical judgment to determine the practical importance of studies through careful evaluation of the design, sample size, power, likelihood of type I and type II errors, data analysis, and reporting of statistical findings (p values, 95% CI or both). [4]  Interestingly, some experts have called for "statistically significant" or "not significant" to be excluded from work as statistical significance never has and will never be equivalent to clinical significance. [17]

The decision on what is clinically significant can be challenging, depending on the providers' experience and especially the severity of the disease. Providers should use their knowledge and experiences to determine the meaningfulness of study results and make inferences based not only on significant or insignificant results by researchers but through their understanding of study limitations and practical implications.

Nursing, Allied Health, and Interprofessional Team Interventions

All physicians, nurses, pharmacists, and other healthcare professionals should strive to understand the concepts in this chapter. These individuals should maintain the ability to review and incorporate new literature for evidence-based and safe care. 

Jones M, Gebski V, Onslow M, Packman A. Statistical power in stuttering research: a tutorial. Journal of speech, language, and hearing research : JSLHR. 2002 Apr:45(2):243-55     [PubMed PMID: 12003508]

Sedgwick P. Pitfalls of statistical hypothesis testing: type I and type II errors. BMJ (Clinical research ed.). 2014 Jul 3:349():g4287. doi: 10.1136/bmj.g4287. Epub 2014 Jul 3     [PubMed PMID: 24994622]

Fethney J. Statistical and clinical significance, and how to use confidence intervals to help interpret both. Australian critical care : official journal of the Confederation of Australian Critical Care Nurses. 2010 May:23(2):93-7. doi: 10.1016/j.aucc.2010.03.001. Epub 2010 Mar 29     [PubMed PMID: 20347326]

Hayat MJ. Understanding statistical significance. Nursing research. 2010 May-Jun:59(3):219-23. doi: 10.1097/NNR.0b013e3181dbb2cc. Epub     [PubMed PMID: 20445438]

Ferrill MJ, Brown DA, Kyle JA. Clinical versus statistical significance: interpreting P values and confidence intervals related to measures of association to guide decision making. Journal of pharmacy practice. 2010 Aug:23(4):344-51. doi: 10.1177/0897190009358774. Epub 2010 Apr 13     [PubMed PMID: 21507834]

Infanger D, Schmidt-Trucksäss A. P value functions: An underused method to present research results and to promote quantitative reasoning. Statistics in medicine. 2019 Sep 20:38(21):4189-4197. doi: 10.1002/sim.8293. Epub 2019 Jul 3     [PubMed PMID: 31270842]

Dorey F. Statistics in brief: Interpretation and use of p values: all p values are not equal. Clinical orthopaedics and related research. 2011 Nov:469(11):3259-61. doi: 10.1007/s11999-011-2053-1. Epub     [PubMed PMID: 21918804]

Liu XS. Implications of statistical power for confidence intervals. The British journal of mathematical and statistical psychology. 2012 Nov:65(3):427-37. doi: 10.1111/j.2044-8317.2011.02035.x. Epub 2011 Oct 25     [PubMed PMID: 22026811]

Tijssen JG, Kolm P. Demystifying the New Statistical Recommendations: The Use and Reporting of p Values. Journal of the American College of Cardiology. 2016 Jul 12:68(2):231-3. doi: 10.1016/j.jacc.2016.05.026. Epub     [PubMed PMID: 27386779]

Spanos A. Recurring controversies about P values and confidence intervals revisited. Ecology. 2014 Mar:95(3):645-51     [PubMed PMID: 24804448]

Freire APCF, Elkins MR, Ramos EMC, Moseley AM. Use of 95% confidence intervals in the reporting of between-group differences in randomized controlled trials: analysis of a representative sample of 200 physical therapy trials. Brazilian journal of physical therapy. 2019 Jul-Aug:23(4):302-310. doi: 10.1016/j.bjpt.2018.10.004. Epub 2018 Oct 16     [PubMed PMID: 30366845]

Dorey FJ. In brief: statistics in brief: Confidence intervals: what is the real result in the target population? Clinical orthopaedics and related research. 2010 Nov:468(11):3137-8. doi: 10.1007/s11999-010-1407-4. Epub     [PubMed PMID: 20532716]

Porcher R. Reporting results of orthopaedic research: confidence intervals and p values. Clinical orthopaedics and related research. 2009 Oct:467(10):2736-7. doi: 10.1007/s11999-009-0952-1. Epub 2009 Jun 30     [PubMed PMID: 19565303]

Gardner MJ, Altman DG. Confidence intervals rather than P values: estimation rather than hypothesis testing. British medical journal (Clinical research ed.). 1986 Mar 15:292(6522):746-50     [PubMed PMID: 3082422]

Cooper RJ, Wears RL, Schriger DL. Reporting research results: recommendations for improving communication. Annals of emergency medicine. 2003 Apr:41(4):561-4     [PubMed PMID: 12658257]

Doll H, Carney S. Statistical approaches to uncertainty: P values and confidence intervals unpacked. Equine veterinary journal. 2007 May:39(3):275-6     [PubMed PMID: 17520981]

Colquhoun D. The reproducibility of research and the misinterpretation of p-values. Royal Society open science. 2017 Dec:4(12):171085. doi: 10.1098/rsos.171085. Epub 2017 Dec 6     [PubMed PMID: 29308247]

Use the mouse wheel to zoom in and out, click and drag to pan the image

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Indian J Crit Care Med
  • v.23(Suppl 3); 2019 Sep

An Introduction to Statistics: Understanding Hypothesis Testing and Statistical Errors

Priya ranganathan.

1 Department of Anesthesiology, Critical Care and Pain, Tata Memorial Hospital, Mumbai, Maharashtra, India

2 Department of Surgical Oncology, Tata Memorial Centre, Mumbai, Maharashtra, India

The second article in this series on biostatistics covers the concepts of sample, population, research hypotheses and statistical errors.

How to cite this article

Ranganathan P, Pramesh CS. An Introduction to Statistics: Understanding Hypothesis Testing and Statistical Errors. Indian J Crit Care Med 2019;23(Suppl 3):S230–S231.

Two papers quoted in this issue of the Indian Journal of Critical Care Medicine report. The results of studies aim to prove that a new intervention is better than (superior to) an existing treatment. In the ABLE study, the investigators wanted to show that transfusion of fresh red blood cells would be superior to standard-issue red cells in reducing 90-day mortality in ICU patients. 1 The PROPPR study was designed to prove that transfusion of a lower ratio of plasma and platelets to red cells would be superior to a higher ratio in decreasing 24-hour and 30-day mortality in critically ill patients. 2 These studies are known as superiority studies (as opposed to noninferiority or equivalence studies which will be discussed in a subsequent article).

SAMPLE VERSUS POPULATION

A sample represents a group of participants selected from the entire population. Since studies cannot be carried out on entire populations, researchers choose samples, which are representative of the population. This is similar to walking into a grocery store and examining a few grains of rice or wheat before purchasing an entire bag; we assume that the few grains that we select (the sample) are representative of the entire sack of grains (the population).

The results of the study are then extrapolated to generate inferences about the population. We do this using a process known as hypothesis testing. This means that the results of the study may not always be identical to the results we would expect to find in the population; i.e., there is the possibility that the study results may be erroneous.

HYPOTHESIS TESTING

A clinical trial begins with an assumption or belief, and then proceeds to either prove or disprove this assumption. In statistical terms, this belief or assumption is known as a hypothesis. Counterintuitively, what the researcher believes in (or is trying to prove) is called the “alternate” hypothesis, and the opposite is called the “null” hypothesis; every study has a null hypothesis and an alternate hypothesis. For superiority studies, the alternate hypothesis states that one treatment (usually the new or experimental treatment) is superior to the other; the null hypothesis states that there is no difference between the treatments (the treatments are equal). For example, in the ABLE study, we start by stating the null hypothesis—there is no difference in mortality between groups receiving fresh RBCs and standard-issue RBCs. We then state the alternate hypothesis—There is a difference between groups receiving fresh RBCs and standard-issue RBCs. It is important to note that we have stated that the groups are different, without specifying which group will be better than the other. This is known as a two-tailed hypothesis and it allows us to test for superiority on either side (using a two-sided test). This is because, when we start a study, we are not 100% certain that the new treatment can only be better than the standard treatment—it could be worse, and if it is so, the study should pick it up as well. One tailed hypothesis and one-sided statistical testing is done for non-inferiority studies, which will be discussed in a subsequent paper in this series.

STATISTICAL ERRORS

There are two possibilities to consider when interpreting the results of a superiority study. The first possibility is that there is truly no difference between the treatments but the study finds that they are different. This is called a Type-1 error or false-positive error or alpha error. This means falsely rejecting the null hypothesis.

The second possibility is that there is a difference between the treatments and the study does not pick up this difference. This is called a Type 2 error or false-negative error or beta error. This means falsely accepting the null hypothesis.

The power of the study is the ability to detect a difference between groups and is the converse of the beta error; i.e., power = 1-beta error. Alpha and beta errors are finalized when the protocol is written and form the basis for sample size calculation for the study. In an ideal world, we would not like any error in the results of our study; however, we would need to do the study in the entire population (infinite sample size) to be able to get a 0% alpha and beta error. These two errors enable us to do studies with realistic sample sizes, with the compromise that there is a small possibility that the results may not always reflect the truth. The basis for this will be discussed in a subsequent paper in this series dealing with sample size calculation.

Conventionally, type 1 or alpha error is set at 5%. This means, that at the end of the study, if there is a difference between groups, we want to be 95% certain that this is a true difference and allow only a 5% probability that this difference has occurred by chance (false positive). Type 2 or beta error is usually set between 10% and 20%; therefore, the power of the study is 90% or 80%. This means that if there is a difference between groups, we want to be 80% (or 90%) certain that the study will detect that difference. For example, in the ABLE study, sample size was calculated with a type 1 error of 5% (two-sided) and power of 90% (type 2 error of 10%) (1).

Table 1 gives a summary of the two types of statistical errors with an example

Statistical errors

In the next article in this series, we will look at the meaning and interpretation of ‘ p ’ value and confidence intervals for hypothesis testing.

Source of support: Nil

Conflict of interest: None

9.1 Null and Alternative Hypotheses

The actual test begins by considering two hypotheses . They are called the null hypothesis and the alternative hypothesis . These hypotheses contain opposing viewpoints.

H 0 , the — null hypothesis: a statement of no difference between sample means or proportions or no difference between a sample mean or proportion and a population mean or proportion. In other words, the difference equals 0.

H a —, the alternative hypothesis: a claim about the population that is contradictory to H 0 and what we conclude when we reject H 0 .

Since the null and alternative hypotheses are contradictory, you must examine evidence to decide if you have enough evidence to reject the null hypothesis or not. The evidence is in the form of sample data.

After you have determined which hypothesis the sample supports, you make a decision. There are two options for a decision. They are reject H 0 if the sample information favors the alternative hypothesis or do not reject H 0 or decline to reject H 0 if the sample information is insufficient to reject the null hypothesis.

Mathematical Symbols Used in H 0 and H a :

H 0 always has a symbol with an equal in it. H a never has a symbol with an equal in it. The choice of symbol depends on the wording of the hypothesis test. However, be aware that many researchers use = in the null hypothesis, even with > or < as the symbol in the alternative hypothesis. This practice is acceptable because we only make the decision to reject or not reject the null hypothesis.

Example 9.1

H 0 : No more than 30 percent of the registered voters in Santa Clara County voted in the primary election. p ≤ 30 H a : More than 30 percent of the registered voters in Santa Clara County voted in the primary election. p > 30

A medical trial is conducted to test whether or not a new medicine reduces cholesterol by 25 percent. State the null and alternative hypotheses.

Example 9.2

We want to test whether the mean GPA of students in American colleges is different from 2.0 (out of 4.0). The null and alternative hypotheses are the following: H 0 : μ = 2.0 H a : μ ≠ 2.0

We want to test whether the mean height of eighth graders is 66 inches. State the null and alternative hypotheses. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : μ __ 66
  • H a : μ __ 66

Example 9.3

We want to test if college students take fewer than five years to graduate from college, on the average. The null and alternative hypotheses are the following: H 0 : μ ≥ 5 H a : μ < 5

We want to test if it takes fewer than 45 minutes to teach a lesson plan. State the null and alternative hypotheses. Fill in the correct symbol ( =, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : μ __ 45
  • H a : μ __ 45

Example 9.4

An article on school standards stated that about half of all students in France, Germany, and Israel take advanced placement exams and a third of the students pass. The same article stated that 6.6 percent of U.S. students take advanced placement exams and 4.4 percent pass. Test if the percentage of U.S. students who take advanced placement exams is more than 6.6 percent. State the null and alternative hypotheses. H 0 : p ≤ 0.066 H a : p > 0.066

On a state driver’s test, about 40 percent pass the test on the first try. We want to test if more than 40 percent pass on the first try. Fill in the correct symbol (=, ≠, ≥, <, ≤, >) for the null and alternative hypotheses.

  • H 0 : p __ 0.40
  • H a : p __ 0.40

Collaborative Exercise

Bring to class a newspaper, some news magazines, and some internet articles. In groups, find articles from which your group can write null and alternative hypotheses. Discuss your hypotheses with the rest of the class.

As an Amazon Associate we earn from qualifying purchases.

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute Texas Education Agency (TEA). The original material is available at: https://www.texasgateway.org/book/tea-statistics . Changes were made to the original material, including updates to art, structure, and other content updates.

Access for free at https://openstax.org/books/statistics/pages/1-introduction
  • Authors: Barbara Illowsky, Susan Dean
  • Publisher/website: OpenStax
  • Book title: Statistics
  • Publication date: Mar 27, 2020
  • Location: Houston, Texas
  • Book URL: https://openstax.org/books/statistics/pages/1-introduction
  • Section URL: https://openstax.org/books/statistics/pages/9-1-null-and-alternative-hypotheses

© Jan 23, 2024 Texas Education Agency (TEA). The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.

IMAGES

  1. Hypothesis Testing- Meaning, Types & Steps

    hypothesis testing meaning in management

  2. What is Hypothesis Testing? Types and Methods

    hypothesis testing meaning in management

  3. PPT

    hypothesis testing meaning in management

  4. Hypothesis Testing Steps & Examples

    hypothesis testing meaning in management

  5. Hypothesis Testing: 4 Steps and Example

    hypothesis testing meaning in management

  6. Hypothesis Testing Definition

    hypothesis testing meaning in management

VIDEO

  1. Lecture 10: Hypothesis Testing

  2. Importance of Hypothesis Testing in Quality Management #statistics

  3. hypothesis testing ll meaning ll definition ll types ll errors ll level of significance ll SEM

  4. KSET PAPER 1 RESEARCH APTITUDE

  5. Lecture 07: Hypothesis Testing

  6. Testing feature hypothesis in product management

COMMENTS

  1. A Beginner's Guide to Hypothesis Testing in Business

    3. One-Sided vs. Two-Sided Testing. When it's time to test your hypothesis, it's important to leverage the correct testing method. The two most common hypothesis testing methods are one-sided and two-sided tests, or one-tailed and two-tailed tests, respectively. Typically, you'd leverage a one-sided test when you have a strong conviction ...

  2. Hypothesis Testing

    Step 5: Present your findings. The results of hypothesis testing will be presented in the results and discussion sections of your research paper, dissertation or thesis.. In the results section you should give a brief summary of the data and a summary of the results of your statistical test (for example, the estimated difference between group means and associated p-value).

  3. Hypothesis Testing: Definition, Uses, Limitations + Examples

    Mean Population IQ: 100. Step 1: Using the value of the mean population IQ, we establish the null hypothesis as 100. Step 2: State that the alternative hypothesis is greater than 100. Step 3: State the alpha level as 0.05 or 5%. Step 4: Find the rejection region area (given by your alpha level above) from the z-table.

  4. Guide: Hypothesis Testing

    Rejecting the null hypothesis might mean implementing a new process or strategy, while failing to reject it might lead to a continuation of current practices. ... Industry Standard: A 5% significance level is widely accepted in many industries, including manufacturing, for hypothesis testing. Risk Management: ...

  5. Hypothesis Testing in Business Analytics

    Hypothesis testing enables organizations to analyze and examine their decisions' causes and effects before making important management decisions. Based on research by the Harvard Business School Online, prior to making any decision, organizations like to explore the advantages of hypothesis testing and the investigation of decisions in a ...

  6. Hypothesis Testing: 4 Steps and Example

    Hypothesis testing is an act in statistics whereby an analyst tests an assumption regarding a population parameter. The methodology employed by the analyst depends on the nature of the data used ...

  7. Hypothesis Testing in Finance

    Hypothesis testing is a powerful tool for testing the power of predictions. A Financial Analyst, for example, might want to make a prediction of the mean value a customer would pay for her firm's product. She can then formulate a hypothesis, for example, "The average value that customers will pay for my product is larger than $5.".

  8. Hypothesis Testing in Business Administration

    The components of hypothesis-testing techniques can be used to address this issue with the understanding that the goal of testing some hypothesis has been replaced by the goal of determining whether a decision can be made about which group has the larger mean. Another aspect of hypothesis testing that has seen considerable criticism is the ...

  9. An Introduction to Bayesian Hypothesis Testing for Management Research

    In management research, empirical data are often analyzed using p-value null hypothesis significance testing (pNHST). Here we outline the conceptual and practical advantages of an alternative analysis method: Bayesian hypothesis testing and model selection using the Bayes factor.

  10. S.3 Hypothesis Testing

    hypothesis testing. S.3 Hypothesis Testing. In reviewing hypothesis tests, we start first with the general idea. Then, we keep returning to the basic procedures of hypothesis testing, each time adding a little more detail. The general idea of hypothesis testing involves: Making an initial assumption. Collecting evidence (data).

  11. A Guide to Product Hypothesis Testing

    A/B Testing. One of the most common use cases to achieve hypothesis validation is randomized A/B testing, in which a change or feature is released at random to one-half of users (A) and withheld from the other half (B). Returning to the hypothesis of bigger product images improving conversion on Amazon, one-half of users will be shown the ...

  12. How McKinsey uses Hypotheses in Business & Strategy by McKinsey Alum

    And, being hypothesis-driven was required to have any success at McKinsey. A hypothesis is an idea or theory, often based on limited data, which is typically the beginning of a thread of further investigation to prove, disprove or improve the hypothesis through facts and empirical data. The first step in being hypothesis-driven is to focus on ...

  13. PDF Introduction to Hypothesis Testing

    Hypothesis testing or significance testing is a method for testing a claim or hypothesis about a parameter in a population, using data measured in a sample. In this method, we test some hypothesis by determining the likelihood that a sample statistic could have been selected, if the hypothesis regarding the population parameter were true. The ...

  14. 9.1: Introduction to Hypothesis Testing

    In hypothesis testing, the goal is to see if there is sufficient statistical evidence to reject a presumed null hypothesis in favor of a conjectured alternative hypothesis.The null hypothesis is usually denoted \(H_0\) while the alternative hypothesis is usually denoted \(H_1\). An hypothesis test is a statistical decision; the conclusion will either be to reject the null hypothesis in favor ...

  15. Hypothesis Testing, P Values, Confidence Intervals, and Significance

    Definition/Introduction. Medical providers often rely on evidence-based medicine to guide decision-making in practice. Often a research hypothesis is tested with results provided, typically with p values, confidence intervals, or both. Additionally, statistical or research significance is estimated or determined by the investigators.

  16. How to Write a Strong Hypothesis

    5. Phrase your hypothesis in three ways. To identify the variables, you can write a simple prediction in if…then form. The first part of the sentence states the independent variable and the second part states the dependent variable. If a first-year student starts attending more lectures, then their exam scores will improve.

  17. Introduction to Hypothesis Testing

    A statistical hypothesis is an assumption about a population parameter.. For example, we may assume that the mean height of a male in the U.S. is 70 inches. The assumption about the height is the statistical hypothesis and the true mean height of a male in the U.S. is the population parameter.. A hypothesis test is a formal statistical test we use to reject or fail to reject a statistical ...

  18. Hypothesis Testing Explained (How I Wish It Was Explained to Me)

    The curse of hypothesis testing is that we will never know if we are dealing with a True or a False Positive (Negative). All we can do is fill the confusion matrix with probabilities that are acceptable given our application. To be able to do that, we must start from a hypothesis. Step 1. Defining the hypothesis

  19. Hypothesis Testing in Finance: Concept and Examples

    Hypothesis testing is a mathematical tool for confirming a financial or business claim or idea. Hypothesis testing is useful for investors trying to decide what to invest in and whether the ...

  20. What is Hypothesis Testing in Statistics? Types and Examples

    Hypothesis testing is a statistical method used to determine if there is enough evidence in a sample data to draw conclusions about a population. It involves formulating two competing hypotheses, the null hypothesis (H0) and the alternative hypothesis (Ha), and then collecting data to assess the evidence.

  21. Hypothesis Testing, P Values, Confidence Intervals, and Significance

    Point of Care - Clinical decision support for Hypothesis Testing, P Values, Confidence Intervals, and Significance. Treatment and management. Definition/Introduction, Issues of Concern, Clinical Significance, Nursing, Allied Health, and Interprofessional Team Interventions

  22. An Introduction to Statistics: Understanding Hypothesis Testing and

    HYPOTHESIS TESTING. A clinical trial begins with an assumption or belief, and then proceeds to either prove or disprove this assumption. In statistical terms, this belief or assumption is known as a hypothesis. Counterintuitively, what the researcher believes in (or is trying to prove) is called the "alternate" hypothesis, and the opposite ...

  23. PDF Hypothesis Testing

    for two methods. Population mean difference is zero. Alternative hypothesis: There is a difference in average fat lost in population for two methods. Population mean difference is not zero. Step 2. Collect and summarize data into a test statistic. So the test statistic: z = 1.8 - 0 = 2.17 0.83

  24. 9.1 Null and Alternative Hypotheses

    The actual test begins by considering two hypotheses.They are called the null hypothesis and the alternative hypothesis.These hypotheses contain opposing viewpoints. H 0, the —null hypothesis: a statement of no difference between sample means or proportions or no difference between a sample mean or proportion and a population mean or proportion. In other words, the difference equals 0.

  25. Bayesian hypothesis testing for equality of high-dimensional means

    Abstract. The classical Hotelling's T 2 test and Bayesian hypothesis tests breakdown for the problem of comparing two high-dimensional population means due to the singularity of the pooled sample covariance matrices when the model dimension p exceeds the sample size n.In this paper, we develop a simple closed-form Bayesian testing procedure based on a split-and-merge technique.