Module 10: Hypothesis Testing With Two Samples

Putting it together: hypothesis testing with two samples, let’s summarize.

  • The steps for performing a hypothesis test for two population means with unknown standard deviation is generally the same as the steps for conducting a hypothesis test for one population mean with unknown standard deviation, using a t -distribution.
  • Because the population standard deviations are not known, the sample standard deviations are used for calculations.
  • When the sum of the sample sizes is more than 30, a normal distribution can be used to approximate the student’s  t -distribution.
  • The difference of two proportions is approximately normal if there are at least five successes and five failures in each sample.
  • When conducting a hypothesis test for a difference of two proportions, the random samples must be independent and the population must be at least ten times the sample size.
  • When calculating the standard error for the difference in sample proportions, the pooled proportion must be used.
  • When two measurements (samples) are drawn from the same pair of individuals or objects, the differences from the sample are used to conduct the hypothesis test.
  • The distribution that is used to conduct the hypothesis test on the differences is a t -distribution.
  • Provided by : Lumen Learning. License : CC BY: Attribution
  • Introductory Statistics. Authored by : Barbara Illowsky, Susan Dean. Provided by : OpenStax. Located at : https://openstax.org/books/introductory-statistics/pages/1-introduction . License : CC BY: Attribution . License Terms : Access for free at https://openstax.org/books/introductory-statistics/pages/1-introduction

Footer Logo Lumen Candela

Privacy Policy

logo

Introduction to Data Science I & II

Two sample testing, two sample testing #.

In many applications there is an interest in comparing two random samples; for example, investigate differences in cholesterol levels between two groups of patients. It is often done using a hypothesis test - hence the name “two sample testing”. This is also called A/B testing.

The natural hypotheses for this situation are:

\(H_0\) : the two samples are generated from the same distribution.

\(H_A\) : the two samples are generated from two different distributions.

The test statistic is normally based on the difference in a specified sample summary; for example, difference in means, or medians, or standard deviations (if we expect the sample to differ in their variability).

We illustrate this with a classic diabetes dataset from the National Institute of Diabetes and Digestive and Kidney Diseases. The subjects of this dataset are females at least 21 years old, and the goal was to predict diabetes status that is summarized in the column called “Outcome”.

We will focus in this example on BMI. Below are boxplots for the two diabetes status groups.

../../_images/HypothesisTesting_3_TwoSample_4_0.png

There are several observations from the above plots:

The distributions of BMI in the two groups seem different; for example, the median BMI is larger in diabetics.

There are some subjects for which the recorded value for BMI is equal to 0; this suggests that missing data were recorded as 0 and we will have to take that into account in our analysis.

Below, we create two arrays that contain the BMI values in the two groups after removing the missing data.

We have two samples here of size \(n_0=491\) and \(n_1=266\) , and the null hypothesis we investigate is: BMI distributions in diabetics and non-diabetics subjects are the same.

The test statistic we will use is the difference in sample medians, with an observed value of 4.2:

The next step is to obtain an approximation for the sampling distribution of our test statistic. The procedure we implement, called a permutation test uses the following observations:

If the null hypothesis is true: a BMI value is equally likely to be sampled from diabetics and non-diabetics

If the null hypothesis is true: all rearrangements (permutations) of BMI values among the two groups are equally likely

If the null hypothesis is true: the observed test statistic can be viewed as a sample from the distribution of median differences of permuted BMI values in two groups.

It suggests the following simulation to learn the null distribution for the test statistic:

Shuffle (permute) the BMI values

Assign \(n_0=491\) to “Group A“ and the rest to “Group B“ (to maintain the two sample sizes)

Find the differences between medians of the two shuffled (permuted) groups

The generated distribution and the value of the test statistic are used to calculate a p-value.

We first illustrate how to create shuffled samples and calculate the corresponding test statistic. We use the numpy function random.permutation to create an array that has the same values but with order that is shuffled: the first part of the new array will correspond to the control group.

In the cell code below, we repeat the procedure 5000 times and create an approximation for the distribution of our test statistic that is saved in the array differences .

../../_images/HypothesisTesting_3_TwoSample_13_0.png

From the above histogram, we can see that there is strong evidence against the null hypothesis that the distributions of BMI in cases and controls are the same.

Please note that the choice of test statistic could have a big impact on the conclusions from the test. Below, we repeat the procedure using as test statistic the difference in standard deviations of the two samples. There is no evidence, when using this statistic, that the distributions are different.

../../_images/HypothesisTesting_3_TwoSample_15_0.png

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

5.5 - hypothesis testing for two-sample proportions.

We are now going to develop the hypothesis test for the difference of two proportions for independent samples. The hypothesis test follows the same steps as one group.

These notes are going to go into a little bit of math and formulas to help demonstrate the logic behind hypothesis testing for two groups. If this starts to get a little confusion, just skim over it for a general understanding! Remember we can rely on the software to do the calculations for us, but it is good to have a basic understanding of the logic!

We will use the sampling distribution of \(\hat{p}_1-\hat{p}_2\) as we did for the confidence interval.

For a test for two proportions, we are interested in the difference between two groups. If the difference is zero, then they are not different (i.e., they are equal). Therefore, the null hypothesis will always be:

\(H_0\colon p_1-p_2=0\)

Another way to look at it is \(H_0\colon p_1=p_2\). This is worth stopping to think about. Remember, in hypothesis testing, we assume the null hypothesis is true. In this case, it means that \(p_1\) and \(p_2\) are equal. Under this assumption, then \(\hat{p}_1\) and \(\hat{p}_2\) are both estimating the same proportion. Think of this proportion as \(p^*\).

Therefore, the sampling distribution of both proportions, \(\hat{p}_1\) and \(\hat{p}_2\), will, under certain conditions, be approximately normal centered around \(p^*\), with standard error \(\sqrt{\dfrac{p^*(1-p^*)}{n_i}}\), for \(i=1, 2\).

We take this into account by finding an estimate for this \(p^*\) using the two-sample proportions. We can calculate an estimate of \(p^*\) using the following formula:

\(\hat{p}^*=\dfrac{x_1+x_2}{n_1+n_2}\)

This value is the total number in the desired categories \((x_1+x_2)\) from both samples over the total number of sampling units in the combined sample \((n_1+n_2)\).

Putting everything together, if we assume \(p_1=p_2\), then the sampling distribution of \(\hat{p}_1-\hat{p}_2\) will be approximately normal with mean 0 and standard error of \(\sqrt{p^*(1-p^*)\left(\frac{1}{n_1}+\frac{1}{n_2}\right)}\), under certain conditions.

\(z^*=\dfrac{(\hat{p}_1-\hat{p}_2)-0}{\sqrt{\hat{p}^*(1-\hat{p}^*)\left(\dfrac{1}{n_1}+\dfrac{1}{n_2}\right)}}\)

...will follow a standard normal distribution.

Finally, we can develop our hypothesis test for \(p_1-p_2\).

Hypothesis Testing for Two-Sample Proportions

Conditions :

\(n_1\hat{p}_1\), \(n_1(1-\hat{p}_1)\), \(n_2\hat{p}_2\), and \(n_2(1-\hat{p}_2)\) are all greater than five

Test Statistic:

\(z^*=\dfrac{\hat{p}_1-\hat{p}_2-0}{\sqrt{\hat{p}^*(1-\hat{p}^*)\left(\dfrac{1}{n_1}+\dfrac{1}{n_2}\right)}}\)

...where \(\hat{p}^*=\dfrac{x_1+x_2}{n_1+n_2}\).

The critical values, p-values, and decisions will all follow the same steps as those from a hypothesis test for a one-sample proportion.

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

Inference for Comparing 2 Population Means (HT for 2 Means, independent samples)

More of the good stuff! We will need to know how to label the null and alternative hypothesis, calculate the test statistic, and then reach our conclusion using the critical value method or the p-value method.

The Test Statistic for a Test of 2 Means from Independent Samples:

[latex]t = \displaystyle \frac{(\bar{x_1} - \bar{x_2}) - (\mu_1 - \mu_2)}{\sqrt{\displaystyle \frac{s_1^2}{n_1} + \displaystyle \frac{s_2^2}{n_2}}}[/latex]

What the different symbols mean:

[latex]n_1[/latex] is the sample size for the first group

[latex]n_2[/latex] is the sample size for the second group

[latex]df[/latex], the degrees of freedom, is the smaller of [latex]n_1 - 1[/latex] and [latex]n_2 - 1[/latex]

[latex]\mu_1[/latex] is the population mean from the first group

[latex]\mu_2[/latex] is the population mean from the second group

[latex]\bar{x_1}[/latex] is the sample mean for the first group

[latex]\bar{x_2}[/latex] is the sample mean for the second group

[latex]s_1[/latex] is the sample standard deviation for the first group

[latex]s_2[/latex] is the sample standard deviation for the second group

[latex]\alpha[/latex] is the significance level , usually given within the problem, or if not given, we assume it to be 5% or 0.05

Assumptions when conducting a Test for 2 Means from Independent Samples:

  • We do not know the population standard deviations, and we do not assume they are equal
  • The two samples or groups are independent
  • Both samples are simple random samples
  • Both populations are Normally distributed OR both samples are large ([latex]n_1 > 30[/latex] and [latex]n_2 > 30[/latex])

Steps to conduct the Test for 2 Means from Independent Samples:

  • Identify all the symbols listed above (all the stuff that will go into the formulas). This includes [latex]n_1[/latex] and [latex]n_2[/latex], [latex]df[/latex], [latex]\mu_1[/latex] and [latex]\mu_2[/latex], [latex]\bar{x_1}[/latex] and [latex]\bar{x_2}[/latex], [latex]s_1[/latex] and [latex]s_2[/latex], and [latex]\alpha[/latex]
  • Identify the null and alternative hypotheses
  • Calculate the test statistic, [latex]t = \displaystyle \frac{(\bar{x_1} - \bar{x_2}) - (\mu_1 - \mu_2)}{\sqrt{\displaystyle \frac{s_1^2}{n_1} + \displaystyle \frac{s_2^2}{n_2}}}[/latex]
  • Find the critical value(s) OR the p-value OR both
  • Apply the Decision Rule
  • Write up a conclusion for the test

Example 1: Study on the effectiveness of stents for stroke patients [1]

In this study , researchers randomly assigned stroke patients to two groups: one received the current standard care (control) and the other received a stent surgery in addition to the standard care (stent treatment). If the stents work, the treatment group should have a lower average disability score . Do the results give convincing statistical evidence that the stent treatment reduces the average disability from stroke?

Since we are being asked for convincing statistical evidence, a hypothesis test should be conducted. In this case, we are dealing with averages from two samples or groups (the patients with stent treatment and patients receiving the standard care), so we will conduct a Test of 2 Means.

  • [latex]n_1 = 98[/latex] is the sample size for the first group
  • [latex]n_2 = 93[/latex] is the sample size for the second group
  • [latex]df[/latex], the degrees of freedom, is the smaller of [latex]98 - 1 = 97[/latex] and [latex]93 - 1 = 92[/latex], so [latex]df = 92[/latex]
  • [latex]\bar{x_1} = 2.26[/latex] is the sample mean for the first group
  • [latex]\bar{x_2} = 3.23[/latex] is the sample mean for the second group
  • [latex]s_1 = 1.78[/latex] is the sample standard deviation for the first group
  • [latex]s_2 = 1.78[/latex] is the sample standard deviation for the second group
  • [latex]\alpha = 0.05[/latex] (we were not told a specific value in the problem, so we are assuming it is 5%)
  • One additional assumption we extend from the null hypothesis is that [latex]\mu_1 - \mu_2 = 0[/latex]; this means that in our formula, those variables cancel out
  • [latex]H_{0}: \mu_1 = \mu_2[/latex]
  • [latex]H_{A}: \mu_1 < \mu_2[/latex]
  • [latex]t = \displaystyle \frac{(\bar{x_1} - \bar{x_2}) - (\mu_1 - \mu_2)}{\sqrt{\displaystyle \frac{s_1^2}{n_1} + \displaystyle \frac{s_2^2}{n_2}}} = \displaystyle \frac{(2.26 - 3.23) - 0)}{\sqrt{\displaystyle \frac{1.78^2}{98} + \displaystyle \frac{1.78^2}{93}}} = -3.76[/latex]
  • StatDisk : We can conduct this test using StatDisk. The nice thing about StatDisk is that it will also compute the test statistic. From the main menu above we click on Analysis, Hypothesis Testing, and then Mean Two Independent Samples. From there enter the 0.05 significance, along with the specific values as outlined in the picture below in Step 2. Notice the alternative hypothesis is the [latex]<[/latex] option. Enter the sample size, mean, and standard deviation for each group, and make sure that unequal variances is selected. Now we click on Evaluate. If you check the values, the test statistic is reported in the Step 3 display, as well as the P-Value of 0.00011.
  • Applying the Decision Rule: We now compare this to our significance level, which is 0.05. If the p-value is smaller or equal to the alpha level, we have enough evidence for our claim, otherwise we do not. Here, [latex]p-value = 0.00011[/latex], which is definitely smaller than [latex]\alpha = 0.05[/latex], so we have enough evidence for the alternative hypothesis…but what does this mean?
  • Conclusion: Because our p-value  of [latex]0.00011[/latex] is less than our [latex]\alpha[/latex] level of [latex]0.05[/latex], we reject [latex]H_{0}[/latex]. We have convincing statistical evidence that the stent treatment reduces the average disability from stroke.

Example 2: Home Run Distances

In 1998, Sammy Sosa and Mark McGwire (2 players in Major League Baseball) were on pace to set a new home run record. At the end of the season McGwire ended up with 70 home runs, and Sosa ended up with 66. The home run distances were recorded and compared (sometimes a player’s home run distance is used to measure their “power”). Do the results give convincing statistical evidence that the home run distances are different from each other? Who would you say “hit the ball farther” in this comparison?

Since we are being asked for convincing statistical evidence, a hypothesis test should be conducted. In this case, we are dealing with averages from two samples or groups (the home run distances), so we will conduct a Test of 2 Means.

  • [latex]n_1 = 70[/latex] is the sample size for the first group
  • [latex]n_2 = 66[/latex] is the sample size for the second group
  • [latex]df[/latex], the degrees of freedom, is the smaller of [latex]70 - 1 = 69[/latex] and [latex]66 - 1 = 65[/latex], so [latex]df = 65[/latex]
  • [latex]\bar{x_1} = 418.5[/latex] is the sample mean for the first group
  • [latex]\bar{x_2} = 404.8[/latex] is the sample mean for the second group
  • [latex]s_1 = 45.5[/latex] is the sample standard deviation for the first group
  • [latex]s_2 = 35.7[/latex] is the sample standard deviation for the second group
  • [latex]H_{A}: \mu_1 \neq \mu_2[/latex]
  • [latex]t = \displaystyle \frac{(\bar{x_1} - \bar{x_2}) - (\mu_1 - \mu_2)}{\sqrt{\displaystyle \frac{s_1^2}{n_1} + \displaystyle \frac{s_2^2}{n_2}}} = \displaystyle \frac{(418.5 - 404.8) - 0)}{\sqrt{\displaystyle \frac{45.5^2}{70} + \displaystyle \frac{35.7^2}{65}}} = 1.95[/latex]
  • StatDisk : We can conduct this test using StatDisk. The nice thing about StatDisk is that it will also compute the test statistic. From the main menu above we click on Analysis, Hypothesis Testing, and then Mean Two Independent Samples. From there enter the 0.05 significance, along with the specific values as outlined in the picture below in Step 2. Notice the alternative hypothesis is the [latex]\neq[/latex] option. Enter the sample size, mean, and standard deviation for each group, and make sure that unequal variances is selected. Now we click on Evaluate. If you check the values, the test statistic is reported in the Step 3 display, as well as the P-Value of 0.05221.
  • Applying the Decision Rule: We now compare this to our significance level, which is 0.05. If the p-value is smaller or equal to the alpha level, we have enough evidence for our claim, otherwise we do not. Here, [latex]p-value = 0.05221[/latex], which is larger than [latex]\alpha = 0.05[/latex], so we do not have enough evidence for the alternative hypothesis…but what does this mean?
  • Conclusion: Because our p-value  of [latex]0.05221[/latex] is larger than our [latex]\alpha[/latex] level of [latex]0.05[/latex], we fail to reject [latex]H_{0}[/latex]. We do not have convincing statistical evidence that the home run distances are different.
  • Follow-up commentary: But what does this mean? There actually was a difference, right? If we take McGwire’s average and subtract Sosa’s average we get a difference of 13.7. What this result indicates is that the difference is not statistically significant; it could be due more to random chance than something meaningful. Other factors, such as sample size, could also be a determining factor (with a larger sample size, the difference may have been more meaningful).
  • Adapted from the Skew The Script curriculum ( skewthescript.org ), licensed under CC BY-NC-Sa 4.0 ↵

Basic Statistics Copyright © by Allyn Leon is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Two Sample T-Test Calculator (Pooled-Variance)

Enter sample data, information, assumptions, required sample data.

Statology

Statistics Made Easy

Two Sample t-test Calculator

t = -1.608761

p-value (one-tailed) = 0.060963

p-value (two-tailed) = 0.121926

Featured Posts

5 Statistical Biases to Avoid

Hey there. My name is Zach Bobbitt. I have a Master of Science degree in Applied Statistics and I’ve worked on machine learning algorithms for professional businesses in both healthcare and retail. I’m passionate about statistics, machine learning, and data visualization and I created Statology to be a resource for both students and teachers alike.  My goal with this site is to help you learn statistics through using simple terms, plenty of real-world examples, and helpful illustrations.

4 Replies to “Two Sample t-test Calculator”

Hi! Thanks for efforts.

-1.838687427 I get this for t value.

Could you please check? Classical two independant samples formula: t = (ma- mb)/ sqrt( s*s/12 – s*s/12)

Thank you learnt something. For df = 12, I had to subtract one from each of the samples

good work zach

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Hypothesis Testing

Hypothesis testing is a tool for making statistical inferences about the population data. It is an analysis tool that tests assumptions and determines how likely something is within a given standard of accuracy. Hypothesis testing provides a way to verify whether the results of an experiment are valid.

A null hypothesis and an alternative hypothesis are set up before performing the hypothesis testing. This helps to arrive at a conclusion regarding the sample obtained from the population. In this article, we will learn more about hypothesis testing, its types, steps to perform the testing, and associated examples.

What is Hypothesis Testing in Statistics?

Hypothesis testing uses sample data from the population to draw useful conclusions regarding the population probability distribution . It tests an assumption made about the data using different types of hypothesis testing methodologies. The hypothesis testing results in either rejecting or not rejecting the null hypothesis.

Hypothesis Testing Definition

Hypothesis testing can be defined as a statistical tool that is used to identify if the results of an experiment are meaningful or not. It involves setting up a null hypothesis and an alternative hypothesis. These two hypotheses will always be mutually exclusive. This means that if the null hypothesis is true then the alternative hypothesis is false and vice versa. An example of hypothesis testing is setting up a test to check if a new medicine works on a disease in a more efficient manner.

Null Hypothesis

The null hypothesis is a concise mathematical statement that is used to indicate that there is no difference between two possibilities. In other words, there is no difference between certain characteristics of data. This hypothesis assumes that the outcomes of an experiment are based on chance alone. It is denoted as \(H_{0}\). Hypothesis testing is used to conclude if the null hypothesis can be rejected or not. Suppose an experiment is conducted to check if girls are shorter than boys at the age of 5. The null hypothesis will say that they are the same height.

Alternative Hypothesis

The alternative hypothesis is an alternative to the null hypothesis. It is used to show that the observations of an experiment are due to some real effect. It indicates that there is a statistical significance between two possible outcomes and can be denoted as \(H_{1}\) or \(H_{a}\). For the above-mentioned example, the alternative hypothesis would be that girls are shorter than boys at the age of 5.

Hypothesis Testing P Value

In hypothesis testing, the p value is used to indicate whether the results obtained after conducting a test are statistically significant or not. It also indicates the probability of making an error in rejecting or not rejecting the null hypothesis.This value is always a number between 0 and 1. The p value is compared to an alpha level, \(\alpha\) or significance level. The alpha level can be defined as the acceptable risk of incorrectly rejecting the null hypothesis. The alpha level is usually chosen between 1% to 5%.

Hypothesis Testing Critical region

All sets of values that lead to rejecting the null hypothesis lie in the critical region. Furthermore, the value that separates the critical region from the non-critical region is known as the critical value.

Hypothesis Testing Formula

Depending upon the type of data available and the size, different types of hypothesis testing are used to determine whether the null hypothesis can be rejected or not. The hypothesis testing formula for some important test statistics are given below:

  • z = \(\frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}\). \(\overline{x}\) is the sample mean, \(\mu\) is the population mean, \(\sigma\) is the population standard deviation and n is the size of the sample.
  • t = \(\frac{\overline{x}-\mu}{\frac{s}{\sqrt{n}}}\). s is the sample standard deviation.
  • \(\chi ^{2} = \sum \frac{(O_{i}-E_{i})^{2}}{E_{i}}\). \(O_{i}\) is the observed value and \(E_{i}\) is the expected value.

We will learn more about these test statistics in the upcoming section.

Types of Hypothesis Testing

Selecting the correct test for performing hypothesis testing can be confusing. These tests are used to determine a test statistic on the basis of which the null hypothesis can either be rejected or not rejected. Some of the important tests used for hypothesis testing are given below.

Hypothesis Testing Z Test

A z test is a way of hypothesis testing that is used for a large sample size (n ≥ 30). It is used to determine whether there is a difference between the population mean and the sample mean when the population standard deviation is known. It can also be used to compare the mean of two samples. It is used to compute the z test statistic. The formulas are given as follows:

  • One sample: z = \(\frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}\).
  • Two samples: z = \(\frac{(\overline{x_{1}}-\overline{x_{2}})-(\mu_{1}-\mu_{2})}{\sqrt{\frac{\sigma_{1}^{2}}{n_{1}}+\frac{\sigma_{2}^{2}}{n_{2}}}}\).

Hypothesis Testing t Test

The t test is another method of hypothesis testing that is used for a small sample size (n < 30). It is also used to compare the sample mean and population mean. However, the population standard deviation is not known. Instead, the sample standard deviation is known. The mean of two samples can also be compared using the t test.

  • One sample: t = \(\frac{\overline{x}-\mu}{\frac{s}{\sqrt{n}}}\).
  • Two samples: t = \(\frac{(\overline{x_{1}}-\overline{x_{2}})-(\mu_{1}-\mu_{2})}{\sqrt{\frac{s_{1}^{2}}{n_{1}}+\frac{s_{2}^{2}}{n_{2}}}}\).

Hypothesis Testing Chi Square

The Chi square test is a hypothesis testing method that is used to check whether the variables in a population are independent or not. It is used when the test statistic is chi-squared distributed.

One Tailed Hypothesis Testing

One tailed hypothesis testing is done when the rejection region is only in one direction. It can also be known as directional hypothesis testing because the effects can be tested in one direction only. This type of testing is further classified into the right tailed test and left tailed test.

Right Tailed Hypothesis Testing

The right tail test is also known as the upper tail test. This test is used to check whether the population parameter is greater than some value. The null and alternative hypotheses for this test are given as follows:

\(H_{0}\): The population parameter is ≤ some value

\(H_{1}\): The population parameter is > some value.

If the test statistic has a greater value than the critical value then the null hypothesis is rejected

Right Tail Hypothesis Testing

Left Tailed Hypothesis Testing

The left tail test is also known as the lower tail test. It is used to check whether the population parameter is less than some value. The hypotheses for this hypothesis testing can be written as follows:

\(H_{0}\): The population parameter is ≥ some value

\(H_{1}\): The population parameter is < some value.

The null hypothesis is rejected if the test statistic has a value lesser than the critical value.

Left Tail Hypothesis Testing

Two Tailed Hypothesis Testing

In this hypothesis testing method, the critical region lies on both sides of the sampling distribution. It is also known as a non - directional hypothesis testing method. The two-tailed test is used when it needs to be determined if the population parameter is assumed to be different than some value. The hypotheses can be set up as follows:

\(H_{0}\): the population parameter = some value

\(H_{1}\): the population parameter ≠ some value

The null hypothesis is rejected if the test statistic has a value that is not equal to the critical value.

Two Tail Hypothesis Testing

Hypothesis Testing Steps

Hypothesis testing can be easily performed in five simple steps. The most important step is to correctly set up the hypotheses and identify the right method for hypothesis testing. The basic steps to perform hypothesis testing are as follows:

  • Step 1: Set up the null hypothesis by correctly identifying whether it is the left-tailed, right-tailed, or two-tailed hypothesis testing.
  • Step 2: Set up the alternative hypothesis.
  • Step 3: Choose the correct significance level, \(\alpha\), and find the critical value.
  • Step 4: Calculate the correct test statistic (z, t or \(\chi\)) and p-value.
  • Step 5: Compare the test statistic with the critical value or compare the p-value with \(\alpha\) to arrive at a conclusion. In other words, decide if the null hypothesis is to be rejected or not.

Hypothesis Testing Example

The best way to solve a problem on hypothesis testing is by applying the 5 steps mentioned in the previous section. Suppose a researcher claims that the mean average weight of men is greater than 100kgs with a standard deviation of 15kgs. 30 men are chosen with an average weight of 112.5 Kgs. Using hypothesis testing, check if there is enough evidence to support the researcher's claim. The confidence interval is given as 95%.

Step 1: This is an example of a right-tailed test. Set up the null hypothesis as \(H_{0}\): \(\mu\) = 100.

Step 2: The alternative hypothesis is given by \(H_{1}\): \(\mu\) > 100.

Step 3: As this is a one-tailed test, \(\alpha\) = 100% - 95% = 5%. This can be used to determine the critical value.

1 - \(\alpha\) = 1 - 0.05 = 0.95

0.95 gives the required area under the curve. Now using a normal distribution table, the area 0.95 is at z = 1.645. A similar process can be followed for a t-test. The only additional requirement is to calculate the degrees of freedom given by n - 1.

Step 4: Calculate the z test statistic. This is because the sample size is 30. Furthermore, the sample and population means are known along with the standard deviation.

z = \(\frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}\).

\(\mu\) = 100, \(\overline{x}\) = 112.5, n = 30, \(\sigma\) = 15

z = \(\frac{112.5-100}{\frac{15}{\sqrt{30}}}\) = 4.56

Step 5: Conclusion. As 4.56 > 1.645 thus, the null hypothesis can be rejected.

Hypothesis Testing and Confidence Intervals

Confidence intervals form an important part of hypothesis testing. This is because the alpha level can be determined from a given confidence interval. Suppose a confidence interval is given as 95%. Subtract the confidence interval from 100%. This gives 100 - 95 = 5% or 0.05. This is the alpha value of a one-tailed hypothesis testing. To obtain the alpha value for a two-tailed hypothesis testing, divide this value by 2. This gives 0.05 / 2 = 0.025.

Related Articles:

  • Probability and Statistics
  • Data Handling

Important Notes on Hypothesis Testing

  • Hypothesis testing is a technique that is used to verify whether the results of an experiment are statistically significant.
  • It involves the setting up of a null hypothesis and an alternate hypothesis.
  • There are three types of tests that can be conducted under hypothesis testing - z test, t test, and chi square test.
  • Hypothesis testing can be classified as right tail, left tail, and two tail tests.

Examples on Hypothesis Testing

  • Example 1: The average weight of a dumbbell in a gym is 90lbs. However, a physical trainer believes that the average weight might be higher. A random sample of 5 dumbbells with an average weight of 110lbs and a standard deviation of 18lbs. Using hypothesis testing check if the physical trainer's claim can be supported for a 95% confidence level. Solution: As the sample size is lesser than 30, the t-test is used. \(H_{0}\): \(\mu\) = 90, \(H_{1}\): \(\mu\) > 90 \(\overline{x}\) = 110, \(\mu\) = 90, n = 5, s = 18. \(\alpha\) = 0.05 Using the t-distribution table, the critical value is 2.132 t = \(\frac{\overline{x}-\mu}{\frac{s}{\sqrt{n}}}\) t = 2.484 As 2.484 > 2.132, the null hypothesis is rejected. Answer: The average weight of the dumbbells may be greater than 90lbs
  • Example 2: The average score on a test is 80 with a standard deviation of 10. With a new teaching curriculum introduced it is believed that this score will change. On random testing, the score of 38 students, the mean was found to be 88. With a 0.05 significance level, is there any evidence to support this claim? Solution: This is an example of two-tail hypothesis testing. The z test will be used. \(H_{0}\): \(\mu\) = 80, \(H_{1}\): \(\mu\) ≠ 80 \(\overline{x}\) = 88, \(\mu\) = 80, n = 36, \(\sigma\) = 10. \(\alpha\) = 0.05 / 2 = 0.025 The critical value using the normal distribution table is 1.96 z = \(\frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}\) z = \(\frac{88-80}{\frac{10}{\sqrt{36}}}\) = 4.8 As 4.8 > 1.96, the null hypothesis is rejected. Answer: There is a difference in the scores after the new curriculum was introduced.
  • Example 3: The average score of a class is 90. However, a teacher believes that the average score might be lower. The scores of 6 students were randomly measured. The mean was 82 with a standard deviation of 18. With a 0.05 significance level use hypothesis testing to check if this claim is true. Solution: The t test will be used. \(H_{0}\): \(\mu\) = 90, \(H_{1}\): \(\mu\) < 90 \(\overline{x}\) = 110, \(\mu\) = 90, n = 6, s = 18 The critical value from the t table is -2.015 t = \(\frac{\overline{x}-\mu}{\frac{s}{\sqrt{n}}}\) t = \(\frac{82-90}{\frac{18}{\sqrt{6}}}\) t = -1.088 As -1.088 > -2.015, we fail to reject the null hypothesis. Answer: There is not enough evidence to support the claim.

go to slide go to slide go to slide

hypothesis testing with 2 samples

Book a Free Trial Class

FAQs on Hypothesis Testing

What is hypothesis testing.

Hypothesis testing in statistics is a tool that is used to make inferences about the population data. It is also used to check if the results of an experiment are valid.

What is the z Test in Hypothesis Testing?

The z test in hypothesis testing is used to find the z test statistic for normally distributed data . The z test is used when the standard deviation of the population is known and the sample size is greater than or equal to 30.

What is the t Test in Hypothesis Testing?

The t test in hypothesis testing is used when the data follows a student t distribution . It is used when the sample size is less than 30 and standard deviation of the population is not known.

What is the formula for z test in Hypothesis Testing?

The formula for a one sample z test in hypothesis testing is z = \(\frac{\overline{x}-\mu}{\frac{\sigma}{\sqrt{n}}}\) and for two samples is z = \(\frac{(\overline{x_{1}}-\overline{x_{2}})-(\mu_{1}-\mu_{2})}{\sqrt{\frac{\sigma_{1}^{2}}{n_{1}}+\frac{\sigma_{2}^{2}}{n_{2}}}}\).

What is the p Value in Hypothesis Testing?

The p value helps to determine if the test results are statistically significant or not. In hypothesis testing, the null hypothesis can either be rejected or not rejected based on the comparison between the p value and the alpha level.

What is One Tail Hypothesis Testing?

When the rejection region is only on one side of the distribution curve then it is known as one tail hypothesis testing. The right tail test and the left tail test are two types of directional hypothesis testing.

What is the Alpha Level in Two Tail Hypothesis Testing?

To get the alpha level in a two tail hypothesis testing divide \(\alpha\) by 2. This is done as there are two rejection regions in the curve.

Lecture 14: Hypothesis Test for One Variance

STAT 205: Introduction to Mathematical Statistics

University of British Columbia Okanagan

March 17, 2024

Introduction

We have covered three hypothesis tests for a single sample:

  • Hypothesis test for the mean \(\mu\) with \(\sigma\) known ( \(Z\) - test)
  • Hypothesis tests for the proportion \(p\) ( \(Z\) - test)
  • Hypothesis test for the mean \(\mu\) with \(\sigma\) unknown ( \(t\) -test)

Today we consider hypothesis tests involve the population variance \(\sigma^2\)

hypothesis testing with 2 samples

Assumptions: \(X_1, X_2, \dots, X_n\) are i.i.d + assumptions in the rhombuses.

In Lecture 7 we saw how to construct a confidence interval for \(\sigma^2\) based on the sampling distribution derived in Lecture 8 .

For random samples from normal populations , we know:

\[ \dfrac{(n-1)S^2}{\sigma^2} \sim \chi^2_{n-1} \]

where \(S^2 = \frac{\sum_{i = 1}^n (X_i - \bar{X})}{n-1}\) is the sample variance and \(\chi^2_{n-1}\) is the Chi-squared distribution with \(n-1\) degrees of freedom.

We may which to test if there is evidence to suggest that population variance differs for some hypothesized value \(\sigma_0^2\) .

As before, we start with a null hypothesis ( \(H_0\) ) that the population variance equals a specified value ( \(\sigma^2 = \sigma_0^2\) )

We test this against the alternative hypothesis \(H_A\) which can either be one-sided ( \(\sigma^2 < \sigma_0^2\) or \(\sigma^2 > \sigma_0^2\) ) or two-sided ( \(\sigma^2 \neq \sigma_0^2\) ).

Test Statistic

Recall that our test statistic is calculated assuming the null hypothesis is true . Hence, if we are testing \(H_0: \sigma^2 = \sigma_0^2\) , the test statistic we use is : \[ \chi^2 = \dfrac{(n-1)S^2}{\sigma_0^2} \] where \(\chi^2 \sim \chi^2_{n-1}\) .

Chi-square distrbituion

hypothesis testing with 2 samples

Assumptions

For the following inference procedures to be valid we require:

  • A simple random sample from the population
  • A normally distributed population (very important, even for large sample sizes)

It is important to note that if the population is not approximately normally distributed, chi-squared distribution may not accurately represent the sampling distribution of the test statistic.

Critical Region (upper-tailed)

hypothesis testing with 2 samples

The rejection region associated with an upper-tailed test for the population variance. Note that the critical value will depend on the chosen significance level ( \(\alpha\) ) and the d.f.

Critical Region (lower-tailed)

hypothesis testing with 2 samples

Critical Region (two-tailed)

hypothesis testing with 2 samples

Similarly we can find \(p\) -values from Chi-squared tables or R

hypothesis testing with 2 samples

\(p\) -value for lower-tailed: \[\Pr(\chi^2 < \chi^2_{\text{obs}})\] \(p\) -value for upper-tailed: \[\Pr(\chi^2 > \chi^2_{\text{obs}})\] \(p\) -value for two-tailed:

\[2\cdot \min \{ \Pr(\chi^2 < \chi^2_{\text{obs}}), \Pr(\chi^2 > \chi^2_{\text{obs}})\}\]

hypothesis testing with 2 samples

Exercise 1: Beyond Burger Fat

Beyond Burgers claim to have 18g grams of fat. A random sample of 6 burgers had a mean of 19.45 and a variance of 0.85 grams \(^2\) . Suppose that the quality assurance team at the company will on accept at most a \(\sigma\) of 0.5. Use the 0.05 level of significance to test the null hypotehsis \(\sigma = 0.5\) against the appropriate alternative.

Distribution of Test Statistic

hypothesis testing with 2 samples

Under the null hypothesis, the test statistic follows \(\chi^2 = (n-1)S^2/0.5^2\) a chi-square distribution with df = 5

Critical value

hypothesis testing with 2 samples

The critical value can be found by determining what value on the chi-square curve with 5 df yield a 5 percent probability in the upper tail (since we are doing an upper-tailed test). In R: qchisq(alpha, df=n-1, lower.tail = FALSE) . Verify using \(\chi^2\) table.

Observed Test Statistic

Compute the observed test statistic which we denote by \(\chi^2_{\text{obs}}\)

hypothesis testing with 2 samples

Since the observed test statistic falls in the rejection region, i.e.  \(\chi^2_{\text{obs}} > \chi^2_{\alpha}\) , we rejection the null hypothesis in favour of the alternative.

P-value in R

hypothesis testing with 2 samples

Alternatively we could compute the p-value which in this case is 0.013. Since this is smaller than the alpha-level of 0.05, we reject the null hypothesis in favour of the alternative. Verify using \(\chi^2\) table.

P-value from tables

hypothesis testing with 2 samples

Using the chi-square distribution table we can see that our observed test statistic falls between two values. We can use the neigbouring values to approximate our p-value.

Approximate P-value

hypothesis testing with 2 samples

It is clear from the visualization that \[\begin{align} \Pr(\chi^2_{5} > \chi^2_{0.025}) > \Pr(\chi^2_{5} > \chi^2_{\text{obs}})\\ \Pr(\chi^2_{5} > \chi^2_{\text{obs}}) < \Pr(\chi^2_{5} > \chi^2_{0.01}) \\ \end{align}\]

The \(p\) -value, \(\Pr(\chi^2_{5} > 14.45)\) can then be expressed as: \[\begin{align} 0.01 < p\text{-value } < 0.025 \end{align}\]

  • the \(p\) -value (0.013) is less than \(\alpha\) = 0.05 OR
  • the the observed test statistic ( \(\chi^2_{\text{obs}}\) = 14.45) is larger than the critical value \(\chi^2_{\alpha}\)

we reject the null hypothesis in favour of the alternative. More specifically, there is very strong evidence to suggest that the population variance \(\sigma^2\) is greater than \(0.5^2\) .

https://irene.vrbik.ok.ubc.ca/quarto/stat205/

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Statistics LibreTexts

10.E: Hypothesis Testing with Two Samples (Exercises)

  • Last updated
  • Save as PDF
  • Page ID 1149

These are homework exercises to accompany the Textmap created for "Introductory Statistics" by OpenStax.

10.1: Introduction

10.2: two population means with unknown standard deviations.

Use the following information to answer the next 15 exercises: Indicate if the hypothesis test is for

  • independent group means, population standard deviations, and/or variances known
  • independent group means, population standard deviations, and/or variances unknown
  • matched or paired samples
  • single mean
  • two proportions
  • single proportion

Exercise 10.2.3

It is believed that 70% of males pass their drivers test in the first attempt, while 65% of females pass the test in the first attempt. Of interest is whether the proportions are in fact equal.

Exercise 10.2.4

A new laundry detergent is tested on consumers. Of interest is the proportion of consumers who prefer the new brand over the leading competitor. A study is done to test this.

Exercise 10.2.5

A new windshield treatment claims to repel water more effectively. Ten windshields are tested by simulating rain without the new treatment. The same windshields are then treated, and the experiment is run again. A hypothesis test is conducted.

Exercise 10.2.6

The known standard deviation in salary for all mid-level professionals in the financial industry is $11,000. Company A and Company B are in the financial industry. Suppose samples are taken of mid-level professionals from Company A and from Company B. The sample mean salary for mid-level professionals in Company A is $80,000. The sample mean salary for mid-level professionals in Company B is $96,000. Company A and Company B management want to know if their mid-level professionals are paid differently, on average.

Exercise 10.2.7

The average worker in Germany gets eight weeks of paid vacation.

Exercise 10.2.8

According to a television commercial, 80% of dentists agree that Ultrafresh toothpaste is the best on the market.

Exercise 10.2.9

It is believed that the average grade on an English essay in a particular school system for females is higher than for males. A random sample of 31 females had a mean score of 82 with a standard deviation of three, and a random sample of 25 males had a mean score of 76 with a standard deviation of four.

  • independent group means, population standard deviations and/or variances unknown

Exercise 10.2.10

The league mean batting average is 0.280 with a known standard deviation of 0.06. The Rattlers and the Vikings belong to the league. The mean batting average for a sample of eight Rattlers is 0.210, and the mean batting average for a sample of eight Vikings is 0.260. There are 24 players on the Rattlers and 19 players on the Vikings. Are the batting averages of the Rattlers and Vikings statistically different?

Exercise 10.2.11

In a random sample of 100 forests in the United States, 56 were coniferous or contained conifers. In a random sample of 80 forests in Mexico, 40 were coniferous or contained conifers. Is the proportion of conifers in the United States statistically more than the proportion of conifers in Mexico?

Exercise 10.2.12

A new medicine is said to help improve sleep. Eight subjects are picked at random and given the medicine. The means hours slept for each person were recorded before starting the medication and after.

Exercise 10.2.13

It is thought that teenagers sleep more than adults on average. A study is done to verify this. A sample of 16 teenagers has a mean of 8.9 hours slept and a standard deviation of 1.2. A sample of 12 adults has a mean of 6.9 hours slept and a standard deviation of 0.6.

Exercise 10.2.14

Varsity athletes practice five times a week, on average.

Exercise 10.2.15

A sample of 12 in-state graduate school programs at school A has a mean tuition of $64,000 with a standard deviation of $8,000. At school B, a sample of 16 in-state graduate programs has a mean of $80,000 with a standard deviation of $6,000. On average, are the mean tuitions different?

Exercise 10.2.16

A new WiFi range booster is being offered to consumers. A researcher tests the native range of 12 different routers under the same conditions. The ranges are recorded. Then the researcher uses the new WiFi range booster and records the new ranges. Does the new WiFi range booster do a better job?

Exercise 10.2.17

A high school principal claims that 30% of student athletes drive themselves to school, while 4% of non-athletes drive themselves to school. In a sample of 20 student athletes, 45% drive themselves to school. In a sample of 35 non-athlete students, 6% drive themselves to school. Is the percent of student athletes who drive themselves to school more than the percent of nonathletes?

Use the following information to answer the next three exercises: A study is done to determine which of two soft drinks has more sugar. There are 13 cans of Beverage A in a sample and six cans of Beverage B. The mean amount of sugar in Beverage A is 36 grams with a standard deviation of 0.6 grams. The mean amount of sugar in Beverage B is 38 grams with a standard deviation of 0.8 grams. The researchers believe that Beverage B has more sugar than Beverage A, on average. Both populations have normal distributions.

Exercise 10.2.18

Are standard deviations known or unknown?

Exercise 10.2.19

What is the random variable?

The random variable is the difference between the mean amounts of sugar in the two soft drinks.

Exercise 10.2.20

Is this a one-tailed or two-tailed test?

Use the following information to answer the next 12 exercises: The U.S. Center for Disease Control reports that the mean life expectancy was 47.6 years for whites born in 1900 and 33.0 years for nonwhites. Suppose that you randomly survey death records for people born in 1900 in a certain county. Of the 124 whites, the mean life span was 45.3 years with a standard deviation of 12.7 years. Of the 82 nonwhites, the mean life span was 34.1 years with a standard deviation of 15.6 years. Conduct a hypothesis test to see if the mean life spans in the county were the same for whites and nonwhites.

Exercise 10.2.21

Is this a test of means or proportions?

Exercise 10.2.22

State the null and alternative hypotheses.

  • \(H_{0}\): __________
  • \(H_{a}\): __________

Exercise 10.2.23

Is this a right-tailed, left-tailed, or two-tailed test?

Exercise 10.2.24

In symbols, what is the random variable of interest for this test?

Exercise 10.2.25

In words, define the random variable of interest for this test.

the difference between the mean life spans of whites and nonwhites

Exercise 10.2.26

Which distribution (normal or Student's t ) would you use for this hypothesis test?

Exercise 10.2.27

Explain why you chose the distribution you did for Exercise .

This is a comparison of two population means with unknown population standard deviations.

Exercise 10.2.28

Calculate the test statistic and \(p\text{-value}\).

Exercise 10.2.29

Sketch a graph of the situation. Label the horizontal axis. Mark the hypothesized difference and the sample difference. Shade the area corresponding to the \(p\text{-value}\).

This is a horizontal axis with arrows at each end. The axis is labeled p'N - p'ND

  • Check student’s solution.

Exercise 10.2.30

Find the \(p\text{-value}\).

Exercise 10.2.31

At a pre-conceived \(\alpha = 0.05\), what is your:

  • Reason for the decision:
  • Conclusion (write out in a complete sentence):
  • Reject the null hypothesis
  • \(p\text{-value} < 0.05\)
  • There is not enough evidence at the 5% level of significance to support the claim that life expectancy in the 1900s is different between whites and nonwhites.

Exercise 10.2.32

Does it appear that the means are the same? Why or why not?

DIRECTIONS: For each of the word problems, use a solution sheet to do the hypothesis test. The solution sheet is found in Appendix E . Please feel free to make copies of the solution sheets. For the online version of the book, it is suggested that you copy the .doc or the .pdf files.

If you are using a Student's t -distribution for a homework problem in what follows, including for paired data, you may assume that the underlying population is normally distributed. (When using these tests in a real situation, you must first prove that assumption, however.)

The mean number of English courses taken in a two–year time period by male and female college students is believed to be about the same. An experiment is conducted and data are collected from 29 males and 16 females. The males took an average of three English courses with a standard deviation of 0.8. The females took an average of four English courses with a standard deviation of 1.0. Are the means statistically the same?

A student at a four-year college claims that mean enrollment at four–year colleges is higher than at two–year colleges in the United States. Two surveys are conducted. Of the 35 two–year colleges surveyed, the mean enrollment was 5,068 with a standard deviation of 4,777. Of the 35 four-year colleges surveyed, the mean enrollment was 5,466 with a standard deviation of 8,191.

Subscripts: 1: two-year colleges; 2: four-year colleges

  • \(H_{0}: \mu_{1} \geq \mu_{2}\)
  • \(H_{a}: \mu_{1} < \mu_{2}\)
  • \(\bar{X}_{1} - \bar{X}_{2}\) is the difference between the mean enrollments of the two-year colleges and the four-year colleges.
  • Student’s- t
  • test statistic: -0.2480
  • \(p\text{-value}: 0.4019\)
  • Alpha: 0.05
  • Decision: Do not reject
  • Reason for Decision: \(p\text{-value} > \alpha\)
  • Conclusion: At the 5% significance level, there is sufficient evidence to conclude that the mean enrollment at four-year colleges is higher than at two-year colleges.

At Rachel’s 11 th birthday party, eight girls were timed to see how long (in seconds) they could hold their breath in a relaxed position. After a two-minute rest, they timed themselves while jumping. The girls thought that the mean difference between their jumping and relaxed times would be zero. Test their hypothesis.

Mean entry-level salaries for college graduates with mechanical engineering degrees and electrical engineering degrees are believed to be approximately the same. A recruiting office thinks that the mean mechanical engineering salary is actually lower than the mean electrical engineering salary. The recruiting office randomly surveys 50 entry level mechanical engineers and 60 entry level electrical engineers. Their mean salaries were $46,100 and $46,700, respectively. Their standard deviations were $3,450 and $4,210, respectively. Conduct a hypothesis test to determine if you agree that the mean entry-level mechanical engineering salary is lower than the mean entry-level electrical engineering salary.

Subscripts: 1: mechanical engineering; 2: electrical engineering

  • \(\bar{X}_{1} - \bar{X}_{2}\) is the difference between the mean entry level salaries of mechanical engineers and electrical engineers.
  • \(t_{108}\)
  • test statistic: \(t = -0.82\)
  • \(p\text{-value}: 0.2061\)
  • \(\alpha: 0.05\)
  • Decision: Do not reject the null hypothesis.
  • Conclusion: At the 5% significance level, there is insufficient evidence to conclude that the mean entry-level salaries of mechanical engineers is lower than that of electrical engineers.

Marketing companies have collected data implying that teenage girls use more ring tones on their cellular phones than teenage boys do. In one particular study of 40 randomly chosen teenage girls and boys (20 of each) with cellular phones, the mean number of ring tones for the girls was 3.2 with a standard deviation of 1.5. The mean for the boys was 1.7 with a standard deviation of 0.8. Conduct a hypothesis test to determine if the means are approximately the same or if the girls’ mean is higher than the boys’ mean.

Use the information from [link] to answer the next four exercises.

Using the data from Lap 1 only, conduct a hypothesis test to determine if the mean time for completing a lap in races is the same as it is in practices.

  • \(H_{0}: \mu_{1} = \mu_{2}\)

\(H_{a}: \mu_{1} \neq \mu_{2}\)

  • \(\bar{X}_{1} - \bar{X}_{2}\) is the difference between the mean times for completing a lap in races and in practices.
  • \(t_{20.32}\)
  • test statistic: –4.70
  • \(p\text{-value}: 0.0001\)
  • Decision: Reject the null hypothesis.
  • Conclusion: At the 5% significance level, there is sufficient evidence to conclude that the mean time for completing a lap in races is different from that in practices.

Repeat the test in Exercise 10.83, but use Lap 5 data this time.

Repeat the test in Exercise 10.83, but this time combine the data from Laps 1 and 5.

  • is the difference between the mean times for completing a lap in races and in practices.
  • \(t_{40.94}\)
  • test statistic: –5.08
  • \(p\text{-value}: 0\)
  • Reason for Decision: \(p\text{-value} < \alpha\)

In two to three complete sentences, explain in detail how you might use Terri Vogel’s data to answer the following question. “Does Terri Vogel drive faster in races than she does in practices?”

Use the following information to answer the next two exercises. The Eastern and Western Major League Soccer conferences have a new Reserve Division that allows new players to develop their skills. Data for a randomly picked date showed the following annual goals.

Conduct a hypothesis test to answer the next two exercises.

The exact distribution for the hypothesis test is:

  • the normal distribution
  • the Student's t -distribution
  • the uniform distribution
  • the exponential distribution

If the level of significance is 0.05, the conclusion is:

  • There is sufficient evidence to conclude that the W Division teams score fewer goals, on average, than the E teams
  • There is insufficient evidence to conclude that the W Division teams score more goals, on average, than the E teams.
  • There is insufficient evidence to conclude that the W teams score fewer goals, on average, than the E teams score.
  • Unable to determine

Suppose a statistics instructor believes that there is no significant difference between the mean class scores of statistics day students on Exam 2 and statistics night students on Exam 2. She takes random samples from each of the populations. The mean and standard deviation for 35 statistics day students were 75.86 and 16.91. The mean and standard deviation for 37 statistics night students were 75.41 and 19.73. The “day” subscript refers to the statistics day students. The “night” subscript refers to the statistics night students. A concluding statement is:

  • There is sufficient evidence to conclude that statistics night students' mean on Exam 2 is better than the statistics day students' mean on Exam 2.
  • There is insufficient evidence to conclude that the statistics day students' mean on Exam 2 is better than the statistics night students' mean on Exam 2.
  • There is insufficient evidence to conclude that there is a significant difference between the means of the statistics day students and night students on Exam 2.
  • There is sufficient evidence to conclude that there is a significant difference between the means of the statistics day students and night students on Exam 2.

Researchers interviewed street prostitutes in Canada and the United States. The mean age of the 100 Canadian prostitutes upon entering prostitution was 18 with a standard deviation of six. The mean age of the 130 United States prostitutes upon entering prostitution was 20 with a standard deviation of eight. Is the mean age of entering prostitution in Canada lower than the mean age in the United States? Test at a 1% significance level.

Test: two independent sample means, population standard deviations unknown.

Random variable:

\[\bar{X}_{1} - \bar{X}_{2}\]

Distribution: \(H_{0}: \mu_{1} = \mu_{2} H_{a}: \mu_{1} < \mu_{2}\) The mean age of entering prostitution in Canada is lower than the mean age in the United States.

This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the left of zero extends from the axis to the curve. The region under the curve to the left of the line is shaded representing p-value = 0.0157.

Graph: left-tailed

\(p\text{-value}: 0.0151\)

Decision: Do not reject \(H_{0}\).

Conclusion: At the 1% level of significance, from the sample data, there is not sufficient evidence to conclude that the mean age of entering prostitution in Canada is lower than the mean age in the United States.

A powder diet is tested on 49 people, and a liquid diet is tested on 36 different people. Of interest is whether the liquid diet yields a higher mean weight loss than the powder diet. The powder diet group had a mean weight loss of 42 pounds with a standard deviation of 12 pounds. The liquid diet group had a mean weight loss of 45 pounds with a standard deviation of 14 pounds.

Suppose a statistics instructor believes that there is no significant difference between the mean class scores of statistics day students on Exam 2 and statistics night students on Exam 2. She takes random samples from each of the populations. The mean and standard deviation for 35 statistics day students were 75.86 and 16.91, respectively. The mean and standard deviation for 37 statistics night students were 75.41 and 19.73. The “day” subscript refers to the statistics day students. The “night” subscript refers to the statistics night students. An appropriate alternative hypothesis for the hypothesis test is:

  • \(\mu_{day} > \mu_{night}\)
  • \(\mu_{day} < \mu_{night}\)
  • \(\mu_{day} = \mu_{night}\)
  • \(\mu_{day} \neq \mu_{night}\)

10.3: Two Population Means with Known Standard Deviations

Use the following information to answer the next five exercises. The mean speeds of fastball pitches from two different baseball pitchers are to be compared. A sample of 14 fastball pitches is measured from each pitcher. The populations have normal distributions. Table shows the result. Scouters believe that Rodriguez pitches a speedier fastball.

Exercise 10.3.2

The difference in mean speeds of the fastball pitches of the two pitchers

Exercise 10.3.3

Exercise 10.3.4

What is the test statistic?

Exercise 10.3.5

What is the \(p\text{-value}\)?

Exercise 10.3.6

At the 1% significance level, we can reject the null hypothesis. There is sufficient data to conclude that the mean speed of Rodriguez’s fastball is faster than Wesley’s.

Use the following information to answer the next five exercises. A researcher is testing the effects of plant food on plant growth. Nine plants have been given the plant food. Another nine plants have not been given the plant food. The heights of the plants are recorded after eight weeks. The populations have normal distributions. The following table is the result. The researcher thinks the food makes the plants grow taller.

Exercise 10.3.7

Is the population standard deviation known or unknown?

Exercise 10.3.8

Subscripts: 1 = Food, 2 = No Food

  • \(H_{a}: \mu_{1} > \mu_{2}\)

Exercise 10.3.9

Exercise 10.3.10

Draw the graph of the \(p\text{-value}\).

This is a normal distribution curve with mean equal to zero. The values 0 and 0.1 are labeled on the horiztonal axis. A vertical line extends from 0.1 to the curve. The region under the curve to the right of the line is shaded to represent p-value = 0.0198.

Exercise 10.3.11

At the 1% significance level, what is your conclusion?

Use the following information to answer the next five exercises. Two metal alloys are being considered as material for ball bearings. The mean melting point of the two alloys is to be compared. 15 pieces of each metal are being tested. Both populations have normal distributions. The following table is the result. It is believed that Alloy Zeta has a different melting point.

Exercise 10.3.12

Subscripts: 1 = Gamma, 2 = Zeta

Exercise 10.3.13

Is this a right-, left-, or two-tailed test?

Exercise 10.3.14

Exercise 10.3.15

Exercise 10.3.16

There is sufficient evidence to reject the null hypothesis. The data support that the melting point for Alloy Zeta is different from the melting point of Alloy Gamma.

DIRECTIONS: For each of the word problems, use a solution sheet to do the hypothesis test. The solution sheet is found in [link] . Please feel free to make copies of the solution sheets. For the online version of the book, it is suggested that you copy the .doc or the .pdf files.

If you are using a Student's t -distribution for one of the following homework problems, including for paired data, you may assume that the underlying population is normally distributed. (When using these tests in a real situation, you must first prove that assumption, however.)

A study is done to determine if students in the California state university system take longer to graduate, on average, than students enrolled in private universities. One hundred students from both the California state university system and private universities are surveyed. Suppose that from years of research, it is known that the population standard deviations are 1.5811 years and 1 year, respectively. The following data are collected. The California state university system students took on average 4.5 years with a standard deviation of 0.8. The private university students took on average 4.1 years with a standard deviation of 0.3.

Parents of teenage boys often complain that auto insurance costs more, on average, for teenage boys than for teenage girls. A group of concerned parents examines a random sample of insurance bills. The mean annual cost for 36 teenage boys was $679. For 23 teenage girls, it was $559. From past years, it is known that the population standard deviation for each group is $180. Determine whether or not you believe that the mean cost for auto insurance for teenage boys is greater than that for teenage girls.

Subscripts: 1 = boys, 2 = girls

  • \(H_{0}: \mu_{1} \leq \mu_{2}\)
  • The random variable is the difference in the mean auto insurance costs for boys and girls.
  • test statistic: \(z = 2.50\)
  • \(p\text{-value}: 0.0062\)
  • Conclusion: At the 5% significance level, there is sufficient evidence to conclude that the mean cost of auto insurance for teenage boys is greater than that for girls.

A group of transfer bound students wondered if they will spend the same mean amount on texts and supplies each year at their four-year university as they have at their community college. They conducted a random survey of 54 students at their community college and 66 students at their local four-year university. The sample means were $947 and $1,011, respectively. The population standard deviations are known to be $254 and $87, respectively. Conduct a hypothesis test to determine if the means are statistically the same.

Some manufacturers claim that non-hybrid sedan cars have a lower mean miles-per-gallon (mpg) than hybrid ones. Suppose that consumers test 21 hybrid sedans and get a mean of 31 mpg with a standard deviation of seven mpg. Thirty-one non-hybrid sedans get a mean of 22 mpg with a standard deviation of four mpg. Suppose that the population standard deviations are known to be six and three, respectively. Conduct a hypothesis test to evaluate the manufacturers claim.

Subscripts: 1 = non-hybrid sedans, 2 = hybrid sedans

  • The random variable is the difference in the mean miles per gallon of non-hybrid sedans and hybrid sedans.
  • test statistic: 6.36
  • Reason for decision: \(p\text{-value} < \alpha\)
  • Conclusion: At the 5% significance level, there is sufficient evidence to conclude that the mean miles per gallon of non-hybrid sedans is less than that of hybrid sedans.

A baseball fan wanted to know if there is a difference between the number of games played in a World Series when the American League won the series versus when the National League won the series. From 1922 to 2012, the population standard deviation of games won by the American League was 1.14, and the population standard deviation of games won by the National League was 1.11. Of 19 randomly selected World Series games won by the American League, the mean number of games won was 5.76. The mean number of 17 randomly selected games won by the National League was 5.42. Conduct a hypothesis test.

One of the questions in a study of marital satisfaction of dual-career couples was to rate the statement “I’m pleased with the way we divide the responsibilities for childcare.” The ratings went from one (strongly agree) to five (strongly disagree). Table contains ten of the paired responses for husbands and wives. Conduct a hypothesis test to see if the mean difference in the husband’s versus the wife’s satisfaction level is negative (meaning that, within the partnership, the husband is happier than the wife).

  • \(H_{0}: \mu_{d} = 0\)

\(H_{a}: \mu_{d} < 0\)

  • The random variable \(X_{d}\) is the average difference between husband’s and wife’s satisfaction level.
  • test statistic: \(t = –1.86\)
  • \(p\text{-value}: 0.0479\)
  • Check student’s solution
  • Decision: Reject the null hypothesis, but run another test.
  • Conclusion: This is a weak test because alpha and the p -value are close. However, there is insufficient evidence to conclude that the mean difference is negative.

10.4: Comparing Two Independent Population Proportions

Use the following information for the next five exercises. Two types of phone operating system are being tested to determine if there is a difference in the proportions of system failures (crashes). Fifteen out of a random sample of 150 phones with OS 1 had system failures within the first eight hours of operation. Nine out of another random sample of 150 phones with OS 2 had system failures within the first eight hours of operation. OS 2 is believed to be more stable (have fewer crashes) than OS 1 .

Exercise 10.4.2

Exercise 10.4.3

\(P'_{OS_{1}} - P'_{OS_{2}} =\) difference in the proportions of phones that had system failures within the first eight hours of operation with OS 1 and OS 2 .

Exercise 10.4.4

Exercise 10.4.5

Exercise 10.4.6

What can you conclude about the two operating systems?

Use the following information to answer the next twelve exercises. In the recent Census, three percent of the U.S. population reported being of two or more races. However, the percent varies tremendously from state to state. Suppose that two random surveys are conducted. In the first random survey, out of 1,000 North Dakotans, only nine people reported being of two or more races. In the second random survey, out of 500 Nevadans, 17 people reported being of two or more races. Conduct a hypothesis test to determine if the population percents are the same for the two states or if the percent for Nevada is statistically higher than for North Dakota.

Exercise 10.4.7

proportions

Exercise 10.4.8

  • \(H_{0}\): _________
  • \(H_{a}\): _________

Exercise 10.4.9

Is this a right-tailed, left-tailed, or two-tailed test? How do you know?

right-tailed

Exercise 10.4.10

What is the random variable of interest for this test?

Exercise 10.4.11

In words, define the random variable for this test.

The random variable is the difference in proportions (percents) of the populations that are of two or more races in Nevada and North Dakota.

Exercise 10.4.12

Exercise 10.4.13

Explain why you chose the distribution you did for the Exercise 10.56 .

Our sample sizes are much greater than five each, so we use the normal for two proportions distribution for this hypothesis test.

Exercise 10.4.14

Calculate the test statistic.

Exercise 10.4.15

Sketch a graph of the situation. Mark the hypothesized difference and the sample difference. Shade the area corresponding to the \(p\text{-value}\).

This is a horizontal axis with arrows at each end. The axis is labeled p'N - p'ND

Exercise 10.4.16

Exercise 10.4.17

  • Reject the null hypothesis.
  • \(p\text{-value} < \alpha\)
  • At the 5% significance level, there is sufficient evidence to conclude that the proportion (percent) of the population that is of two or more races in Nevada is statistically higher than that in North Dakota.

Exercise 10.4.18

Does it appear that the proportion of Nevadans who are two or more races is higher than the proportion of North Dakotans? Why or why not?

If you are using a Student's t -distribution for one of the following homework problems, including for paired data, you may assume that the underlying population is normally distributed. (In general, you must first prove that assumption, however.)

A recent drug survey showed an increase in the use of drugs and alcohol among local high school seniors as compared to the national percent. Suppose that a survey of 100 local seniors and 100 national seniors is conducted to see if the proportion of drug and alcohol use is higher locally than nationally. Locally, 65 seniors reported using drugs or alcohol within the past month, while 60 national seniors reported using them.

We are interested in whether the proportions of female suicide victims for ages 15 to 24 are the same for the whites and the blacks races in the United States. We randomly pick one year, 1992, to compare the races. The number of suicides estimated in the United States in 1992 for white females is 4,930. Five hundred eighty were aged 15 to 24. The estimate for black females is 330. Forty were aged 15 to 24. We will let female suicide victims be our population.

  • \(H_{0}: P_{W} = P_{B}\)
  • \(H_{a}: P_{W} \neq P_{B}\)
  • The random variable is the difference in the proportions of white and black suicide victims, aged 15 to 24.
  • normal for two proportions
  • test statistic: –0.1944
  • \(p\text{-value}: 0.8458\)
  • Reason for decision: \(p\text{-value} > \alpha\)
  • Conclusion: At the 5% significance level, there is insufficient evidence to conclude that the proportions of white and black female suicide victims, aged 15 to 24, are different.

Elizabeth Mjelde, an art history professor, was interested in whether the value from the Golden Ratio formula, \(\left(\frac{(larger + smaller dimension}{larger dimension}\right)\) was the same in the Whitney Exhibit for works from 1900 to 1919 as for works from 1920 to 1942. Thirty-seven early works were sampled, averaging 1.74 with a standard deviation of 0.11. Sixty-five of the later works were sampled, averaging 1.746 with a standard deviation of 0.1064. Do you think that there is a significant difference in the Golden Ratio calculation?

A recent year was randomly picked from 1985 to the present. In that year, there were 2,051 Hispanic students at Cabrillo College out of a total of 12,328 students. At Lake Tahoe College, there were 321 Hispanic students out of a total of 2,441 students. In general, do you think that the percent of Hispanic students at the two colleges is basically the same or different?

Subscripts: 1 = Cabrillo College, 2 = Lake Tahoe College

  • \(H_{0}: p_{1} = p_{2}\)
  • \(H_{a}: p_{1} \neq p_{2}\)
  • The random variable is the difference between the proportions of Hispanic students at Cabrillo College and Lake Tahoe College.
  • test statistic: 4.29
  • \(p\text{-value}: 0.00002\)
  • Reason for decision: p -value < alpha
  • Conclusion: There is sufficient evidence to conclude that the proportions of Hispanic students at Cabrillo College and Lake Tahoe College are different.

Use the following information to answer the next three exercises. Neuroinvasive West Nile virus is a severe disease that affects a person’s nervous system . It is spread by the Culex species of mosquito. In the United States in 2010 there were 629 reported cases of neuroinvasive West Nile virus out of a total of 1,021 reported cases and there were 486 neuroinvasive reported cases out of a total of 712 cases reported in 2011. Is the 2011 proportion of neuroinvasive West Nile virus cases more than the 2010 proportion of neuroinvasive West Nile virus cases? Using a 1% level of significance, conduct an appropriate hypothesis test.

  • “2011” subscript: 2011 group.
  • “2010” subscript: 2010 group
  • a test of two proportions
  • a test of two independent means
  • a test of a single mean
  • a test of matched pairs.

An appropriate null hypothesis is:

  • \(p_{2011} \leq p_{2010}\)
  • \(p_{2011} \geq p_{2010}\)
  • \(\mu_{2011} \leq \mu_{2010}\)
  • \(p_{2011} > p_{2010}\)

The \(p\text{-value}\) is 0.0022. At a 1% level of significance, the appropriate conclusion is

  • There is sufficient evidence to conclude that the proportion of people in the United States in 2011 who contracted neuroinvasive West Nile disease is less than the proportion of people in the United States in 2010 who contracted neuroinvasive West Nile disease.
  • There is insufficient evidence to conclude that the proportion of people in the United States in 2011 who contracted neuroinvasive West Nile disease is more than the proportion of people in the United States in 2010 who contracted neuroinvasive West Nile disease.
  • There is insufficient evidence to conclude that the proportion of people in the United States in 2011 who contracted neuroinvasive West Nile disease is less than the proportion of people in the United States in 2010 who contracted neuroinvasive West Nile disease.
  • There is sufficient evidence to conclude that the proportion of people in the United States in 2011 who contracted neuroinvasive West Nile disease is more than the proportion of people in the United States in 2010 who contracted neuroinvasive West Nile disease.

Researchers conducted a study to find out if there is a difference in the use of eReaders by different age groups. Randomly selected participants were divided into two age groups. In the 16- to 29-year-old group, 7% of the 628 surveyed use eReaders, while 11% of the 2,309 participants 30 years old and older use eReaders.

Test: two independent sample proportions.

Random variable: \(p′_{1} - p′_{2}\)

Distribution:

The proportion of eReader users is different for the 16- to 29-year-old users from that of the 30 and older users.

Graph: two-tailed

This is a normal distribution curve with mean equal to zero. Both the right and left tails of the curve are shaded. Each tail represents 1/2(p-value) = 0.0017.

\(p\text{-value}: 0.0033\)

Conclusion: At the 5% level of significance, from the sample data, there is sufficient evidence to conclude that the proportion of eReader users 16 to 29 years old is different from the proportion of eReader users 30 and older.

are considered obese if their body mass index (BMI) is at least 30. The researchers wanted to determine if the proportion of women who are obese in the south is less than the proportion of southern men who are obese. The results are shown in Table . Test at the 1% level of significance.

Two computer users were discussing tablet computers. A higher proportion of people ages 16 to 29 use tablets than the proportion of people age 30 and older. Table details the number of tablet owners for each age group. Test at the 1% level of significance.

Test: two independent sample proportions

  • \(H_{a}: p_{1} > p_{2}\)

A higher proportion of tablet owners are aged 16 to 29 years old than are 30 years old and older.

Graph: right-tailed

This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the right of zero extends from the axis to the curve. The region under the curve to the right of the line is shaded representing p-value = 0.2354.

\(p\text{-value}: 0.2354\)

Decision: Do not reject the \(H_{0}\).

Conclusion: At the 1% level of significance, from the sample data, there is not sufficient evidence to conclude that a higher proportion of tablet owners are aged 16 to 29 years old than are 30 years old and older.

A group of friends debated whether more men use smartphones than women. They consulted a research study of smartphone use among adults. The results of the survey indicate that of the 973 men randomly sampled, 379 use smartphones. For women, 404 of the 1,304 who were randomly sampled use smartphones. Test at the 5% level of significance.

While her husband spent 2½ hours picking out new speakers, a statistician decided to determine whether the percent of men who enjoy shopping for electronic equipment is higher than the percent of women who enjoy shopping for electronic equipment. The population was Saturday afternoon shoppers. Out of 67 men, 24 said they enjoyed the activity. Eight of the 24 women surveyed claimed to enjoy the activity. Interpret the results of the survey.

Subscripts: 1: men; 2: women

  • \(H_{0}: p_{1} \leq p_{2}\)
  • \(P'_{1} - P\_{2}\) is the difference between the proportions of men and women who enjoy shopping for electronic equipment.
  • test statistic: 0.22
  • \(p\text{-value}: 0.4133\)
  • Conclusion: At the 5% significance level, there is insufficient evidence to conclude that the proportion of men who enjoy shopping for electronic equipment is more than the proportion of women.

We are interested in whether children’s educational computer software costs less, on average, than children’s entertainment software. Thirty-six educational software titles were randomly picked from a catalog. The mean cost was $31.14 with a standard deviation of $4.69. Thirty-five entertainment software titles were randomly picked from the same catalog. The mean cost was $33.86 with a standard deviation of $10.87. Decide whether children’s educational software costs less, on average, than children’s entertainment software.

Joan Nguyen recently claimed that the proportion of college-age males with at least one pierced ear is as high as the proportion of college-age females. She conducted a survey in her classes. Out of 107 males, 20 had at least one pierced ear. Out of 92 females, 47 had at least one pierced ear. Do you believe that the proportion of males has reached the proportion of females?

  • \(P'_{1} - P\_{2}\) is the difference between the proportions of men and women that have at least one pierced ear.
  • test statistic: –4.82
  • Conclusion: At the 5% significance level, there is sufficient evidence to conclude that the proportions of males and females with at least one pierced ear is different.

Use the data sets found in [link] to answer this exercise. Is the proportion of race laps Terri completes slower than 130 seconds less than the proportion of practice laps she completes slower than 135 seconds?

"To Breakfast or Not to Breakfast?" by Richard Ayore

In the American society, birthdays are one of those days that everyone looks forward to. People of different ages and peer groups gather to mark the 18th, 20th, …, birthdays. During this time, one looks back to see what he or she has achieved for the past year and also focuses ahead for more to come.

If, by any chance, I am invited to one of these parties, my experience is always different. Instead of dancing around with my friends while the music is booming, I get carried away by memories of my family back home in Kenya. I remember the good times I had with my brothers and sister while we did our daily routine.

Every morning, I remember we went to the shamba (garden) to weed our crops. I remember one day arguing with my brother as to why he always remained behind just to join us an hour later. In his defense, he said that he preferred waiting for breakfast before he came to weed. He said, “This is why I always work more hours than you guys!”

And so, to prove him wrong or right, we decided to give it a try. One day we went to work as usual without breakfast, and recorded the time we could work before getting tired and stopping. On the next day, we all ate breakfast before going to work. We recorded how long we worked again before getting tired and stopping. Of interest was our mean increase in work time. Though not sure, my brother insisted that it was more than two hours. Using the data in Table , solve our problem.

  • \(H_{a}: \mu_{d} > 0\)
  • The random variable \(X_{d}\) is the mean difference in work times on days when eating breakfast and on days when not eating breakfast.
  • test statistic: 4.8963

\(p\text{-value}: 0.0004\)

  • Reason for Decision:\(p\text{-value} < \alpha\)
  • Conclusion: At the 5% level of significance, there is sufficient evidence to conclude that the mean difference in work times on days when eating breakfast and on days when not eating breakfast has increased.

10.5: Matched or Paired Samples

Use the following information to answer the next five exercises. A study was conducted to test the effectiveness of a software patch in reducing system failures over a six-month period. Results for randomly selected installations are shown in Table . The “before” value is matched to an “after” value, and the differences are calculated. The differences have a normal distribution. Test at the 1% significance level.

Exercise 10.5.4

the mean difference of the system failures

Exercise 10.5.5

Exercise 10.5.6

Exercise 10.5.7

Exercise 10.5.8

What conclusion can you draw about the software patch?

With a \(p\text{-value} 0.0067\), we can reject the null hypothesis. There is enough evidence to support that the software patch is effective in reducing the number of system failures.

Use the following information to answer next five exercises. A study was conducted to test the effectiveness of a juggling class. Before the class started, six subjects juggled as many balls as they could at once. After the class, the same six subjects juggled as many balls as they could. The differences in the number of balls are calculated. The differences have a normal distribution. Test at the 1% significance level.

Exercise 10.5.9

Exercise 10.5.10

Exercise 10.5.11

What is the sample mean difference?

Exercise 10.5.12

This is a normal distribution curve with mean equal to zero. The values 0 and 1.67 are labeled on the horiztonal axis. A vertical line extends from 1.67 to the curve. The region under the curve to the right of the line is shaded to represent p-value = 0.0021.

Exercise 10.5.13

What conclusion can you draw about the juggling class?

Use the following information to answer the next five exercises. A doctor wants to know if a blood pressure medication is effective. Six subjects have their blood pressures recorded. After twelve weeks on the medication, the same six subjects have their blood pressure recorded again. For this test, only systolic pressure is of concern. Test at the 1% significance level.

Exercise 10.5.14

\(H_{0}: \mu_{d} \geq 0\)

Exercise 10.5.15

Exercise 10.5.16

Exercise 10.5.17

Exercise 10.5.18

What is the conclusion?

We decline to reject the null hypothesis. There is not sufficient evidence to support that the medication is effective.

Bringing It Together

Use the following information to answer the next ten exercises. indicate which of the following choices best identifies the hypothesis test.

  • independent group means, population standard deviations and/or variances known

Exercise 10.5.19

A powder diet is tested on 49 people, and a liquid diet is tested on 36 different people. The population standard deviations are two pounds and three pounds, respectively. Of interest is whether the liquid diet yields a higher mean weight loss than the powder diet.

Exercise 10.5.20

A new chocolate bar is taste-tested on consumers. Of interest is whether the proportion of children who like the new chocolate bar is greater than the proportion of adults who like it.

Exercise 10.5.21

The mean number of English courses taken in a two–year time period by male and female college students is believed to be about the same. An experiment is conducted and data are collected from nine males and 16 females.

Exercise 10.5.22

A football league reported that the mean number of touchdowns per game was five. A study is done to determine if the mean number of touchdowns has decreased.

Exercise 10.5.23

A study is done to determine if students in the California state university system take longer to graduate than students enrolled in private universities. One hundred students from both the California state university system and private universities are surveyed. From years of research, it is known that the population standard deviations are 1.5811 years and one year, respectively.

Exercise 10.5.24

According to a YWCA Rape Crisis Center newsletter, 75% of rape victims know their attackers. A study is done to verify this.

Exercise 10.5.25

According to a recent study, U.S. companies have a mean maternity-leave of six weeks.

Exercise 10.5.26

A recent drug survey showed an increase in use of drugs and alcohol among local high school students as compared to the national percent. Suppose that a survey of 100 local youths and 100 national youths is conducted to see if the proportion of drug and alcohol use is higher locally than nationally.

Exercise 10.5.27

A new SAT study course is tested on 12 individuals. Pre-course and post-course scores are recorded. Of interest is the mean increase in SAT scores. The following data are collected:

Exercise 10.5.28

University of Michigan researchers reported in the Journal of the National Cancer Institute that quitting smoking is especially beneficial for those under age 49. In this American Cancer Society study, the risk (probability) of dying of lung cancer was about the same as for those who had never smoked.

Exercise 10.5.29

Lesley E. Tan investigated the relationship between left-handedness vs. right-handedness and motor competence in preschool children. Random samples of 41 left-handed preschool children and 41 right-handed preschool children were given several tests of motor skills to determine if there is evidence of a difference between the children based on this experiment. The experiment produced the means and standard deviations shown Table . Determine the appropriate test and best distribution to use for that test.

  • Two independent means, normal distribution
  • Two independent means, Student’s-t distribution
  • Matched or paired samples, Student’s-t distribution
  • Two population proportions, normal distribution

Exercise 10.5.30

A golf instructor is interested in determining if her new technique for improving players’ golf scores is effective. She takes four (4) new students. She records their 18-hole scores before learning the technique and then after having taken her class. She conducts a hypothesis test. The data are as Table .

  • a test of two independent means.
  • a test of two proportions.
  • a test of a single mean.
  • a test of a single proportion.

If you are using a Student's t -distribution for the homework problems, including for paired data, you may assume that the underlying population is normally distributed. (When using these tests in a real situation, you must first prove that assumption, however.)

Ten individuals went on a low–fat diet for 12 weeks to lower their cholesterol. The data are recorded in Table . Do you think that their cholesterol levels were significantly lowered?

\(p\text{-value} = 0.1494\)

At the 5% significance level, there is insufficient evidence to conclude that the medication lowered cholesterol levels after 12 weeks.

Use the following information to answer the next two exercises. A new AIDS prevention drug was tried on a group of 224 HIV positive patients. Forty-five patients developed AIDS after four years. In a control group of 224 HIV positive patients, 68 developed AIDS after four years. We want to test whether the method of treatment reduces the proportion of patients that develop AIDS after four years or if the proportions of the treated group and the untreated group stay the same.

Let the subscript \(t =\) treated patient and \(ut =\) untreated patient.

The appropriate hypotheses are:

  • \(H_{0}: p_{t} < p_{ut}\) and \(H_{a}: p_{t} \geq p_{ut}\)
  • \(H_{0}: p_{t} \leq p_{ut}\) and \(H_{a}: p_{t} > p_{ut}\)
  • \(H_{0}: p_{t} = p_{ut}\) and \(H_{a}: p_{t} \neq p_{ut}\)
  • \(H_{0}: p_{t} = p_{ut}\) and \(H_{a}: p_{t} < p_{ut}\)

If the \(p\text{-value}\) is 0.0062 what is the conclusion (use \(\alpha = 0.05\))?

  • The method has no effect.
  • There is sufficient evidence to conclude that the method reduces the proportion of HIV positive patients who develop AIDS after four years.
  • There is sufficient evidence to conclude that the method increases the proportion of HIV positive patients who develop AIDS after four years.
  • There is insufficient evidence to conclude that the method reduces the proportion of HIV positive patients who develop AIDS after four years.

Use the following information to answer the next two exercises. An experiment is conducted to show that blood pressure can be consciously reduced in people trained in a “biofeedback exercise program.” Six subjects were randomly selected and blood pressure measurements were recorded before and after the training. The difference between blood pressures was calculated (after - before) producing the following results: \(\bar{x}_{d} = -10.2\) \(s_{d} = 8.4\). Using the data, test the hypothesis that the blood pressure has decreased after the training.

The distribution for the test is:

  • \(N(-10.2, 8.4)\)
  • \(N\left(-10.2, \frac{8.4}{\sqrt{6}}\right)\)

If \(\alpha = 0.05\), the \(p\text{-value}\) and the conclusion are

  • 0.0014; There is sufficient evidence to conclude that the blood pressure decreased after the training.
  • 0.0014; There is sufficient evidence to conclude that the blood pressure increased after the training.
  • 0.0155; There is sufficient evidence to conclude that the blood pressure decreased after the training.
  • 0.0155; There is sufficient evidence to conclude that the blood pressure increased after the training.

A golf instructor is interested in determining if her new technique for improving players’ golf scores is effective. She takes four new students. She records their 18-hole scores before learning the technique and then after having taken her class. She conducts a hypothesis test. The data are as follows.

The correct decision is:

  • Reject \(H_{0}\).
  • Do not reject the \(H_{0}\).

A local cancer support group believes that the estimate for new female breast cancer cases in the south is higher in 2013 than in 2012. The group compared the estimates of new female breast cancer cases by southern state in 2012 and in 2013. The results are in Table .

Test: two matched pairs or paired samples ( t -test)

Random variable: \(\bar{X}_{d}\)

Distribution: \(t_{12}\)

\(H_{0}: \mu_{d} = 0 H_{a}: \mu_{d} > 0\)

The mean of the differences of new female breast cancer cases in the south between 2013 and 2012 is greater than zero. The estimate for new female breast cancer cases in the south is higher in 2013 than in 2012.

This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the right of zero extends from the axis to the curve. The region under the curve to the right of the line is shaded representing p-value = 0.0004.

Decision: Reject \(H_{0}\)

Conclusion: At the 5% level of significance, from the sample data, there is sufficient evidence to conclude that there was a higher estimate of new female breast cancer cases in 2013 than in 2012.

A traveler wanted to know if the prices of hotels are different in the ten cities that he visits the most often. The list of the cities with the corresponding hotel prices for his two favorite hotel chains is in Table. Test at the 1% level of significance.

A politician asked his staff to determine whether the underemployment rate in the northeast decreased from 2011 to 2012. The results are in Table.

Test: matched or paired samples ( t -test)

Difference data: \(\{–0.9, –3.7, –3.2, –0.5, 0.6, –1.9, –0.5, 0.2, 0.6, 0.4, 1.7, –2.4, 1.8\}\)

Random Variable: \(\bar{X}_{d}\)

Distribution: \(H_{0}: \mu_{d} = 0 H_{a}: \mu_{d} < 0\)

The mean of the differences of the rate of underemployment in the northeastern states between 2012 and 2011 is less than zero. The underemployment rate went down from 2011 to 2012.

Graph: left-tailed.

This is a normal distribution curve with mean equal to zero. A vertical line near the tail of the curve to the right of zero extends from the axis to the curve. The region under the curve to the right of the line is shaded representing p-value = 0.1207.

\(p\text{-value}: 0.1207\)

Conclusion: At the 5% level of significance, from the sample data, there is not sufficient evidence to conclude that there was a decrease in the underemployment rates of the northeastern states from 2011 to 2012.

10.6: Hypothesis Testing for Two Means and Two Proportions

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Mathematics LibreTexts

9: Hypothesis Testing with Two Samples

  • Last updated
  • Save as PDF
  • Page ID 125733

You have learned to conduct hypothesis tests on single means and single proportions. You will expand upon that in this chapter. You will compare two means or two proportions to each other. The general procedure is still the same, just expanded. To compare two means or two proportions, you work with two groups. The groups are classified either as independent or matched pairs. Independent groups consist of two samples that are independent, that is, sample values selected from one population are not related in any way to sample values selected from the other population. Matched pairs consist of two samples that are dependent. The parameter tested using matched pairs is the population mean. The parameters tested using independent groups are either population means or population proportions.

  • 9.1: Prelude to Hypothesis Testing with Two Samples This chapter deals with the following hypothesis tests: Independent groups (samples are independent) Test of two population means. Test of two population proportions. Matched or paired samples (samples are dependent) Test of the two population proportions by testing one population mean of differences.
  • 9.2: Comparing Two Independent Population Means (Hypothesis test) The comparison of two population means is very common. A difference between the two samples depends on both the means and the standard deviations. Very different means can occur by chance if there is great variation among the individual samples.
  • 9.3: Comparing Two Independent Population Proportions (Hyppothesis test) Comparing two proportions, like comparing two means, is common. If two estimated proportions are different, it may be due to a difference in the populations or it may be due to chance. A hypothesis test can help determine if a difference in the estimated proportions reflects a difference in the population proportions.
  • 9.4: Matched or Paired Samples When using a hypothesis test for matched or paired samples, the following characteristics should be present: Simple random sampling is used. Sample sizes are often small. Two measurements (samples) are drawn from the same pair of individuals or objects. Differences are calculated from the matched or paired samples. The differences form the sample that is used for the hypothesis test. Either the matched pairs have differences that come from a population that is normal or the number of difference
  • 9.5: Hypothesis Testing for Two Means and Two Proportions (Worksheet) A statistics Worksheet: The student will select the appropriate distributions to use in each case. The student will conduct hypothesis tests and interpret the results.
  • 9.E: Hypothesis Testing with Two Samples (Exercises) These are homework exercises to accompany the Textmap created for "Introductory Statistics" by OpenStax.

IMAGES

  1. Two Sample t Test (Independent Samples)

    hypothesis testing with 2 samples

  2. Chapter 8 Hypothesis Testing with Two Samples LarsonFarber

    hypothesis testing with 2 samples

  3. Hypothesis Testing Solved Examples(Questions and Solutions)

    hypothesis testing with 2 samples

  4. PPT

    hypothesis testing with 2 samples

  5. Ch8: Hypothesis Testing (2 Samples)

    hypothesis testing with 2 samples

  6. Hypothesis Testing With Two Proportions

    hypothesis testing with 2 samples

VIDEO

  1. Two-Sample Hypothesis Testing: Dependent Sample

  2. Hypothesis Testing #2 (Testing of Proportions)

  3. Hypothesis testing 2 L06

  4. Hypothesis Testing 2

  5. L-4

  6. 8 Hypothesis testing| Z-test |Two Independent Samples with MS Excel

COMMENTS

  1. 10: Hypothesis Testing with Two Samples

    10.E: Hypothesis Testing with Two Samples (Exercises) These are homework exercises to accompany the Textmap created for "Introductory Statistics" by OpenStax. You have learned to conduct hypothesis tests on single means and single proportions. You will expand upon that in this chapter. You will compare two means or two proportions to each other.

  2. Two Sample t-test: Definition, Formula, and Example

    A two sample t-test is used to determine whether or not two population means are equal. ... 0.05, and 0.01) then you can reject the null hypothesis. Two Sample t-test: Assumptions. For the results of a two sample t-test to be valid, the following assumptions should be met:

  3. PDF Two Samples Hypothesis Testing

    • Here, we extend the concept of hypothesis testing to the comparison of two variables x A and x B. Two Samples Hypothesis Testing when n is the same for the two Samples . Two-tailed paired samples hypothesis test: • In engineering analysis, we often want to test whether some . modification. to a system causes a . statistically significant ...

  4. Two-sample hypothesis testing

    In statistical hypothesis testing, a two-sample test is a test performed on the data of two random samples, each independently obtained from a different given population. The purpose of the test is to determine whether the difference between these two populations is statistically significant . There are a large number of statistical tests that ...

  5. Hypothesis Testing

    Table of contents. Step 1: State your null and alternate hypothesis. Step 2: Collect data. Step 3: Perform a statistical test. Step 4: Decide whether to reject or fail to reject your null hypothesis. Step 5: Present your findings. Other interesting articles. Frequently asked questions about hypothesis testing.

  6. How t-Tests Work: 1-sample, 2-sample, and Paired t-Tests

    A paired t-test takes paired observations (like before and after), subtracts one from the other, and conducts a 1-sample t-test on the differences. Typically, a paired t-test determines whether the paired differences are significantly different from zero. Download the CSV data file to check this yourself: T-testData.

  7. Two-sample t test for difference of means

    And let's assume that we are working with a significance level of 0.05. So pause the video, and conduct the two sample T test here, to see whether there's evidence that the sizes of tomato plants differ between the fields. Alright, now let's work through this together. So like always, let's first construct our null hypothesis.

  8. Hypothesis Testing for 2 Samples: Introduction

    The mean for the last recorded percentage was less than half of the initial score: 30.27 (SD 34.03). The decrease was found to be statistically significant using a paired sample t-test (t = 4.36, 36 df, p < .001).". This is a hypothesis test for matched pairs, sometimes known as 2 means, dependent samples.

  9. Hypotheses for a two-sample t test (video)

    If that's below your significance level, then you would reject your null hypothesis and it would suggest the alternative that might be that, "Hey, maybe this mean "is greater than zero." On the other hand, a two-sample T test is where you're thinking about two different populations. For example, you could be thinking about a population of men ...

  10. Putting It Together: Hypothesis Testing with Two Samples

    Let's Summarize. The steps for performing a hypothesis test for two population means with unknown standard deviation is generally the same as the steps for conducting a hypothesis test for one population mean with unknown standard deviation, using a t-distribution.; Because the population standard deviations are not known, the sample standard deviations are used for calculations.

  11. Hypothesis Testing: Two Samples

    The Population Mean: This image shows a series of histograms for a large number of sample means taken from a population.Recall that as more sample means are taken, the closer the mean of these means will be to the population mean. In this section, we explore hypothesis testing of two independent population means (and proportions) and also tests for paired samples of population means.

  12. Example of hypotheses for paired and two-sample t tests

    First of all, if you have two groups, one testing one placebo, then it's 2 samples. If it is the same group before and after, then paired t-test. I'm trying to run a dependent sample t-test/paired sample t test through using data from a Qualtrics survey measuring two groups of people (one with social anxiety and one without on the effects of ...

  13. Two sample testing

    It is often done using a hypothesis test - hence the name "two sample testing". This is also called A/B testing. The natural hypotheses for this situation are: H 0: the two samples are generated from the same distribution. H A: the two samples are generated from two different distributions. The test statistic is normally based on the ...

  14. 6a.2

    Lesson 6a: Hypothesis Testing for One-Sample Proportion. 6a.1 - Introduction to Hypothesis Testing ; 6a.2 - Steps for Hypothesis Tests; 6a.3 - Set-Up for One-Sample Hypotheses; 6a.4 - Hypothesis Test for One-Sample Proportion. 6a.4.1 - Making a Decision; 6a.4.2 - More on the P-Value and Rejection Region Approach

  15. 5.5

    5.5 - Hypothesis Testing for Two-Sample Proportions. We are now going to develop the hypothesis test for the difference of two proportions for independent samples. The hypothesis test follows the same steps as one group. These notes are going to go into a little bit of math and formulas to help demonstrate the logic behind hypothesis testing ...

  16. 8: Hypothesis Testing with Two Samples

    8.1: Prelude to Hypothesis Testing with Two Samples. This chapter deals with the following hypothesis tests: Independent groups (samples are independent) Test of two population means. Test of two population proportions. Matched or paired samples (samples are dependent) Test of the two population proportions by testing one population mean of ...

  17. 9.2: Comparing Two Independent Population Means (Hypothesis test)

    Distribution for the test: Use tdf t d f where df d f is calculated using the df d f formula for independent groups, two population means. Using a calculator, df d f is approximately 18.8462. Do not pool the variances. Calculate the p-value using a Student's t-distribution: p-value = 0.0054 p -value = 0.0054. Graph:

  18. 10: Hypothesis Testing with Two Samples

    10.4: Matched or Paired Samples When using a hypothesis test for matched or paired samples, the following characteristics should be present: Simple random sampling is used. Sample sizes are often small. Two measurements (samples) are drawn from the same pair of individuals or objects. Differences are calculated from the matched or paired samples.

  19. Hypothesis Testing: 2 Means (Independent Samples)

    Since we are being asked for convincing statistical evidence, a hypothesis test should be conducted. In this case, we are dealing with averages from two samples or groups (the home run distances), so we will conduct a Test of 2 Means. n1 = 70 n 1 = 70 is the sample size for the first group. n2 = 66 n 2 = 66 is the sample size for the second group.

  20. Two Sample T-Test

    1. Two tailed test example: A factory uses two identical machines to produce plastic plates. You would expect both machines to produce the same number of plates per minute. Let μ1 = average number of plates produced by machine1 per minute. Let μ2 = average number of plates produced by machine2 per minute. We would expect μ1 to be equal to μ2.

  21. Two Sample t-test Calculator

    If this is not the case, you should instead use the Welch's t-test calculator. To perform a two sample t-test, simply fill in the information below and then click the "Calculate" button. Enter raw data Enter summary data. Sample 1. 301, 298, 295, 297, 304, 305, 309, 298, 291, 299, 293, 304. Sample 2.

  22. 10: Hypothesis Testing with Two Samples

    10.4: Matched or Paired Samples When using a hypothesis test for matched or paired samples, the following characteristics should be present: Simple random sampling is used. Sample sizes are often small. Two measurements (samples) are drawn from the same pair of individuals or objects. Differences are calculated from the matched or paired samples.

  23. Hypothesis Testing

    Step 1: Set up the null hypothesis by correctly identifying whether it is the left-tailed, right-tailed, or two-tailed hypothesis testing. Step 2: Set up the alternative hypothesis. Step 3: Choose the correct significance level, \(\alpha\), and find the critical value. Step 4: Calculate the correct test statistic (z, t or \(\chi\)) and p-value.

  24. Stat 205

    Under the null hypothesis, the test statistic follows \(\chi^2 = (n-1)S^2/0.5^2\) a chi-square distribution with df = 5. If this alternative hypothesis is true and \(\sigma^2\) squared is greater than the hypothesized value then the sample variance \(S^2\) will have a tendency to be greater than the hypothesized value and the test statistic will tend to be large

  25. 10.E: Hypothesis Testing with Two Samples (Exercises)

    Use the following information to answer the next 15 exercises: Indicate if the hypothesis test is for. independent group means, population standard deviations, and/or variances known. independent group means, population standard deviations, and/or variances unknown. matched or paired samples. single mean.

  26. PDF If testing a 2-sided hypothesis, use a 2-sided test! → for null

    If testing a 2-sided hypothesis, use a 2-sided test! Morals of the sidedness (or tail) tale: + A single, 1-sided test is fine if one has prior information and makes *a* 1-sided hypothesis. + For all other cases, use *a* 2-sided test. + A pair of 1-sided tests with FPR = α is equivalent to one 2-sided test with FPR = 2α, i.e.,

  27. 9: Hypothesis Testing with Two Samples

    The differences form the sample that is used for the hypothesis test. Either the matched pairs have differences that come from a population that is normal or the number of difference; 9.5: Hypothesis Testing for Two Means and Two Proportions (Worksheet) A statistics Worksheet: The student will select the appropriate distributions to use in each ...