Normal Hypothesis Testing ( AQA A Level Maths: Statistics )

Revision note.

Amber

Normal Hypothesis Testing

How is a hypothesis test carried out with the normal distribution.

  • The population mean is tested by looking at the mean of a sample taken from the population
  • A hypothesis test is used when the value of the assumed population mean is questioned
  • Make sure you clearly define µ before writing the hypotheses, if it has not been defined in the question
  • The null hypothesis will always be H 0 : µ = ...
  • The alternative hypothesis will depend on if it is a one-tailed or two-tailed test
  • The alternative hypothesis, H 1   will be H 1 :   µ > ... or  H 1 :   µ < ...
  • The alternative hypothesis, H 1   will be H 1 :   µ ≠ ..
  • Remember that the variance of the sample mean distribution will be the variance of the population distribution divided by n
  • the mean of the sample mean distribution will be the same as the mean of the population distribution
  • The normal distribution will be used to calculate the probability of the observed value of the test statistic taking the observed value or a more extreme value
  • either calculating the probability of the test statistic taking the observed or a more extreme value ( p – value ) and comparing this with the significance level
  • Finding the critical region can be more useful for considering more than one observed value or for further testing

How is the critical value found in a hypothesis test for the mean of a normal distribution?

  • The probability of the observed value being within the critical region, given a true null hypothesis will be the same as the significance level
  • To find the critical value(s) find the distribution of the sample means, assuming H 0 is true, and use the inverse normal function on your calculator
  • For a two-tailed test you will need to find both critical values, one at each end of the distribution

What steps should I follow when carrying out a hypothesis test for the mean of a normal distribution?

  • Following these steps will help when carrying out a hypothesis test for the mean of a normal distribution:

Step 2.  Write the null and alternative hypotheses clearly using the form

H 0 : μ = ...

H 1 : μ ... ...

Step 4.    Calculate either the critical value(s) or the p – value (probability of the observed value) for the test

Step 5.    Compare the observed value of the test statistic with the critical value(s) or the p - value with the significance level

Step 6.    Decide whether there is enough evidence to reject H 0 or whether it has to be accepted

Step 7.  Write a conclusion in context

Worked example

5-3-2-hypothesis-nd-we-solution-part-1

You've read 0 of your 0 free revision notes

Get unlimited access.

to absolutely everything:

  • Downloadable PDFs
  • Unlimited Revision Notes
  • Topic Questions
  • Past Papers
  • Model Answers
  • Videos (Maths and Science)

Join the 100,000 + Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Did this page help you?

Author: Amber

Amber gained a first class degree in Mathematics & Meteorology from the University of Reading before training to become a teacher. She is passionate about teaching, having spent 8 years teaching GCSE and A Level Mathematics both in the UK and internationally. Amber loves creating bright and informative resources to help students reach their potential.

  • FOR INSTRUCTOR
  • FOR INSTRUCTORS

8.4.3 Hypothesis Testing for the Mean

$\quad$ $H_0$: $\mu=\mu_0$, $\quad$ $H_1$: $\mu \neq \mu_0$.

$\quad$ $H_0$: $\mu \leq \mu_0$, $\quad$ $H_1$: $\mu > \mu_0$.

$\quad$ $H_0$: $\mu \geq \mu_0$, $\quad$ $H_1$: $\mu \lt \mu_0$.

Two-sided Tests for the Mean:

Therefore, we can suggest the following test. Choose a threshold, and call it $c$. If $|W| \leq c$, accept $H_0$, and if $|W|>c$, accept $H_1$. How do we choose $c$? If $\alpha$ is the required significance level, we must have

  • As discussed above, we let \begin{align}%\label{} W(X_1,X_2, \cdots,X_n)=\frac{\overline{X}-\mu_0}{\sigma / \sqrt{n}}. \end{align} Note that, assuming $H_0$, $W \sim N(0,1)$. We will choose a threshold, $c$. If $|W| \leq c$, we accept $H_0$, and if $|W|>c$, accept $H_1$. To choose $c$, we let \begin{align} P(|W| > c \; | \; H_0) =\alpha. \end{align} Since the standard normal PDF is symmetric around $0$, we have \begin{align} P(|W| > c \; | \; H_0) = 2 P(W>c | \; H_0). \end{align} Thus, we conclude $P(W>c | \; H_0)=\frac{\alpha}{2}$. Therefore, \begin{align} c=z_{\frac{\alpha}{2}}. \end{align} Therefore, we accept $H_0$ if \begin{align} \left|\frac{\overline{X}-\mu_0}{\sigma / \sqrt{n}} \right| \leq z_{\frac{\alpha}{2}}, \end{align} and reject it otherwise.
  • We have \begin{align} \beta (\mu) &=P(\textrm{type II error}) = P(\textrm{accept }H_0 \; | \; \mu) \\ &= P\left(\left|\frac{\overline{X}-\mu_0}{\sigma / \sqrt{n}} \right| \lt z_{\frac{\alpha}{2}}\; | \; \mu \right). \end{align} If $X_i \sim N(\mu,\sigma^2)$, then $\overline{X} \sim N(\mu, \frac{\sigma^2}{n})$. Thus, \begin{align} \beta (\mu)&=P\left(\left|\frac{\overline{X}-\mu_0}{\sigma / \sqrt{n}} \right| \lt z_{\frac{\alpha}{2}}\; | \; \mu \right)\\ &=P\left(\mu_0- z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}} \leq \overline{X} \leq \mu_0+ z_{\frac{\alpha}{2}} \frac{\sigma}{\sqrt{n}}\right)\\ &=\Phi\left(z_{\frac{\alpha}{2}}+\frac{\mu_0-\mu}{\sigma / \sqrt{n}}\right)-\Phi\left(-z_{\frac{\alpha}{2}}+\frac{\mu_0-\mu}{\sigma / \sqrt{n}}\right). \end{align}
  • Let $S^2$ be the sample variance for this random sample. Then, the random variable $W$ defined as \begin{equation} W(X_1,X_2, \cdots, X_n)=\frac{\overline{X}-\mu_0}{S / \sqrt{n}} \end{equation} has a $t$-distribution with $n-1$ degrees of freedom, i.e., $W \sim T(n-1)$. Thus, we can repeat the analysis of Example 8.24 here. The only difference is that we need to replace $\sigma$ by $S$ and $z_{\frac{\alpha}{2}}$ by $t_{\frac{\alpha}{2},n-1}$. Therefore, we accept $H_0$ if \begin{align} |W| \leq t_{\frac{\alpha}{2},n-1}, \end{align} and reject it otherwise. Let us look at a numerical example of this case.

$\quad$ $H_0$: $\mu=170$, $\quad$ $H_1$: $\mu \neq 170$.

  • Let's first compute the sample mean and the sample standard deviation. The sample mean is \begin{align}%\label{} \overline{X}&=\frac{X_1+X_2+X_3+X_4+X_5+X_6+X_7+X_8+X_9}{9}\\ &=165.8 \end{align} The sample variance is given by \begin{align}%\label{} {S}^2=\frac{1}{9-1} \sum_{k=1}^9 (X_k-\overline{X})^2&=68.01 \end{align} The sample standard deviation is given by \begin{align}%\label{} S&= \sqrt{S^2}=8.25 \end{align} The following MATLAB code can be used to obtain these values: x=[176.2,157.9,160.1,180.9,165.1,167.2,162.9,155.7,166.2]; m=mean(x); v=var(x); s=std(x); Now, our test statistic is \begin{align} W(X_1,X_2, \cdots, X_9)&=\frac{\overline{X}-\mu_0}{S / \sqrt{n}}\\ &=\frac{165.8-170}{8.25 / 3}=-1.52 \end{align} Thus, $|W|=1.52$. Also, we have \begin{align} t_{\frac{\alpha}{2},n-1} = t_{0.025,8} \approx 2.31 \end{align} The above value can be obtained in MATLAB using the command $\mathtt{tinv(0.975,8)}$. Thus, we conclude \begin{align} |W| \leq t_{\frac{\alpha}{2},n-1}. \end{align} Therefore, we accept $H_0$. In other words, we do not have enough evidence to conclude that the average height in the city is different from the average height in the country.

Let us summarize what we have obtained for the two-sided test for the mean.

Case Test Statistic Acceptance Region
$X_i \sim N(\mu, \sigma^2)$, $\sigma$ known $W=\frac{\overline{X}-\mu_0}{\sigma / \sqrt{n}}$ $|W| \leq z_{\frac{\alpha}{2}}$
$n$ large, $X_i$ non-normal $W=\frac{\overline{X}-\mu_0}{S / \sqrt{n}}$ $|W| \leq z_{\frac{\alpha}{2}}$
$X_i \sim N(\mu, \sigma^2)$, $\sigma$ unknown $W=\frac{\overline{X}-\mu_0}{S / \sqrt{n}}$ $|W| \leq t_{\frac{\alpha}{2},n-1}$

One-sided Tests for the Mean:

  • As before, we define the test statistic as \begin{align}%\label{} W(X_1,X_2, \cdots,X_n)=\frac{\overline{X}-\mu_0}{\sigma / \sqrt{n}}. \end{align} If $H_0$ is true (i.e., $\mu \leq \mu_0$), we expect $\overline{X}$ (and thus $W$) to be relatively small, while if $H_1$ is true, we expect $\overline{X}$ (and thus $W$) to be larger. This suggests the following test: Choose a threshold, and call it $c$. If $W \leq c$, accept $H_0$, and if $W>c$, accept $H_1$. How do we choose $c$? If $\alpha$ is the required significance level, we must have \begin{align} P(\textrm{type I error}) &= P(\textrm{Reject }H_0 \; | \; H_0) \\ &= P(W > c \; | \; \mu \leq \mu_0) \leq \alpha. \end{align} Here, the probability of type I error depends on $\mu$. More specifically, for any $\mu \leq \mu_0$, we can write \begin{align} P(\textrm{type I error} \; | \; \mu) &= P(\textrm{Reject }H_0 \; | \; \mu) \\ &= P(W > c \; | \; \mu)\\ &=P \left(\frac{\overline{X}-\mu_0}{\sigma / \sqrt{n}}> c \; | \; \mu\right)\\ &=P \left(\frac{\overline{X}-\mu}{\sigma / \sqrt{n}}+\frac{\mu-\mu_0}{\sigma / \sqrt{n}}> c \; | \; \mu\right)\\ &=P \left(\frac{\overline{X}-\mu}{\sigma / \sqrt{n}}> c+\frac{\mu_0-\mu}{\sigma / \sqrt{n}} \; | \; \mu\right)\\ &\leq P \left(\frac{\overline{X}-\mu}{\sigma / \sqrt{n}}> c \; | \; \mu\right) \quad (\textrm{ since }\mu \leq \mu_0)\\ &=1-\Phi(c) \quad \big(\textrm{ since given }\mu, \frac{\overline{X}-\mu}{\sigma / \sqrt{n}} \sim N(0,1) \big). \end{align} Thus, we can choose $\alpha=1-\Phi(c)$, which results in \begin{align} c=z_{\alpha}. \end{align} Therefore, we accept $H_0$ if \begin{align} \frac{\overline{X}-\mu_0}{\sigma / \sqrt{n}} \leq z_{\alpha}, \end{align} and reject it otherwise.
Case Test Statistic Acceptance Region
$X_i \sim N(\mu, \sigma^2)$, $\sigma$ known $W=\frac{\overline{X}-\mu_0}{\sigma / \sqrt{n}}$ $W \leq z_{\alpha}$
$n$ large, $X_i$ non-normal $W=\frac{\overline{X}-\mu_0}{S / \sqrt{n}}$ $W \leq z_{\alpha}$
$X_i \sim N(\mu, \sigma^2)$, $\sigma$ unknown $W=\frac{\overline{X}-\mu_0}{S / \sqrt{n}}$ $W \leq t_{\alpha,n-1}$

$\quad$ $H_0$: $\mu \geq \mu_0$, $\quad$ $H_1$: $\mu \lt \mu_0$,

Case Test Statistic Acceptance Region
$X_i \sim N(\mu, \sigma^2)$, $\sigma$ known $W=\frac{\overline{X}-\mu_0}{\sigma / \sqrt{n}}$ $W \geq -z_{\alpha}$
$n$ large, $X_i$ non-normal $W=\frac{\overline{X}-\mu_0}{S / \sqrt{n}}$ $W \geq -z_{\alpha}$
$X_i \sim N(\mu, \sigma^2)$, $\sigma$ unknown $W=\frac{\overline{X}-\mu_0}{S / \sqrt{n}}$ $W \geq -t_{\alpha,n-1}$

The print version of the book is available on .


Hypothesis tests about the mean

by Marco Taboga , PhD

This lecture explains how to conduct hypothesis tests about the mean of a normal distribution.

We tackle two different cases:

when we know the variance of the distribution, then we use a z-statistic to conduct the test;

when the variance is unknown, then we use the t-statistic.

In each case we derive the power and the size of the test.

We conclude with two solved exercises on size and power.

Table of contents

Known variance: the z-test

The null hypothesis, the test statistic, the critical region, the decision, the power function, the size of the test, how to choose the critical value, unknown variance: the t-test, how to choose the critical values, solved exercises.

The assumptions are the same we made in the lecture on confidence intervals for the mean .

A test of hypothesis based on it is called z-test .

Otherwise, it is not rejected.

[eq7]

We explain how to do this in the page on critical values .

This case is similar to the previous one. The only difference is that we now relax the assumption that the variance of the distribution is known.

The test of hypothesis based on it is called t-test .

Otherwise, we do not reject it.

[eq19]

The page on critical values explains how this equation is solved.

Below you can find some exercises with explained solutions.

Suppose that a statistician observes 100 independent realizations of a normal random variable.

The mean and the variance of the random variable, which the statistician does not know, are equal to 1 and 4 respectively.

Find the probability that the statistician will reject the null hypothesis that the mean is equal to zero if:

she runs a t-test based on the 100 observed realizations;

[eq32]

A statistician observes 100 independent realizations of a normal random variable.

She performs a t-test of the null hypothesis that the mean of the variable is equal to zero.

[eq38]

How to cite

Please cite as:

Taboga, Marco (2021). "Hypothesis tests about the mean", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/fundamentals-of-statistics/hypothesis-testing-mean.

Most of the learning materials found on this website are now available in a traditional textbook format.

  • Gamma function
  • Characteristic function
  • Uniform distribution
  • Mean square convergence
  • Convergence in probability
  • Likelihood ratio test
  • Statistical inference
  • Point estimation
  • Combinations
  • Mathematical tools
  • Fundamentals of probability
  • Probability distributions
  • Asymptotic theory
  • Fundamentals of statistics
  • About Statlect
  • Cookies, privacy and terms of use
  • Discrete random variable
  • Mean squared error
  • Continuous mapping theorem
  • Alternative hypothesis
  • Probability density function
  • IID sequence
  • To enhance your privacy,
  • we removed the social buttons,
  • but don't forget to share .

Hypothesis Testing with the Normal Distribution

Contents Toggle Main Menu 1 Introduction 2 Test for Population Mean 3 Worked Example 3.1 Video Example 4 Approximation to the Binomial Distribution 5 Worked Example 6 Comparing Two Means 7 Workbooks 8 See Also

Introduction

When constructing a confidence interval with the standard normal distribution, these are the most important values that will be needed.

Significance Level

$10$%

$5$%

$1$%

$z_{1-\alpha}$

$1.28$

$1.645$

$2.33$

$z_{1-\frac{\alpha}{2} }$

$1.645$

$1.96$

$2.58$

Distribution of Sample Means

where $\mu$ is the true mean and $\mu_0$ is the current accepted population mean. Draw samples of size $n$ from the population. When $n$ is large enough and the null hypothesis is true the sample means often follow a normal distribution with mean $\mu_0$ and standard deviation $\frac{\sigma}{\sqrt{n}}$. This is called the distribution of sample means and can be denoted by $\bar{X} \sim \mathrm{N}\left(\mu_0, \frac{\sigma}{\sqrt{n}}\right)$. This follows from the central limit theorem .

The $z$-score will this time be obtained with the formula \[Z = \dfrac{\bar{X} - \mu_0}{\frac{\sigma}{\sqrt{n}}}.\]

So if $\mu = \mu_0, X \sim \mathrm{N}\left(\mu_0, \frac{\sigma}{\sqrt{n}}\right)$ and $ Z \sim \mathrm{N}(0,1)$.

The alternative hypothesis will then take one of the following forms: depending on what we are testing.

Worked Example

An automobile company is looking for fuel additives that might increase gas mileage. Without additives, their cars are known to average $25$ mpg (miles per gallons) with a standard deviation of $2.4$ mpg on a road trip from London to Edinburgh. The company now asks whether a particular new additive increases this value. In a study, thirty cars are sent on a road trip from London to Edinburgh. Suppose it turns out that the thirty cars averaged $\overline{x}=25.5$ mpg with the additive. Can we conclude from this result that the additive is effective?

We are asked to show if the new additive increases the mean miles per gallon. The current mean $\mu = 25$ so the null hypothesis will be that nothing changes. The alternative hypothesis will be that $\mu > 25$ because this is what we have been asked to test.

\begin{align} &H_0:\mu=25. \\ &H_1:\mu>25. \end{align}

Now we need to calculate the test statistic. We start with the assumption the normal distribution is still valid. This is because the null hypothesis states there is no change in $\mu$. Thus, as the value $\sigma=2.4$ mpg is known, we perform a hypothesis test with the standard normal distribution. So the test statistic will be a $z$ score. We compute the $z$ score using the formula \[z=\frac{\bar{x}-\mu}{\frac{\sigma}{\sqrt{n} } }.\] So \begin{align} z&=\frac{\overline{x}-25}{\frac{2.4}{\sqrt{30} } }\\ &=1.14 \end{align}

We are using a $5$% significance level and a (right-sided) one-tailed test, so $\alpha=0.05$ so from the tables we obtain $z_{1-\alpha} = 1.645$ is our test statistic.

As $1.14<1.645$, the test statistic is not in the critical region so we cannot reject $H_0$. Thus, the observed sample mean $\overline{x}=25.5$ is consistent with the hypothesis $H_0:\mu=25$ on a $5$% significance level.

Video Example

In this video, Dr Lee Fawcett explains how to conduct a hypothesis test for the mean of a single distribution whose variance is known, using a one-sample z-test.

Approximation to the Binomial Distribution

A supermarket has come under scrutiny after a number of complaints that its carrier bags fall apart when the load they carry is $5$kg. Out of a random sample of $200$ bags, $185$ do not tear when carrying a load of $5$kg. Can the supermarket claim at a $5$% significance level that more that $90$% of the bags will not fall apart?

Let $X$ represent the number of carrier bags which can hold a load of $5$kg. Then $X \sim \mathrm{Bin}(200,p)$ and \begin{align}H_0&: p = 0.9 \\ H_1&: p > 0.9 \end{align}

We need to calculate the mean $\mu$ and variance $\sigma ^2$.

\[\mu = np = 200 \times 0.9 = 180\text{.}\] \[\sigma ^2= np(1-p) = 18\text{.}\]

Using the normal approximation to the binomial distribution we obtain $Y \sim \mathrm{N}(180, 18)$.

\[\mathrm{P}[X \geq 185] = \mathrm{P}\left[Z \geq \dfrac{184.5 - 180}{4.2426} \right] = \mathrm{P}\left[Z \geq 1.0607\right] \text{.}\]

Because we are using a one-tailed test at a $5$% significance level, we obtain the critical value $Z=1.645$. Now $1.0607 < 1.645$ so we cannot accept the alternative hypothesis. It is not true that over $90$% of the supermarket's carrier bags are capable of withstanding a load of $5$kg.

Comparing Two Means

When we test hypotheses with two means, we will look at the difference $\mu_1 - \mu_2$. The null hypothesis will be of the form

where $a$ is a constant. Often $a=0$ is used to test if the two means are the same. Given two continuous random variables $X_1$ and $X_2$ with means $\mu_1$ and $\mu_2$ and variances $\frac{\sigma_1^2}{n_1}$ and $\frac{\sigma_2^2}{n_2}$ respectively \[\mathrm{E} [\bar{X_1} - \bar{X_2} ] = \mathrm{E} [\bar{X_1}] - \mathrm{E} [\bar{X_2}] = \mu_1 - \mu_2\] and \[\mathrm{Var}[\bar{X_1} - \bar{X_2}] = \mathrm{Var}[\bar{X_1}] - \mathrm{Var}[\bar{X_2}]=\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}\text{.}\] Note this last result, the difference of the variances is calculated by summing the variances.

We then obtain the $z$-score using the formula \[Z = \frac{(\bar{X_1}-\bar{X_2})-(\mu_1 - \mu_2)}{\sqrt{\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}}}\text{.}\]

These workbooks produced by HELM are good revision aids, containing key points for revision and many worked examples.

  • Tests concerning a single sample
  • Tests concerning two samples

Selecting a Hypothesis Test

hypothesis test for mean of a normal distribution

  • The Open University
  • Accessibility hub
  • Guest user / Sign out
  • Study with The Open University

My OpenLearn Profile

Personalise your OpenLearn profile, save your favourite content and get recognition for your learning

About this free course

Become an ou student, download this course, share this free course.

Data analysis: hypothesis testing

Start this free course now. Just create an account and sign in. Enrol and complete the course for a free statement of participation or digital badge if available.

4.1 The normal distribution

Here, you will look at the concept of normal distribution and the bell-shaped curve. The peak point (the top of the bell) represents the most probable occurrences, while other possible occurrences are distributed symmetrically around the peak point, creating a downward-sloping curve on either side of the peak point.

Cartoon showing a bell-shaped curve.

The cartoon shows a bell-shaped curve. The x-axis is titled ‘How high the hill is’ and the y-axis is titled ‘Number of hills’. The top of the bell-shaped curve is labelled ‘Average hill’, but on the lower right tail of the bell-shaped curve is labelled ‘Big hill’.

In order to test hypotheses, you need to calculate the test statistic and compare it with the value in the bell curve. This will be done by using the concept of ‘normal distribution’.

A normal distribution is a probability distribution that is symmetric about the mean, indicating that data near the mean are more likely to occur than data far from it. In graph form, a normal distribution appears as a bell curve. The values in the x-axis of the normal distribution graph represent the z-scores. The test statistic that you wish to use to test the set of hypotheses is the z-score . A z-score is used to measure how far the observation (sample mean) is from the 0 value of the bell curve (population mean). In statistics, this distance is measured by standard deviation. Therefore, when the z-score is equal to 2, the observation is 2 standard deviations away from the value 0 in the normal distribution curve.

A symmetrical graph reminiscent of a bell showing normal distribution.

A symmetrical graph reminiscent of a bell. The top of the bell-shaped curve appears where the x-axis is at 0. This is labelled as Normal distribution.

Previous

An Introduction to Bayesian Thinking

Chapter 5 hypothesis testing with normal populations.

In Section 3.5 , we described how the Bayes factors can be used for hypothesis testing. Now we will use the Bayes factors to compare normal means, i.e., test whether the mean of a population is zero or compare the means of two groups of normally-distributed populations. We divide this mission into three cases: known variance for a single population, unknown variance for a single population using paired data, and unknown variance using two independent groups.

Also note that some of the examples in this section use an updated version of the bayes_inference function. If your local output is different from what is seen in this chapter, or the provided code fails to run for you please make sure that you have the most recent version of the package.

5.1 Bayes Factors for Testing a Normal Mean: variance known

Now we show how to obtain Bayes factors for testing hypothesis about a normal mean, where the variance is known . To start, let’s consider a random sample of observations from a normal population with mean \(\mu\) and pre-specified variance \(\sigma^2\) . We consider testing whether the population mean \(\mu\) is equal to \(m_0\) or not.

Therefore, we can formulate the data and hypotheses as below:

Data \[Y_1, \cdots, Y_n \mathrel{\mathop{\sim}\limits^{\rm iid}}\textsf{Normal}(\mu, \sigma^2)\]

  • \(H_1: \mu = m_0\)
  • \(H_2: \mu \neq m_0\)

We also need to specify priors for \(\mu\) under both hypotheses. Under \(H_1\) , we assume that \(\mu\) is exactly \(m_0\) , so this occurs with probability 1 under \(H_1\) . Now under \(H_2\) , \(\mu\) is unspecified, so we describe our prior uncertainty with the conjugate normal distribution centered at \(m_0\) and with a variance \(\sigma^2/\mathbf{n_0}\) . This is centered at the hypothesized value \(m_0\) , and it seems that the mean is equally likely to be larger or smaller than \(m_0\) , so a dividing factor \(n_0\) is given to the variance. The hyper parameter \(n_0\) controls the precision of the prior as before.

In mathematical terms, the priors are:

  • \(H_1: \mu = m_0 \text{ with probability 1}\)
  • \(H_2: \mu \sim \textsf{Normal}(m_0, \sigma^2/\mathbf{n_0})\)

Bayes Factor

Now the Bayes factor for comparing \(H_1\) to \(H_2\) is the ratio of the distribution, the data under the assumption that \(\mu = m_0\) to the distribution of the data under \(H_2\) .

\[\begin{aligned} \textit{BF}[H_1 : H_2] &= \frac{p(\text{data}\mid \mu = m_0, \sigma^2 )} {\int p(\text{data}\mid \mu, \sigma^2) p(\mu \mid m_0, \mathbf{n_0}, \sigma^2)\, d \mu} \\ \textit{BF}[H_1 : H_2] &=\left(\frac{n + \mathbf{n_0}}{\mathbf{n_0}} \right)^{1/2} \exp\left\{-\frac 1 2 \frac{n }{n + \mathbf{n_0}} Z^2 \right\} \\ Z &= \frac{(\bar{Y} - m_0)}{\sigma/\sqrt{n}} \end{aligned}\]

The term in the denominator requires integration to account for the uncertainty in \(\mu\) under \(H_2\) . And it can be shown that the Bayes factor is a function of the observed sampled size, the prior sample size \(n_0\) and a \(Z\) score.

Let’s explore how the hyperparameters in \(n_0\) influences the Bayes factor in Equation (5.1) . For illustration we will use the sample size of 100. Recall that for estimation, we interpreted \(n_0\) as a prior sample size and considered the limiting case where \(n_0\) goes to zero as a non-informative or reference prior.

\[\begin{equation} \textsf{BF}[H_1 : H_2] = \left(\frac{n + \mathbf{n_0}}{\mathbf{n_0}}\right)^{1/2} \exp\left\{-\frac{1}{2} \frac{n }{n + \mathbf{n_0}} Z^2 \right\} \tag{5.1} \end{equation}\]

Figure 5.1 shows the Bayes factor for comparing \(H_1\) to \(H_2\) on the y-axis as \(n_0\) changes on the x-axis. The different lines correspond to different values of the \(Z\) score or how many standard errors \(\bar{y}\) is from the hypothesized mean. As expected, larger values of the \(Z\) score favor \(H_2\) .

Vague prior for mu: n=100

Figure 5.1: Vague prior for mu: n=100

But as \(n_0\) becomes smaller and approaches 0, the first term in the Bayes factor goes to infinity, while the exponential term involving the data goes to a constant and is ignored. In the limit as \(n_0 \rightarrow 0\) under this noninformative prior, the Bayes factor paradoxically ends up favoring \(H_1\) regardless of the value of \(\bar{y}\) .

The takeaway from this is that we cannot use improper priors with \(n_0 = 0\) , if we are going to test our hypothesis that \(\mu = n_0\) . Similarly, vague priors that use a small value of \(n_0\) are not recommended due to the sensitivity of the results to the choice of an arbitrarily small value of \(n_0\) .

This problem arises with vague priors – the Bayes factor favors the null model \(H_1\) even when the data are far away from the value under the null – are known as the Bartlett’s paradox or the Jeffrey’s-Lindleys paradox.

Now, one way to understand the effect of prior is through the standard effect size

\[\delta = \frac{\mu - m_0}{\sigma}.\] The prior of the standard effect size is

\[\delta \mid H_2 \sim \textsf{Normal}(0, \frac{1}{\mathbf{n_0}})\]

This allows us to think about a standardized effect independent of the units of the problem. One default choice is using the unit information prior, where the prior sample size \(n_0\) is 1, leading to a standard normal for the standardized effect size. This is depicted with the blue normal density in Figure 5.2 . This suggested that we expect that the mean will be within \(\pm 1.96\) standard deviations of the hypothesized mean with probability 0.95 . (Note that we can say this only under a Bayesian setting.)

In many fields we expect that the effect will be small relative to \(\sigma\) . If we do not expect to see large effects, then we may want to use a more informative prior on the effect size as the density in orange with \(n_0 = 4\) . So they expected the mean to be within \(\pm 1/\sqrt{n_0}\) or five standard deviations of the prior mean.

Prior on standard effect size

Figure 5.2: Prior on standard effect size

Example 1.1 To illustrate, we give an example from parapsychological research. The case involved the test of the subject’s claim to affect a series of randomly generated 0’s and 1’s by means of extra sensory perception (ESP). The random sequence of 0’s and 1’s are generated by a machine with probability of generating 1 being 0.5. The subject claims that his ESP would make the sample mean differ significantly from 0.5.

Therefore, we are testing \(H_1: \mu = 0.5\) versus \(H_2: \mu \neq 0.5\) . Let’s use a prior that suggests we do not expect a large effect which leads the following solution for \(n_0\) . Assume we want a standard effect of 0.03, there is a 95% chance that it is between \((-0.03/\sigma, 0.03/\sigma)\) , with \(n_0 = (1.96\sigma/0.03)^2 = 32.7^2\) .

Figure 5.3 shows our informative prior in blue, while the unit information prior is in orange. On this scale, the unit information prior needs to be almost uniform for the range that we are interested.

Prior effect in the extra sensory perception test

Figure 5.3: Prior effect in the extra sensory perception test

A very large data set with over 104 million trials was collected to test this hypothesis, so we use a normal distribution to approximate the distribution the sample mean.

  • Sample size: \(n = 1.0449 \times 10^8\)
  • Sample mean: \(\bar{y} = 0.500177\) , standard deviation \(\sigma = 0.5\)
  • \(Z\) -score: 3.61

Now using our prior in the data, the Bayes factor for \(H_1\) to \(H_2\) was 0.46, implying evidence against the hypothesis \(H_1\) that \(\mu = 0.5\) .

  • Informative \(\textit{BF}[H_1:H_2] = 0.46\)
  • \(\textit{BF}[H_2:H_1] = 1/\textit{BF}[H_1:H_2] = 2.19\)

Now, this can be inverted to provide the evidence in favor of \(H_2\) . The evidence suggests that the hypothesis that the machine operates with a probability that is not 0.5, is 2.19 times more likely than the hypothesis the probability is 0.5. Based on the interpretation of Bayes factors from Table 3.5 , this is in the range of “not worth the bare mention”.

To recap, we present expressions for calculating Bayes factors for a normal model with a specified variance. We show that the improper reference priors for \(\mu\) when \(n_0 = 0\) , or vague priors where \(n_0\) is arbitrarily small, lead to Bayes factors that favor the null hypothesis regardless of the data, and thus should not be used for hypothesis testing.

Bayes factors with normal priors can be sensitive to the choice of the \(n_0\) . While the default value of \(n_0 = 1\) is reasonable in many cases, this may be too non-informative if one expects more effects. Wherever possible, think about how large an effect you expect and use that information to help select the \(n_0\) .

All the ESP examples suggest weak evidence and favored the machine generating random 0’s and 1’s with a probability that is different from 0.5. Note that ESP is not the only explanation – a deviation from 0.5 can also occur if the random number generator is biased. Bias in the stream of random numbers in our pseudorandom numbers has huge implications for numerous fields that depend on simulation. If the context had been about detecting a small bias in random numbers what prior would you use and how would it change the outcome? You can experiment it in R or other software packages that generate random Bernoulli trials.

Next, we will look at Bayes factors in normal models with unknown variances using the Cauchy prior so that results are less sensitive to the choice of \(n_0\) .

5.2 Comparing Two Paired Means using Bayes Factors

We previously learned that we can use a paired t-test to compare means from two paired samples. In this section, we will show how Bayes factors can be expressed as a function of the t-statistic for comparing the means and provide posterior probabilities of the hypothesis that whether the means are equal or different.

Example 5.1 Trace metals in drinking water affect the flavor, and unusually high concentrations can pose a health hazard. Ten pairs of data were taken measuring the zinc concentration in bottom and surface water at ten randomly sampled locations, as listed in Table 5.1 .

Water samples collected at the the same location, on the surface and the bottom, cannot be assumed to be independent of each other. However, it may be reasonable to assume that the differences in the concentration at the bottom and the surface in randomly sampled locations are independent of each other.

Table 5.1: Zinc in drinking water
location bottom surface difference
1 0.430 0.415 0.015
2 0.266 0.238 0.028
3 0.567 0.390 0.177
4 0.531 0.410 0.121
5 0.707 0.605 0.102
6 0.716 0.609 0.107
7 0.651 0.632 0.019
8 0.589 0.523 0.066
9 0.469 0.411 0.058
10 0.723 0.612 0.111

To start modeling, we will treat the ten differences as a random sample from a normal population where the parameter of interest is the difference between the average zinc concentration at the bottom and the average zinc concentration at the surface, or the main difference, \(\mu\) .

In mathematical terms, we have

  • Random sample of \(n= 10\) differences \(Y_1, \ldots, Y_n\)
  • Normal population with mean \(\mu \equiv \mu_B - \mu_S\)

In this case, we have no information about the variability in the data, and we will treat the variance, \(\sigma^2\) , as unknown.

The hypothesis of the main concentration at the surface and bottom are the same is equivalent to saying \(\mu = 0\) . The second hypothesis is that the difference between the mean bottom and surface concentrations, or equivalently that the mean difference \(\mu \neq 0\) .

In other words, we are going to compare the following hypotheses:

  • \(H_1: \mu_B = \mu_S \Leftrightarrow \mu = 0\)
  • \(H_2: \mu_B \neq \mu_S \Leftrightarrow \mu \neq 0\)

The Bayes factor is the ratio between the distributions of the data under each hypothesis, which does not depend on any unknown parameters.

\[\textit{BF}[H_1 : H_2] = \frac{p(\text{data}\mid H_1)} {p(\text{data}\mid H_2)}\]

To obtain the Bayes factor, we need to use integration over the prior distributions under each hypothesis to obtain those distributions of the data.

\[\textit{BF}[H_1 : H_2] = \iint p(\text{data}\mid \mu, \sigma^2) p(\mu \mid \sigma^2) p(\sigma^2 \mid H_2)\, d \mu \, d\sigma^2\]

This requires specifying the following priors:

  • \(\mu \mid \sigma^2, H_2 \sim \textsf{Normal}(0, \sigma^2/n_0)\)
  • \(p(\sigma^2) \propto 1/\sigma^2\) for both \(H_1\) and \(H_2\)

\(\mu\) is exactly zero under the hypothesis \(H_1\) . For \(\mu\) in \(H_2\) , we start with the same conjugate normal prior as we used in Section 5.1 – testing the normal mean with known variance. Since we assume that \(\sigma^2\) is known, we model \(\mu \mid \sigma^2\) instead of \(\mu\) itself.

The \(\sigma^2\) appears in both the numerator and denominator of the Bayes factor. For default or reference case, we use the Jeffreys prior (a.k.a. reference prior) on \(\sigma^2\) . As long as we have more than two observations, this (improper) prior will lead to a proper posterior.

After integration and rearranging, one can derive a simple expression for the Bayes factor:

\[\textit{BF}[H_1 : H_2] = \left(\frac{n + n_0}{n_0} \right)^{1/2} \left( \frac{ t^2 \frac{n_0}{n + n_0} + \nu } { t^2 + \nu} \right)^{\frac{\nu + 1}{2}}\]

This is a function of the t-statistic

\[t = \frac{|\bar{Y}|}{s/\sqrt{n}},\]

where \(s\) is the sample standard deviation and the degrees of freedom \(\nu = n-1\) (sample size minus one).

As we saw in the case of Bayes factors with known variance, we cannot use the improper prior on \(\mu\) because when \(n_0 \to 0\) , then \(\textit{BF}[H1:H_2] \to \infty\) favoring \(H_1\) regardless of the magnitude of the t-statistic. Arbitrary, vague small choices for \(n_0\) also lead to arbitrary large Bayes factors in favor of \(H_1\) . Another example of the Barlett’s or Jeffreys-Lindley paradox.

Sir Herald Jeffrey discovered another paradox testing using the conjugant normal prior, known as the information paradox . His thought experiment assumed that our sample size \(n\) and the prior sample size \(n_0\) . He then considered what would happen to the Bayes factor as the sample mean moved further and further away from the hypothesized mean, measured in terms standard errors with the t-statistic, i.e., \(|t| \to \infty\) . As the t-statistic or information about the mean moved further and further from zero, the Bayes factor goes to a constant depending on \(n, n_0\) rather than providing overwhelming support for \(H_2\) .

The bounded Bayes factor is

\[\textit{BF}[H_1 : H_2] \to \left( \frac{n_0}{n_0 + n} \right)^{\frac{n - 1}{2}}\]

Jeffrey wanted a prior with \(\textit{BF}[H_1 : H_2] \to 0\) (or equivalently, \(\textit{BF}[H_2 : H_1] \to \infty\) ), as the information from the t-statistic grows, indicating the sample mean is as far as from the hypothesized mean and should favor \(H_2\) .

To resolve the paradox when the information the t-statistic favors \(H_2\) but the Bayes factor does not, Jeffreys showed that no normal prior could resolve the paradox .

But a Cauchy prior on \(\mu\) , would resolve it. In this way, \(\textit{BF}[H_2 : H_1]\) goes to infinity as the sample mean becomes further away from the hypothesized mean. Recall that the Cauchy prior is written as \(\textsf{C}(0, r^2 \sigma^2)\) . While Jeffreys used a default of \(r = 1\) , smaller values of \(r\) can be used if smaller effects are expected.

The combination of the Jeffrey’s prior on \(\sigma^2\) and this Cauchy prior on \(\mu\) under \(H_2\) is sometimes referred to as the Jeffrey-Zellener-Siow prior .

However, there is no closed form expressions for the Bayes factor under the Cauchy distribution. To obtain the Bayes factor, we must use the numerical integration or simulation methods.

We will use the function from the package to test whether the mean difference is zero in Example 5.1 (zinc), using the JZS (Jeffreys-Zellener-Siow) prior.

hypothesis test for mean of a normal distribution

With equal prior probabilities on the two hypothesis, the Bayes factor is the posterior odds. From the output, we see this indicates that the hypothesis \(H_2\) , the mean difference is different from 0, is almost 51 times more likely than the hypothesis \(H_1\) that the average concentration is the same at the surface and the bottom.

To sum up, we have used the Cauchy prior as a default prior testing hypothesis about a normal mean when variances are unknown. This does require numerical integration, but it is available in the function from the package. If you expect that the effect sizes will be small, smaller values of \(r\) are recommended.

It is often important to quantify the magnitude of the difference in addition to testing. The Cauchy Prior provides a default prior for both testing and inference; it avoids problems that arise with choosing a value of \(n_0\) (prior sample size) in both cases. In the next section, we will illustrate using the Cauchy prior for comparing two means from independent normal samples.

5.3 Comparing Independent Means: Hypothesis Testing

In the previous section, we described Bayes factors for testing whether the mean difference of paired samples was zero. In this section, we will consider a slightly different problem – we have two independent samples, and we would like to test the hypothesis that the means are different or equal.

Example 5.2 We illustrate the testing of independent groups with data from a 2004 survey of birth records from North Carolina, which are available in the package.

The variable of interest is – the weight gain of mothers during pregnancy. We have two groups defined by the categorical variable, , with levels, younger mom and older mom.

Question of interest : Do the data provide convincing evidence of a difference between the average weight gain of older moms and the average weight gain of younger moms?

We will view the data as a random sample from two populations, older and younger moms. The two groups are modeled as:

\[\begin{equation} \begin{aligned} Y_{O,i} & \mathrel{\mathop{\sim}\limits^{\rm iid}} \textsf{N}(\mu + \alpha/2, \sigma^2) \\ Y_{Y,i} & \mathrel{\mathop{\sim}\limits^{\rm iid}} \textsf{N}(\mu - \alpha/2, \sigma^2) \end{aligned} \tag{5.2} \end{equation}\]

The model for weight gain for older moms using the subscript \(O\) , and it assumes that the observations are independent and identically distributed, with a mean \(\mu+\alpha/2\) and variance \(\sigma^2\) .

For the younger women, the observations with the subscript \(Y\) are independent and identically distributed with a mean \(\mu-\alpha/2\) and variance \(\sigma^2\) .

Using this representation of the means in the two groups, the difference in means simplifies to \(\alpha\) – the parameter of interest.

\[(\mu + \alpha/2) - (\mu - \alpha/2) = \alpha\]

You may ask, “Why don’t we set the average weight gain of older women to \(\mu+\alpha\) , and the average weight gain of younger women to \(\mu\) ?” We need the parameter \(\alpha\) to be present in both \(Y_{O,i}\) (the group of older women) and \(Y_{Y,i}\) (the group of younger women).

We have the following competing hypotheses:

  • \(H_1: \alpha = 0 \Leftrightarrow\) The means are not different.
  • \(H_2: \alpha \neq 0 \Leftrightarrow\) The means are different.

In this representation, \(\mu\) represents the overall weight gain for all women. (Does the model in Equation (5.2) make more sense now?) To test the hypothesis, we need to specify prior distributions for \(\alpha\) under \(H_2\) (c.f. \(\alpha = 0\) under \(H_1\) ) and priors for \(\mu,\sigma^2\) under both hypotheses.

Recall that the Bayes factor is the ratio of the distribution of the data under the two hypotheses.

\[\begin{aligned} \textit{BF}[H_1 : H_2] &= \frac{p(\text{data}\mid H_1)} {p(\text{data}\mid H_2)} \\ &= \frac{\iint p(\text{data}\mid \alpha = 0,\mu, \sigma^2 )p(\mu, \sigma^2 \mid H_1) \, d\mu \,d\sigma^2} {\int \iint p(\text{data}\mid \alpha, \mu, \sigma^2) p(\alpha \mid \sigma^2) p(\mu, \sigma^2 \mid H_2) \, d \mu \, d\sigma^2 \, d \alpha} \end{aligned}\]

As before, we need to average over uncertainty and the parameters to obtain the unconditional distribution of the data. Now, as in the test about a single mean, we cannot use improper or non-informative priors for \(\alpha\) for testing.

Under \(H_2\) , we use the Cauchy prior for \(\alpha\) , or equivalently, the Cauchy prior on the standardized effect \(\delta\) with the scale of \(r\) :

\[\delta = \alpha/\sigma^2 \sim \textsf{C}(0, r^2)\]

Now, under both \(H_1\) and \(H_2\) , we use the Jeffrey’s reference prior on \(\mu\) and \(\sigma^2\) :

\[p(\mu, \sigma^2) \propto 1/\sigma^2\]

While this is an improper prior on \(\mu\) , this does not suffer from the Bartlett’s-Lindley’s-Jeffreys’ paradox as \(\mu\) is a common parameter in the model in \(H_1\) and \(H_2\) . This is another example of the Jeffreys-Zellner-Siow prior.

As in the single mean case, we will need numerical algorithms to obtain the Bayes factor. Now the following output illustrates testing of Bayes factors, using the Bayes inference function from the package.

hypothesis test for mean of a normal distribution

We see that the Bayes factor for \(H_1\) to \(H_2\) is about 5.7, with positive support for \(H_1\) that there is no difference in average weight gain between younger and older women. Using equal prior probabilities, the probability that there is a difference in average weight gain between the two groups is about 0.15 given the data. Based on the interpretation of Bayes factors from Table 3.5 , this is in the range of “positive” (between 3 and 20).

To recap, we have illustrated testing hypotheses about population means with two independent samples, using a Cauchy prior on the difference in the means. One assumption that we have made is that the variances are equal in both groups . The case where the variances are unequal is referred to as the Behren-Fisher problem, and this is beyond the scope for this course. In the next section, we will look at another example to put everything together with testing and discuss summarizing results.

5.4 Inference after Testing

In this section, we will work through another example for comparing two means using both hypothesis tests and interval estimates, with an informative prior. We will also illustrate how to adjust the credible interval after testing.

Example 5.3 We will use the North Carolina survey data to examine the relationship between infant birth weight and whether the mother smoked during pregnancy. The response variable, , is the birth weight of the baby in pounds. The categorical variable provides the status of the mother as a smoker or non-smoker.

We would like to answer two questions:

Is there a difference in average birth weight between the two groups?

If there is a difference, how large is the effect?

As before, we need to specify models for the data and priors. We treat the data as a random sample for the two populations, smokers and non-smokers.

The birth weights of babies born to non-smokers, designated by a subgroup \(N\) , are assumed to be independent and identically distributed from a normal distribution with mean \(\mu + \alpha/2\) , as in Section 5.3 .

\[Y_{N,i} \mathrel{\mathop{\sim}\limits^{\rm iid}}\textsf{Normal}(\mu + \alpha/2, \sigma^2)\]

While the birth weights of the babies born to smokers, designated by the subgroup \(S\) , are also assumed to have a normal distribution, but with mean \(\mu - \alpha/2\) .

\[Y_{S,i} \mathrel{\mathop{\sim}\limits^{\rm iid}}\textsf{Normal}(\mu - \alpha/2, \sigma^2)\]

The difference in the average birth weights is the parameter \(\alpha\) , because

\[(\mu + \alpha/2) - (\mu - \alpha/2) = \alpha\] .

The hypotheses that we will test are \(H_1: \alpha = 0\) versus \(H_2: \alpha \ne 0\) .

We will still use the Jeffreys-Zellner-Siow Cauchy prior. However, since we may expect the standardized effect size to not be as strong, we will use a scale of \(r = 0.5\) rather than 1.

Therefore, under \(H_2\) , we have \[\delta = \alpha/\sigma \sim \textsf{C}(0, r^2), \text{ with } r = 0.5.\]

Under both \(H_1\) and \(H_2\) , we will use the reference priors on \(\mu\) and \(\sigma^2\) :

\[\begin{aligned} p(\mu) &\propto 1 \\ p(\sigma^2) &\propto 1/\sigma^2 \end{aligned}\]

The input to the base inference function is similar, but now we will specify that \(r = 0.5\) .

hypothesis test for mean of a normal distribution

We see that the Bayes factor is 1.44, which weakly favors there being a difference in average birth weights for babies whose mothers are smokers versus mothers who did not smoke. Converting this to a probability, we find that there is about a 60% chance of the average birth weights are different.

While looking at evidence of there being a difference is useful, Bayes factors and posterior probabilities do not convey any information about the magnitude of the effect. Reporting a credible interval or the complete posterior distribution is more relevant for quantifying the magnitude of the effect.

Using the function, we can generate samples from the posterior distribution under \(H_2\) using the option.

The 2.5 and 97.5 percentiles for the difference in the means provide a 95% credible interval of 0.023 to 0.57 pounds for the difference in average birth weight. The MCMC output shows not only summaries about the difference in the mean \(\alpha\) , but the other parameters in the model.

In particular, the Cauchy prior arises by placing a gamma prior on \(n_0\) and the conjugate normal prior. This provides quantiles about \(n_0\) after updating with the current data.

The row labeled effect size is the standardized effect size \(\delta\) , indicating that the effects are indeed small relative to the noise in the data.

Estimates of effect under H2

Figure 5.4: Estimates of effect under H2

Figure 5.4 shows the posterior density for the difference in means, with the 95% credible interval indicated by the shaded area. Under \(H_2\) , there is a 95% chance that the average birth weight of babies born to non-smokers is 0.023 to 0.57 pounds higher than that of babies born to smokers.

The previous statement assumes that \(H_2\) is true and is a conditional probability statement. In mathematical terms, the statement is equivalent to

\[P(0.023 < \alpha < 0.57 \mid \text{data}, H_2) = 0.95\]

However, we still have quite a bit of uncertainty based on the current data, because given the data, the probability of \(H_2\) being true is 0.59.

\[P(H_2 \mid \text{data}) = 0.59\]

Using the law of total probability, we can compute the probability that \(\mu\) is between 0.023 and 0.57 as below:

\[\begin{aligned} & P(0.023 < \alpha < 0.57 \mid \text{data}) \\ = & P(0.023 < \alpha < 0.57 \mid \text{data}, H_1)P(H_1 \mid \text{data}) + P(0.023 < \alpha < 0.57 \mid \text{data}, H_2)P(H_2 \mid \text{data}) \\ = & I( 0 \text{ in CI }) P(H_1 \mid \text{data}) + 0.95 \times P(H_2 \mid \text{data}) \\ = & 0 \times 0.41 + 0.95 \times 0.59 = 0.5605 \end{aligned}\]

Finally, we get that the probability that \(\alpha\) is in the interval, given the data, averaging over both hypotheses, is roughly 0.56. The unconditional statement is the average birth weight of babies born to nonsmokers is 0.023 to 0.57 pounds higher than that of babies born to smokers with probability 0.56. This adjustment addresses the posterior uncertainty and how likely \(H_2\) is.

To recap, we have illustrated testing, followed by reporting credible intervals, and using a Cauchy prior distribution that assumed smaller standardized effects. After testing, it is common to report credible intervals conditional on \(H_2\) . We also have shown how to adjust the probability of the interval to reflect our posterior uncertainty about \(H_2\) . In the next chapter, we will turn to regression models to incorporate continuous explanatory variables.

Test around Mean of Normal Population

Known variance.

Consider $n$ samples $X_{1}, X_{2}, \ldots, X_{n}$ drawn from a normal distribution with unknown mean $\mu$ and known variance $\sigma^{2}$. We have the following hypothesis \begin{align} \text{null hypothesis} \quad H_{0}&: \mu = \mu_{0}\newline \text{alternate hypothesis} \quad H_{1}&: \mu \neq \mu_{0} \end{align}

The sample mean $\overline{X}$ is clearly a natural choice for the estimator of the mean. It is intuitive to define the critical region in such a manner that we reject $H_{0}$ if the estimator is far off from $\mu_{0}$ and vice versa \begin{align} C = { X_{1}, X_{2}, \ldots X_{n}: \vert\overline{X} - \mu_{0}\vert > c} \end{align} for some suitably chosen $c$. We also know that the mean of a normal population has a normal distribution. Hence for some significance level $\alpha$ ( type I error ), \begin{align} P_{\mu_{0}}(\vert\overline{X} - \mu_{0}\vert > c) = \alpha \end{align} where the subscript denotes the fact that the probability is being calculated under the assumption of $\mu = \mu_{0}$. Under this assumption, $\overline{X}$ is normally distributed with mean $\mu_{0}$. \begin{align} P_{\mu_{0}}(\vert\frac{\overline{X} - \mu_{0}}{\sigma/\sqrt{n}}\vert > \frac{c\sqrt{n}}{\sigma}) &= \alpha\newline 2P_{\mu_{0}}(\frac{\overline{X} - \mu_{0}}{\sigma/\sqrt{n}} > \frac{c\sqrt{n}}{\sigma}) &= \alpha \end{align} but we know that these are tabulated values \begin{align} P(Z > z_{\alpha/2}) &= \alpha/2\newline \text{or,}\quad \frac{c\sqrt{n}}{\sigma} &= z_{\alpha/2}\newline \text{or,}\quad c &= \frac{\sigma z_{\alpha/2}}{\sqrt{n}} \end{align} or simply put in terms of the hypothesis test, \begin{alignat}{4} \text{Reject}\quad &H_{0} \quad\text{if}\quad &\bigg\lvert \frac{\sqrt{n}}{\sigma}(\overline{X} - \mu_{0}) \big\rvert &> &z_{\alpha/2}\newline \text{Accept}\quad &H_{0} \quad\text{if}\quad &\big\lvert \frac{\sqrt{n}}{\sigma}(\overline{X} - \mu_{0}) \big\rvert &\leq &z_{\alpha/2}\newline \end{alignat} where $\alpha$ is the type I error and should ideally be low.

Acceptance Region for Hypothesis $\mu=\mu_{0}$

p-value denotes the probability of observing a test-statistic at least as large as the observed test-statistic under the assumption that the null hypothesis is true. Mathemtically, \begin{align} P(Z > \frac{\sqrt{n}}{\sigma}(\overline{X} - \mu_{0})) = \text{p-value} \end{align} where $Z$ is the random variable denoting the test-statistic while right-hand side of the inequality is the observed test-statistic. For the above formula, we have assumed the test statistic to be derived from a normal distribution. However, the definition is applicable to any distribution.

A very small p-value means that either the null hypothesis is incorrect, or the observed test statistic must be very unlikely. We can use p-value to compare against already defined significance levels to accept or reject the null hypothesis.

*p-value* and significance levels on a standard normal distribution

Furthermore, if we have a predefined value of the significance level, any p-value lower than this level implies it is very likely for the mean to be different, calling for rejecting $H_{0}$. This is visually represented in figure above (note that the figure compares different significance levels for a fixed value of p-value ).

Thus, a small p-value denotes strong evidence in favor of observing an effect. For an analogy, consider the hypothesis test concerning the weights of a linear regression model. The usual $H_{0}$ refers to the coefficient being $0$. Thus, a small p-value strongly rejects the null hypothesis, meaning that the corresponding regressor has non-zero impact on the target variable.

Power of a test and Type II error

We have not yet invoked the type II error . Consider $\beta(\mu)$ as probability of accepting $H_{0}$ when the mean is $\mu$ \begin{align} \beta(\mu) &= P_{\mu}(\text{accepting $H_{0}$ when the mean is $\mu$})\newline &= P(\text{statistic is}\quad \leq z_{\alpha/2})\newline &= P(\vert\frac{\sqrt{n}}{\sigma}(\overline{X} - \mu_{0})\vert \leq z_{\alpha/2})\newline &= P(-z_{\alpha/2} \leq \frac{\sqrt{n}}{\sigma}(\overline{X} - \mu_{0}) \leq z_{\alpha/2}) \end{align} We also define effect size here. It is defined as \begin{align} \text{effect size} &= \text{True value} - \text{Hypothesize value}\newline &= \mu - \mu_{0} \end{align} But, under the premise that $\mu$ is the correct mean (and $H_{0}$ is false), \begin{align} \frac{\overline{X} - \mu}{\sigma/\sqrt{n}} \sim \mathcal{N}(0, 1) \end{align} Thus, \begin{align} \beta(\mu) &= P(-z_{\alpha/2} - \frac{\sqrt{n}}{\sigma}\mu \leq \frac{\sqrt{n}}{\sigma}(\overline{X} - \mu_{0}) - \frac{\sqrt{n}}{\sigma}\mu \leq z_{\alpha/2} - \frac{\sqrt{n}}{\sigma}\mu)\newline &= P(-z_{\alpha/2} - \frac{\sqrt{n}}{\sigma}(\mu + \mu_{0}) \leq \frac{\sqrt{n}}{\sigma}(\overline{X} - \mu) \leq z_{\alpha/2} - \frac{\sqrt{n}}{\sigma}(\mu + \mu_{0}))\newline &= \Phi(z_{\alpha/2} - \frac{\sqrt{n}}{\sigma}(\mu - \mu_{0})) - \Phi(-z_{\alpha/2} - \frac{\sqrt{n}}{\sigma}(\mu - \mu_{0}))\newline &= \Phi(\frac{\sqrt{n}}{\sigma}(\mu_{0} - \mu) +z_{\alpha/2}) - \Phi(\frac{\sqrt{n}}{\sigma}(\mu_{0} - \mu) -z_{\alpha/2})\newline \end{align} where $\Phi$ is the standard normal cumulative distribution function.

$\beta(\mu)$ is called the Operating Charectistic. The value of this function is only dependent on the gap between $\mu_{0}$ and $\mu$. For a fix $\alpha$, as the gap grows, we move away from the centre of the standard normal. As such, the difference in the two terms of $\beta(\mu)$ keeps decreasing. It is maximum when $\mu = \mu_{0}$.

Curve of $\beta(\mu)$ for a fixed $\alpha$

The function $1 - \beta(\mu)$ is called the power function and is the probability of rejection of $H_{0}$ when it is false. This function is useful in calculating the value of the sample size so that the probability of accepting $H_{0}: \mu = \mu_{0}$ when the true mean is $\mu_{1}$ is $\beta$. We solve the equation $\beta(\mu_{1}) = \beta$ and try to guess the value of $n$ because analytical solution is not possible.

\begin{align} n \approx \frac{(z_{\alpha/2} + z_{\beta})^{2} \sigma^{2}}{(\mu_{1} - \mu_{0})^{2}} \end{align} is the approximate solution assuming $\alpha$ is negligible.

The power of a test is dependent on

Sample size $n$: Other things being constant, the power of a test is higher the larger the sample size.

Significance level $\alpha$: The smaller the value of $\alpha$, the larger is the acceptance region making it more likely to accept $H_{0} (\mu = \mu_{0})$ when the true mean is different.

Effect size: The greater the effect size (difference between the hypothesized value and the true value), the higher the power of the test.

Neyman-Pearson Lemma

Consider a test with two competing hypothesis $H_{0} = \theta_{0}$ and $H_{1} = \theta_{1}$ where the probability density (or mass) function is given by $f(\mathbf{x} \vert \theta_{i})$ for $i \in {0, 1}$. Denoting the critical region (rejection region) by $C$, the Neyman-Pearson Lemma states that the Most Powerful (MP) test statisfies the below for some $\eta \geq 0$

$\mathbf{x} \in C \; \text{if} \; f(\mathbf{x}\vert\theta_{1}) > \eta f(\mathbf{x}\vert\theta_{0})$

$\mathbf{x} \in C^{c} \; \text{if} \; f(\mathbf{x}\vert\theta_{1}) < \eta f(\mathbf{x}\vert\theta_{0})$

$P_{\theta_{0}}(\mathbf{X} \in C) = \alpha$ for some prefixed significance level $\alpha$

In practice, the likelihood ratio is often used to construct the tests and determine the relation between the effect size and the likelihood ratio.

One Sided Test

In one sided test, we test the equality of the mean with a fixed value vs a single sided inequality of the mean being larger than, or smaller than that fixed value. \begin{align} \text{null hypothesis}\quad H_{0}&: \mu = \mu_{0}\newline \text{alternate hypothesis}\quad H_{1}&: \mu > \mu_{0} \end{align}

Note that the variance of the distribution is known in this case. Clearly, the critical region (rejection region) is one where the large values of $\mu$ are unlikely \begin{align} C = {X_{1}, \ldots, X_{n}: \overline{X} - \mu_{0} > c} \end{align} for some constant $c$ chosen based on the significance level $\alpha$. Equivalently, \begin{align} P(\frac{\overline{X} - \mu_{0}}{\sigma/\sqrt{n}} > z_{\alpha}) &= \alpha\newline \text{or,}\quad \overline{X} &> z_{\alpha}\frac{\sigma}{\sqrt{n}} + \mu_{0} \end{align}

is the rejection region based on the sample mean.

\begin{alignat}{4} \text{Reject}\quad &H_{0} \quad\text{if}\quad &\frac{\sqrt{n}}{\sigma}(\overline{X} - \mu_{0}) &> &z_{\alpha}\newline \text{Accept}\quad &H_{0} \quad\text{if}\quad &\frac{\sqrt{n}}{\sigma}(\overline{X} - \mu_{0}) &\leq &z_{\alpha}\newline \end{alignat}

The p-value is similarly calculated as the probability that the standard normal is at least as large as this test statistic. Similar to the two sided test, operating characteristic curve can be defined \begin{align} \beta(\mu) &= P(\text{Accepting}\quad H_{0})\newline &= P(\overline{X} \leq z_{\alpha}\frac{\sigma}{\sqrt{n}} + \mu_{0})\newline &= P(\frac{\overline{X} - \mu}{\sigma/\sqrt{n}} \leq z_{\alpha} + \frac{\mu_{0} - \mu}{\sigma/\sqrt{n}})\newline &= P(Z \leq z_{\alpha} + \frac{\mu_{0} - \mu}{\sigma/\sqrt{n}}) \end{align} where $Z$ is the standard normal.

Special Note

The tests discussed above have been derived under the assumption that the sample mean has a normal distribution. But, by central limit theorem, the sample mean of any large population will tend towards a normal distribution. Hence, the hypothesis tests will remain valid provided the population has known variance $\sigma^{2}$.

Unknown Variance

We proceed in a manner similar to the known variance case but use sample variance instead. Recall \begin{align} \sqrt{n} \frac{\overline{X} - \mu_{0}}{S} \sim T_{n-1} \end{align} which is a t-distributed random variable with $n-1$ degrees of freedom. Since t-distribution also has specially defined values $t_{\alpha, n-1}$ similar to $z_{\alpha}$, we can simply use the following 2-sided tests at significance level $\alpha$ \begin{alignat}{4} \text{Reject}\quad &H_{0} \quad\text{if}\quad &\bigg \lvert\frac{\sqrt{n}}{S}(\overline{X} - \mu_{0}) \bigg\rvert &> &t_{\alpha/2, n-1}\newline \text{Accept}\quad &H_{0} \quad\text{if}\quad &\bigg \lvert\frac{\sqrt{n}}{S}(\overline{X} - \mu_{0}) \bigg\rvert &\leq &t_{\alpha/2, n-1}\newline \end{alignat}

Further, p-values are defined using the same statistic $\sqrt{n}(\overline{X} - \mu_{0})/S$ and for any significance level which is less than the p-value probability that the t-statistic is greater than this statistic $\sqrt{n}(\overline{X} - \mu_{0})/S$, we will reject $H_{0}: \mu = \mu_{0}$. We accept the null hypothesis when the significance level is larger than the p-value .

Similar to the known variance case, we have the one sided tests defined as below \begin{align} H_{0}: \mu \leq \mu_{0} \quad \text{versus} \quad H_{1}: \mu > \mu_{0} \end{align} \begin{alignat}{4} \text{Reject}\quad &H_{0} \quad\text{if}\quad &\frac{\sqrt{n}}{S}(\overline{X} - \mu_{0}) &> &t_{\alpha, n-1}\newline \text{Accept}\quad &H_{0} \quad\text{if}\quad &\frac{\sqrt{n}}{S}(\overline{X} - \mu_{0}) &\leq &t_{\alpha, n-1}\newline \end{alignat} and the other side \begin{align} H_{0}: \mu \geq \mu_{0} \quad \text{versus} \quad H_{1}: \mu < \mu_{0} \end{align} \begin{alignat}{4} \text{Reject}\quad &H_{0} \quad\text{if}\quad &\frac{\sqrt{n}}{S}(\overline{X} - \mu_{0}) &< &-t_{\alpha, n-1}\newline \text{Accept}\quad &H_{0} \quad\text{if}\quad &\frac{\sqrt{n}}{S}(\overline{X} - \mu_{0}) &\geq &-t_{\alpha, n-1}\newline \end{alignat} and we can calculate the p-value as well in the above cases using the test statistic.

9.3 Probability Distribution Needed for Hypothesis Testing

Earlier in the course, we discussed sampling distributions. Particular distributions are associated with various types of hypothesis testing.

The following table summarizes various hypothesis tests and corresponding probability distributions that will be used to conduct the test (based on the assumptions shown below):

Type of Hypothesis Test Population Parameter Estimated value (point estimate) Probability Distribution Used
Hypothesis test for the mean, when the population standard deviation is known Population mean Sample mean Normal distribution,
Hypothesis test for the mean, when the population standard deviation is unknown and the distribution of the sample mean is approximately normal Population mean Sample mean Student’s t-distribution,
Hypothesis test for proportions Population proportion Sample proportion Normal distribution,

Assumptions

When you perform a hypothesis test of a single population mean μ using a normal distribution (often called a z-test), you take a simple random sample from the population. The population you are testing is normally distributed , or your sample size is sufficiently large. You know the value of the population standard deviation , which, in reality, is rarely known.

When you perform a hypothesis test of a single population mean μ using a Student's t-distribution (often called a t -test), there are fundamental assumptions that need to be met in order for the test to work properly. Your data should be a simple random sample that comes from a population that is approximately normally distributed. You use the sample standard deviation to approximate the population standard deviation. (Note that if the sample size is sufficiently large, a t -test will work even if the population is not approximately normally distributed).

When you perform a hypothesis test of a single population proportion p , you take a simple random sample from the population. You must meet the conditions for a binomial distribution : there are a certain number n of independent trials, the outcomes of any trial are success or failure, and each trial has the same probability of a success p . The shape of the binomial distribution needs to be similar to the shape of the normal distribution. To ensure this, the quantities np and nq must both be greater than five ( n p > 5   n p > 5   and n q > 5   n q > 5   ). Then the binomial distribution of a sample (estimated) proportion can be approximated by the normal distribution with μ = p   μ = p   and σ = p q n σ = p q n . Remember that q = 1 - p q q = 1 - p q .

Hypothesis Test for the Mean

Going back to the standardizing formula we can derive the test statistic for testing hypotheses concerning means.

The standardizing formula cannot be solved as it is because we do not have μ, the population mean. However, if we substitute in the hypothesized value of the mean, μ 0 in the formula as above, we can compute a Z value. This is the test statistic for a test of hypothesis for a mean and is presented in Figure 9.3 . We interpret this Z value as the associated probability that a sample with a sample mean of X ¯ X ¯ could have come from a distribution with a population mean of H 0 and we call this Z value Z c for “calculated”. Figure 9.3 and Figure 9.4 show this process.

In Figure 9.3 two of the three possible outcomes are presented. X ¯ 1 X ¯ 1 and X ¯ 3 X ¯ 3 are in the tails of the hypothesized distribution of H 0 . Notice that the horizontal axis in the top panel is labeled X ¯ X ¯ 's. This is the same theoretical distribution of X ¯ X ¯ 's, the sampling distribution, that the Central Limit Theorem tells us is normally distributed. This is why we can draw it with this shape. The horizontal axis of the bottom panel is labeled Z and is the standard normal distribution. Z α 2 Z α 2 and -Z α 2 -Z α 2 , called the critical values , are marked on the bottom panel as the Z values associated with the probability the analyst has set as the level of significance in the test, (α). The probabilities in the tails of both panels are, therefore, the same.

Notice that for each X ¯ X ¯ there is an associated Z c , called the calculated Z, that comes from solving the equation above. This calculated Z is nothing more than the number of standard deviations that the hypothesized mean is from the sample mean. If the sample mean falls "too many" standard deviations from the hypothesized mean we conclude that the sample mean could not have come from the distribution with the hypothesized mean, given our pre-set required level of significance. It could have come from H 0 , but it is deemed just too unlikely. In Figure 9.3 both X ¯ 1 X ¯ 1 and X ¯ 3 X ¯ 3 are in the tails of the distribution. They are deemed "too far" from the hypothesized value of the mean given the chosen level of alpha. If in fact this sample mean it did come from H 0 , but from in the tail, we have made a Type I error: we have rejected a good null. Our only real comfort is that we know the probability of making such an error, α, and we can control the size of α.

Figure 9.4 shows the third possibility for the location of the sample mean, x _ x _ . Here the sample mean is within the two critical values. That is, within the probability of (1-α) and we cannot reject the null hypothesis.

This gives us the decision rule for testing a hypothesis for a two-tailed test:

Decision rule: two-tail test
If < : then do not REJECT
If > : then REJECT

This rule will always be the same no matter what hypothesis we are testing or what formulas we are using to make the test. The only change will be to change the Z c to the appropriate symbol for the test statistic for the parameter being tested. Stating the decision rule another way: if the sample mean is unlikely to have come from the distribution with the hypothesized mean we cannot accept the null hypothesis. Here we define "unlikely" as having a probability less than alpha of occurring.

P-Value Approach

An alternative decision rule can be developed by calculating the probability that a sample mean could be found that would give a test statistic larger than the test statistic found from the current sample data assuming that the null hypothesis is true. Here the notion of "likely" and "unlikely" is defined by the probability of drawing a sample with a mean from a population with the hypothesized mean that is either larger or smaller than that found in the sample data. Simply stated, the p-value approach compares the desired significance level, α, to the p-value which is the probability of drawing a sample mean further from the hypothesized value than the actual sample mean. A large p -value calculated from the data indicates that we should not reject the null hypothesis . The smaller the p -value, the more unlikely the outcome, and the stronger the evidence is against the null hypothesis. We would reject the null hypothesis if the evidence is strongly against it. The relationship between the decision rule of comparing the calculated test statistics, Z c , and the Critical Value, Z α , and using the p -value can be seen in Figure 9.5 .

The calculated value of the test statistic is Z c in this example and is marked on the bottom graph of the standard normal distribution because it is a Z value. In this case the calculated value is in the tail and thus we cannot accept the null hypothesis, the associated X ¯ X ¯ is just too unusually large to believe that it came from the distribution with a mean of µ 0 with a significance level of α.

If we use the p -value decision rule we need one more step. We need to find in the standard normal table the probability associated with the calculated test statistic, Z c . We then compare that to the α associated with our selected level of confidence. In Figure 9.5 we see that the p -value is less than α and therefore we cannot accept the null. We know that the p -value is less than α because the area under the p-value is smaller than α/2. It is important to note that two researchers drawing randomly from the same population may find two different P-values from their samples. This occurs because the P-value is calculated as the probability in the tail beyond the sample mean assuming that the null hypothesis is correct. Because the sample means will in all likelihood be different this will create two different P-values. Nevertheless, the conclusions as to the null hypothesis should be different with only the level of probability of α.

Here is a systematic way to make a decision of whether you cannot accept or cannot reject a null hypothesis if using the p -value and a preset or preconceived α (the " significance level "). A preset α is the probability of a Type I error (rejecting the null hypothesis when the null hypothesis is true). It may or may not be given to you at the beginning of the problem. In any case, the value of α is the decision of the analyst. When you make a decision to reject or not reject H 0 , do as follows:

  • If α > p -value, cannot accept H 0 . The results of the sample data are significant. There is sufficient evidence to conclude that H 0 is an incorrect belief and that the alternative hypothesis , H a , may be correct.
  • If α ≤ p -value, cannot reject H 0 . The results of the sample data are not significant. There is not sufficient evidence to conclude that the alternative hypothesis, H a , may be correct. In this case the status quo stands.
  • When you "cannot reject H 0 ", it does not mean that you should believe that H 0 is true. It simply means that the sample data have failed to provide sufficient evidence to cast serious doubt about the truthfulness of H 0 . Remember that the null is the status quo and it takes high probability to overthrow the status quo. This bias in favor of the null hypothesis is what gives rise to the statement "tyranny of the status quo" when discussing hypothesis testing and the scientific method.

Both decision rules will result in the same decision and it is a matter of preference which one is used.

One and Two-tailed Tests

The discussion of Figure 9.3 - Figure 9.5 was based on the null and alternative hypothesis presented in Figure 9.3 . This was called a two-tailed test because the alternative hypothesis allowed that the mean could have come from a population which was either larger or smaller than the hypothesized mean in the null hypothesis. This could be seen by the statement of the alternative hypothesis as μ ≠ 100, in this example.

It may be that the analyst has no concern about the value being "too" high or "too" low from the hypothesized value. If this is the case, it becomes a one-tailed test and all of the alpha probability is placed in just one tail and not split into α/2 as in the above case of a two-tailed test. Any test of a claim will be a one-tailed test. For example, a car manufacturer claims that their Model 17B provides gas mileage of greater than 25 miles per gallon. The null and alternative hypothesis would be:

  • H 0 : µ ≤ 25
  • H a : µ > 25

The claim would be in the alternative hypothesis. The burden of proof in hypothesis testing is carried in the alternative. This is because failing to reject the null, the status quo, must be accomplished with 90 or 95 percent confidence that it cannot be maintained. Said another way, we want to have only a 5 or 10 percent probability of making a Type I error, rejecting a good null; overthrowing the status quo.

This is a one-tailed test and all of the alpha probability is placed in just one tail and not split into α/2 as in the above case of a two-tailed test.

Figure 9.6 shows the two possible cases and the form of the null and alternative hypothesis that give rise to them.

where μ 0 is the hypothesized value of the population mean.

Sample size Test statistic
< 30
(σ unknown)
< 30
(σ known)
> 30
(σ unknown)
> 30
(σ known)

Effects of Sample Size on Test Statistic

In developing the confidence intervals for the mean from a sample, we found that most often we would not have the population standard deviation, σ. If the sample size were less than 30, we could simply substitute the point estimate for σ, the sample standard deviation, s, and use the student's t -distribution to correct for this lack of information.

When testing hypotheses we are faced with this same problem and the solution is exactly the same. Namely: If the population standard deviation is unknown, and the sample size is less than 30, substitute s, the point estimate for the population standard deviation, σ, in the formula for the test statistic and use the student's t -distribution. All the formulas and figures above are unchanged except for this substitution and changing the Z distribution to the student's t -distribution on the graph. Remember that the student's t -distribution can only be computed knowing the proper degrees of freedom for the problem. In this case, the degrees of freedom is computed as before with confidence intervals: df = (n-1). The calculated t-value is compared to the t-value associated with the pre-set level of confidence required in the test, t α , df found in the student's t tables. If we do not know σ, but the sample size is 30 or more, we simply substitute s for σ and use the normal distribution.

Table 9.5 summarizes these rules.

A Systematic Approach for Testing a Hypothesis

A systematic approach to hypothesis testing follows the following steps and in this order. This template will work for all hypotheses that you will ever test.

  • Set up the null and alternative hypothesis. This is typically the hardest part of the process. Here the question being asked is reviewed. What parameter is being tested, a mean, a proportion, differences in means, etc. Is this a one-tailed test or two-tailed test?

Decide the level of significance required for this particular case and determine the critical value. These can be found in the appropriate statistical table. The levels of confidence typical for businesses are 80, 90, 95, 98, and 99. However, the level of significance is a policy decision and should be based upon the risk of making a Type I error, rejecting a good null. Consider the consequences of making a Type I error.

Next, on the basis of the hypotheses and sample size, select the appropriate test statistic and find the relevant critical value: Z α , t α , etc. Drawing the relevant probability distribution and marking the critical value is always big help. Be sure to match the graph with the hypothesis, especially if it is a one-tailed test.

  • Take a sample(s) and calculate the relevant parameters: sample mean, standard deviation, or proportion. Using the formula for the test statistic from above in step 2, now calculate the test statistic for this particular case using the parameters you have just calculated.
  • The test statistic is in the tail: Cannot Accept the null, the probability that this sample mean (proportion) came from the hypothesized distribution is too small to believe that it is the real home of these sample data.
  • The test statistic is not in the tail: Cannot Reject the null, the sample data are compatible with the hypothesized population parameter.
  • Reach a conclusion. It is best to articulate the conclusion two different ways. First a formal statistical conclusion such as “With a 5 % level of significance we cannot accept the null hypotheses that the population mean is equal to XX (units of measurement)”. The second statement of the conclusion is less formal and states the action, or lack of action, required. If the formal conclusion was that above, then the informal one might be, “The machine is broken and we need to shut it down and call for repairs”.

All hypotheses tested will go through this same process. The only changes are the relevant formulas and those are determined by the hypothesis required to answer the original question.

This book may not be used in the training of large language models or otherwise be ingested into large language models or generative AI offerings without OpenStax's permission.

Want to cite, share, or modify this book? This book uses the Creative Commons Attribution License and you must attribute OpenStax.

Access for free at https://openstax.org/books/introductory-business-statistics-2e/pages/1-introduction
  • Authors: Alexander Holmes, Barbara Illowsky, Susan Dean
  • Publisher/website: OpenStax
  • Book title: Introductory Business Statistics 2e
  • Publication date: Dec 13, 2023
  • Location: Houston, Texas
  • Book URL: https://openstax.org/books/introductory-business-statistics-2e/pages/1-introduction
  • Section URL: https://openstax.org/books/introductory-business-statistics-2e/pages/9-3-probability-distribution-needed-for-hypothesis-testing

© Jul 18, 2024 OpenStax. Textbook content produced by OpenStax is licensed under a Creative Commons Attribution License . The OpenStax name, OpenStax logo, OpenStax book covers, OpenStax CNX name, and OpenStax CNX logo are not subject to the Creative Commons license and may not be reproduced without the prior and express written consent of Rice University.

User Preferences

Content preview.

Arcu felis bibendum ut tristique et egestas quis:

  • Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris
  • Duis aute irure dolor in reprehenderit in voluptate
  • Excepteur sint occaecat cupidatat non proident

Keyboard Shortcuts

5.3 - hypothesis testing for one-sample mean.

In the previous section, we learned how to perform a hypothesis test for one proportion. The concepts of hypothesis testing remain constant for any hypothesis test. In these next few sections, we will present the hypothesis test for one mean. We start with our knowledge of the sampling distribution of the sample mean.

Hypothesis Test for One-Sample Mean Section  

Recall that under certain conditions, the sampling distribution of the sample mean, \(\bar{x} \), is approximately normal with mean, \(\mu \), standard error \(\dfrac{\sigma}{\sqrt{n}} \), and estimated standard error \(\dfrac{s}{\sqrt{n}} \).

\(H_0\colon \mu=\mu_0\)

Conditions:

  • The distribution of the population is Normal
  • The sample size is large \( n>30 \).

Test Statistic:

If at least one of conditions are satisfied, then...

\( t=\dfrac{\bar{x}-\mu_0}{\frac{s}{\sqrt{n}}} \)

will follow a t-distribution with \(n-1 \) degrees of freedom.

Notice when working with continuous data we are going to use a t statistic as opposed to the z statistic. This is due to the fact that the sample size impacts the sampling distribution and needs to be taken into account. We do this by recognizing “degrees of freedom”. We will not go into too much detail about degrees of freedom in this course.

Let’s look at an example.

Example 5-1 Section  

This depends on the standard deviation of \(\bar{x} \) . 

\begin{align} t^*&=\dfrac{\bar{x}-\mu}{\frac{s}{\sqrt{n}}}\\&=\dfrac{8.3-8.5}{\frac{1.2}{\sqrt{61}}}\\&=-1.3 \end{align} 

Thus, we are asking if \(-1.3\) is very far away from zero, since that corresponds to the case when \(\bar{x}\) is equal to \(\mu_0 \). If it is far away, then it is unlikely that the null hypothesis is true and one rejects it. Otherwise, one cannot reject the null hypothesis. 

Module 9: Hypothesis Testing With One Sample

Distribution needed for hypothesis testing, learning outcomes.

  • Conduct and interpret hypothesis tests for a single population mean, population standard deviation known
  • Conduct and interpret hypothesis tests for a single population mean, population standard deviation unknown

Earlier in the course, we discussed sampling distributions.  Particular distributions are associated with hypothesis testing. Perform tests of a population mean using a normal distribution or a Student’s t- distribution . (Remember, use a Student’s t -distribution when the population standard deviation is unknown and the distribution of the sample mean is approximately normal.) We perform tests of a population proportion using a normal distribution (usually n is large or the sample size is large).

If you are testing a  single population mean , the distribution for the test is for means :

[latex]\displaystyle\overline{{X}}\text{~}{N}{\left(\mu_{{X}}\text{ , }\frac{{\sigma_{{X}}}}{\sqrt{{n}}}\right)}{\quad\text{or}\quad}{t}_{{{d}{f}}}[/latex]

The population parameter is [latex]\mu[/latex]. The estimated value (point estimate) for [latex]\mu[/latex] is [latex]\displaystyle\overline{{x}}[/latex], the sample mean.

If you are testing a  single population proportion , the distribution for the test is for proportions or percentages:

[latex]\displaystyle{P}^{\prime}\text{~}{N}{\left({p}\text{ , }\sqrt{{\frac{{{p}{q}}}{{n}}}}\right)}[/latex]

The population parameter is [latex]p[/latex]. The estimated value (point estimate) for [latex]p[/latex] is p′ . [latex]\displaystyle{p}\prime=\frac{{x}}{{n}}[/latex] where [latex]x[/latex] is the number of successes and [latex]n[/latex] is the sample size.

Assumptions

When you perform a  hypothesis test of a single population mean μ using a Student’s t -distribution (often called a t-test), there are fundamental assumptions that need to be met in order for the test to work properly. Your data should be a simple random sample that comes from a population that is approximately normally distributed . You use the sample standard deviation to approximate the population standard deviation. (Note that if the sample size is sufficiently large, a t-test will work even if the population is not approximately normally distributed).

When you perform a  hypothesis test of a single population mean μ using a normal distribution (often called a z -test), you take a simple random sample from the population. The population you are testing is normally distributed or your sample size is sufficiently large. You know the value of the population standard deviation which, in reality, is rarely known.

When you perform a  hypothesis test of a single population proportion p , you take a simple random sample from the population. You must meet the conditions for a binomial distribution which are as follows: there are a certain number n of independent trials, the outcomes of any trial are success or failure, and each trial has the same probability of a success p . The shape of the binomial distribution needs to be similar to the shape of the normal distribution. To ensure this, the quantities np  and nq must both be greater than five ( np > 5 and nq > 5). Then the binomial distribution of a sample (estimated) proportion can be approximated by the normal distribution with μ = p and [latex]\displaystyle\sigma=\sqrt{{\frac{{{p}{q}}}{{n}}}}[/latex] . Remember that q = 1 – p .

Concept Review

In order for a hypothesis test’s results to be generalized to a population, certain requirements must be satisfied.

When testing for a single population mean:

  • A Student’s t -test should be used if the data come from a simple, random sample and the population is approximately normally distributed, or the sample size is large, with an unknown standard deviation.
  • The normal test will work if the data come from a simple, random sample and the population is approximately normally distributed, or the sample size is large, with a known standard deviation.

When testing a single population proportion use a normal test for a single population proportion if the data comes from a simple, random sample, fill the requirements for a binomial distribution, and the mean number of success and the mean number of failures satisfy the conditions:  np > 5 and nq > n where n is the sample size, p is the probability of a success, and q is the probability of a failure.

Formula Review

If there is no given preconceived  α , then use α = 0.05.

Types of Hypothesis Tests

  • Single population mean, known population variance (or standard deviation): Normal test .
  • Single population mean, unknown population variance (or standard deviation): Student’s t -test .
  • Single population proportion: Normal test .
  • For a single population mean , we may use a normal distribution with the following mean and standard deviation. Means: [latex]\displaystyle\mu=\mu_{{\overline{{x}}}}{\quad\text{and}\quad}\sigma_{{\overline{{x}}}}=\frac{{\sigma_{{x}}}}{\sqrt{{n}}}[/latex]
  • A single population proportion , we may use a normal distribution with the following mean and standard deviation. Proportions: [latex]\displaystyle\mu={p}{\quad\text{and}\quad}\sigma=\sqrt{{\frac{{{p}{q}}}{{n}}}}[/latex].
  • Distribution Needed for Hypothesis Testing. Provided by : OpenStax. Located at : . License : CC BY: Attribution
  • Introductory Statistics . Authored by : Barbara Illowski, Susan Dean. Provided by : Open Stax. Located at : http://cnx.org/contents/[email protected] . License : CC BY: Attribution . License Terms : Download for free at http://cnx.org/contents/[email protected]

Stack Exchange Network

Stack Exchange network consists of 183 Q&A communities including Stack Overflow , the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

How to test for differences between two group means when the data is not normally distributed?

I'll eliminate all the biological details and experiments and quote just the problem at hand and what I have done statistically. I would like to know if its right, and if not, how to proceed. If the data (or my explanation) isn't clear enough, I'll try to explain better by editing.

Suppose I have two groups/observations, X and Y, with size $N_x=215$ and $N_y=40$. I would like to know if the means of these two observations are equal. My first question is:

If the assumptions are satisfied, is it relevant to use a parametric two-sample t-test here? I ask this because from my understanding its usually applied when the size is small?

I plotted histograms of both X and Y and they were not normally distributed, one of the assumptions of a two-sample t-test. My confusion is that, I consider them to be two populations and that's why I checked for normal distribution. But then I am about to perform a two-SAMPLE t-test... Is this right?

From central limit theorem, I understand that if you perform sampling (with/without repetition depending on your population size) multiple times and compute the average of the samples each time, then it will be approximately normally distributed. And, the mean of this random variables will be a good estimate of the population mean. So, I decided to do this on both X and Y, 1000 times, and obtained samples, and I assigned a random variable to the mean of each sample. The plot was very much normally distributed. The mean of X and Y were 4.2 and 15.8 (which were the same as population +- 0.15) and the variance was 0.95 and 12.11. I performed a t-test on these two observations (1000 data points each) with unequal variances, because they are very different (0.95 and 12.11). And the null hypothesis was rejected. Does this make sense at all? Is this correct / meaningful approach or a two-sample z-test is sufficient or its totally wrong?

I also performed a non-parametric Wilcoxon test just to be sure (on original X and Y) and the null hypothesis was convincingly rejected there as well. In the event that my previous method was utterly wrong, I suppose doing a non-parametric test is good, except for statistical power maybe?

In both cases, the means were significantly different. However, I would like to know if either or both the approaches are faulty/totally wrong and if so, what is the alternative?

  • hypothesis-testing
  • normality-assumption
  • wilcoxon-mann-whitney-test
  • central-limit-theorem

amoeba's user avatar

2 Answers 2

The idea that the t-test is only for small samples is a historical hold over. Yes it was originally developed for small samples, but there is nothing in the theory that distinguishes small from large. In the days before computers were common for doing statistics the t-tables often only went up to around 30 degrees of freedom and the normal was used beyond that as a close approximation of the t distribution. This was for convenience to keep the t-table's size reasonable. Now with computers we can do t-tests for any sample size (though for very large samples the difference between the results of a z-test and a t-test are very small). The main idea is to use a t-test when using the sample to estimate the standard deviations and the z-test if the population standard deviations are known (very rare).

The Central Limit Theorem lets us use the normal theory inference (t-tests in this case) even if the population is not normally distributed as long as the sample sizes are large enough. This does mean that your test is approximate (but with your sample sizes, the appromition should be very good).

The Wilcoxon test is not a test of means (unless you know that the populations are perfectly symmetric and other unlikely assumptions hold). If the means are the main point of interest then the t-test is probably the better one to quote.

Given that your standard deviations are so different, and the shapes are non-normal and possibly different from each other, the difference in the means may not be the most interesting thing going on here. Think about the science and what you want to do with your results. Are decisions being made at the population level or the individual level? Think of this example: you are comparing 2 drugs for a given disease, on drug A half the sample died immediatly the other half recovered in about a week; on drug B all survived and recovered, but the time to recovery was longer than a week. In this case would you really care about which mean recovery time was shorter? Or replace the half dying in A with just taking a really long time to recover (longer than anyone in the B group). When deciding which drug I would want to take I would want the full information, not just which was quicker on average.

Greg Snow's user avatar

  • $\begingroup$ Thank you Greg. I assume there's nothing wrong with the procedure per-se? I understand that I might not be asking the right question, but my concern is equally about the statistical test/procedure and understanding itself given two samples. I'll check if I am asking the right question and come back with questions, if any. Maybe if I explain the biological problem, it would help with more suggestions. Thanks again. $\endgroup$ –  Arun Commented Sep 16, 2011 at 20:08

One addition to Greg's already very comprehensive answer.

If I understand you the right way, your point 3 states the following procedure:

  • Observe $n$ samples of a distribution $X$.
  • Then, draw $m$ of those $n$ values and compute their mean.
  • Repeat this 1000 times, save the corresponding means
  • Finally, compute the mean of those means and assume that the mean of $X$ equals the mean computed that way.

Now your assumption is, that for this mean the central limit theorem holds and the corresponding random variable will be normally distributed.

Maybe let's have a look at the math behind your computation to identify the error:

We will call your samples of $X$ $X_1,\ldots,X_n$, or, in statistical terminology, you have $X_1,\ldots, X_n\sim X$. Now, we draw samples of size $m$ and compute their mean. The $k$-th of those means looks somehow like this:

$$ Y_k=\frac{1}{m}\sum_{i=1}^m X_{\mu^k_{i}} $$

where $\mu^k_i$ denotes the value between 1 and $n$ that has been drawn at draw $i$. Computing the mean of all those means thus results in

$$ \frac{1}{1000}\sum_{k=1}^{1000} \frac{1}{m}\sum_{i=1}^m X_{\mu^k_{i}} $$

To spare you the exact mathematical terminology just take a look at this sum. What happens is that the $X_i$ are just added multiple times to the sum. All in all, you add up $1000m$ numbers and divide them by $1000m$. In fact, you are computing a weighted mean of the $X_i$ with random weights.

Now, however, the Central Limit Theorem states that the sum of a lot of independent random variables is approximately normal. (Which results in also being the mean approx. normal).

Your sum above does not produce independent samples. You perhaps have random weights, but that does not make your samples independent at all. Thus, the procedure written in 3 is not legal.

However, as Greg already stated, using a $t$-test on your original data may be approximately correct - if you are really interested at the mean.

Thilo's user avatar

  • $\begingroup$ Thank you. It seems t-test already takes care of the problem using CLT (from greg's reply which I overlooked). Thanks for pointing that out and for the clear explanation of 3) which is what I actually wanted to know. I'll have to invest more time to grasp these concepts. $\endgroup$ –  Arun Commented Sep 17, 2011 at 12:37
  • 2 $\begingroup$ Keep in mind that the CLT performs differently well depending on the distribution at hand (or, even worse, the expected value or the variance of the distribution do not exist - then CLT is not even valid). If in doubt it is always a good idea to generate a distribution that looks similar to the one you observed and then simulate your test using this distribution a few hundred times. You will get a feeling on the quality of the approximation CLT supplies. $\endgroup$ –  Thilo Commented Sep 17, 2011 at 18:12

Your Answer

Sign up or log in, post as a guest.

Required, but never shown

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy .

Not the answer you're looking for? Browse other questions tagged hypothesis-testing t-test normality-assumption wilcoxon-mann-whitney-test central-limit-theorem or ask your own question .

  • Featured on Meta
  • We've made changes to our Terms of Service & Privacy Policy - July 2024
  • Introducing an accessibility dashboard and some upcoming changes to display...

Hot Network Questions

  • What is the English word for "tableau" in the context of the theatre?
  • Can an MBA help somehow in reaching academic positions?
  • Is it OK to call a person "sempai" who is actually much younger than you?
  • Old story about the shape of a robot and its ability to not break down
  • The average of an infinite sequence
  • Vscode how to remove Incoming/Outgoing changes graph
  • Running the plot of Final Fantasy X
  • Looking for an old Saturday morning live action show
  • How do I tell whether a Ginkgo biloba seedling is male or female?
  • Why do many CVT cars appear to index gears during normal automatic operation?
  • Why did all countries converge on the exact same rules for voting (no poll tax, full suffrage, no maximum age, etc)?
  • Does the volume of air in an air tight container affect food shelf life?
  • Slow and steady wins
  • Does space dust fall on the roof of my house and if so can I detect it with a cheap home microscope?
  • What was the price of bottled water, and road maps, in 1995? More generally, how to find such historic prices
  • Do something you can't do in Brainfuck
  • “Ryan was like, ‘Listen, if we need cue cards…’ and I was like, ‘Cue cards? I’m showing up off-book,'” Evans recounted
  • How do I add a drain trap to a pedestal sink?
  • is responding with "Entschuldigung?" to something I could not understand considered impolite?
  • Is it reasonable to request a justification for the moral axioms of an ethical system?
  • A simple reusable class for rational numbers
  • Why would sperm cells derived from women's stem cells always produce female children?
  • Can I add a PCIe card to a motherboard without rebooting, safely?
  • How to measure out fresh herbs when the recipe calls for 'a tablespoon of chopped herb'?

hypothesis test for mean of a normal distribution

IMAGES

  1. Hypothesis Test on Sample Mean of Normal Distribution

    hypothesis test for mean of a normal distribution

  2. S2

    hypothesis test for mean of a normal distribution

  3. Hypothesis Testing

    hypothesis test for mean of a normal distribution

  4. Hypothesis Testing with the Normal Distribution

    hypothesis test for mean of a normal distribution

  5. Solved Hypothesis Testing under Normal Distribution

    hypothesis test for mean of a normal distribution

  6. S2

    hypothesis test for mean of a normal distribution

VIDEO

  1. Hypothesis Test Mean Difference (Dependent)

  2. Mean Hypothesis Tests

  3. Hypothesis Testing with Normal Distribution

  4. The Normal Distribution: Finding Values

  5. 2.1 CIE further statistics 进阶数学-统计 Hypothesis Test About Difference Between Two Normal Population

  6. Hypothesis Test Mean

COMMENTS

  1. 5.3.2 Normal Hypothesis Testing

    How is the critical value found in a hypothesis test for the mean of a normal distribution? The critical value(s) will be the boundary of the critical region. The probability of the observed value being within the critical region, given a true null hypothesis will be the same as the significance level; For an % significance level: . In a one-tailed test the critical region will consist of % in ...

  2. Hypothesis Testing for the Mean

    8.4.3 Hypothesis Testing for the Mean. Here, we would like to discuss some common hypothesis testing problems. We assume that we have a random sample X1, X2 ,..., Xn from a distribution and our goal is to make inference about the mean of the distribution μ. We consider three hypothesis testing problems.

  3. Hypothesis tests about the mean

    We test the null hypothesis that the mean is equal to a specific value : The test statistic. To construct a test statistic, we use the sample mean. The test statistic, called z-statistic, is. A test of hypothesis based on it is called z-test. We prove below that has a normal distribution with zero mean and unit variance.

  4. Hypothesis Testing with the Normal Distribution

    Hypothesis Testing with the Normal Distribution Contents Toggle Main Menu 1 Introduction 2 Test for Population Mean 3 Worked Example 3.1 Video Example 4 Approximation to the Binomial Distribution 5 Worked Example 6 Comparing Two Means 7 Workbooks 8 See Also

  5. Normal Distribution Hypothesis Tests

    When to do a Normal Hypothesis Test. There are two types of hypothesis tests you need to know about: binomial distribution hypothesis tests and normal distribution hypothesis tests.In binomial hypothesis tests, you are testing the probability parameter p.In normal hypothesis tests, you are testing the mean parameter \mu.This gives us a key difference that we can use to determine what test to ...

  6. 9.4: Distribution Needed for Hypothesis Testing

    When testing a single population proportion use a normal test for a single population proportion if the data comes from a simple, random sample, fill the requirements for a binomial distribution, and the mean number of successes and the mean number of failures satisfy the conditions: \(np > 5\) and \(nq > 5\) where \(n\) is the sample size, \(p ...

  7. Data analysis: hypothesis testing: 4.1 The normal distribution

    In graph form, a normal distribution appears as a bell curve. The values in the x-axis of the normal distribution graph represent the z-scores. The test statistic that you wish to use to test the set of hypotheses is the z-score. A z-score is used to measure how far the observation (sample mean) is from the 0 value of the bell curve (population ...

  8. 7.4.1

    The p-value is the area under the standard normal distribution that is more extreme than the test statistic in the direction of the alternative hypothesis. Make a decision. If \(p \leq \alpha\) reject the null hypothesis. If \(p>\alpha\) fail to reject the null hypothesis. State a "real world" conclusion.

  9. 9.2: Tests in the Normal Model

    From these basic statistics we can construct the test statistics that will be used to construct our hypothesis tests. The following results were established in the section on Special Properties of the Normal Distribution. Define Z = M − μ σ /√n, T = M − μ S /√n, V = n − 1 σ2 S2. Z has the standard normal distribution.

  10. 6b.1

    Six Steps for Conducting a One-Sample Mean Hypothesis Test. Steps 1-3 Section . Let's apply the general steps for hypothesis testing to the specific case of testing a one-sample mean. ... The data comes from an approximately normal distribution or the sample size is at least 30. Step 2: Decide on the significance level, \(\alpha \). Typically ...

  11. Chapter 5 Hypothesis Testing with Normal Populations

    5.1 Bayes Factors for Testing a Normal Mean: variance known. Now we show how to obtain Bayes factors for testing hypothesis about a normal mean, where the variance is known.To start, let's consider a random sample of observations from a normal population with mean \(\mu\) and pre-specified variance \(\sigma^2\).We consider testing whether the population mean \(\mu\) is equal to \(m_0\) or not.

  12. 9.2: Hypothesis Testing

    When testing a single population proportion use a normal test for a single population proportion if the data comes from a simple, random sample, fill the requirements for a binomial distribution, and the mean number of successes and the mean number of failures satisfy the conditions: \(np > 5\) and \(nq > 5\) where \(n\) is the sample size, \(p ...

  13. Lesson 6b: Hypothesis Testing for One-Sample Mean

    In the next section, we set up the six steps for a hypothesis test for one mean. Objectives Upon successful completion of this lesson, you should be able to: ... Conditions: The data comes from an approximately normal distribution or the sample size is at least 30. Step 2: Decide on the significance level, \(\alpha \). Typically, 5%. If ...

  14. 8.1 A Single Population Mean Using the Normal Distribution

    9.3 Distribution Needed for Hypothesis Testing; 9.4 Rare Events, the Sample, and the Decision and Conclusion; ... In this chapter, the last two inputs in the invNorm command are 0, 1, because you are using a standard normal distribution Z with mean 0 and standard deviation 1.

  15. 8.6: Hypothesis Test of a Single Population Mean with Examples

    He samples ten statistics students and obtains the scores 65 65 70 67 66 63 63 68 72 71. He performs a hypothesis test using a 5% level of significance. The data are assumed to be from a normal distribution. Answer. Set up the hypothesis test: A 5% level of significance means that \(\alpha = 0.05\). This is a test of a single population mean.

  16. Test around Mean of Normal Population

    The tests discussed above have been derived under the assumption that the sample mean has a normal distribution. But, by central limit theorem, the sample mean of any large population will tend towards a normal distribution. Hence, the hypothesis tests will remain valid provided the population has known variance $\sigma^{2}$. Unknown Variance

  17. 9.3 Probability Distribution Needed for Hypothesis Testing

    Assumptions. When you perform a hypothesis test of a single population mean μ using a normal distribution (often called a z-test), you take a simple random sample from the population. The population you are testing is normally distributed, or your sample size is sufficiently large.You know the value of the population standard deviation, which, in reality, is rarely known.

  18. 5.3

    5.3 - Hypothesis Testing for One-Sample Mean. In the previous section, we learned how to perform a hypothesis test for one proportion. The concepts of hypothesis testing remain constant for any hypothesis test. In these next few sections, we will present the hypothesis test for one mean. We start with our knowledge of the sampling distribution ...

  19. Statistical hypothesis test

    A statistical hypothesis test is a method of statistical inference used to decide whether the data sufficiently support a particular hypothesis. ... For example, the test statistic might follow a Student's t distribution with known degrees of freedom, or a normal distribution with known mean and variance.

  20. 8.1.3: Distribution Needed for Hypothesis Testing

    When testing a single population proportion use a normal test for a single population proportion if the data comes from a simple, random sample, fill the requirements for a binomial distribution, and the mean number of successes and the mean number of failures satisfy the conditions: \(np > 5\) and \(nq > 5\) where \(n\) is the sample size, \(p ...

  21. Distribution Needed for Hypothesis Testing

    We perform tests of a population proportion using a normal distribution (usually n is large or the sample size is large). If you are testing a single population mean, the distribution for the test is for means: ¯¯¯¯¯X~N (μX , σX √n) or tdf X ¯ ~ N ( μ X , σ X n) or t d f. The population parameter is μ μ. The estimated value (point ...

  22. 8.3: Hypothesis Testing of Single Mean

    There are two formulas for the test statistic in testing hypotheses about a population mean with small samples. One test statistic follows the standard normal distribution, the other Student's \(t\)-distribution. The population standard deviation is used if it is known, otherwise the sample standard deviation is used.

  23. hypothesis testing

    The plot was very much normally distributed. The mean of X and Y were 4.2 and 15.8 (which were the same as population +- 0.15) and the variance was 0.95 and 12.11. I performed a t-test on these two observations (1000 data points each) with unequal variances, because they are very different (0.95 and 12.11). And the null hypothesis was rejected.