Hypothesis Testing

Claims and Suspicions

The "claim" is called the null hypothesis and denoted H0H_0.

The "suspicion" is called alternative hypothesis and denoted H1H_1 or HAH_A.

The (random) quantity we use to measure evidence is called a test statistic. We need to know its sampling distribution when H0H_0 is true.

The pp-value quantifies "the evidence against H0H_0". A smaller pp-value means more evidence against claim.

How small does the p-value need to be?

  • pp-value α\leq \alpha then we reject H0H_0
  • pp-value >α> \alpha then there is not enough evidence to rject H0H_0

By convention α=0.01orα=0.05\alpha = 0.01 or \alpha = 0.05

Hypothesis Testing

A hypothesis is a conjecture concerning the value of a problem parameter.

Requires two competing hypotheses:

  • a null hypothesis, denoted by H0H_0
  • an alternative hypothesis, denoted by H1H_1 or HAH_A

The hypothesis is tested by evaluating experimental evidence:

  • we reject H0H_0 if the evidence against H0H_0 is strong
  • we fail to reject H0H_0 if the evidence against is H0H_0 is insufficient

Errors in Hypothesis Testing

Two types of errors can be committed when testing H0H_0 against H1H_1.

Decision: reject H0H_0Decision: fail to reject H0H_0
Reality: H0H_0 is TrueType I ErrorNo Error
Reality: H0H_0 is FalseNo ErrorType II Error

If we reject H0H_0 when H0H_0 is true ⇒ we have committed a type I error.

If we fail to reject H0H_0 when H0H_0 is false ⇒ type II error

Probability of Committing Errors and Power

The probability of committing type I error is usually denoted by:

α=P(reject H0H0 is true)\alpha = P(\mathrm{reject} \ H_0|H_0 \ \mathrm{is \ true}).

The probability of committing type II error is:

β=P(fail to reject H0H0 is false)\beta = P(\mathrm{fail \ to \ reject} \ H_0|H_0 \ \mathrm{is \ false}).

The power of test is the probability of correctly rejecting H0H_0:

Power=P(reject H0H0 is false)=1β\mathrm{Power} = P(\mathrm{reject} \ H_0|H_0 \ \mathrm{is \ false}) = 1 - \beta

Conventional values of α,β\alpha, \beta and Power are 0.05, 0.2, 0.8, respectively.

Type of Null and Alternative Hypothesis

Let μ\mu be the population parameter of interest. The hypotheses are expressed in terms of the values of this parameter.

The null hypotheses is a simple hypothesis, that is, it is of the form:

H0:μ=μ0H_0: \mu = \mu_0

where μ0\mu_0 is some candidate value ("simple" means that it is assumed to be a single value.)

The alternative hypothesis H1H_1 is a composite hypothesis, i.e. it contains more than one candidate value.

Depending on the context, hypothesis testing takes on one of the following three forms:

H0:μ=μ0H_0 : \mu = \mu_0 where μ0\mu_0 is a number.

  • against a two-sided alternative: H1:μμ0H_1 : \mu \neq \mu_0
  • against a left-sided alternative: H1:μ<μ0H_1 : \mu < \mu_0 or
  • against a right-sided alternative: H1:μ>μ0H_1 : \mu > \mu_0

The formulation of the alternative hypothesis depends on our research hypothesis and is determined prior to experiment or study.

Test Statistic and Critical Region

To test a statistical hypothesis statistics we use a test statistic. A test statistic is a function of the random sample and the population parameter of interest.

We reject H0H_0 if the value of the test statistic is in the critical region or rejection area. The critical region is a subset of the real numbers.

The critical region is obtained using the definition of errors in hypothesis testing. We select the critical region so that

α=P(reject H0H0 is true)\alpha = P(\mathrm{reject} \ H_0|H_0 \ \mathrm{is \ true}). is equal to some pre-determined value, like 0.05 or 0.01.

Test for a Mean with Known Variance

Suppose X1,...,XnX_1, ..., X_n is a random sample from a population with mean μ\mu and variance σ2\sigma^2, let X=1ni=1nXi\overline{X} = \frac{1}{n} \sum_{i=1}^n X_i and denote the sample mean:

  • if the population is normal then X\overline{X} is exactly N(μ,σ2/n)N(\mu, \sigma^2/n)
  • if the population is not normal, then as long as nn is large enough, we have X\overline{X} is\simimately N(μ,σ2/n)N (\mu, \sigma^2/n), according to the CLT.

In this section, we assume that the population variance σ2\sigma^2 is known, and that the hypothesis concerns the unknown population mean μ\mu.

Explanation: Left-Sided Alternative

Consider the unknown population mean μ\mu. Suppose that we would like to test:

H0:μ=μ0H_0 : \mu = \mu_0 against H1:μ<μ0H_1 : \mu < \mu_0 where μ0\mu_0 is some candidate value for μ\mu.

To evaluate the evidence against H0H_0, we compare X\overline{X} to μ0\mu_0: under H0H_0,

Z0=Xμ0σ/nN(0,1)Z_0 = \frac{\overline{X} - \mu_0}{\sigma/\sqrt{n}} \sim N(0,1)

We say that Z0Z_0 is the observed value of the Z-test statistic Z0Z_0. If z0 < 0, we have evidence that μ\mu < μ0\mu_0. However, we only reject H0H_0 in favour of H1H_1 if the evidence is _significant.

Critical Region: Let α\alpha be the level of significance. We reject H0H_0 in favour of H1H_1 only if z0zαz_0 \leq -z_\alpha.

The corresponding p-value for the test is the probability of observing evidence as or more extreme than our current evidence in favour of H1H_1, assuming that H0H_0 is true (that is, simply by chance). Even more extreme in this case means further to the left; so p-value =P(Zz0)=Φ(z0)= P(Z \leq z_0) = \Phi (z_0), where z0z_0 is the observed value for the Z-test statistic.

Decision Rule: if the p-value α\leq \alpha, then we reject H0H_0 in favour of H1H_1. If the p-value >α> \alpha, we fail to reject H0H_0.

Left-Sided Test

H0:μ=μ0H_0 : \mu = \mu_0 against H1:μ<μ0H_1 : \mu < \mu_0

At significance α\alpha, if z0zαz_0 \leq -z_\alpha, we reject H0H_0 in favour of H1H_1.

Right-Sided Test

H0:μ=μ0H_0 : \mu = \mu_0 against H1:μ>μ0H_1 : \mu > \mu_0

At significance α\alpha, if z0zαz_0 \geq -z_\alpha, we reject H0H_0 in favour of H1H_1.

Two-Sided Test

H0:μ=μ0H_0 : \mu = \mu_0 against H1:μμ0H_1 : \mu \neq \mu_0

At significance α\alpha, if z0zα|z_0| \geq -z_\alpha, we reject H0H_0 in favour of H1H_1.

Procedure

To test for H0:μ=μ0H_0 : \mu = \mu_0, where μ0\mu_0 is a constant.

  1. set H0:μ=μ0H_0 : \mu = \mu_0

  2. select an alternative hypothesis H1H_1 (what we are trying to show using the data). Depending on context, we choose on of these alternatives:

    • H1:μ<μ0H_1 : \mu < \mu_0 (left-sided test)
    • H1:μ>μ0H_1 : \mu > \mu_0 (right-sided test)
    • H1:μμ0H_1 : \mu \neq \mu_0 (two-sided test)
  3. choose α=P(type I error)\alpha = P (\mathrm{type \ I \ error}): typically α=0.01\alpha = 0.01 or 0.050.05.

  4. for the observed sample x1,...,xn{x_1, . . . , x_n}, compute the observed value of the test statistics z0z_0

  5. determine the critical region as follows:

    Alternative HypothesisCritical Region
    H1:μ>μ0H_1 : \mu > \mu_0z0>zαz_0 > z_\alpha
    H1:μ<μ0H_1 : \mu < \mu_0z0<zαz_0 < -z_\alpha
    H1:μμ0H_1 : \mu \neq \mu_0z0>zα/2\|z_0\| > z_{\alpha/2}
  6. compute the associated p-value as follows:

    Alternative Hypothesisp-Value
    H1:μ>μ0H_1 : \mu > \mu_0P(Z>z0)P(Z > z_0)
    H1:μ<μ0H_1 : \mu < \mu_0P(Z<z0)P(Z < z_0)
    H1:μμ0H_1 : \mu \neq \mu_02min{P(Z>z0),P(Z<z0)}2\cdot \mathrm{min}\{P(Z > z_0), P (Z < z_0)\}

Decision Rule: if the p-value α\leq \alpha, then we reject H0H_0 in favour of H1H_1. If the p-value >α> \alpha, we fail to reject H0H_0.

Test for a Mean with Unknown Variance

If the data is normal and σ\sigma is unknown, we can estimate it by the sample variance:

S=1n1i=1n(XiX2)S = \sqrt{\frac{1}{n-1}\sum_{i=1}^n(X_i-\overline{X}^2)}

As we have seen for confidence intervals, the test statistic follows a Student's tt-distribution with n1n-1 degrees of freedom.

T=XμS/nt(n1)T = \frac{\overline{X}-\mu}{S/\sqrt{n}} \sim t(n-1)

We can follow the same steps as for the test with known variance, with the modified critical regions and p-values:

Alternative HypothesisCritical Region
H1:μ>μ0H_1 : \mu > \mu_0t0>tα(n1)t_0 > t_\alpha (n-1)
H1:μ<μ0H_1 : \mu < \mu_0t0<tα(n1)t_0 < -t_\alpha (n-1)
H1:μμ0H_1 : \mu \neq \mu_0t0>tα/2(n1)\|t_0\| > t_{\alpha/2}(n-1)

where t0,tα(n1)t_0, t_\alpha (n-1) is the t-value satisfying P(T>tα(n1))=αP(T > t_\alpha (n-1)) = \alpha, and TT follows a Student's t-distribution with n1n-1 degrees of freedom.

Alternative Hypothesisp-Value
H1:μ>μ0H_1 : \mu > \mu_0P(T>t0)P(T > t_0)
H1:μ<μ0H_1 : \mu < \mu_0P(T<t0)P(T < t_0)
H1:μμ0H_1 : \mu \neq \mu_02min{P(T>t0),P(T<t0)}2\cdot \mathrm{min}\{P(T > t_0), P (T < t_0)\}

Two-Sample Tests

Unpaired

Let X1,1,...,X1,nX_{1,1}, . . . , X_{1,n} be a random sample from a normal population with unknown mean μ1\mu_1 and variance σ12σ^2_1; let Y2,1,...,Y2,mY_{2,1}, . . . , Y_{2,m} be a random sample from a normal population with unknown mean μ2\mu_2 and variance σ22σ^2_2, with both populations independent of one another. We want to test:

H0:μ1=μ2H_0 : \mu_1 = \mu_2 against H1:μ1μ2H_1 : \mu_1 \neq \mu_2

Let X=1ni=1nXi\overline{X} = \frac{1}{n}\sum_{i=1}^{n} X_i, Y=1ni=1nYi\overline{Y} = \frac{1}{n}\sum_{i=1}^{n} Y_i. The observed values are again denoted by lower case letters: x,y\overline{x}, \overline{y}.

Case 1: SD1 and SD2 are known

Alternative HypothesisCritical Region
H1:μ>μ0H_1 : \mu > \mu_0z0>zαz_0 > z_\alpha
H1:μ<μ0H_1 : \mu < \mu_0z0<zαz_0 < -z_\alpha
H1:μμ0H_1 : \mu \neq \mu_0z0>zα/2\|z_0\| > z_{\alpha/2}

where z0=xyσ12/n+σ22/m,zαz_0 = \frac{\overline{x} - \overline{y}}{\sqrt{\sigma^2_1/n+\sigma^2_2/m}}, z_\alpha satisfies P(Z>zα)=αP(Z > z_\alpha) = \alpha, and ZN(0,1)Z \sim N(0,1)

Alternative Hypothesisp-Value
H1:μ>μ0H_1 : \mu > \mu_0P(Z>z0)P(Z > z_0)
H1:μ<μ0H_1 : \mu < \mu_0P(Z<z0)P(Z < z_0)
H1:μμ0H_1 : \mu \neq \mu_02min{P(Z>z0),P(Z<z0)}2\cdot \mathrm{min}\{P(Z > z_0), P (Z < z_0)\}

Case 2: SD1 and SD2 are Unknown (Small Samples)

Alternative HypothesisCritical Region
H1:μ>μ0H_1 : \mu > \mu_0t0>tα(n+m2)t_0 > t_\alpha (n+m-2)
H1:μ<μ0H_1 : \mu < \mu_0t0<tα(n+m2)t_0 < -t_\alpha (n+m-2)
H1:μμ0H_1 : \mu \neq \mu_0t0>tα/2(n+m2)\|t_0\| > t_{\alpha/2}(n+m-2)

where t0=XYSp2/n+Sp2/m,Sp2=(n1)S12+(m1)S22n+m2,tα(n+m2)t_0 = \frac{\overline{X} - \overline{Y}}{\sqrt{S^2_p/n+S^2_p/m}}, S^2_p = \frac{(n-1)S^2_1+(m-1)S^2_2}{n+m-2}, t_\alpha(n+m-2) satisfies P(T>tα(n+m2))=αP(T > t_\alpha(n+m-2)) = \alpha, and Tt(n+m2)T \sim t(n+m-2)

Alternative Hypothesisp-Value
H1:μ>μ0H_1 : \mu > \mu_0P(T>t0)P(T > t_0)
H1:μ<μ0H_1 : \mu < \mu_0P(T<t0)P(T < t_0)
H1:μμ0H_1 : \mu \neq \mu_02min{P(T>t0),P(T<t0)}2\cdot \mathrm{min}\{P(T > t_0), P (T < t_0)\}

Case 3: SD1 and SD2 are Unknown (Large Samples)

Alternative HypothesisCritical Region
H1:μ>μ0H_1 : \mu > \mu_0z0>zαz_0 > z_\alpha
H1:μ<μ0H_1 : \mu < \mu_0z0<zαz_0 < -z_\alpha
H1:μμ0H_1 : \mu \neq \mu_0z0>zα/2\|z_0\| > z_{\alpha/2}

where z0=XYS12/n+S22/m,zαz_0 = \frac{\overline{X} - \overline{Y}}{\sqrt{S^2_1/n+S^2_2/m}}, z_\alpha satisfies P(Z>zα)=αP(Z > z_\alpha) = \alpha, and ZN(0,1)Z \sim N(0,1)

Alternative Hypothesisp-Value
H1:μ>μ0H_1 : \mu > \mu_0P(Z>z0)P(Z > z_0)
H1:μ<μ0H_1 : \mu < \mu_0P(Z<z0)P(Z < z_0)
H1:μμ0H_1 : \mu \neq \mu_02min{P(Z>z0),P(Z<z0)}2\cdot \mathrm{min}\{P(Z > z_0), P (Z < z_0)\}