MathCS.org - Statistics

back | next

8.1 Statistical Testing

In this chapter we will introduce hypothesis testing to enable us to answer questions such as the following:

In general:

We are interested in testing a particular hypothsis and we want to decide whether it is true or not. Moreoever, we want to associate a probability with our decisison so that we know how certain (or uncertain) we are that our decision is correct.

We will approach this problem like a trial. Recall that in a standard trial in front of a judge or jury there are two mutually exclusive hypothesis:

The defendent is either guilty or not guilty

During the trial evidence is collected and weighed either in favor of the defendent being guilty (the job of the DA) or in favor of the defendent being not guilty (the job of the Defense Lawyer). At the end of the trial the judge (or jury) decides between the two alternatives and either convicts the defendent (if he/she was assumed to be proven guilty beyond a reasonable doubt) or lets them go (if there was sufficient doubt in the defendent's guilt).

Note that a defendent is "innocent until proven guilty". If the judge (or jury) decides a defendent is not guilty, that does not necessarily mean he/she is innocent. It simply means there was not enough evidence for a conviction.

In general, a statistical test involves four elements to a statistical test: Please note that our final conclusion is always one of two options: we either reject the null hypothesis or we declare the test invalid. We never conclude anything else, such as accepting the null hypothesis.
Example: A new antihypertensive drug is tested. It is supposed to lower blood pressure more than other drugs. Other drugs have been found to lower the pressure by 10 mmHg on average, so we suspect (or hope) that our drug will lower blood pressure by more than 10 mmHg. To collect evidence, we select a random sample of size n = 62 (say), which was found to have a sample mean of 11.3 and a sample standard deviation of 5.1. Is the new drug better than the old drugs, i.e. does the new drug lower blood pressure more than other drugs?

Since the sample mean is 11.3, which is more than other drugs, it looks like this sample mean supports the  claim (because the mean from our sample is indeed bigger than 10). But - knowing that we can never be 100% certain - we must compute a probability and associate that with our conclusion, if indeed we want to make that conclusion.

In other words, we need to setup the four components of a statistical test: the population is the amount of decrease in blood pressure in people who have been given the new drug.
So, if we decide to reject the null hypothesis, that decision is invalid with a probability of about 4% (or correct with a probability of 96%). That's good enough for us so we decide indeed to reject the null hypothesis. Since we reject the null hypothesis we automatically accept the alternative, and thus we think there is sufficient evidence that the new drug is better than the existing drugs in lowering blood pressure.

So, how do we compute the above numbers to arrive at this decision ... read the next section -:)