MathCS.org - Statistics

back | next

8.3 Statistical Test for Population Mean (Small Sample)

In this section wil ladjust our statistical test for the population mean to apply to small sample situations. Fortunately (sic!), this will be easy (in fact, once you understand one statistical test, additional tests are easy since they all follow a similar procedure.

The only difference in performing a "small sample" statistical test for the mean as opposed to a "large sample test" is that we do not use the normal distribution as prescribed by the Central Limit theorem, but instead a more conservative distribution called the T-Distribution. The Central Limit theorem applies best when sample sizes are large so that we need to make some adjustment in computing probabilities for small sample sizes. The appropriate function in Excel is the TDIST function, defined as follows:
TDIST(T, N-1, TAILS), where
  • Tis the value for which we want to compute the probability
  • N is the sample size (and N-1 is frequently called the "degrees of freedom")
  • TAILS is either 1 (for a 1-tail test) or 2 (for a 2-tail test). Since we again consider 2-tailed tests only we always use 2
With that new Excel function our test procedure for a sample mean, small sample size, is as follows:

Statistical Test for the Mean (small sample size N < 30):

Fix an error level you are comfortable with (something like 10%, 5%, or 1% is most common). Denote that "comfortable error level" by the letter "A". Then setup the test as follows:
Null Hypothesis H0:
mean = M, i.e. The mean is a known number M
Alternative Hypothesis Ha:
mean ≠ M, i.e. mean is different from M (2-tailed test)
Test Statistics:
Select a random sample of size N, compute its sample mean X and the standard deviation S. Then compute the corresponding t-score as follows:
  T= (X - M) / ( S / sqrt(N) )
Rejection Region (Conclusion)

Compute p = 2*(1 - P(t > |T|)) = TDIST(ABS(T), N-1, 2)

If the probability p computed in the above step is less than A (the error level you were comfortable with inititially, you reject the null hypothesis H0 and accept the alternative hypothesis. Otherwise you declare your test inconclusive.

Comments:

Example 1: A group of secondary education student teachers were given 2 1/2 days of training in interpersonal communication group work. The effect of such a training session on the dogmatic nature of the student teachers was measured y the difference of scores on the "Rokeach Dogmatism test given before and after the training session. The difference "post minus pre score" was recorded as follows:

-16, -5, 4, 19, -40, -16, -29, 15, -2, 0, 5, -23, -3, 16, -8, 9, -14, -33, -64, -33
Can we conclude from this evidence that the training session makes student teachers less dogmatic (at the 5% level of significance) ?

This is of course the same example as before, where we incorrectly used the normal distribution to compute the probability in the last step. This time, we will do it correctly, which is fortunately almost identical to the previous case (except that we use TDIST instead of NORMDIST):

Note that in the previous section we (incorrectly) computed the probability p to be 2.2%, now it is 3.4%. The difference is small, but can be significant in special situations. Thus, to be safe:
Example 2: Suppose GAP, the clothing store, wants to introduce their line of clothing for women to another country. But their clothing sizes are based on the assumption that the average size of a woman is 162 cm. To determine whether they can simply ship the clothes to the new country they select 5 women at random in the target country and determine their heights as follows:

149, 165, 150, 158, 153

Should they adjust their line of clothing or they ship them without change? Make sure to decide at the 0.05-level.

By now statistical testing is second-nature (I hope -:)
Note that our test is inconclusive, which does not mean that we accept the null hypothesis. Thus, we don't recommend anything to GAP (since our test came out inconclusive). Using common sense, however, we recommend to GAP to conduct a new study but this time select a random sample of (much) larger size, something like 100 or more. Hopefully the new study will provide statistically significant evidence.

If you compare the T values in example 1 and 2 you see that they are very similar. Yet, the associated probabilities are very different. That is due to the fact that in example 1 we used a 1-tail test with a relatively high degree of freedom while in example 2 we had a 2-tail test with a very small degree of freedom. Both factors make the probabilities so different.