Chapter Summary

This chapter provides an introduction to statistical analysis.

Statistical inference involves both hypothesis testing (establishing if there is reason to believe that an observed relationship is likely to be real or if it is likely to have occurred by chance) and point and interval estimation (estimating unknown population parameters from known sample data).

Null hypotheses have two important characteristics:

  • They are succinct and precise assertions about population parameters.
  • They are stated in such a manner that data plus statistical theory allow us to reject them with a known degree of confidence that we are not making a mistake.
     

In addition to stating a null hypothesis, researchers state another hypothesis called the research or alternative hypothesis.

Hypotheses are accepted or rejected on the basis of statistical likelihood. A type 1 error occurs when a true null hypothesis is mistakenly rejected. A type 2 error occurs when a false null hypothesis is mistakenly accepted.

We indicate how sure we want to be that we are not committing a type 1 error with the selection of an alpha level.

A sampling distribution is a mathematical function that indicates the probability of different values of the estimator occurring.

To test for statistical significance we compare observed test statistics to critical values from a distribution. With a small sample one can calculate a t-value (divide the difference of the means by the standard error of the mean) to compare with a critical value drawn from the known probabilities of the student’s t-distribution. The critical value is determined by the degrees of freedom, the confidence level, and whether you use a one or two-tailed test. If the absolute value of the calculated t-value is greater than the critical value you reject the null hypothesis.

One must know the degrees of freedom; choose a one- or two-tailed test and a confidence level to find a critical value on the t-table.

With large samples or population data one can test hypotheses with the standard normal or z distribution and associated table using a z score.

The standard error of the mean is the standard deviation of the sampling distribution and is a measure of the imprecision in the estimators.

The confidence interval refers to the range of likely values associated with a given probability or confidence level. Thus for every confidence level, a particular confidence interval exists.

Construction of a confidence interval gives the range of values of an estimated population parameter for a given level of statistical significance or confidence level.