Chapter Summary

Chapter Objectives

13.1: Describe how to begin exploring the relationship between categorical variables.
13.2: Describe the shapes or types of bivariate relationships. 
13.3: Explain how to construct and interpret bivariate contingency tables.
13.4: Understand how measures of association measure the strength of relationships between variables.
13.5: Understand how to choose an appropriate measure of association.     
13.6: Introduce the chi-square statistic to test hypotheses involving categorical data.
13.7: Explain how to interpret contingency tables once a third (control) variable is introduced.
13.8: Describe the analysis of the difference between means for more than two means.

  • This chapter presents methods of statistical analysis of relationships between two variables. The relationship between variables lies at the heart of empirical social science inquiry.
  • The determination touches on several matters that are further discussed in the chapter: the level or scale of measurement of the variables, the form of the relationship, the strength of the relationship, and numerical summaries of relationships.
    • The level of measurement is important when selecting the appropriate method for investigating relationships between variables.
    • The relationship between two variables can take several forms: general association, monotonic correlation, and linear correlation.
  • Cross-tabulations are for use with nominal- and ordinal-level variables.
  • The strength of an association refers to how different the observed values of the dependent variable are in the categories of the independent variable. The direction of a relationship shows which values of the independent variable are associated with which values of the dependent variable
  • Measures of association indicate the extent to which two variables are related. There are many measures of association, and each is intended for use with nominal-, ordinal-, interval-, or ratio-level variables, but usually not all levels of measurement.
  • A correlation coefficient is also a measure of the strength of a relationship, but the term is generally reserved for use with linear relationships. 
  • When one or both of the variables in a cross-tabulation are nominal, Goodman and Kruskal’s lambda, λ, is commonly used.
  • Another measure of association that can be used instead of lambda, is phi (Φ).
  • Lambda is a proportional-reduction-in-error (PRE) coefficient. 
  • Kendall’s tau-b, Kendall’s tau-c, Somer’s D, and Goodman and Kruskal’s gamma are for use with ordinal-level variables. These statistics rely on contingency tables with concordant and discordant pairs and ties.
  • The chi-square2) statistic can be used to determine statistical significance for categorical data.
  • Phi (Φ), which is based on chi-square, divides the observed chi-square statistic by n and takes the square root of the quotient.
  • In a multivariate cross-tabulation, researchers control for a third variable by grouping. Researchers group the observations according to their values on the third variable and then observe the relationship between opinions on spending and voting within each of these groups.
  • Analysis of variance (ANOVA) is appropriate when one variable is nominal or ordinal and the other is interval or ratio. In ANOVA, two types of variation add up to the overall variation, or total variance.
  • Eta-squared is a measure of association used with the analysis of variance that indicates what proportion of the variance in the dependent variable is explained by the variance in the independent variable.
  • You should use an F test to determine statistical significance with ANOVA.