# Political Science Research Methods

Ninth Edition

# Chapter Summary

**Chapter Objectives**

14.1: Understand the logic behind an ordinary least squares (OLS) regression.

14.2: Describe how to calculate a bivariate regression.

14.3: Explain how to interpret bivariate regression results and test hypotheses.

14.4: Describe why one would include multiple independent variables in a regression to control for other sources of variation.

14.5: Explain how to interpret multivariate regression results and test hypotheses.

14.6: Understand the logic behind a maximum likelihood analysis.

14.7: Explain how to interpret logistic regression results and test hypotheses.

- A
**regression analysis**is a technique for measuring the relationship between two interval- or ratio-level variables. Regression is to make causal assertions, rather than assertions about correlation. To make the assertions, researchers rely on**regression coefficients**, which are estimates of the unobserved population parameters. - There are ten classical assumptions for linear regression models.
- Graphs provide the first step when conducting a regression analysis. A common graph that is utilized is a
**scatterplot**. They show at a glance the form and strength of relationships. - Establishing causal relationships is about the mean and variation from the mean.
- In a bivariate regression, researchers can plot a regression line that represents the relationship between the independent and dependent variables.
- The ordinary least squares regression formula is
*Y*=*a*+*bx*and describes the slope of a line.*Y*is the dependent variable,*a*is the*y*-intercept (or constant),*b*is the slope, and*x*is the independent variable.- If
*b*is positive, the relationship is positive, and if*b*is negative, the relationship is negative. - Regression provides the best fit line by minimizing the squared distances from each data point to the line--or minimizing the squared errors.

- If
- In a regression analysis, the dependent variable is a continuous ratio-level variable. There are various equations that can be utilized in a regression analysis.
- A statistic related to regression, as you will see in the equation below, is
**Pearson’s**, the correlation coefficient. Pearson’s*r**r*indicates the level of association between two variables. - You can further use Pearson’s
*r*to calculate another statistic calledor*R*-Squared*R*^{2}.*R*-squared is a commonly reported statistic interpreted as the percentage of variation in*Y*that explained by the variation in the independent variable. **Multiple regression analysis**extends the bivariate regression analysis presented in Chapter 13 to include additional independent variables.*Both types of regression*involve finding an equation that best fits or approximates the data and describes the relationship between the independent and dependent variables.- A
**multivariate regression****coefficient**is a number that tells how much*Y*will change for a one-unit change in a particular independent variable, if all the other variables in the model have been held constant.

- A
**dummy variable**has two categories, generally coded 1 for the presence of a characteristic and 0 otherwise. Recoding a nominal-level variable as a dummy variable allows the variable to be used in numerical analysis. - One can measure an interaction to determine whether variables behave differently in the presence of a third.
- There are different maximum likelihood models analysts can use to analyze dichotomous dependent variables we will only discuss results from one: logistic regression, also known as logit.

- The
**maximum likelihood estimation**is a class of estimators that chooses a set of parameters which provides the highest probability of observing a particular outcome. - Maximum likelihood models work differently than regression to account for the limited range in the dependent variable.
- You cannot interpret a maximum likelihood model in the same way that you interpret regression.

- The
- A (nonlinear)
**logistic regression**is usually a better choice for a binary dependent variable. - A logistic regression is interpreted differently than a multiple regression. Coefficients in a logistic regression change when each independent variable is set at a different value (like the mean or one standard deviation above the mean).
- The difference between interpreting logit results and OLS regression results is that we cannot interpret the magnitude of the coefficient.
- Researchers cannot interpret the magnitude of the coefficients in maximum likelihood models, so political scientists turn to various tools for additional interpretation beyond the coefficients.
- Those tools rely on predicted probabilities.
- They can be reported in a table or used to generate various graphical representations.

- Regression is an important tool used by political science researchers and students.