Discovering Statistics Using IBM SPSS Statistics
Intermediate Questions
The owner of the large chain of coffee shops called ‘MoonBucks’ decided to calculate how much revenue was gained from lattes each month in a nationwide sample of 2445 cafés. To measure the variance of revenue gained from lattes, he computes SS = 351,936 for this sample.
 What are the degrees of freedom for variance?
 Compute the variance.
 Compute the standard deviation.
 144
 12
 2444
Answer: 144
If we calculated an effect size and found it was r = .42 which expression would best describe the size of effect?
 Small
 Small to medium
 Medium to large
 Large
Answer: Medium to large
If we use the mean as a model, what does the variance represent?
 The average error between the model and the observed data.
 The total error between the model and the observed data.
 The squared total error between the model and the observed data.
 The squarerooted average error between the model and the observed data.
Answer: The average error between the model and the observed data
Which of the following best describes the variable ‘Gender’?
 A betweengroup variable.
 A coding variable.
 All of the possible answers are correct.
 A grouping variable.
Answer: All of the possible answers are correct
Which of the numbers below might IBM SPSS report as 10.574 E−05?
 0.00010574
 10.569
 1057400.0
 0000.10574
Answer: 0.00010574
Which of the numbers below might IBM SPSS report as 8.96 E+03?
 89.60
 8960.0
 0.008960
 8.960
Answers: 8960.0
Which of the following does a box–whisker plot not display?
 The mean
 The median
 Outliers
 The lower quartile
Answer: The mean
What type of graph can we use to compare frequency distributions of several groups simultaneously?
 A histogram
 A bar chart
 A population pyramid
 A boxplot.
Answer: A population pyramid
Out of the following options, which type of bar chart would we produce to look at the mean ratings of two new varieties of Sauvignon Blanc (wine)?
 Clustered bar chart
 Stacked bar chart
 Simple 3D bar chart

Simple bar chart
Answer: Simple bar chart
Out of the following options, which type of bar chart would we produce to look at the mean ratings of ‘taste’ and ‘value for money’ for two new varieties of Sauvignon Blanc wine?
 Clustered bar chart
 All of the options are possible.
 Stacked bar chart
 3D bar chart
Answer: All of the options are possible
Out of the following options, which type of graph could we use to compare frequency distributions of several groups simultaneously?
 Population pyramid
 Simple histogram
 Frequency polygon
 Simple 3D bar chart
Answer: Population pyramid
Which of the following are assumptions underlying the use of parametric tests (based on the normal distribution)?
 All of the options are true.
 Some feature of the data should be normally distributed.
 The samples being tested should have approximately equal variances.
 The data should be at least interval level.
Answer: All of the options are true
Approximately what percentage of people would have scores lower than an individual with a zscore of 1.65 in a normally distributed sample?
 95%
 98%
 It is not possible to calculate this unless the mean and standard deviation are given.
 1%
Answer: 95%
Assuming the assumptions of parametric tests are met, nonparametric tests, compared to their parametric counterparts:
 Are all of these.
 Are more conservative.
 Are less likely to accept the alternative hypothesis.
 Have less statistical powe
Answer: Are all of these
How much variance has been explained by a correlation of .9?
 18%
 9%
 81%
 None of these
Answer: 81%
A correlation of .7 was found between time spent studying and percentage on an exam. What is the proportion of variance in exam scores that can be explained by time spent studying?
 .70
 .49
 .30
 .7
Answer: .49
Which of the following statistical tests allows causal inferences to be made?
 Analysis of variance
 Regression
 None of these, it’s the design of the research that determines whether causal inferences can be made.
 ttest
Answer: None of these, it’s the design of the research that determines whether causal inferences can be made
Which of the following statements about outliers is not true?
 Outliers are values very different from the rest of the data.
 Influential cases will always show up as outliers.
 Outliers have an effect on the mean.
 Outliers have an effect on regression parameters.
Answer: Influential cases will always show up as outliers
For which regression assumption does the Durbin–Watson statistic test?
 Linearity
 Homoscedasticity
 Multicollinearity
 Independence of errors
Answer: Independence of errors
A researcher was interested in stress levels of lecturers during lectures. She took the same group of 8 lecturers and measured their anxiety (out of 15) during a normal lecture and again in a lecture in which she had paid students to be disruptive and misbehave. What test is best used to compare the mean level of anxiety in the two lectures?
 Independent samples ttest
 Pairedsamples ttest
 Oneway independent ANOVA
 Mann–Whitney test
Answer: Pairedsamples ttest
What does the error bar on an error bar chart represent?
 The confidence interval around the mean.
 The standard error of the mean.
 The standard deviation of the mean.
 It can represent any of these.
Answer: It can represent any of these
Differences between group means can be characterized as a regression (linear) model if:
 The experimental groups are represented by a binary variable (i.e. coded 0 and 1).
 The outcome variable is categorical.
 The groups have equal sample sizes.
 Differences between group means cannot be characterized as a linear model, they must be analysed with an independent ttest.
Answer: The experimental groups are represented by a binary variable (i.e. coded 0 and 1)
A researcher measured people’s physiological reactions to horror films. He split the data into two groups: males and females. The resulting data were normally distributed and men and women had equal variances. What test should be used to analyse the data?
 Dependent
 Independent ttest
 Mann–Whitney test
 Wilcoxon signedrank test
Answer: Independent ttest
The combined effect of two variables on another is known conceptually as _______, and in statistical terms as _________
 Mediation, an interaction effect
 Moderation, a direct effect
 Moderation, an interaction effect
 Mediation, a direct effect
Answer: Moderation, an interaction effect
Imagine we wanted to look at the relationship between the number of hours spent practising the guitar per week and skill level. If we had reason to believe that the strength or direction of the relationship between these variables will be affected by level of enjoyment, what type of analysis should we conduct on these data?
 Moderation analysis
 Mediation analysis
 Twoway repeatedmeasures ANOVA
 ANCOVA
Answer: Moderation analysis
Grand mean centring for a given variable is achieved by:
 Taking the mean of all scores (ignoring from which variable they come) and subtracting each score from it.
 Taking each score and subtracting from it the mean of all scores (for that variable).
 Taking each score and dividing it by the mean of all scores (for that variable).
 Taking each score, subtracting the mean and then dividing by the standard deviation.
Answer: Taking each score and subtracting from it the mean of all scores (for that variable)
Imagine we wanted to look at whether enjoyment of guitar playing influences the relationship between time spent practising and skill level. How would we know if we had a significant moderation effect?
 If the interaction of time spent practising and skill level was a significant predictor of enjoyment.
 If the relationship between time spent practising and skill level does not change when enjoyment is entered into the model.
 If the relationship between time spent practising and skill level disappears when enjoyment is entered into the model.
 If the interaction of time spent practising and enjoyment was a significant predictor of skill level.
Answer: If the interaction of time spent practising and enjoyment was a significant predictor of skill level.
A researcher measured people’s physiological reactions while watching a horror film and compared them to when watching a comedy film, and a documentary about wildlife. Different people viewed each type of film. The resulting data were normally distributed and the variances across groups were similar. What test should be used to analyse the data?
 Repeatedmeasures analysis of variance
 Kruskal–Wallis test
 Friedman’s ANOVA
 Independent analysis of variance
Answer: Independent analysis of variance
Levene's test tests whether:
 Data are normally distributed.
 The variances in different groups are equal.
 The assumption of sphericity has been met.
 Group means differ.
Answer: The variances in different groups are equal
What assumption does ANCOVA have that ANOVA does not?
 Homogeneity of variance
 Homoscedasticity
 Homogeneity of sample size
 Homogeneity of regression slopes
Answer: Homogeneity of regression slopes
Which of the following are affected by including a covariate in an analysis of variance?
 All of these.
 The error mean square
 The betweensubjects mean square
 The Fratio.
Answer: All of these
What would the levels of the independent variables be for a twoway ANOVA investigating the effect of four different treatments for depression and gender?
 4 and 1
 2
 4 and 2
 6
Answer: 4 and 2
How many independent variables were used and how were they measured in a threeway independent ANOVA?
 Three independent variables all measured using the same entities
 Three independent variables all measured using different entities
 One independent variable (with three levels) measured using the same entities
 One independent variable (with three levels) measured using different entities
Answer: Three independent variables all measured using different entities
A nutritionist conducted an experiment on memory for dreams. She wanted to test whether it really was true that eating cheese before going to bed made you have bad dreams. Over three nights, the nutritionist fed people different foods before bed. On one night they had nothing to eat, a second night they had a big plate of cheese, and the third night they had another dairy product, milk, before bed. All people were given all foods at some point over the three nights. The nutritionist measured heart rate during dreams as an index of distress. How should these data be analysed?
 Oneway independent ANOVA
 Oneway repeatedmeasures ANOVA
 Threeway repeatedmeasures ANOVA
 Threeway independent ANOVA
Answer: Oneway repeatedmeasures ANOVA
In repeatedmeasures ANOVA, the assumption of independence is:
 Always met
 Unimportant
 Tested using the Levene’s test
 Always violated
Answer: Always violated
A researcher tested 40 children aged 6 years. Each child engaged in a task where they had to use two dolls (one representing themselves and one representing a teacher) and they had to enact a time when their teacher had been angry with them. All children were videotaped and 20 children were told that their teacher would see the tape and 20 were not. What experimental design has been used?
 A repeatedmeasures design
 A matched design
 A mixed design
 A betweensubjects design
Answer: A betweensubjects design
A researcher tested 40 adults. Each adult had to rate their mood after listening to a tape of people being sick, and then again after a tape of people laughing. What experimental design has been used?
 A matched design
 A repeatedmeasures design
 A mixed design
 A betweensubjects design
Answer: A repeatedmeasures design
The power of MANOVA to detect an effect depends on:
 A combination of the correlation between dependent variables and the effect size to be detected.
 A combination of the correlation between independent variables and the effect size to be detected.
 A combination of the correlation between independent and dependent variables.
 None of these
Answer: A combination of the correlation between dependent variables and the effect size to be detected
What would you use Box’s test for?
 To test for multivariate normality.
 To test for independence of residuals
 To test for homogeneity of variance
 To test the assumption of homogeneity of covariance matrices.
Answer: To test the assumption of homogeneity of covariance matrices
If your MANOVA is statistically significant:
 You could conduct separate Bonferroni corrected ANOVAs on each dependent variable.
 There is no added value in performing discriminant function analysis.
 You could conclude that all groups differ significantly.
 None of these are correct.
Answer: You could conduct separate Bonferroni corrected ANOVAs on each dependent variable
A square matrix in which the diagonal elements are equal to 1 and the offdiagonal elements are equal to 0 is known as:
 A variance–covariance matrix
 A column vector
 An identity matrix
 The error sum of squares and crossproducts matrix (or error SSCP)
Answer: An identity matrix
Varimax rotation should be used when:
 You believe that the underlying factors will be correlated.
 You believe that the underlying factors are nonorthogonal.
 You believe that the underlying factors are independent.
 Kaiser’s criterion is met.
Answer: You believe that the underlying factors are independent
Kaiser's criterion for retaining factors is:
 Retain any factor with an eigenvalue greater than 1.
 Retain any factor with an eigenvalue greater than 0.3.
 Retain factors before the point of inflexion on a scree plot.
 Retain factors with communalities greater than 0.7.
Answer: Retain any factor with an eigenvalue greater than 1
What is a latent variable?
 It is a variable that cannot be measured directly.
 It is another name for a factor.
 Latent variables represent clusters of variables that correlate highly with each other.
 All of these are correct.
Answer: All of these are correct
On which of the following does the critical value for a chisquare statistic rely?
 The degrees of freedom
 The sum of the frequencies
 The row totals
 The number of variables
Answer: The degrees of freedom
Imagine you conducted a study to look at the association between whether expectant mothers in two different age groups (18–30 and 31–43 years) eat breakfast (or not) and the gender of the firstborn child. Which of the following options would be the most appropriate method of analysing these data?
 Chisquare test
 Threeway repeatedmeasures ANOVA
 Loglinear analysis
 Twoway mixed analysis of covariance
Answer: Loglinear analysis
The odds ratio is:
 The ratio of the probability of an event not happening to the probability of the event happening.
 The probability of an event occurring.
 The ratio of the odds after a unit change in the predictor to the original odds.

The ratio of the probability of an event happening to the probability of the event not happening.
Answer: The ratio of the odds after a unit change in the predictor to the original odds
Larger values of the loglikelihood statistic indicate:
 That there are a greater number of explained vs. unexplained observations.
 That the statistical model fits the data well.
 That as the predictor variable increases, the likelihood of the outcome occurring decreases.
 That the statistical model is a poor fit of the data.
Answer: That the statistical model is a poor fit of the data
Logistic regression assumes a:
 Linear relationship between continuous predictor variables and the outcome variable.
 Linear relationship between continuous predictor variables and the logit of the outcome variable.
 Linear relationship between continuous predictor variables.
 Linear relationship between observations
Answer: Linear relationship between continuous predictor variables and the logit of the outcome variable
A researcher was interested in the effects of information about exercises that relieve back pain delivered in two different ways by doctors. Doctors were recruited from different hospitals and each gave several patients the information. How many levels are there in this hierarchical data structure?
 3
 1
 2
 4
Answer: 3
A researcher was interested in the effects of information about exercises that relieve back pain delivered in two different ways. Several doctors within the same hospital delivered the information to multiple patients. How many levels are there in this hierarchical data structure?
 1
 3
 2
 4
Answer: 2
You've done an ANCOVA but found that the assumption of homogeneity of regression slopes has been violated. Which of these is a potential solution to overcome this problem?
 Multiple regression
 Discriminant function analysis
 Multilevel model
 Factorial ANOVA
Answer: Multilevel model
Missing data pose the least problem for:
 Analysis of variance
 Multiple regression
 Principal component analysis
 Multilevel linear models
Answer: Multilevel linear models
Answer choices
 CIC
 AIC
 CAIC
 BIC