Chapter Summary

Chapter Objectives

5.1: Describe how sampling works.
5.2: Explain what can be learned from a population sample and its limitations.
5.3: Explain why the sampling distribution is key to using inferential statistics.
5.4: Explain how probability samples create a representative sample for use in inference about the population.
5.5: Explore why nonprobability samples can be a satisfactory choice for some projects. 

  • A population is any well-defined set of units of analysis.
  • A sample is any subset of units collected in some manner from a population. Once a sample has been collected, one can derive sample statistics that measure characteristics of the sample to estimate the value of population parameters that describe the characteristics of a population.
  • Samples provide only estimates or approximations of population attributes. Sometimes, the estimate may be right. Most of the time, however, they will differ from the true value of the population parameter.
  • When researchers report a sample statistic, they always assume there will be a margin of error or a difference between the reported sample statistics and actual population parameter values.
  • Researchers sacrifice some precision whenever they rely on samples instead of enumerating and measuring the entire population.
  • The goal of statistical inference is to make supportable conjectures about the unknown characteristics of a population based on sample statistics.
    • The key to making inferences from a sample statistic about a population parameter is the sampling distribution.
  • The difference between the sample statistic and the population parameter is called the sampling error and it arises because only a portion of a population is observed.
  • A sampling distribution is a theoretical frequency distribution of a statistic generated from an infinite number of samples drawn from a population.
  • If statistics are calculated for each of many independently and randomly chosen samples, their average or mean will equal the corresponding true, or population, value, no matter what the sample size. Statisticians refer to this mean as the expected value.
  • Larger samples are more likely than smaller samples to be representative of the population from which the sample is drawn. But, with larger samples, researchers can encounter larger costs.
  • An element, also referred to as a unit of analysis, is a single occurrence, realization, or instance of the objects or entities being studied.
  • A sampling frame is a list from which sampling units are drawn into a sample, and it must be specified clearly.
  • The sampling unit is the same as an element. In more complicated sampling designs, it may be a collection of elements.
  • A stratum or layer is a subgroup of a population that shares one or more characteristics.
  • Any difference between a population and sample is defined as bias. An unrepresentative sample will lead to inaccurate conclusions about the population.
  • The chapter discusses probability samples--defined as samples for which each element in the population has a known probability of inclusion in the sample. The particular population from which a sample is actually drawn is called a sampling frame, and it must be specified clearly.
  • These sampling methods are the first choice because they produce the most representative samples. They are probability samples.
    • In a simple random sample, each element and combination of elements in a population has an equal chance of selection.
    • A systematic sample is generated by selecting elements from a list of the population at a predetermined interval, that is, every fiftieth element on the list.
    • A stratified sample is drawn from a population that has been subdivided into two or more strata based on a single characteristic, and elements are selected from each strata in proportion to each strata’s representation in the entire population. A disproportionate stratified sample can also be useful when a researcher wishes to overrepresent a group that due to its small size in the population would not likely make up a large enough percentage of the sample to make quality inferences.
    • Cluster samples use groups of elements as an initial sampling frame (the fifty states in the union, e.g.), samples are then drawn from increasingly narrow groups (counties, then cities, then blocks) until the final sample of elements is drawn from the smallest group (individuals living in each household).
  • The chapter discusses nonprobability samples--defined as samples for which each element in the population has an unknown probability of inclusion in the sample. These sampling techniques, while less representative, are used to collect data when probability samples are not feasible.
    • A judgmental sample is typically used to study a diverse and usually limited number of observations rather than to analyze a sample representative of a larger target population. Observations are often hand selected.
    • A quota sample is a sample in which elements are chosen for inclusion in a nonprobabilistic manner (usually in a purposive or convenient manner) in proportion to their representation in the population.
    • A snowball sample relies on elements in the target population to identify other elements in the population for inclusion in the sample and is particularly useful when studying hard to locate or identify populations such as drug users, the homeless, or illegal immigrants.