SSRS Encipher Hybrid
A Validated Methodology for Blending Probability and Nonprobability Samples
Encipher Hybrid is a unique and sophisticated method that leverages study-specific outcomes, advanced modeling techniques, and customized non-demographic measures to produce weighting margins that are optimized for reducing selection bias in key study outcomes.
In a validation study whose results are reported here >>, Encipher Hybrid:
Reduced bias in topline estimates by nearly 60% relative to a nonprobability-only sample.
Reduced bias in subgroup estimates (including breakouts by gender, age, educational attainment, and race) by similar amounts.
Substantially increased effective sample sizes relative to a probability-only sample.
Reasons to Consider SSRS Encipher Hybrid Samples
The SSRS Encipher Hybrid Approach
Encipher Hybrid is an ideal solution for researchers who need a “middle ground” between the greater accuracy of probability samples and the lower cost of nonprobability samples.
In a hybrid design, we administer a survey to side-by-side probability and nonprobability samples, and then blend the two sets of responses. The probability sample acts as an “anchor” to allow generalizability to the population, while the nonprobability sample provides a cost-effective source of additional respondents, allowing a larger total sample than could feasibly be obtained from probability sources alone. We apply SSRS’s specialized calibration methodology that matches the nonprobability respondents to the probability respondents on non-demographic characteristics that are related to key study outcomes. This corrects for known selection biases and allows the hybrid sample, as a whole, to provide a reasonable snapshot of the target population.
Encipher Hybrid offers solutions throughout the survey lifecycle—before, during, and after data collection—to allow probability and nonprobability completes to be analyzed as a single sample that can accurately be generalized to the population of interest.
Before Data Collection
We select a handful of topic-customized items from the SSRS Encipher Calibration Item Bank for inclusion on the questionnaire. Including these items allows us to go beyond simple demographic weighting and one-size-fits-all solutions to develop a weighting model that is well-tailored to study-specific measures.
Our Calibration Item Bank includes about 40 (and growing) non-demographic items that have been experimentally validated as being strong predictors of differences between probability and nonprobability samples. The items cover multiple topic areas, including Internet and Technology Use, Consumer Behavior, Political Attitudes, Health Behavior, Social and Institutional Trust, Privacy Attitudes, Science Attitudes and Knowledge, and Sports and Leisure Activities. This allows us to select items that are customized to the topic of a given study. Topic customization is critical because, to meaningfully reduce the risk of bias, weighting variables must be predictive of the substantive measures for which population estimates are desired*.
*Little and Vartivarian 2005.
During Data Collection
We administer the full questionnaire to side-by-side probability and nonprobability samples. The size of the probability sample is customized to study needs, but typically comprises 25% to 50% of the total completes. For many target populations, our SSRS Opinion Panel is available as a cost-effective source of probability completes. Of course, SSRS also offers custom probability-based designs.
After Data Collection
After data collection is complete, we follow the process illustrated in the figure below, beginning with the unweighted probability and nonprobability samples, to produce a calibrated hybrid weight.
We begin by weighting the probability sample to an extended set of external demographic benchmarks. We use this weighted probability sample to produce internal benchmarks for the non-demographic items from the Calibration Item Bank.
We then apply SSRS’s Stepwise Calibration methodology to develop calibrated hybrid weights adjusted both to the external demographic benchmarks and the most useful internal calibration benchmarks. Stepwise Calibration adapts a guided-search algorithm** to identify a final set of weighting margins that is optimal for minimizing bias across a pre-identified set of key substantive measures from the survey. The algorithm begins with a large set of potential weighting margins that includes the items from the Calibration Item Bank along with key demographics. It then progressively narrows this “search space” to a much more parsimonious weighting model, retaining only those margins that meaningfully contribute to reducing selection bias. Using Stepwise Calibration, SSRS data scientists can develop study-tailored weighting models that balance bias reduction and other data quality measures across multiple study outcomes, all while keeping budget and turnaround time under control.
**For examples of similar approaches, see Schouten 2007 and Särndal and Lundström 2010.
To validate the Encipher – Hybrid methodology, we fielded several surveys covering a broad range of topic areas. Each survey included (1) relevant items from the Encipher Calibration Item Bank and (2) several benchmarkable outcome items that we used to evaluate the success of calibration at reducing selection bias.
Each survey was fielded to:
- A probability sample selected from the SSRS Opinion Panel. Most completes from this sample were by Web, with some phone completes to represent non-Web users.
- A nonprobability sample purchased from an opt-in Web panel vendor. All completes from this sample were by Web.
We combined the resulting completes to create a hybrid sample in which 25% of the completes were from the probability-based SSRS Opinion Panel and the remaining 75% were from nonprobability sources. We applied the Encipher methodology to calibrate on non-demographic items from the Calibration Item Bank, in addition to standard demographics. We refer to this design as Hybrid – calibrated.
We then compared estimates from the Hybrid – calibrated design to estimates from three other designs, all of which were weighted only on standard demographics: Probability, using only the SSRS Opinion Panel; Nonprobability – demo, using only the nonprobability completes; and Hybrid – demo, using the same 25%-75% split but omitting Encipher calibration.