SSRS Encipher Hybrid

A Validated Methodology for Blending Probability and Nonprobability Samples

Encipher Hybrid is a unique and sophisticated method that leverages study-specific outcomes, advanced modeling techniques, and customized non-demographic measures to produce weighting margins that are optimized for reducing selection bias in key study outcomes.

In a validation study whose results are reported here >>, Encipher Hybrid:

Reduced bias in topline estimates by nearly 60% relative to a nonprobability-only sample.

Reduced bias in subgroup estimates (including breakouts by gender, age, educational attainment, and race) by similar amounts.

Substantially increased effective sample sizes relative to a probability-only sample.

Reasons to Consider SSRS Encipher Hybrid Samples

Research consistently finds that probability-based samples remain the “gold standard” for projecting estimates to a larger population but can be cost-prohibitive for many clients.

Nonprobability samples offer a much lower cost-per-complete but yield biased population estimates even if weighted on demographics because self-selection into online samples is driven by characteristics, attitudes, and behaviors that are not captured by traditional weighting demographics.

SSRS has developed an approach that utilizes data science to create a fully-customized and affordable solution to each unique application of hybridized data. This allows us to take advantage of both the low cost of nonprobability sampling (to obtain larger sample sizes) and the statistical rigor of probability sampling (to reduce the risk that estimates will be biased).

The SSRS Encipher Hybrid Approach

Encipher Hybrid is an ideal solution for researchers who need a “middle ground” between the greater accuracy of probability samples and the lower cost of nonprobability samples.

In a hybrid design, we administer a survey to side-by-side probability and nonprobability samples, and then blend the two sets of responses. The probability sample acts as an “anchor” to allow generalizability to the population, while the nonprobability sample provides a cost-effective source of additional respondents, allowing a larger total sample than could feasibly be obtained from probability sources alone. We apply SSRS’s specialized calibration methodology that matches the nonprobability respondents to the probability respondents on non-demographic characteristics that are related to key study outcomes. This corrects for known selection biases and allows the hybrid sample, as a whole, to provide a reasonable snapshot of the target population.

Encipher Hybrid offers solutions throughout the survey lifecycle—before, during, and after data collection—to allow probability and nonprobability completes to be analyzed as a single sample that can accurately be generalized to the population of interest.

Before Data Collection

We select a handful of topic-customized items from the SSRS Encipher Calibration Item Bank for inclusion on the questionnaire. Including these items allows us to go beyond simple demographic weighting and one-size-fits-all solutions to develop a weighting model that is well-tailored to study-specific measures.

Our Calibration Item Bank includes about 40 (and growing) non-demographic items that have been experimentally validated as being strong predictors of differences between probability and nonprobability samples. The items cover multiple topic areas, including Internet and Technology Use, Consumer Behavior, Political Attitudes, Health Behavior, Social and Institutional Trust, Privacy Attitudes, Science Attitudes and Knowledge, and Sports and Leisure Activities. This allows us to select items that are customized to the topic of a given study. Topic customization is critical because, to meaningfully reduce the risk of bias, weighting variables must be predictive of the substantive measures for which population estimates are desired*.

*Little and Vartivarian 2005.

During Data Collection

We administer the full questionnaire to side-by-side probability and nonprobability samples. The size of the probability sample is customized to study needs, but typically comprises 25% to 50% of the total completes. For many target populations, our SSRS Opinion Panel is available as a cost-effective source of probability completes. Of course, SSRS also offers custom probability-based designs.

After Data Collection

After data collection is complete, we follow the process illustrated in the figure below, beginning with the unweighted probability and nonprobability samples, to produce a calibrated hybrid weight.

SSRS Encipher Hybrid calibration weighting procedure

We begin by weighting the probability sample to an extended set of external demographic benchmarks. We use this weighted probability sample to produce internal benchmarks for the non-demographic items from the Calibration Item Bank.

We then apply SSRS’s Stepwise Calibration methodology to develop calibrated hybrid weights adjusted both to the external demographic benchmarks and the most useful internal calibration benchmarks. Stepwise Calibration adapts a guided-search algorithm** to identify a final set of weighting margins that is optimal for minimizing bias across a pre-identified set of key substantive measures from the survey. The algorithm begins with a large set of potential weighting margins that includes the items from the Calibration Item Bank along with key demographics. It then progressively narrows this “search space” to a much more parsimonious weighting model, retaining only those margins that meaningfully contribute to reducing selection bias. Using Stepwise Calibration, SSRS data scientists can develop study-tailored weighting models that balance bias reduction and other data quality measures across multiple study outcomes, all while keeping budget and turnaround time under control.

**For examples of similar approaches, see Schouten 2007 and Särndal and Lundström 2010.

Study Design

To validate the Encipher – Hybrid methodology, we fielded several surveys covering a broad range of topic areas. Each survey included (1) relevant items from the Encipher Calibration Item Bank and (2) several benchmarkable outcome items that we used to evaluate the success of calibration at reducing selection bias.

Each survey was fielded to:

  • A probability sample selected from the SSRS Opinion Panel. Most completes from this sample were by Web, with some phone completes to represent non-Web users.
  • A nonprobability sample purchased from an opt-in Web panel vendor. All completes from this sample were by Web.

We combined the resulting completes to create a hybrid sample in which 25% of the completes were from the probability-based SSRS Opinion Panel and the remaining 75% were from nonprobability sources. We applied the Encipher methodology to calibrate on non-demographic items from the Calibration Item Bank, in addition to standard demographics. We refer to this design as Hybrid – calibrated.

We then compared estimates from the Hybrid – calibrated design to estimates from three other designs, all of which were weighted only on standard demographics: Probability, using only the SSRS Opinion Panel; Nonprobability – demo, using only the nonprobability completes; and Hybrid – demo, using the same 25%-75% split but omitting Encipher calibration.

Read the findings in our full Report >>

Meet Encipher Hybrid

This paper introduces the new SSRS Encipher Hybrid methodology for blending probability and nonprobability samples.

READ THE PAPER

Get the Facts

Just want the highlights?
Here’s a one-page spec sheet you can download and share.

GET THE SPEC SHEET

Another Option: Encipher Nonprob

Improving Weighting for Nonprobability-only Samples

LEARN MORE

Encipher in Action: Current Work, Case Studies & Conference Presentations

Cornell 2022 Collaborative Midterm Survey

  • The NSF-funded Collaborative Midterm Survey is a multimode survey (including probability and nonprobability samples) conducted by three data collection teams (SSRS, Gradient and Survey 160, University of Iowa) selected from an open call for data collection proposals.
  • For its sample, SSRS used a combination of the probability-based SSRS Opinion Panel, a supplemental address-based sample, and multiple nonprobability panels. The SSRS sample was calibrated using our Encipher Hybrid methodology for blending probability and nonprobability data.
  • The SSRS sample presented a highly accurate picture of the 2022 midterm electorate. The estimated House generic ballot from the pre-election SSRS sample was within 1 percentage point of the actual results.
Learn more

SSRS Encipher Hybrid: A Case Study in Adaptation and Using Hybrid Samples to Produce Estimates for Subgroups

read

SSRS Encipher Hybrid: A Case Study in Using Our Hybrid Approach to Better Understand Perceptions of For-Profit Colleges

read

Can one weight fit all? Adjusting Hybrid Samples for Subgroup Estimation

MAPOR 2022 Conference Presentation

view

A full-service research organization, our highly experienced SSRS Methods, Analytics, and Data Science (MADS) group conceptualizes and manages survey data collection from design to dissemination. They have extensive expertise in methodological experimentation, sampling, weighting, data collection planning, monitoring, adaptive and tailored design, data editing and imputation, documentation, reporting, and quantitative analysis.

Contact the MADS Team for More Information about Encipher

EMAIL MADS