Non-probability Samples:

Emerging Methods and Models for High Quality Research

Declining participation rates, coupled with FCC legislation mandating how cell phones are dialed, and declining working number rates on landlines have all contributed to a stark increase of the cost of telephone research.

In our “Status of Telephone Interviewing” White Paper, we documented how deep empirical dives on the quality of telephone research finds that, despite significant declines in participation, telephone research is of the same high quality that it was 20 and even 30 years ago.

Election polling research has found that error rates in predicting the popular vote have been unchanged for a generation [1], and likely since the advent of political polling in the late 1940s.  Failures in certain states aside, most national pollsters predicted the popular vote in 2016 with greater accuracy than in 2012 or 2008.  And modern day research finds that if anything, bias due to errors in telephone surveys are not increasing, but declining [2].

That said, it is undeniable that declining participation rates, coupled with FCC legislation mandating how cell phones are dialed, and declining working number rates on landlines have all contributed to a stark increase of the cost of telephone research.  While precise data has not been made public by survey research companies, it is not unfair to say that the cost of a telephone survey has likely doubled in the past decade.

While many survey consumers have absorbed the increased cost, others have had much greater difficultly in doing so, and have had to move to lower cost alternatives.

Probability-based Internet panels are one viable alternative.  However, they tend to be smaller in size and thus are not useful for studies of low incidence populations, or small geographies, or studies that require large sample sizes, or studies that track either monthly, weekly, or are constantly fielding.  That leaves quite a lot of research for which probability-based panels are not a good fit.

The second alternative is of course a nonprobability (convenience, or opt-in) Internet panel.  And while much of market research has leveraged this approach for already more than a decade, social science researchers and pollsters have avoided them nearly universally.  And for good reason: Every first-tier blind-review research project dedicated to assessing the quality and accuracy of data attained via nonprobability panels finds serious errors, bias, and variance [3].  While nonprobability panels can arrive at accurate estimates, relying on such an outcome is akin to rolling the dice and hoping for double sixes [4].

So if, for a given project, telephone is unaffordable, probability panels are not a great fit, and nonprobability are too risky to rely upon, what is a well-intentioned survey researcher to do?

There are options.  SSRS, in partnership with the Data Science team at the University of Massachusetts-Boston have conducted over two years of research and development on the use of nonprobability samples in social science research.  In one approach, available companion data is utilized to model and calibrate a nonprobability sample.  A second option is a true hybrid approach that blends probability and nonprobability samples to attain a complete representative unified sample.

As a number of researchers have found, there is no “cookie cutter” approach to making nonprobability samples representative [5].  Indeed, a range of papers presented over the last few years at the conference of the American Association for Public Opinion Research have found that “universal solutions,” regardless of approach (which can include sample matching, propensity weighting, and/or traditional calibration/raking) fare poorly in reducing bias for many surveys.

There is a need for a solution that is relatively universal in its approach, but highly customized for any given specific study.

SSRS/UMass has developed four different methods to reduce error and bias in nonprobability samples.  The first is to utilize a machine learning technique to generate a sample matching approach to nonprobability sample selection based on probability gold standard benchmarks.  The second is a model that reduces bias due to nonprobability samples having coverage error specific to persons who do not access the Internet.  These are overarching, preventative solutions to bias, but are known to typically not completely erase bias in the process.  While useful, the third and fourth approaches are our principal solutions and have been used quite successfully for a number of projects.

The first of these is used for studies that gather all of their data via a nonprobability panel.  In this approach, we leverage the SSRS Omnibus [6] to gather key benchmarks in a simple random sample telephone survey, and use those benchmarks to model and calibrate the nonprobability survey.  The details of how this approach is executed depends on the existence of prior data, but typically we take whatever information is available prior to a study, combined with Omnibus data to execute a machine learning technique to identify key interactive variables and utilize those interactions to re-calibrate the nonprobability sample.  This approach can be quite effective, but the effectiveness depends on a number of features of the data.  Given that the study is 100% nonprobability data, this is considered the lower cost but also lower quality solution of the two presented here.

The final, and higher quality approach is the true hybrid study.  In short, this is a study where somewhere between 20% and 80% of the data are gathered via nonprobability panel and the remainder via telephone sample.  On the one end of this spectrum, where 80% are gathered via RDD telephone (or a probability panel), we have found almost no situation where our modelling technique cannot ensure the data to be basically as representative and accurate as a 100% RDD telephone survey…at a modest but significant reduction in costs in nearly all cases.  At the other end, where 80% is gathered by nonprobability sample, the modelling has to work much harder to attain representativeness, but this can be quite successful given a number of key features in the data such as relatively large sample size and efficacy of the variables used in modelling to relate to sample type and other variables in the data. This approach substantially reduces costs, by as much as 50% from an all-RDD telephone survey approach.

The hybrid approach utilizes a machine learning approach to identify not just key main variables but key interactions between variables to assess the efficacy of using those variables and interactions for advanced calibration and modelling.

This approach leverages the state of the art of Big Data analytical techniques to provide the best solution possible for combining the probability and nonprobability data into a universal sample.  The goal is not to get the nonprobability sample to be representative, but rather for it to “contribute” to representativeness in concert with the probability sample.  In studies where this approach has been executed, we have found the differences between gold-standard probability and the hybridized data fall to insignificance with only a few minor exceptions in the worst of cases.  Again, the efficacy of this procedure will depend upon a number of features of the approach and the data, but overall it is a viable solution for those needing to reduce cost compared to a telephone survey while maintaining data quality.

Have we piqued your curiosity? Good! We are always happy to talk further about our cutting-edge approaches. Let us help you find the best solution for your research project.



[2] David Dutwin and Trent Buskirk (2017). Telephone Sample Surveys: Dearly Beloved or Nearly Departed? Trends in Survey Errors in the Age of Declining Response Rates.  Under Peer Review.

[3] See Chiang and Krosnick, 2009; Dutwin and Buskirk, 2017; Malhotra, & Krosnick, 2007; Walker et al, 2010; Yeager et al., 2011.

[4] While this might seem a stretch, consider that Walker et al found that only two of 17 web panels were able to accurately assess the smoking rate in the U.S. within a few percentage points, while others attained a smoking rate nearly double the true national prevalence (33% compared to 17.5%)

[5] Andrew W. Mercer; Frauke Kreuter; Scott Keeter; Elizabeth A. Stuart (2017). Theory and Practice in Nonprobability Surveys: Parallels between Causal Inference and Survey Inference. Public Opinion Quarterly, 81 (S1), 250–271.


Want more information?