On the morning of Nov. 8, 2016, many Americans went to bed confident that Hillary Clinton would be elected the nation’s first female president.
Their confidence was driven, in no small part, by a pervasive message that Clinton was ahead in the polls and forecasts leading up to the election. Polling aggregation sites, such as Huffington Post’s Pollster and The New York Times Upshot blog, reported that Clinton was virtually certain to win. It soon became clear that these models were off the mark.
Since then, forecasters and media prognosticators have dissected what went wrong. The finger-pointing almost inevitably landed on public opinion polling, especially at the state level. The polls, critics argued, led modelers and the public to vastly overestimate the likelihood of a Clinton win.
With the 2018 elections coming up, many in the public have expressed their skepticism that public opinion polls can be trusted this time around. Indeed, in an era where a majority of American adults no longer even have landline telephones, where many people answer only when calls originate from a known number, and where pollsters’ calls are sometimes flagged as likely spam, there are lots of reasons to worry.
But polling firms seem to be going about their business as usual, and those of us who do research on the quality of public opinion research are not particularly alarmed about what’s going on.
One might be tempted to think that those of us in the polling community are simply out to lunch. But the data from 2016 tell a distinctly different story.
The national polls were fairly accurate both in their national estimate of the popular vote in 2016 and in historical perspective. In the average preelection national poll, Clinton was ahead of Donald Trump by 3.3 percentage points. She proceeded to win the popular vote by 2.1 percentage points. Pollsters missed the mark by a mere 1.2 percentage points on average.
The polls in the Upper Midwest states missed by larger margins. These polls were conducted in ways that pollsters widely know to be suboptimal. They relied heavily on robocalls; on surveys of people who volunteer to take surveys on the internet; and on samples of respondents from voter files with incomplete information.
So why was the 2016 election so shocking? The big reason wasn’t the polls, it was our expectations.
In the last few years, members of the public have come to expect that a series of highly confident models can tell us exactly what is going to happen in the future. But in the runup to the 2016 election, these models made a few big, problematic assumptions.
For one, they largely assumed that the different errors that different polls had were independent of one another. But the challenges that face contemporary polling, such as the difficulty of reaching potential respondents, can induce small but consistent errors across almost all polls.
When modelers treat errors as independent of one another, they make conclusions that are far more precise than they should be. The average poll is indeed the best guess at the outcome of an election, but national polling averages are often off by around 2 percentage points. State polls can be off by even more at times.
In addition, polling aggregators and public polling information have been flooded by a deluge of lower-quality surveys based on suboptimal methods. These methods can sometimes produce accurate estimates, but the processes by which they do so is not well-understood on theoretical grounds. There are lots of reasons to think that these methods may not produce consistently accurate results in the future. Unfortunately, there will likely continue to be lots of low-quality polls, because they are so much less expensive to conduct.
Research out of our lab suggests yet another reason that the polls were shocking to so many: When ordinary people look at the evidence from polling, just as with other sources of information, they tend to see the results they desire.
During the 2016 election campaign, we asked Americans to compare two preelection polls – one where Clinton was leading and one where Trump was ahead. Across the board, Clinton supporters told us that the Clinton-leading poll was more accurate than the Trump-leading poll. Trump supporters reported exactly the opposite perceptions. In other studies, we saw the same phenomenon when people were exposed to poll results showing majorities in favor of or opposed to their own views on policy issues such as gun control or abortion.
So, what does this all mean for someone reading the polls in 2018?
You don’t have to ignore the results – just recognize that all polling has some error. While even the experts may not know quite which way that error is going to point, we do have a sense of the size of that error. Error is likely to be smaller when considering a polling average instead of an individual poll.
It’s also a good bet that the actual result will be within 3 percentage points for an averaging of high-quality national polls. For similarly high-quality state polls, it will likely be within more like 5 percentage points, because these polls usually have smaller sample sizes.
What makes a high-quality poll? It will either use live interviewers with both landlines and cellphones or recruit respondents using offline methods to take surveys online. Look for polls conducted around the same time to see whether they got the same result. If not, see whether they sampled the same kind of people, used the same interviewing technique or used a similar question wording. This is often the explanation for reported differences.
The good news is that news consumers can easily find out about a poll’s quality. This information is regularly included in news stories and is shown by many poll aggregators. What’s more, pollsters are increasingly transparent about the methods they use.
Polls that don’t use these methods should be taken with a big grain of salt. We simply don’t know enough about when they will succeed and when they will fail.