Opinion Polls: How accurate are they?

I've been watching the Presidential polls and have wondered just how accurate they are or will be in the upcoming 2020 election. In 2016, the majority of the polls nailed the popular vote to well within their customary 3% margin of error, but there were problems in some of the critical battleground states, namely PA, MI, an WI, that led to Donald Trump's upset win after many pundits assigned chance of winning percentage to Hillary of 80-90%. So what went wrong?
One answer to that question is what pollsters call a non response bias. Basically it says that there are certain groups, such as uneducated males over 50, that will not answer the phone, or if they do, are not inclined to engage in a conversation with a stranger or a robo caller, or they may not own a phone at all.
There's also the possibility of respondents not answering honestly, that they may be embarrassed to support one candidate or the other and only in the privacy of the ballot booth do they express their preference for a certain candidate.
A third problem is that of likely voters. With 40% or so of those eligible not casting their vote, it can influence the results if pollsters aren't able to identify those that will actually show up to the polls.
So why are state polls so difficult to get right? First off, there are fewer polls taken in states vs nation wide, and many of those that are out there are new to the game and can be very volatile. Many are run by local newspapers, which have come under intense budgetary pressure. They want results to publish but they don't want to have to pay a fortune to do it right. Another issue is that the data pollsters use to weigh their results is unreliable. Most use census results, which are performed every 10 years and can vary from county to county. Cell phone coverage is generally more reliable in the big cities than it is in rural areas. The nation has become increasingly more divided by rural vs. urban than it ever has in the past, which magnifies the errors produced as a result of these biases, especially when performed on a smaller scale like they are in state wide polling.
The biases, if quantified, can be accounted for in a poll result. For example, if I know that Trump voters are 50% less likely to answer the phone than Biden voters, I can factor that information into a prediction model and adjust for a non response bias. If I know that a 70 year old retired person is 50% more likely to show up and cast their vote and that older people tend to favor Trump, I can factor in a likely voter component into the equation.
This election cycle is even more complicated by the coronavirus crisis. Will Biden voters, generally more concerned about contracting/spreading the disease, less likely to turn out than the disbelieving Trump voters? Or will those that are older, ie Trump supporters, and more susceptible to the disease stay home? Will the sudden increase in racial discrimination awareness motivate black and minority voters, generally pro Biden, to turn out in November or will they stay home like they did in 2016? Will certain voters tend to take advantage of mail in balloting measures that are being adapted?
Anyhow, it's just some thoughts I've had on the subject. The poll numbers look horrible for Trump, but how accurate are they in predicting election results, especially this election, which will be unprecedented in the history of our nation as we've never before held an election during a pandemic.
One answer to that question is what pollsters call a non response bias. Basically it says that there are certain groups, such as uneducated males over 50, that will not answer the phone, or if they do, are not inclined to engage in a conversation with a stranger or a robo caller, or they may not own a phone at all.
There's also the possibility of respondents not answering honestly, that they may be embarrassed to support one candidate or the other and only in the privacy of the ballot booth do they express their preference for a certain candidate.
A third problem is that of likely voters. With 40% or so of those eligible not casting their vote, it can influence the results if pollsters aren't able to identify those that will actually show up to the polls.
So why are state polls so difficult to get right? First off, there are fewer polls taken in states vs nation wide, and many of those that are out there are new to the game and can be very volatile. Many are run by local newspapers, which have come under intense budgetary pressure. They want results to publish but they don't want to have to pay a fortune to do it right. Another issue is that the data pollsters use to weigh their results is unreliable. Most use census results, which are performed every 10 years and can vary from county to county. Cell phone coverage is generally more reliable in the big cities than it is in rural areas. The nation has become increasingly more divided by rural vs. urban than it ever has in the past, which magnifies the errors produced as a result of these biases, especially when performed on a smaller scale like they are in state wide polling.
The biases, if quantified, can be accounted for in a poll result. For example, if I know that Trump voters are 50% less likely to answer the phone than Biden voters, I can factor that information into a prediction model and adjust for a non response bias. If I know that a 70 year old retired person is 50% more likely to show up and cast their vote and that older people tend to favor Trump, I can factor in a likely voter component into the equation.
This election cycle is even more complicated by the coronavirus crisis. Will Biden voters, generally more concerned about contracting/spreading the disease, less likely to turn out than the disbelieving Trump voters? Or will those that are older, ie Trump supporters, and more susceptible to the disease stay home? Will the sudden increase in racial discrimination awareness motivate black and minority voters, generally pro Biden, to turn out in November or will they stay home like they did in 2016? Will certain voters tend to take advantage of mail in balloting measures that are being adapted?
Anyhow, it's just some thoughts I've had on the subject. The poll numbers look horrible for Trump, but how accurate are they in predicting election results, especially this election, which will be unprecedented in the history of our nation as we've never before held an election during a pandemic.