15 Comments
⭠ Return to thread

Chris,

Electoral predictions based on polling is a very cloudy area of "data science". Polls purport to show who will win and by what margin. However, to access the data validity of any given poll, we must understand who's included in the participant poll and in what proportions.

Polling results can be produced to lean "right/GOP/MAGA" or lean "left/DEM/LIBERAL" simply by over weighting or under weighting certain demographic groups.

For instance, the US population is approximately 75/76% Urban and 24/25% Rural. US Urban voters tend to vote Democratic by +10%-15% compared to Rural voters who are a majority GOP conservative.

Therefore, if a pollster wants to bias the poll in favor the GOP, they can overweight the number of rural participants.

The same is true for other demographics like education level, income or age.

The damage created by bias polling can be significant when the press reports poll results as "facts". The most heinous example I'd the NYTs/Sienna poll. The participant pool in the NYTs poll is significantly GOP biased. It has been since last year.

Using the Times data, political reporters all across MSM report that Biden is behind Trump.

And Biden is behind Trump among less educated, rural voters. However, those are not the 81 million voters who elected Biden in 2020.

The current presidential race would be better served if reporters would qualify candidate predictions based on the inherent bias in the polling data they reference.

This rule would certainly clarify the pissing match between Simon Rosenberg and Issac Chotiner

Expand full comment

This is wrong. The NYT polling is not biased. This is a falsehood pushed by the left. again, we need to deal in facts. Not what we WANT to be true.

Please read this: https://www.nytimes.com/article/times-siena-poll-methodology.html

Expand full comment

No Chris you are wrong. I suggest you scroll done to the bottom of thim month's NYTs poll to see the demographics in each of the 6 swing states highlighted. You will find an overweighting of rural voters in each state. Nothing wrong with this if the pollster wants to show what a big rural vote might mean in Nov. However, as a neutral poll suggesting how America will vote, it's biased as hell

Expand full comment

So based on Merrill’s comment I went into each of the demographics of the participant groups for the swing states and found the following (urban/rural): PA (65/35); WI (48/52); AZ (75/25); GA (61/39); MI (62/38); and NV (82/18). The only state that really looks out of whack WI which is 70% urban vs 30% rural. I didn’t look up the distributions in the other states. Regardless, the poll states: “To further ensure that the results reflect the entire voting population, not just those willing to take a poll, we give more weight to respondents from demographic groups underrepresented among survey respondents, like people without a college degree.” So, even if the demographics of the respondents didn’t match the overall population, the poll accounted for this in their results through their weightings.

Expand full comment

According to the Census Bureau, 80% of the U.S. population lives in urban areas. The remaining 20% lives in areas classified as rural.

Expand full comment

Jim

Thanks for looking into the urban/rural split. If you check census stats you'll see that truly rural population in the US is about 24% so I'm not sure how you determined the Times poll isn't biased to rural voters who are far more conservative/MAGA than urban voters.

Expand full comment

That’s on the population as a whole. The breakout varies by state which is why the Polsters take into account the actual geographic breakout but state. And IF their participant pool differs from that breakout, they adjust their weights accordingly so that the disparity is accounted for. You will never see the actual participant breakout exactly match the state geographic breakout. You cannot look at the pure participant breakout to determine a poll is biased. You must dig much deeper, which you will not typically see, to the actual weights they use to adjust the results to,account for the disparity. It’s impossible to look at the published results and determine if the poll is biased, UNLESS they also published their weights.

Expand full comment

The rural and urban population of Pennsylvania:

In 2020, rural municipalities had a total of 2.9 million residents or 22 percent of the state's population. Urban municipalities had 10.1 million residents or 78 percent of the state's population.

Expand full comment

I'm not sure where your 65/35 urban/rural split came from for PA.

Several sites I checked showed results in the 22-25% rural residents for PA. If 35% of respondents included in the Times data are from rural areas, then the Times survey data is over counting rural voters by ~17%. Since rural voters are more Republican these days, this is one key factor in why the Times' polls Aren't biased

Expand full comment

My numbers were from the poll. I was not describing the actual demographics. So the PA participants in the poll broke 65/35. As I’ve said before the poll participants will almost never match the actual geographic distribution. That’s why they use weights to adjust the numbers. You again are misinterpreting the results. You are correct that the rural population was over represented but you are not correct in saying the results are biased because the weights will offset the over representation. It’s basic polling statistical methodology.

Expand full comment

I meant why Times' polls are GOP biased.

Expand full comment

Chris I will not just dismiss Merrill's scientific analysis of these polls as wrong just because it's NYT. Merrill's statistical/ scientific analysis makes more sense to me than the NYT analysis of their polls. I guess everyone can pick the analysis he or she wants to believe

Expand full comment

Thanks Dr Abubakar

I think mine is really a simple point. Political poll results are really indicative of the attitudes and opinions of the participant group.

How broadly any given poll results can be applied to the broad voter population, depends on how closely it matches the broad population. When we see and read Nate Cohn's polls, we must take him at his word that the Times poll intentionally leans GOP.

Expand full comment

The basic principle of Statistics (Statistics 101) is that for you to be able to generalize a study (poll) to a population , your sample demogrsphics must be representative of the population you're trying to apply your study(poll) to. This NYT poll and many other polls do not represent the general electorate. These pundits just like using the polls to drive their parochial narratives.

Expand full comment

Polls do not purport to tell anyone who will win. They are a snapshot in time of a representative opinion sample. Actual voting and counting is the only thing that tells us who won, and includes a bunch of factors that pollsters have an incredibly hard time measuring. Voting habits change by state laws allowing mail-in and early voting, the number of polling places and wait times, and even the weather in certain places. Especially in very close elections, polling has little way of capturing those factors that are election-day specific.

Expand full comment