A causal Bayesian explanation and method for handling conflicts
Update 8 Sept 2024: An extensively updated version if this article appears here.
A recent article by Daniel Jupp noted an astonishing difference between what most polls on the US Election were reporting compared to that of a Rasmussen poll:
In the latest polling, every single major polling company and organisation has Queen Kamala and Trump tied, or even Queen K slightly ahead.
Except for one.
Rasmussen has Trump at 62% and Kamala at 35%.
So all the others say even, and Rasmussen says Trump has a huge lead. A giant 27% lead. So what we are seeing here is an outlier, right? The majority of polls must be right and Rasmussen has got it wildly wrong.
No.
For a start a 27% difference in polling results on the same question is not an outlier. An outlier would be maybe a 5 or at its greatest 10% difference from the rest. The Rasmussen difference is too massive to be dismissed as a mere anomaly.
First of all we can be much more explicit in answering the question that Daniel poses when we model the problem using a generic causal Bayesian network. We will assume there are just two candidates X (Trump) and Y (Harris). For simplicity, we will assume that there are two polls, with X and Y tied in poll 1 (so X is at 50%) and Y slightly ahead in poll 2 (so X is 49%) and another conflicting poll (Rasmussen) reporting X at 62%.
We will also assume that all polls have a sample size of 1000 and are unbiased.
Then using standard Bayesian assumptions**, we can ‘learn’ the true (but unknown) percentage p who favour X (as a probability distribution) when we observe poll results. The more results we observe the more accurate the prediction of p. And we can also predict what the results will be in an (as yet) unobserved poll.
So, if we first observe poll 1 showing X at 50% and poll 2 showing X at 49%, the (as shown in Figure 1) we learn that the true percentage for X is a distribution with mean 49.5% and 95% conifence interval 47.3% to 51.7%. In other words we would conclude that there is a 95% probability that the true percentage for X is between 47.4% and 51.7%. If the sample numbers were higher (respectively lower), the confidence interval would be narrower (respectively higher).
But note that the model also gives us the explict answer to the question Daniel posed:
The probability that a different poll would result in at least 62% for X is 0.00004%.
That is a 1 in 2.5 million chance.
So, as Daniel correctly implied in his conclusion, that is not a feasible outlier.
It means there would have to be a causal explanation for such a result.
Daniel suggests that the contradictory poll (Rasmussen) may have posed a different question than the others. While that is something that could be checked, he also raises the possibility of systemic bias as another explanation. If the Rasmusson poll, for example, oversamples Republicans (i.e. those more likely to vote for X) then obviously it will record higher percentages for X than the other polls. Similarly if there is systemic oversampling of Democrats (i.e. those more likely to vote for Y) in the other polls then obviously they will record lower percentages for X. Because the model explicitly incorporates the possibility of such biases we can revise our conclusions by removing our assumption of no bias.
So, if we know nothing about the biases in the polls, then oberving 62% for the contradictory poll results in the revised probabilities shown in Figure 2.
Note that the Bayesian inference enables us to conclude that both sets of polls are biased. There is 91% probability that the contradictory poll is biased in favour of X and a 97% probability that polls 1 and 2 are biased in favour of Y. The reason why there is higher probability that polls 1 and 2 are biased than the contradictory poll is because we are assuming that these polls suffer from the same common bias. The overall impact on the revised probability of the true percentage is that it has mean 56.2% with 95% confidence interval 51% to 61.8%. So, even though we now have results from 3 rather than 2 samples, the confidence interval is much wider because of the conflicting results.
But what if we know that the conflicting poll is unbiased and know nothing about any bias in the polls 1 and 2? Then, as shown Figure 3, we learn that there is as 99.98% probability that the other polls are biased in favour of Y and the revised probability of the true percentage is that it has mean 60.5% with 95% confidence interval 57.5% to 63.5%.
As Daniel notes, there is indeed evidence that, unlike the Rasmussen poll, the other polls have systematically oversampled Democrats.
This model can also be used to take account of any known biases. For example, if it was known that the polls 1 and 2 systematically oversampled Democrats, but only be a small amount, say 2%, then we get the results shown in Figure 4. Note that is in now certain there is bias, but the revised probability distribution is no longer so dominated by the (unbiased) contradictory poll result. The revised probability of the true percentage is that it has mean 55.2% with 95% confidence interval 53.3% to 57.1%.
Here is a video demonstration of the above Bayesian network argument (software by agena.ai):
This article by John Ward is also interesting and relevant
4 Sept 2024 Update: Interestingly, it does appear that the hypothesis that the polls showing there was little to separate Harris and Trump were indeed likely to have been the result of oversampling Democrat supporters.
** whereby the number of people in a sample supporting X is a Binomial(n, p) distribution where n is the number in the sample and p is the true (but unknown) percentage for X
Enhance your cybersecurity expertise and career prospects with OSCP Training and Certification. Benefit from hands-on experience, real-world scenarios, and expert guidance to master offensive security techniques.
Independent variables and an assumption of normal distribution are need to make your statements on confidence intervals. Try as hard as they can pollsters fail to find samples of people that are not correlated in their opinions. Interesting article though, showing how a single outlier can completely skew any statistical conclusion.
Very interesting and well explained. Once you point out the implications they seem intuitively reasonable.
Thanks Norman, that’s extremely interesting and obviously a lot more detailed than my effort. The Rasmussen polling seems consistently different to the rest, from what I’ve since checked. Rasmussen supplied the figures I quote (I checked that) but I have seen talk that he was referencing the bookies (which I haven’t confirmed either way). It seems very odd to me that a pollster would put bookies figures alongside actual poll figures though. Your analysis techniques would also aply of course to any Rasmussen/other polls disparities, of which there have been multiple examples.