I say, I say, I say, when is a poll not a poll? When it’s a multilevel regression with post-stratification (MRP) model!
I won’t be joining the comedy circuit with that one. But this truly awful gag has a point. MRPs have become a fixture of modern election campaigns, alongside conventional opinion polling. Importantly, they aren’t the same thing as opinion polls. They do use polling data – a lot of it, as it happens – but they are not quite the same thing.
Election obsessives will remember where they were when the first ever MRP landed for a UK election. For everyone else, here’s the story.
A week before the 2017 general election, YouGov broke with just about everyone else by predicting a hung parliament. Its MRP method was new and hardly anyone knew what to make of it. The newspaper that had commissioned it led with “shock projection” and “controversial estimate”.
YouGov updated the model as election day approached, reiterating the hung parliament prediction, but few commentators endorsed it – until it turned out to be remarkably accurate. YouGov’s central estimate was that there would be 302 Conservative and 269 Labour MPs. The electorate returned 318 and 261, respectively.
YouGov’s MRP stood out in 2017 because every conventional pre-election poll pointed to a clear majority for Theresa May’s Conservatives. In 2019, YouGov’s MRP correctly predicted a clear Conservative majority, although underestimating the scale of it.
In 2024, amid speculation that the Conservatives could be reduced to fewer than 100 MPs, the appeal of MRPs is obvious. The seat-by-seat breakdown they provide also enables media outlets to identify specific senior politicians at risk of losing their seats. Standard opinion polling cannot provide the granular focus on likely constituency outcomes that MRPs do.
What makes an MRP different?
This central focus on seats rather than votes is the first way that MRPs differ from opinion polling.
Conventional polls seek to estimate the national vote share each party would secure if there were a general election at that moment. By contrast, the key purpose of MRP models is to estimate the likely winner in each of Great Britain’s 632 parliamentary constituencies (Northern Ireland is typically excluded) and to use that to propose the overall outcome of the election.
Because of their complexity, MRPs appear less frequently than opinion polls. In the first weeks of the 2024 election campaign, four MRP models were released, by Find Out Now and Electoral Calculus, More in Common, YouGov and Survation.
The common conclusion across all four models is that Labour is headed for a comfortable majority and the Conservatives are set to lose hundreds of seats. However, there is enormous variation in the estimates produced by the four models.
Labour’s haul is put at between 382 and 487 MPs. Estimated Conservative representation ranges from 66 to 180. For the Liberal Democrats, the models point to a parliamentary party of 30 to 59.
How is an MRP produced?
The second way that MRPs differ from opinion polls is in how they use polling data. MRPs are really a byproduct of online opinion polling. The internet has made polling cheaper, enabling agencies to poll many more people and at regular intervals.
While each standard opinion poll typically draws on a sample of about 1,000 people, pollsters can easily collect data from tens of thousands of people in a few days.
By aggregating all these survey responses, pollsters can build models that predict how different groups of people will vote. If you have enough responses from middle-aged, university-educated, home-owning males, you can estimate the probability of anyone with those characteristics voting for each party.
This analysis is repeated for each distinct social group identified, using a method called multilevel regression, and forms the first stage of the modelling – the MR part of the MRP.
The second stage uses this first set of calculations to estimate the outcome in each constituency. Population data from the census and other sources is used to calculate the exact mix of the different social groups in each constituency.
The models of how each group will vote are now placed in their geographical context. So, if a constituency has a particularly high concentration of middle-aged, university-educated, home-owning males, we have information that can help us understand how that constituency might be expected to vote.
This stage of the modelling, known as post-stratification, is the P part of the MRP. It also typically adds other variables, such as past general and local election results, to estimate turnout and to adjust for the particular characteristics of each constituency.
The limitations of MRPs
As statistical models, MRPs have limitations and need careful interpretation. They are based on probabilities and therefore produce a range of possible outcomes. While a central estimate of each party’s representation is reported, there is a margin of error around each of these.
Want more election coverage from academic experts? Over the coming weeks, we’ll bring you informed analysis of developments in the campaign and we’ll fact check the claims being made.
Sign up for our new, weekly election newsletter, delivered every Friday throughout the campaign and beyond.
There are also factors MRPs cannot take into account. Some constituency results will be influenced by locally specific issues that statistical models can’t capture. For example, adjustments cannot be made for who the candidates are in each constituency and whether any have a profile that could enhance, or detract from, their party’s chance of winning. Think Nigel Farage standing for Reform UK in Clacton.
And competing MRPs produce contrasting outcomes because each model uses specific assumptions and data. Different inputs produce different outputs. This election has seen more MRPs produced than ever before, and from various organisations. In the early hours of July 5, we’ll know which one came closest to getting the result correct.
Stuart Wilks-Heeg does not work for, consult, own shares in or receive funding from any company or organisation that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.
This article was originally published on The Conversation. Read the original article.