All these polls saying this side or that side will win in any political situation… I have never been asked by these poll-makers. I don’t know anyone who has been asked.
Do they ask the same people over and over again? If so, are the results skewed by being produced only from people who volunteer for polls? Rather like the antismokers only asking the opinions of nonsmokers, or the antibooze lot only asking the opinions of teetoallers?
If the polls are based on a volunteer pool then they are not truly random. They require a character trait – being the sort of person who answers polls – which might be linked to other character traits, including one which could be the subject of a particular poll.
For example, if the trait ‘volunteers for polls’ is linked to the trait ‘hates cats’ then a poll asking ‘should cat ownership be banned’ would get a resounding ‘yes’. But a lot of people like cats. I don’t have one but there’s one who’s been trying to move in for the past few weeks. It practically lives in the garden and mews at the back door. I have tried explaining that it doesn’t live here and I have no cat food but so far, to no avail. It looks healthy so it’s getting fed somewhere.
Back to polls. The big-news one at the moment is the Scottish Independence Vote. It keeps coming out 50/50, and any swing of one or two percent is hailed as victory by the appropriate side.
The result we’ve been seeing is an average of four polls. Three are each linked to a newspaper, the fourth to the ‘No’ campaign and they currently look like this:
Panelbase/Sunday Times – Yes 49.4% No 50.6%
Opinium/Observer – Yes 47% No 53%
ICM/Sunday Telegraph – Yes 54% No 46%
Survation/Better Together – Yes 46% No 54%
If you simply average the percentages you get Yes 49.2% and No 50.9%. The overall poll is claiming 49% Yes and 51% No so it looks like that’s what they’ve been doing.
The man who produces the overall prediction said this –
‘We are dependent on a pot of people which is defined, but we don’t know how big it is and in my view it won’t be big enough.’
Size matters, but not overall size. The size of each of those four polls matters a lot.
Let’s take an extreme example. Poll A reports 80% Yes and 20% No. Poll B reports 20% Yes and 80% No.
Averaging the percentages gives you a 50/50 split vote.
However, suppose poll A asked 500 people and poll B asked only 100. In that case, poll A reported 400 ‘Yes’ votes and 100 ‘No’ votes. Poll B reported 20 ‘Yes’ votes and 80 ‘No’ votes.
Which means that the real pool of data is 420 ‘Yes’ and 180 ‘No’. The real percentages are 70% ‘Yes’ and 30% ‘No’.
Averaging percentages is not going to work unless each poll asks exactly the same number of people every time. If you don’t know how many are in each of four datasets then averaging the percentage results is meaningless. If Poll B asked 100 of the people already surveyed by Poll A then it’s even worse – Poll B becomes entirely irrelevant! Any overlap between polls (and people who answer polls might well answer more than one) further complicates the calculation.
This is not complex maths. It’s well within the required ability of any kind of number-cruncher-based employment and a high ranking pollster should know this.
To work out the average of four polls you need to know a) how many people were polled and b) how many people answered more than one poll – and what their answers were. That last bit might not be so easy, if the poll is to be kept confidential, but even knowing how much overlap there is between polls would let you work out a plus-or-minus for the overall result.
It seems that the 50/50 we have been hearing about might be way, way off target. The result could be a slam-dunk one way or the other.
The final result is anybody’s guess. Even the pollsters now admit it.