Monday, 30th October 2017

E- paper

The poll on pollsters

Why they can go terribly wrong, and why that can hurt democracy

By Ashmita Gupta
  • Published 19.05.19, 12:13 AM
  • Updated 19.05.19, 12:13 AM
  • 3 mins read
  •  
A woman casts her vote inside a polling station in Allahabad, Uttar Pradesh on Sunday, May 12, 2019. (AP)

Elections engulf our country, much time on television is spent on analysing various opinion polls. Different TV channels compete with each other over the predictions they make. Till the time election results are declared, these predictions dominate discussions from drawing rooms in Lutyen’s Delhi to panchayats in the remotest villages. Very often people forget that these are mere “predictions” and not the actual results. If the predictions are right, the channels take credit, but in case they are wrong people blame everything on the validity of the methodology. Various issues need to be understood: What is the mathematics behind analysing and predicting poll trends? How is the data collected? Does the collected data represent the true preferences of the population?

The first thing to ask is if there are any scientific models which can give us accurate predictions. A misconception people have is that interviewing a small number of people can never truly reflect the preferences of over 550 million voters. It is mathematically proven that by using uniform random sampling, we can make predictions about people’s preferences with 99 per cent accuracy, and for this, a sample size of about 1,000 is sufficient.

However, due to the structure of the electoral system in India, to get the most accurate predictions, one has to do the sampling with a 1,000 sample size in each constituency. It is practically impossible to get such a large sample. The polls bypass this problem by breaking up the entire population into regions and demography and obtain a sample from each of these subgroups.

Hence, due to the structure of the electoral system and the considerable diversity in India, getting a sufficient number of uniform random samples involves enormous time and costs. For example, if in a region 3 per cent of the voters live in difficult terrain or Maoist areas, a sample of 1,000 people should include 30 people on average randomly from these areas. However, the surveyors do not have the resources to do that, and often these people are excluded; this introduces distortion and bias. Similarly, other socio-economic biases creep in depending on economic class, caste and gender. For example, some polls are conducted by multinational companies doing market research. They tend to follow the same technique to understand the preference between the Congress and the BJP as they do to understand the difference between Coke and Pepsi. Such polls are extremely biased as it is highly possible that they are reflecting the opinions of people belonging to a particular socio-economic group and living in big cities.

In an attempt to bridge the gap between the requirements of the scientific model and data collection, different polls try to optimise time, cost and accuracy of predictions by adopting various techniques. For example, surveyors make educated guesses about the voters based on historical trends and an understanding of local politics by getting opinions from experts. For instance, the RJD will get the majority of Yadav votes in Bihar, the BSP will get the Jatav votes in UP, and the BJP will not get a substantial proportion of the Muslim votes.

However, people could have multiple identities and choose to go with one identity and not the other. Even within castes, class operates based on relative accumulation patterns. Hence, sometimes these guesses might go wrong, adding significant error to the predictions. Also, obtaining honest answers from people is an art in itself and involves many techniques including psychology.

Even if we assume that data collection is done with minimum possible errors, they can only predict the vote share in each region/ group and not the number of seats. In 2014, the BJP just got 31 per cent of the total vote share but still formed the government with an overwhelming majority. On the other hand, the CPM in West Bengal had over 30 per cent vote share but won only two out of the 42 Lok Sabha seats. There is no scientific formula to convert the vote share into the number of seats. Hence, predictions are made by making informed guesses with limited knowledge; they are prone to errors.

Assuming that opinion polls are honest and correct, they should be able to accurately predict the vote percentage in a region on the date of the data collection. However, historically, it is seen that as much as 30 per cent of voters in India are “swing” voters. These voters are the ones who do not have any fixed allegiance and can swing depending on circumstances. Sometimes this could be right before election day or on the day of the election itself. Hence, opinion polls taken before fail to capture this last minute swing.

Predictions based on opinion polls are a combination of science and art. The science part tells pollsters what the best possible sampling methodology is, and the art part enables them to make educated guesses based on historical data.

However, these opinion polls can themselves form an opinion which can affect the swing voters. There is a feedback effect of these opinion polls. People tend to vote for the “winners”. So if a certain party is declared the winner by these polls, people can be swayed into voting for that party. And if there is collusion between the media and some political party, these polls could be politically motivated. In these situations, the polls could actually “form” opinions rather than reflect what people’s opinions are.

Gupta is a Calcutta-based independent researcher and academic