Harnessing the power of big data

Data is everywhere, yet there is shortage of information when it comes to determining effective public policy. The National Sample Survey holds a glimmer of hope for policymakers.

By Sabir Ahamed
  • Published 14.08.18
  • a few seconds read
  •  
Photo by Quinn Dombrowski / Flickr

India observes Statistics Day on June 29 to mark the birth anniversary of P.C. Mahalanobis, the architect of the most credible official statistical system in post-Independence India. The National Sample Survey Directorate was first set up under the ministry of finance in 1950; it aimed to collect data in the areas that are vital for developmental planning.

There has been an incredible boom in big data - data is everywhere - yet there is shortage of relevant and timely information when it comes to determining effective public policy. The National Sample Survey holds a glimmer of hope for policymakers and researchers. Studies show that reliable data form the bedrock of better governance; it helps citizens measure how much of the political rhetoric is translated into reality. For instance, an ongoing survey on sanitation will definitely shed light on the progress of the Swachh Bharat Mission.

The statistical findings of official bodies also make the government of the day more accountable. One NSSO survey indicates that digital literacy is inextricably linked to educational attainment. The focus of the Digital India initiative should thus be connected with an investment in the education sector, aimed at improving the quality of education.

The NSS has been providing critical data on education, employment, health, poverty and so on since its inception. More important, NSSO's stratified survey design is very effective in capturing diversities among various groups and spatial locations. For example, the recent education round-up shows how some social groups are still lagging behind more than 70 years after Independence, calling for the immediate attention of policymakers.

With technological advancement in data software, researchers can use datasets from more than one lakh households or of six lakh individuals to produce estimates according to social groups, economic status and regions. An analysis can also be done, with some caution, at the district level.

Untapped potential

Given the importance of the data, the NSS could have been far better utilized than it is at present. Very few researchers can actually exploit the entire scope of the data. Besides the government, NSS data is largely used only by students of economics or statistics, that too in a handful of specialized academic institutes.

What are the barriers in the way of accessing NSS data? The report of every round of survey is published along with a summary of the key findings. Yet using the raw data remains a challenge owing to multiple inherent bottlenecks. First, most social science courses have failed in educating students about the national statistical systems, including NSSO's surveys. Students are taught to master the craft of sophisticated and advanced quantitative techniques like time-series econometrics and panel data. But rarely anything is taught about data management, handling techniques and so on.

Then, the documentation of NSS data poses challenges for users owing to the use of old-fashioned guidelines that are exhaustive but not user-friendly. As NSS follows a highly technical sampling procedure, researchers without an understanding of statistics and economics find it difficult to comprehend the data. In addition, dealing with the data requires a high level of digital literacy, which is a lacuna as far as Indian social scientists are concerned. Some of the problems plaguing the handling of NSS data are the extraction of data, combining different data sets - appending and merging - and reshaping data - changing the units of the data.

In order to make NSS data more accessible, the government should consider increasing the allocation of funds for training young scholars, engaging students from non-economic and non-statistics background, and finally, special arrangements can be made with credible news organizations to promote data journalism using NSS data.