Dig for gold

Mousum Datta is a miner who digs out precious resources. But his form of mining doesn’t involve drilling machines or gigantic trucks — just smart software, immense patience and some intellect. His work is all about crunching numbers and finding patterns — mining for data in an ocean of information.

“We try to trace patterns from a database, which help companies take business decisions,” says Dutta, 34, a senior software specialist at SAS, a leading data analysis firm in Pune. “For instance, retail majors ask us to analyse their billing database to gain an insight into consumers’ purchasing behaviour — such as how often they shop or what they buy,” he explains. The information helps the retailer choose stocks or organise the store to boost sales.

The retail business is just one of the sectors he looks at. Industries such as banking, insurance and even movie production are competing on the back of predictive models of consumer behaviour built through mined data. Big data is big business.

“'It’s a virtual goldmine, the biggest industry that people have started talking about,”' says C.A. Murthy of the Indian Statistical Institute (ISI), Calcutta.

As a seasoned data miner, he’s been involved in several mammoth central government projects such as the decennial census and the Five Year Plans, but storing the data collected had been a problem until recently. Now supercomputers and connected networks have made storage easier. “Yesterday’s storage problem is today’s strategic asset,” he says.

Rajeeva Karandikar, director of the Chennai Mathematical Institute and a top data scientist, stresses that data mining is the “new name for age-old” analytics or statistical analysis. “You need basic mathematical skills and should be able to handle software programs to get into this field.”

People with experience of statistical modelling have an advantage, but they are not the only ones who can delve into the data mine. “A solid background in soft computing or a penchant for working with huge data sets is an advantage,” says Bhamidipati Narayan, a data miner working with Yahoo Labs, Bangalore. A PhD from ISI, Calcutta, he analyses the consumer behaviour of those who surf the Internet using Yahoo. “By mining terabytes of data culled out of zillion ‘search’ terms, we try to track probable consumers of a particular product so that the company can reach out to them with pinpoint precision,” says Narayan. In other words, the approach maximises the impact of advertisements, pushes up sale figures and spares viewers or readers from getting bombarded with irrelevant ads.

Thanks to the Internet and new mobile devices, the world has become a swelling ocean of data. Social media networks are also throwing up personal information that is up for grabs. “Cleaning up raw data — like looking for a needle in a digital haystack — is not an easy task,” says Narayan.

Indeed, this part of the job is tedious. According to Dutta, basic mining (extraction, transformation and loading or ETL) is 70 per cent of the task. “It’s like harvesting ore from a mine. A good software programmer with a graduate degree in engineering, technology or computer applications can do this.”

But finding patterns (akin to cutting a raw diamond) needs a higher degree in statistics, mathematics, physics, economics, computer science or bioinformatics. The final task is to present the data (this can be likened to displaying the diamond in a showroom) by analysing bits and pieces of information, accomplished by user interface programmers with impeccable communication skills.

Even though data mining is finding diverse applications (ranging from the prevention of disease to gauging gambler satisfaction in casinos) there are not enough qualified people. “There’s a huge dearth of big data skills in the country. A report by [consultancy firm] McKinsey suggests that with organisations increasingly depending on data crunching, the demand will shoot up,” says Gautam Munshi, who’s been running an analytic training institute in Bangalore since 2007.

Munshi, whose classes have been attracting more and more vice-presidents and product managers of top companies, recently teamed up with specialist analyst Pavan Bhat to set up a research and analytical firm called Redwood Associates. Among other things, the firm helps companies hire recruits who are “best-fit” in an organisation as consultants. “We are using data mining techniques to enable employers pick and choose the best of the lot.”

According to Dutta, many youngsters are launching such startups. “It needs a little investment (around Rs 10-15 lakh) and a few good professionals. I have seen hundreds of such firms grow in India in the last five years. Most of these clean up basic data sets (ETL phase) for foreign firms,” he says. Karandikar points out that the next outsourcing boom is happening in this field. “Top Indian IT firms are now in a rush to recruit data analysts.”

Data mining is still in its infancy, but it’s going to change the way we do business — just like IT transformed the face of global industry a decade ago.

So are you ready to dig for gold?

Where the jobs are


  • Applied to identify potential customers of a product or service, the size of the market, the profitability of the market


  • Designed to help banks in their operational activities like customer acquisition, operations and minimising default risks

Human resource management

  • To predict the effectiveness of hiring, the performance of employees and attrition decisions

Manufacturing industry

  • For continuous process improvement, quick detection of defects and identifying customer trends early

Pharmaceutical industry

  • Measurement of the effectiveness of strategies, picking up of customer trends immediately, cutting the down time of clinical trials