top of page

Data is the new oil, analytics, the new refinery and data breach, the new oil spill!

We, millennials, believe that social media platforms and messaging services are important for our sustenance. These platforms process many terabytes of data every second. These data sets are termed as Big Data.

Just like how the countries with highest oil reserves were the richest and the wealthiest in the 1900s, with the onset of the 21st century, companies and countries which have access to, and control of maximum and vital data are the ones which are the wealthiest! The titans of the 21st century- Alphabet, Amazon, Apple, Facebook and Microsoft- look as invincible as Standard oil and Exxon way back in the 1900s. Data now gives dominance. It provides companies with a ‘Bird’s eye view’ of activities and an ability to improve customer satisfaction. Information technology has not only become an inseparable part of our lives but has also created huge amount of jobs and significant revenues for entirely new industries and places (Silicon Valley, Bangalore), just like oil did way back in the 1900s. Thus data, like oil, helps create an economic moat for its owners and ensures the prosperity of those who control it.

Big Data – An Introduction

‘Big Data’ refers to big and complex data sets that cannot be processed through traditional data-processing software. Early 2000s saw the emergence of online start-ups (such as Facebook and Google) whose core business was designed around big data. In no time, it found application in other fields for storage and analysis. Some of the applications are shown in figure 1. For the purpose of our article, we are focussing on big data applications in finance. Finance is concerned with estimating the amount of money required and tapping the sources from which it can be obtained. It encompasses a multitude of sub-groups. Though big data is a game changer for the entire field of finance, according to us, it has a huge impact on banking and insurance.

Features of Big Data

Three Vs of big data are important for its popularity. These are:


Huge quantity of data can be generated, processed, transferred and stored using big data applications. According to an estimate, Facebook processes 600 terabyte and stores about 300 petabyte data per day. In Indian context, best example is Aadhar data of Indian citizens handled by UIDAI.


Quantitative data as well as data from videos, graphs, images, and audio can be stored, processed and transferred through big data applications. More than 350 million photos are uploaded per day and about 100 million hours of daily video watch time is used by Facebook users.


Big data applications enable the processing, storage and transfer of data in real time. We can share photographs on WhatsApp and Instagram within microseconds. Transactions are processed in no time through high-frequency stock trading algorithms on BSE and NSE, just on a click of the mouse.

Real-life applications of Big Data in Banking

With the help of Big Data analytics, financial services firms and banks not only store data, they also use it to generate insights for the business and to add value. This helps in immediate decision-making. We have explained five ways in which Big Data analytics are used by banks and financial services firms.

Fraud Detection

The Reserve Bank of India (RBI) in its Financial Stability Report, 2017, called frauds in banks and financial institutions ‘one of the emerging risks to the financial sector’ and has strongly recommended the use of technology to detect frauds. Big data applications, such as analytics and machine learning, help banks to differentiate a fraudulent transaction from a legitimate one. These platforms can analyse any transaction in less than 300 milliseconds. Every transaction is compared with normal activity, as per recorded history, with the help of big data applications. Any aberration may indicate that a fraud is being perpetuated. Customer is sent an email and SMS. The bank may block irregular transactions. Thus, a fraud may be nipped in the bud. The credibility of digital banking transactions, especially, depends on the efficiency of big data.

Compliance and Regulatory Requirements

According to a research study by Corlytics Analytics, top 10 investment banks lost $43 billion in fines between 2009 and 2015 due to failure in customer reporting. Big data analytics can save investment banks from such non-compliance fines by preparing timely compliance reports and helping banks to meet other reporting requirements.

Moreover, commercial banking sector is one of the heavily regulated ones in most of the countries in the World. It is subjected to continuous monitoring and has to adhere to reporting requirements of the Central bank of the country. Banks have to aggregate data that is scattered across locations and servers. Big data analytics help banks to easily ingest such scattered data. This data may be used by the Central bank to identify abnormal trading patterns.

Customer Segmentation

Banks have changed their approach from product-centric to customer-centric. By dividing the customers into different segments based on their requirements, banks understand them at granular level and can serve them better. Big data helps banks to group customers into distinct segments based on customer value (low, mid or high – based on the kind of revenue relationship has generated); demographics (e.g. age, geography, gender, income level, marital status); banking pattern (online or offline or mobile banking); and other criteria. Based on the customer requirement, the bank can customise solutions. Even promotion and marketing campaigns should be designed keeping the desired customer segment in view.

Personalised Marketing

Accenture in its Global Consumer Pulse Research Report, 2017 reveals that about 48% of surveyed consumers in the US expect specialized treatment for their loyalty and interestingly, 33% of those consumers who switched loyalty during last one year was because personalisation was lacking. Personalised marketing, rooted in customer segmentation, is increasingly considered as the panacea to win loyal customers.

According to World Retail Banking Report issued by Capgemini, banks would be able to retain only 55% of its retail customers for the next two quarters. Big data forms the backbone of personalised marketing. Sources of data capturing can be merchant records, customers’ social media profiles or even customers. CGI’s research study has disclosed that banking customers are willing to share their personal information with their bank to enable it to understand its financial goals and requirements better.

Risk Management

Stress testing helps banks to manage liquidity risk. For example, UOB bank from Singapore made use of big data application and reduced time required to calculate its total-bank risk from about eighteen hours to a few minutes. The risks of algorithmic trading are managed through backtesting strategies against historical data. Big data analysis can also support real-time alerting if a risk threshold is surpassed.

Case Studies from India

1. HDFC Bank

HDFC Bank Ltd. uses Big Data to obtain knowledge not only about the financial habits of a customer but also his personal habits. It helps to reduce the possibility of money laundering as it identifies suspicious activity. Analytics helps them to keep a record of the credit histories of the customers so that they can disburse loans accordingly.

2. ICICI Bank

A Debt Collection process was adopted by ICICI Bank which helped them to improve their customer satisfaction. Developed in-house, the Business Intelligence (BI) solution was implemented by the bank. Multiple channels were examined for debt collection and evaluating performance. This application of ICICI is a trendsetter.

3. Axis Bank

The bank uses analytics in almost every sphere. It helps them to get information about the customer’s background. It also plays an important role in building customer loyalty. Axis Bank uses Statistical Analysis System (SAS) to provide customer intelligence, and to improve the risk management throughout the organisation by giving them early warning signals.

Real-life applications of Big Data in Insurance

The insurance industry is founded on estimating future events and measuring the risk/value of these events. Big data improves customer satisfaction, risk prediction accuracy and fraud detections, as explained below.

1. Client Satisfaction

Big data assists insurance companies in not only collecting mammoth data about their clients but also analyzing that data to make tailor-made insurance policies for clients. Companies can, today, acquire a comprehensive understanding of customer behaviours, habits and needs, so as to create a unique customer profile. This information can be used to determine the suitable policy and its premium. Thus, by putting forth policies with flexible options and terms (for example, family floaters), customer satisfaction is increased.

Big data also helps in certain other ways like – with improved technology, marine insurance companies can use Geo-spatial data to find the exact location of the ships and track its route. Insurers can warn their clients with timely weather precautions and foreseeable calamities (if any) and thus, strengthen customer relations.

2. Risk Mitigation for Insurers

The mammoth data collected by companies helps them to assess the risk undertaken by accepting the client’s application, to a great extent.

For example, relevant dangers at the location where home insurance is applied for. Accordingly, premiums and the decision to undertake the policy are more informed nowadays.

3. Fraud Detection

Insurance providers, today, are looking to use algorithmic fraud detection techniques which focus more on the individual who files the claim as compared to the traditional focus on the claim itself. Earlier with lesser data availability, insurers would scan through innumerable claims to identify danger signals. However, today, insurers are trying to devise algorithms that help identify red flags in the individuals who filed the claim. This has become possible with text mining, a system that scans large amounts of data for keywords. Now within minutes, insurers can find out the entire biodata of the claimants-how many claims they have submitted, their social category and past connections and/or claims submitted with rival insurance companies.

4. Loan Underwriting

Loan underwriting is the process which determines whether or not the borrower’s loan application is of acceptable risk. Underwriters use the 3C’s model to assess the borrower’s ability to repay the loan. This includes analysis of the borrower’s credit reputation, capacity to repay (calculated using debt ratios and cash reserves), and the collateral pledged.

Intelligent automation can help insurers mine through vast amounts of data and highlight anomalies, patterns and red flags. Further, with the help of machine learning -which trains data to improve algorithms- and predictive analysis, insurers can get detailed and accurate information about the chances of repayment of the loan, and hence, the risk involved in its underwriting.

Health Insurance

Big data analytics contributes significantly in this area in the following ways:

1. IoT (Internet of Things) Devices

Availability of remote IoT healthcare devices that measure and communicate health statistics of a person, are greatly benefitting healthcare. Patients no longer need to extend their hospital stay for mere checks and subsequent regulation of dose/medicines. Their virtual presence along with information sent over from the Iot device to various doctors at regular intervals can achieve the same results. Thus, this reduces hospital stay significantly which is beneficial for both the client and insurance companies.

2. Reference of Doctors

Insurance companies have access to the information of all their clients -what ailments they suffer from, places where they were treated and how their experience was. By linking members in cohorts, insurers can make references of doctors and hospitals that benefitted others, suffering from the same disease. In fact, they can do much more now due to the availability of big data. For example, insurance companies can, in the case of a client with a back problem, suggest not only orthopaedic physicians but also orthopaedic surgeons with the expertise of sports medicine or a doctor dealing with Orthopedic trauma/Hand surgery/foot and ankle surgery; or simply suggest a doctor handling pediatric orthopaedics.

Everything has its own set of pros and cons. Big Data is no exception. The application of Big Data, particularly in the Health Insurance sector, has some negatives, which are discussed below.

1. Data Leak

Though IoT devices make lives easy by providing data from even remote areas to the doctors/hospitals/insurers, chances of a data leak is very high. With a very small mistake or wrong intention, health data of millions can be exposed.

2. Risk pool

The foundation on which health insurance works is risk pooling. One deals with both, healthy and sick people, so that taking up insurance is not only affordable for the sick but also, profitable for the insurers.

If it happens that only the sick take insurance, they would be forced to pay very high premiums (based on insurance company’s past data history and high risk of insuring them). Otherwise, the insurance companies would be at a loss. On the other hand, if only healthy people take health insurances, the whole purpose of insuring people against ailments is defeated.

In fact, Big Data could be used by companies to separate the healthy from the sick. Artificial intelligence and algorithms have the ability to predict through genetics and behavioural patterns, diseases persons are likely to develop. Using this technique, insurers would then sort people with more risk of diseases and make sure they do not insure them, and subsequently raise their risk.

Alternatively, they will definitely increase premiums depending on risk profiles unless forbidden by law. As a result of targeted increase in premiums, it would become too expensive for those considered riskiest, forcing them to drop out. Consequently, only the relatively healthy people would be covered, and hence there would be no risk pooling of healthy and sick people, and this would defeat the whole purpose of insurance.

Authors’ Point of View

From a time where one needed huge supercomputers to store and process small amounts of data to an era where an unimaginable amount of data is produced, stored and processed every day, big data has certainly changed the world. And it doesn’t just stop here. We believe big data is a tool that will give dominance to its owners and something that will fuel growth and development in future. To put it in a nutshell, big data has a monumental impact on each and every field- be it banking, insurance, stock markets or simply everyday life. We believe that the world is moving towards a situation where jobs, education and investment will be centred around Big Data.

Cambridge Analytica scandal, instances of leakage of Aadhaar data, and WhatsApp fake news controversy have brought the problems associated with Big Data to the forefront. The underlying issue is the concern regarding localisation of data, and the respect for the privacy of individuals. Big companies like MasterCard and Visa are against the concept of localisation of data ostensibly because of the cost involved. The Supreme Court of India has recently recognised the Right to Privacy of every individual as a Fundamental Right under the Constitution of India. In order to protect the misuse of personal data, it is important that we be aware that our consent is important if the data is to be shared by the government and the private players. This requires that we read terms and conditions carefully before registering ourselves on any website which requires our personal details, to register ourselves or to access information on that website. Corporates should regard the issue of data privacy as good business ethics and should sensitise their employees about it. They will need to set up an entirely new field-one that has a specialized workforce and a separate department at companies. Moreover, for the Government, cybercrime cells and stricter laws and regulations will be required to ensure that vested interests do not mine the ‘new oil’ of the 21st century to their advantage.

To conclude one can say, data is a two-edged knife. On one hand, it can revitalize economies by creating more jobs, facilitating quick decisions, ensuring better governance among others but on the other hand, it can destroy individuals, companies and economies, as a whole, when in the wrong hands.

By Deepti Mahajan and Manya Manushi


bottom of page