You are on page 1of 2

Name: Dev Deepak Pathak

Practical No: 4
Roll No: 27

Hadoop Distributed File System (HDFS)

The storage system for Hadoop is known as HDFS. HDFS system breaks the incoming
data into multiple packets and distributes it among different servers connected in the clusters.
That way every server, stores a fragment of the entire data set and all such fragments are
replicated on more than one server to achieve fault tolerance.

Hadoop Use Case

Hadoop in the Financial Sector


The Finance sector is one of the major users of Hadoop. One of the primary use cases
for Hadoop, was in risk modelling, to solve the question for banks to evaluate customers and
markets better than legacy systems.

These days we notice that many banks compile separate data warehouses into a single
repository backed by Hadoop for quick and easy analysis. Hadoop clusters are used by banks
to create more accurate risk analysis models for the customers in its portfolio. Such risk
analysis helps banks to manage their financial security and offer customized products and
services.

Hadoop has helped the financial sector, maintain a better risk record in the aftermath of
2008 economic downturn. Before that, every regional branch of the bank maintained a
legacy data warehouse framework isolated from a global entity. Data such as checking and
saving transactions, home mortgage details, credit card transactions and other financial
details of every customer was restricted to local database systems, due to which, banks failed
to paint a comprehensive risk portfolio of their customers.

After the economic recession, most of the financial institutions and national monetary
associations started maintaining a single Hadoop Cluster containing more than petabytes of
financial data aggregated from multiple enterprise and legacy database systems. Along with
aggregating, banks and financial institutions started pulling in other data sources - such as
customer call records, chat and web logs, email correspondence and others. When such
unprecedented scale of data is analysed with the assistance of Hadoop, MapReduce and
techniques like sentiment analysis, text processing, pattern matching, graph creating; banks
were able to identify same customers across different sources along with accurate risk
assessment.

This proved that not just banks but any company which has fragmented and incomplete
picture of its customers, would improve its customer satisfaction and revenue by creating a
global data warehouse. Currently banks as well as government financial institutions use
HDFS and MapReduce to commence anti money laundering practices, asset valuation and
portfolio analysis.

Predicting market behaviour is another classic problem in the financial sector. Different
market sources such as, stock exchanges, banks, revenue and proceeds department, securities
market - hold massive volume of data themselves but their performance is interdependent.
Hadoop provides the technological backbone by creating a platform where data from such
sources can be compiled and processed in real-time.

Morgan Stanley with assets over 350 billion is one of the world’s biggest financial
services organizations. It relies on the Hadoop framework to make industry critical
investment decisions. Hadoop provides scalability and better results through it administrator
and can manage petabytes of data which is not possible with traditional database systems.

JPMorgan Chase is another financial giant which provides services in more than 100
countries. Such large commercial banks can leverage big data analytics more effectively by
using frameworks like Hadoop on massive volumes of structured and unstructured data.
JPMorgan Chase has mentioned it on various channels that they prefer to use HDFS to
support the exponentially growing size of data as well as for low latency processing of
complex unstructured data.

Larry Feinsmith, Managing Director of Information Technology at JPMorgan Chase said


that, “Hadoop's ability to store vast volumes of unstructured data allows the company to
collect and store web logs, transaction data and social media data. Hadoop allows us to store
data that we never stored before. The data is aggregated into a common platform for use in a
range of customer-focused data mining and data analytics tools”.

Financial institutions use Big Data Analytics and Hadoop to analyse world economy,
fraud detection, derive value based products for customers, analyse credit market data,
effective cash and credit management and for providing an enriching customer experience.

You might also like