You are on page 1of 11

The History, Evolution, &

Technologies of Big Data [with use


cases]
With the rising Big Data, Companies are moving towards Big Data tools and
technologies. Everyone might want to know the history of big data. In this
article, we will see the history of the present buzz “Big Data”.

The article will also cover the use cases of Big Data in different domains. You
will also explore the different big data technologies adopted by companies for
handling Big Data.

Let us start with the history of Big Data.

History of Big Data


The history of big data starts many years before the present buzz around Big
Data. Seventy years ago the first attempt to quantify the growth rate of data in
the terms of volume of data was encountered. That has popularly been known
as “information explosion“.
We will be covering some major milestones in the evolution of “big data”.

1944:
Fremont Rider, based upon his observation, speculated that Yale Library in
2040 will have “approximately 200,000,000 volumes, which will occupy over
6,000 miles of shelves… [requiring] a cataloging staff of over six thousand
persons.”
He did not predict the digitization of libraries but predicted the information
explosion.
From 1944 to 1980, many articles and presentations were presented that
observed the ‘information explosion’ and the arising needs for storage
capacity.
1980:
In 1980, the sociologist Charles Tilly uses the term big data in one sentence
“none of the big questions has actually yielded to the bludgeoning of the big-
data people.” in his article “The old-new social history and the new old social
history”.
But the term used in this sentence is not in the context of the present meaning
of Big Data today.

Now, moving fast to 1997-1998 where we see the actual use of big data in its
present context.

1997:
In 1977, Michael Cox and David Ellsworth published the article “Application-
controlled demand paging for out-of-core visualization” in the Proceedings of
the IEEE 8th conference on Visualization.
The article uses the big data term in the sentence “Visualization provides an
interesting challenge for computer systems: data sets are generally quite large,
taxing the capacities of main memory, local disk, and even remote disk. We
call this the problem of big data. When data sets do not fit in main memory (in
core), or when they do not fit even on local disk, the most common solution is
to acquire more resources.”.

It was the first article in the ACM digital library that uses the term big data with
its modern context.
1998:
In 1998, John Mashey, who was Chief Scientist at SGI presented a paper titled
“Big Data… and the Next Wave of Infrastress.” at a USENIX meeting. John
Mashey used this term in his various speeches and that’s why he got the credit
for coining the term Big Data.
2000:
In 2000, Francis Diebold presented a paper titled “’ Big Data’ Dynamic Factor
Models for Macroeconomic Measurement and Forecasting” to the Eighth
World Congress of the Econometric Society.
In the paper, he stated that “Recently, much good science, whether physical,
biological, or social, has been forced to confront—and has often benefited from
—the “Big Data” phenomenon.

Big Data refers to the explosion in the quantity (and sometimes, quality) of
available and potentially relevant data, largely the result of recent and
unprecedented advancements in data recording and storage technology.”

He is the one who linked big data term explicitly to the way we understand big
data today.
2001:
In 2001, Doug Laney, who was an analyst with the Meta Group (Gartner),
presented a research paper titled “3D Data Management: Controlling Data
Volume, Velocity, and Variety.” The 3V’s have become the most accepted
dimensions for defining big data.
2005:
In 2005, Tim O’Reilly published his groundbreaking article “What is Web
2.0?”. In this article, Tim O’Reilly states that the “data is the next Intel inside”.
O’Reilly Media explicitly used the term ‘Big Data’ to refer to the large sets of
data which is almost impossible to handle and process using the traditional
business intelligence tools.

This is for sure the current widely understood form of Big data definition.
In 2005 Yahoo used Hadoop to process petabytes of data which is now made
open-source by Apache Software Foundation. Many companies are now using
Hadoop to crunch Big Data.

So we can say that 2005 is the year that the Big data revolution has truly begun
and the rest they say is history.
Big Data Use Cases
It’s time to see some big data use cases. Many organizations use big data tools
such as Apache Hadoop, Spark, Hive, Pig, etc. to handle big data and gain
insights from it.
Below we listed some major big data use cases in different domains.

1. Financial Sectors

There are some applications of Big Data in the Finance and Banking sectors.
Financial services organizations use big data for various:

a. Fraud Detection
Banks and Financial firms use big data analytics to differentiate legitimate
business transactions and fraudulent interactions. Using machine learning
and big data analysis, they were able to differentiate the normal activity and
unusual behavior indicating fraud based on the customer’s history.

If unusual behavior is observed, the analysis systems will suggest immediate


actions, such as blocking irregular transactions, which will stop fraud before it
occurs.

b. Risk assessment
Financial firms manage their customer’s risk through big data analysis by
analyzing their customer portfolios. The big data analysis supports real-time
alerting, so if the risk threshold exceeds, the system alerts the firms.

c. Customer Segmentation
Customer segmentation is the best way to transform banks from product-
centric to customer-centric businesses. Big Data enables banking sectors to
group customers into distinct segments defined by data sets that include daily
transactions, demographics, etc.

Marketing Campaigns and promotions are then targeted to the customers


based on their segments.

Big Data in JPMorgan Chase


JPMorgan Chase is a topmost global financial services firm. It is among the
largest banking institutions in the US. It generates massive amounts of data
about its US-based customers such as credit card information and other
transactional data.

Along with the publicly available economic statistics, JPMorgan Chase uses
new big data analytics to develop insights into consumers’ trends and offers
those reports to the bank’s clients.

JPMorgan Chase analyses phone calls, emails, transaction data to detect the
possibilities of fraud. It also uses Analytics software developed by Palantir to
keep an eye on employee communications to identify any risk of internal
fraud.

2. Health Care sectors


Many companies use big data, but the healthcare sector is one of the most
popular areas where big data is getting profitable success in shaping the usual
practices.

a. Patients predictions
Healthcare sectors use Big Data analysis to predict the numbers of next visits,
to identify the frequency of skipped appointments, the full time of surgery.

Using big data analysis they can predict if doctors have enough medical
supplies or not. Consequently, these process better quality of help to the
patients which helps them to recover fast.

b. Real-Time Health Monitoring


With the advancement in IoT, there are many wearable devices like fitness
trackers, wristbands, etc to monitor the health of their users. But with this
monitoring device, it is needed to analyze the data generated by these devices
to monitor user health in a real-time mode and provide the information to the
doctors.

So, data from all these devices are analyzed instantly and, if something is
wrong, an alert will be sent to the doctor or another specialist automatically.
As a result, the doctor can contact the patient without any delay and provide
them all the necessary instructions.

c. Predictions of Mass outbreaks


With big data analysis, a scientist builds social models of the health of the
population. The doctors can create predictive models of outbreaks. By
analyzing the data and using the algorithms, they were able to predict the
disease outbreak.

So before the disease spread, the doctors were having the opportunity to
create targeted vaccines faster which will prevent the disease outbreak. It is a
wonderful benefit for the world’s population.

3. Big Data in Transportation industry


Not only is banking and medical, but big data is also proven profitable for the
transportation industry as well. Big data is used in the transportation
industries to make transportation more efficient and easy.

1. Route planning: Transportation firms are using big data to understand and


estimate the users’ needs on different routes and on different modes of
transportation. They make route planning to reduce their waiting time.
2. Congestion management and traffic control: Big data helps in combining
real-time traffic data collected from road sensors, video cameras, and GPS
devices. Thus, traffic problems in dense areas can be resolved by adjusting
public transportation routes in real-time.
For example, people are using Google Maps to locate the least dense routes.
3. The safety level of traffic: The real-time processing of big data and predictive
analysis can be used to identify accident-prone areas which can help in
reducing accidents and increase the safety level of traffic.
4. Big Data in Government sector
Big data plays a vital role in the government sectors. Technologies in Big Data
are playing significant roles in fields like public services, national security,
defense, national security, cybersecurity, crime prediction, etc.

 In public services, Big data tools have a wide range of applications like
financial market analysis, health-related search, fraud detection,
environmental protection, financial market analysis, and many more.
 The Social Security Administration uses Big Data to analyze large
amounts of social disability claims that arrive in unstructured format. This
analytics helps SSA to fastly process medical information and helps in
faster decision making and detecting fraudulent claims.
 The Food and Drug Administration (FDA) uses big data for detecting
and studying the patterns of food-related diseases and illnesses. This
provides faster responses leading to rapid treatment and reduces death.
 The Department of Homeland Security also uses big data for various
different use cases.
5. Big Data in Retail
Big Data analytics is playing a major role in shaping the future of the retail
industries.

The retailers, both offline and online, are adopting the data analysis strategies
for understanding the buying behavior of their customers, and mapping them
to different products, and planning marketing strategies to sell out their
products and increase their profits.

They are using big data analysis for:

1. Generating Recommendations: Retail industries based on their customer’s


purchase history predicts what they will likely purchase next. They use
machine learning models that are trained on historical data to make
predictions.
2. Making Strategic Decisions: Retailers collect data from various sources and
analyze them to make profitable decisions.
3. Market Basket Analysis: They use Market Basket Analysis techniques to
figure out what products are most likely a customer would purchase together.
Using Apache Hadoop, retailers now analyze vast amounts of data.
There are many other use cases of Big Data in different sectors like Education,
Retail, Telecom, Media and Entertainment. Refer to Big Data Use Cases article
to see different use cases of big data.

Big Data Technologies


Big Data technologies refer to the software utilities designed for the purpose of
analyzing, processing, and extracting information from the vast amount of
unstructured or semi-structured data that can’t be handled with the relational
databases or the traditional processing systems.

The topmost big data technologies are:

1. Apache Hadoop
Hadoop provides the solution to all the big data problems. It is the backbone
of the Big Data industry. 90% of the world’s data is now moved to Hadoop. It
is the open-source software framework that stores and processes big data in a
distributed manner.

The HDFS, MapReduce, and YARN are the core components of Hadoop.

2. Apache Spark
Apache Spark is another leading Big Data tool. Spark is a lightning-fast cluster
computing engine that is 100 times faster than Hadoop in running
applications in memory and 10 times faster than Hadoop in running
applications in the disk.

Apache Spark is best known for its in-memory computing capabilities that
deliver high-speed processing.

3. Apache Flink
Apache Flink is called 4G of Big Data. Flink is an open-source scalable data
analytics framework that can handle stream processing as well as batch
processing easily. It is a streaming data flow engine designed for stateful
computations.

4. Tableau
Tableau is a BI tool for data visualization that transforms raw data into an
understandable format. It visualizes data in the form of interactive dashboards
that can be easily understood by any technical or non-technical user.

A person without any coding knowledge can learn Tableau. It is the most
powerful and robust data visualization tool in the analytics industry.

5. QlikView
QlikView is another leading Big data visualization tool. It is the best option for
transforming raw data into knowledge. It has a simple, clean and
straightforward user interface that provides a completely new level of analysis.

6. Hive
A hive is an open-source tool that provides the developer the capability to use
SQL like queries known as Hive Query Language to process Big Data. It is a
data warehousing tool built on the top of Hadoop.

These are some top big data technologies that are used by a large number of
companies for dealing with Big Data and to make profits with the rising Big
Data market.

You might also like