A Brief History of Analytics: Big Data

A Brief History of Analytics
Historically speaking, a simple definition of Analytics is “the study of analysis.” A more

useful, more modern description would suggest “Data Analytics” is an important tool for
gaining business insights and providing tailored responses to customers. Data
Analytics, sometimes abbreviated to “Analytics,” has become increasingly important for
organizations of all sizes. The practice of Data Analytics has gradually evolved and
broadened over time, providing many benefits.
The use of Analytics by business can be found as far back as the 19 th century, when
Frederick Winslow Taylor initiated time management exercises. Another example is
when Henry Ford measured the speed of assembly lines. In the late 1960s, Analytics
began receiving more attention as computers became decision-making support
systems. With the development of Big Data, Data Warehouses, the Cloud, and a variety
of software and hardware, Data Analytics has evolved, significantly. Data Analytics
involves the research, discovery, and interpretation of patterns within data.
Relational Databases
Relational Databases were invented by Edgar F. Codd in the 1970s and became quite
popular in the 1980s. Relational Databases (RDBMs), in turn, allowed users to write in
Sequel (SQL) and retrieve data from their database. Relational Databases
and SQL provided the advantage of being able to analyze data on demand, and are still
used extensively. They are easy to work with, and very useful for maintaining accurate
records. On the negative side, RDBMs are generally quite rigid and were not designed
to translate unstructured data.
Data Warehouses
In the late 1980s, the amount of data being collected continued to grow significantly, in
part due to the lower costs of hard disk drives. During this time, the architecture of Data
Warehouses was developed to help in transforming data coming from operational
systems into decision-making support systems. Unlike relational databases, a Data
Warehouse is normally optimized for a quick response time to queries. In a data
warehouse, data is often stored using a timestamp, and operation commands, such as
DELETE or UPDATE, are used less frequently. If all sales transactions were stored
using timestamps, an organization could use a Data Warehouse to compare the sales
trends of each month.
Business Intelligence
The term Business Intelligence (BI) was first used in 1865, and was later adapted by
Howard Dresner at Gartner in 1989, to describe making better business decisions
through searching, gathering, and analyzing the accumulated data saved by an
organization. Using the term “Business Intelligence” as a description of decision-making
based on data technologies was both novel and far-sighted. Large companies first
embraced BI in the form of analyzing customer data systematically, as a necessary step
in making business decisions.
Data Mining
Data Mining began in the 1990s and is the process of discovering patterns within large
data sets. Analyzing data in non-traditional ways provided results that were both
surprising and beneficial. The use of Data Mining came about directly from the evolution
of database and Data Warehouse technologies. The new technologies allow
organizations to store more data, while still analyzing it quickly and efficiently. As a
result, businesses started predicting the potential needs of customers, based on an
analysis of their historical purchasing patterns.
Big Data
While the internet became extremely popular, relational databases could not keep up.
The immense flow of information combined with the variety of data types coming from
many different sources led to non-relational databases, also referred to as NoSQL. A
NoSQL database can translate data using different languages and formats quickly and
avoids SQL’s rigidity by replacing its “organized” storage with greater flexibility.
In 2005, Big Data was given that name by Roger Magoulas. He was describing a large
amount of data, which seemed almost impossible to cope with using the Business
Intelligence tools available at the time. In the same year, Hadoop, which could process
Big Data, was developed. Hadoop’s foundation was based on another open-source
software framework called Nutch, which was then merged with Google’s MapReduce.
Apache Hadoop is an open-source software framework, which can process both

structured and unstructured data, streaming in from almost all digital sources. This
flexibility allows Hadoop (and its sibling open-source frameworks) to process Big Data.
During the late 2000s, several open source projects, such as Apache Spark and
Apache Cassandra came about to deal with this challenge.

A Brief History of Analytics: Big Data

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Brief History of Analytics: Big Data

Uploaded by

Copyright:

Available Formats

A Brief History of Analytics

Historically speaking, a simple definition of Analytics is “the study of analysis.” A more

Apache Hadoop is an open-source software framework, which can process both

You might also like