You are on page 1of 22

Points to be Cover

Evolution of Data
What is Data? And what makes it big?
 5 V’s of Big Data / Characteristics of Big Data
 Big Data as an Opportunity
 Case Study – IBM Big Data Analytics
Problem with Big Data
 Hadoop as a Solution
 Conclusion
 References

06-11-2019 Big Data 2 out of 22


Figure 1 – Evolution of Technology Figure 2 - IOT

06-11-2019 Big Data 3 out of 22


Social Media Other Factors

Figure 3 – Usage of Social Media Figure 4 – Data Evolution

06-11-2019 Big Data 4 out of 22


 Data is a set of values of qualitative or quantitative
variables. In computing, data is information that has been
translated into a form that is efficient for movement or
processing.
 The concept was first pioneered in the 1940s by Hari
Seldon, professor of mathematics.
 Roger Magoulas, the man reputed to have invented the
term ‘big data’.

06-11-2019 Big Data 5 out of 22


 Big data generates value from the storage and processing
of very large quantities of digital information that cannot
be analysed with traditional computing techniques.

06-11-2019 Big Data 6 out of 22


Figure 5 – 5 V’s

06-11-2019 Big Data 7 out of 22


– How
much data
 Tremendously large
 Rising
exponentially
 For e.g. Facebook
alone there are 10
billion messages,
4.5 billion times
that the “like”
button is pressed,
and over 350
million new pictures
are uploaded every
day.
Figure 6 – Volume

06-11-2019 Big Data 8 out of 22


– The
various types of Data
 Structured – Data
with proper schema
 Semi-structured –
E.g. JSON, XML,
CSV, TSV & many
more files where
schema is not
defined
 Unstructured –
Log, Audio, Video,
Image files
Figure 7 - Different kind of data generated
from various sources

06-11-2019 Big Data 9 out of 22


– How
fast that data is
processed
 Mainframe 
Client / Server 
Internet  Mobile,
Social media 
Cloud Storage
 There are large
amount of data that
you will be scared
to calculate.

Figure 8 - Data is being generated at


an alarming rate

06-11-2019 Big Data 10 out of 22


-
Uncertainty and
inconsistencies in the
data
 Some data packets
are bound to lose in
their process
 Data like this is one
problematic thing

Figure 9 - Veracity

06-11-2019 Big Data 11 out of 22


 Mechanism to
bring the correct
meaning out of the
data.
 Make sure
whatever big data
generated it make
sense i.e. it help
you in your
business to grow &
Figure 10 – Value of Big Data
has some value to
it.

06-11-2019 Big Data 12 out of 22


 Handle larger dataset
 Insights everywhere for everyone
 Data quality
 Better education at the grassroots level
 Security
 Cost Management
 Actionable insights
 Lower barrier to entry for majority of organizations and
many more …

06-11-2019 Big Data 13 out of 22


(IBM Case study)
Data was collected Data is collected
in 1 month in 15 minutes
Earlier New Smart
meter Meter

96 million reads per day Big Data generated


for every million meters by Smart Meter
Smart
Meters

Figure 12 - Managing the large volume and velocity of information generated by


short-interval reads of smart meter data

06-11-2019 Big Data 14 out of 22


 To manage and use this information to gain insight,
utility companies must be capable of high-volume data
management and advanced analytics designed to
transform data into actionable insights.

Figure 13 - Time-of-use pricing encourages cost-savvy retail like


industrial heavy machines to be used at off-pick time.

06-11-2019 Big Data 15


Storing exponentially growing huge
datasets
 Data generated in past 2 years is more than the previous
history in total
 By 2020, total digital data will grow to 44 Zettabytes
approximately
 By 2020, about 1.7 MB of new information will be
created every second for every person

06-11-2019 Big Data 16 out of 22


Processing data having complex
structure
 Data is not only huge but it present in various format
 You need to make sure that a system is present to store
that verities data generated from various sources

Figure 14 – Different types of data

06-11-2019 Big Data 17 out of 22


Processing data faster

 The data is growing at much faster rate than that of disk


read/write speed
 Bringing huge amount of data to computation unit
becomes a bottleneck

06-11-2019 Big Data 18 out of 22


HDFS(Storage)
MapReduce(Processing)
Allows to dump any kind of
Allows parallel processing of
data across the cluster
the data stored in HDFS
Figure 15 - Hadoop is framework that allows us to store and process large data
sets in parallel and distributed fashion
06-11-2019 Big Data 19 out of 22
The availability of Big Data, low-cost commodity
hardware, and new information management and
analytic software has produced a unique moment in
the history of data analysis.
Big Data brings new and exciting opportunities to
companies who utilize the platform available.
In this information Era, Big Data technology has got
its own importance for business.
It has got lot of opportunities in the upcoming days.

06-11-2019 Big Data 20 out of 22


 Big data for dummies - Judith Hurwitz, Alan Nugent, Dr.Fern Halper,
Marcia Kaufman
 Big data does size matter? - Timandra Harkness
 https://m.youtube.com/watch?list=PL1c9thWg6dPfl5iOGQFe62z1BpRqpZfr
K&params=OAFIAVgC&v=3SK9iJNYehg&mode=NORMAL
 https://www.odbms.org/2015/04/who-invented-big-data-and-why-
should-we-care/
 https://www.xsnet.com/blog/bid/205405/the-v-s-of-big-data-velocity-
volume-value-variety-and-veracity
 https://www.wired.com/insights/2013/05/the-missing-vs-in-big-data-
viability-and-value/
 http://www.tutorialspoint.com/hadoop/hadoop_introduction.htm

06-11-2019 Big Data 21 out of 22

You might also like