You are on page 1of 30

Big Data

Outline
 Introduction
 Similar
Topics
 What Does Big Data Look Like?
 Why Use Big Data?
 How Is It Useful?
 What Companies Rely On Big Data?
 Summary
 Questions
 References

2
Events,
observatio Post-
Existence, recording
n process
changes

Where (big) data starts to play a role


What exactly is Big Data
 extremelylarge data sets that may be analyzed computationally to reveal patterns, trends, and
associations, especially relating to human behavior and interactions.

 Q:, and why does the University need an institute?


 A: Currently, many domains, including  science, engineering, health care, environmental science, e-
commerce and, increasingly, the humanities, generate massive amounts of data. These data are
accumulating to the point where making sense of them is a huge challenge. And it’s not just about
size, speed and variety, but also the complexity of the data sets, the enormous numbers of variables
and the uncertainty in measurements from global environmental monitoring systems, studies of gene
expression and others. Because data can come from multiple sources, they also must be integrated,
creating additional complications.
 So we need a Big Data Institute at the University because many of the problems we face in science,
engineering, health care and the humanities require very powerful tools for answering questions
across the domains. And because U.Va. is a complete university, we can combine our efforts to solve
some of the most challenging problems.
History of data
 Data was recorded by human beings since ancient time.
 Experts generate data
 everage people consume data
 In recent century, easier by computers
 Still experts or specialized software supply generate data
 Everage humanbeings consume data
 Nowadays, done by human beings using various devices
and software
 By everybody
“Big” data
 Data on paper
 Usage of computers create big data problem, with the
following characteristics
 Faster. High speed, rapidity with which data comes in
 Easier
 Varieties
 Large amount
 Various resources
Multiple faceted
 “Big data” is something that has multiple definitions,
Resources
 Sensor network
 High precision devices
 Surveillance
…
 data fusion, and information analysis.
 More accurate analyses may lead to more confident
decision making. And better decisions can mean greater
operational efficiencies, cost reductions and reduced risk.
(Predictive Analytics, Big Data)
 What is Data Mining
Who is in the jungle ?
 SAS
Introduction to Big Data
 Extremely large data sets which grow exponentially
 Difficult to process with traditional methods
 Reveals patterns, trends & associations

13
Introduction (continuted)
 Unimportant: Size of data
 Important: Ability to analyze such a data set

14
Graphical Representation
 “There were 5 exabytes of
information created between
the dawn of civilization
through 2003, but that much
information is now created
every 2 days.”
 Eric Schmidt
• Google Employee (2010)

15
Confusion on Big Data
 Notusually confused with another concept
 Confusion occurs when trying to understand why

16
Components to Big Data: Multi-faceted
Relationships among data

18
Data Representation: Table

Pictures/Day 55 Million 55,000,000

Tweets/Day 340 Million 340,000,000

Documents/Day 1 Billion 1,000,000,000

Total Bytes/Day 2.5 Quintillion 1,000,000,000,000,000,000

19
Process of Big Data Creation

Social
Client/
Mainframe The Internet Media/The
Server
Cloud

20
Why Use Big Data?
 Better management of data
 Speed, capacity & scalability benefits
 Better visualization of data
 Data analysis capabilities will evolve

21
Big Data Analyzing Software
 Amazon
 Amazon DynamoDB
 Amazon Redshift
 DataStax
 Cassandra
• Developed by Facebook, inspired by DynamoDB

22
Who Relies on Big Data?
 Majority of companies
 Retail
• Amazon
 Entertainment
• Netflix
 Health
• MyFitnessPal

23
Amazon
 Predicts what customers want before they start their
search
 “Frequently Bought Together”
 “Customers Who Bought This Item Also Bought”

24
Netflix
 Analyze viewing habits
 Improve suggestions

25
MyFitnessPal
 Extensive database with information immediately
available

26
Big Data’s Influence for the Future
 Healthcare industry could save $300 billion a year by
using big data analytics
 Big Data has helped predict crimes three times more
accurately than current forecasting
 Companies involved in retail could increase profit by more
than 60%

27
Fun Facts
 1.9 million IT jobs will be created by 2015 to work with Big
Data projects
 Data transferred via mobile networks increased by 81%
every month between 2012 and 2014
 NSA is only capable of analyzing 1.6% of all internet
traffic per day

28
Summary
Big Data has a much larger impact on daily life than the
majority of society realizes. Whether they’re actively on the
internet or shopping at the grocery store, data is being
recorded and analyzed instantaneously. Without Big Data
the technology available to us today would not be as
reliable or successful.

29
Questions
 Whatis more important, the amount of data or the
process by which the data is analyzed?
 The process of analyzing the data

 What is the pattern of growth for Big Data?


 Exponential growth

30

You might also like