You are on page 1of 13

CBTU presents a course on Big data and Hadoop

q
q

All the logos, trademarks are copyright of the respective companies.


Audience and Prerequisites
• Audience
– Anyone who is interested in learning Big data,
Analytics and Hadoop Framework.
• Prerequisites
– Basic computer knowledge, a bit of Java, database
concepts and Linux operating system.
q
q

Last updated on 6 November 2013


Unique binary prefixes
IEC prefix Representations Customary prefix

Name Symbol Base 2 Value Base 10 Name Symbol

kibi Ki 210 1024 = 1.024×103 kilo k or K


mebi Mi 220 1048576 ≈ 1.049×106 mega M
gibi Gi 230 1073741824 ≈ 1.074×109 giga G
tebi Ti 240 1099511627776 ≈ 1.100×1012 tera T
pebi Pi 250 1125899906842624 ≈ 1.126×1015 peta P
exbi Ei 260 1152921504606846976 ≈ 1.153×1018 exa E
zebi Zi 270 1180591620717411303424 ≈ 1.181×1021 zetta Z
yobi Yi 280 1208925819614629174706176 ≈ 1.209×1024 yotta Y
Specific units of IEC 60027-2 A.2 and ISO/IEC 80000
Big Data ?
Big data is a term for data sets that are so large or
complex that traditional data processing applications
are inadequate to deal with them.
Challenges include:
• Data capturing, storage, data analysis
• Search, sharing, transfer, visualization
• Querying, updating and information privacy.
Growth of Big data
Data is very rapidly generated by
• R&D
• Satellites
• Aviation Data is of
• Social media Huge volume
• Mobile devices, cameras
• Remote sensing High velocity
• Software logs Extensible Variety
• IoT devices
• Stock exchange and finance
• Online shopping/ ecommerce
• RFID readers, Sensors
• Manufacturing, Healthcare, etc.
Application of Big data
• Google search, translate, business
• Amazon, Walmart, eBay, Bestbuy
• Facebook, Twitter
• Public services
• Election forecasting
• Medical and health
• Weather predictions
• Science, Engineering
Most common users of big data and analytics apps

Source: Evans Data Corporation, Big Data and Analytics survey 2015
Big data eco system
• Big data eco system helps making more accurate
analysis, better decision-making, greater operational
efficiencies, cost reductions, and reduced risks for
the business.
Data Analytics
• Predictive analytics
• User behavior analytics
• Find results, correlations, trends and reports
Useful for
– Scientists, businesses, medicinal research, crime
investigation, marketing research, governments
and everywhere practically.
Types of Data
Big Data includes huge volume, high velocity, and
extensible variety of data and forms.
• Structured data : Relational data.
• Semi Structured data : XML data.
• Unstructured data : Photos, Videos, documents,
PDF, Text etc.
Growth of Big data
Data generation
Doubles every two years since 1980
~2.5 exabytes (2.5×1018 ) per day.
IDC predicts 1 Gig = 1000 Megs
1 Tera = 1000 Gigs
by 2020 there will be 44 1 Peta = 1000 Tera = 1 million Gigs
Zettabytes of data 1 Exa = 1000 Peta = 1 billion Gigs
1 Zetta = 1000 Exa = 1000 billion gigs
by 2025 there will be 163 1 Yotta = 1000 Zetta = 1 million Exa
Zettabytes of data

You might also like