You are on page 1of 27

BAFBAN1: Fundamentals of Business Analytics

Week 12

IBM Global Center for Smarter Analytics  © 2013 IBM Corporation


© Copyright IBM Corporation 2013. All rights reserved.

THE INFORMATION CONTAINED IN THIS PRESENTATION IS FOR INFORMATIONAL PURPOSES ONLY. IBM SHALL
NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO,
THIS PRESENTATION OR ANY OTHER DOCUMENTATION.

IBM, the IBM logo, ibm.com, Cognos, SPSS and iLog are trademarks or registered trademarks of International
Business Machines Corporation in the United States, other countries, or both. If these and other IBM
trademarked terms are U.S. registered or common law trademarks owned by IBM at the time this
information was published. Trademarks may also be registered or common law trademarks in other
countries. A current list of IBM trademarks is available on the Web at “Copyright and trademark
information” at http://www.ibm.com/legal/copytrade.html. The IBM logo must not be moved, added to
or altered in any way.

Other company, product, or service names may be trademarks or service marks of others.

IBM Global Center for Smarter Analytics  © 2013 IBM Corporation


BAFBAN1: Fundamentals of Business Analytics
Big Data

3 IBM Global Center for Smarter Analytics  © 2013 IBM Corporation


Big Data

Definition: datasets that grow so large that they become awkward


to work with using on-hand database management tools

Challenge: capturing, storing, searching, sharing, analyzing, and


visualizing.

4 IBM Global Center for Smarter Analytics  © 2013 IBM Corporation


Big Data - Characteristics
Volume
–large amount of data we are getting from many sources as the world becomes
more and more instrumented

Variety
–data we receive can be structured, semi-structured, or unstructured

Velocity
–we are being bombarded with these data at huge speeds, all the time, any
time

5 IBM Global Center for Smarter Analytics  © 2013 IBM Corporation


Big Data - Characteristics - Volume

2.5 quintillion (or 2.5 million trillion) bytes of data are created each day
90% of the world data was created in last two years alone.

6 IBM Global Center for Smarter Analytics  © 2013 IBM Corporation


Big Data - Characteristics - Variety

80% of the world’s data is unstructured


It may be data we’ve collected before, but they could not be processed

7 IBM Global Center for Smarter Analytics  © 2013 IBM Corporation


Big Data - Characteristics - Velocity

8 IBM Global Center for Smarter Analytics  © 2013 IBM Corporation


Big Data - Business Challenges
How to use data to learn about customers

Tasks to learn about this data

Know where to look for the data

9 IBM Global Center for Smarter Analytics  © 2013 IBM Corporation


Big Data - Business Challenges
How to use data to learn about customers
•What data they have related to the customer
–database (structured)
–flat files, data in documents like word, web blogs, comments from web sites
(unstructured)

•What the data says about customers


–“hidden information”
–sentiment
–buying trend

•Where the data about the customers are

10 IBM Global Center for Smarter Analytics  © 2013 IBM Corporation


Big Data - Business Challenges
Tasks to learn about this data

•Identify the data

•Locate data sources

•Load the Data

•Convert into Useable Format

•Analyze the Data

•Convert the Data Into Useable Knowledge


11 IBM Global Center for Smarter Analytics  © 2013 IBM Corporation
Big Data - Business Challenges
Know where to look for the data
•Warehouse data about purchase history of a customer structured
•Site browsing history of the customer structured
•Comments on Websites unstructured
•Customer surveys semi-structured

12 IBM Global Center for Smarter Analytics  © 2013 IBM Corporation


Big Data – Business Motivation

Launch customized advertising campaigns

Gather data from different sources to get the complete picture

Integrate data into meaningful records

Use this information to provide rich and useable results

To use data for on going analytic queries

13 IBM Global Center for Smarter Analytics  © 2013 IBM Corporation


Big Data - Technical Challenges

Integrate from various sources

Not compromise existing data structures and infrastructure

Locate appropriate data

Get data into a useable format

Analyze petabytes of data

14 IBM Global Center for Smarter Analytics  © 2013 IBM Corporation


Big Data - Technical Challenges
Integrate from various sources

•Different Data Source Platforms

15 IBM Global Center for Smarter Analytics  © 2013 IBM Corporation


Big Data - Technical Challenges
Integrate from various sources

•Structured Data Sources

•Semi Structured Data Sources

•Unstructured Data Sources

16 IBM Global Center for Smarter Analytics  © 2013 IBM Corporation


Big Data - Technical Challenges
Locate appropriate data

•Where is the data?


–Relational Databases
–Flat files
–Web Sites, Web Logs
–Written Notes

•How does the data look like?


–Structured
–Semi Structured
–Unstructured

17 IBM Global Center for Smarter Analytics  © 2013 IBM Corporation


Big Data - Technical Challenges
Get data into a useable format

•Understand the data you are dealing with

18 IBM Global Center for Smarter Analytics  © 2013 IBM Corporation


Big Data - Technical Challenges
Analyze petabytes of data
•The amount of data that may need to be analyzed will be in petabytes

1 kilobytes is approx. 1,000 bytes

1 megabytes is approx. 1,000,000 bytes

1 gigabytes is approx. 1,000,000,000 bytes

1 terabytes is approx. 1,000,000,000,000 bytes

1 petabytes is approx. 1,000,000,000,000,000 bytes

19 IBM Global Center for Smarter Analytics  © 2013 IBM Corporation


Big Data – Sample Solutions

All product and company names are trademarks™ or registered® trademarks of their respective holders. Use of them does not imply any affiliation with or endorsement by them.

20 IBM Global Center for Smarter Analytics  © 2013 IBM Corporation


Big Data – Sample Solutions
analyze the 12B TB of tweets being
created each day to figure out what
people are saying about your products

figure out who the key influencers are


within your target demographics
mine this data to identify new market
opportunities
volume variety

21 IBM Global Center for Smarter Analytics  © 2013 IBM Corporation


Big Data – Sample Solutions
hospitals could take the thousands of
sensor readings collected every hour
per patients in ICUs

identify subtle indications that the


patient is becoming unwell, days
earlier that is allowed by traditional
techniques

variety volume

22 IBM Global Center for Smarter Analytics  © 2013 IBM Corporation


Big Data – Sample Solutions
make risk decisions, such as whether
or not someone qualifies for a
mortgage, in minutes, by analyzing
many sources of data, including real-
time transactional data, while the
client is still on the phone or in the
office

volume velocity

23 IBM Global Center for Smarter Analytics  © 2013 IBM Corporation


Big Data – Sample Solutions
law enforcement agencies could
analyze audio and video feeds in real-
time without human intervention to
identify suspicious activity

variety velocity

24 IBM Global Center for Smarter Analytics  © 2013 IBM Corporation


Big Data – Sample Solutions
a green energy company could use
petabytes of weather data along with
massive volumes of operational data
to optimize asset location and
utilization, making these
environmentally friends energy
sources more cost competitive with
traditional sources

volume variety

25 IBM Global Center for Smarter Analytics  © 2013 IBM Corporation


Big Data and Analytics
WATSON competed in JEOPARDY in 2011

200 million pages of text is loaded in memory using Big Data technology (i.e., IBM BigInsight)

Analytics is used to process the loaded data.

IBM Global Center for Smarter Analytics  © 2013 IBM Corporation


Sources
IBM Academic Initiative (ibm.com/academicinitiative)
•(BD001) Hadoop Fundamentals
•(DW640) InfoSphere BigInsights Analytics for Business Analysts

27 IBM Global Center for Smarter Analytics  © 2013 IBM Corporation

You might also like