You are on page 1of 53

BUSINESS INFORMATION

SYSTEMS
(UGBS 655)
Week 5

Data Analytics and Business Intelligence

Dr. Eric Afful-Dadzie


(OMIS, Dept.)
Week 5
2
3
Data → → → →Information
• Information has been
important for a very wide
variety of reasons and for as
many centuries.
• The location of the nearest
water hole, warm cave, food
in times of famine was a
carefully guarded secret in
the past.

4
Data is the New Oil
❑We need to Extract it
❑Commodify it
❑Protect it

World Economic Forum 2011

5
What is Data Analytics?
• Is the process of using modern computational tools to
collect, organize, model and analyze data with the goal of
discovering useful information to support decision-
making. This may include:
– Searching for trends
– Patterns
– Natural groups (Clusters)
– Associations (Causal Relationships)
– Anomalies (Outliers)
Data Analytics
• Data Analytics as a discipline is largely transparent to
the world. Most of the time, we never even notice that
it’s happening.
• But whenever we sign up for a grocery store shopping
card, make a purchase using a credit card, or surf the
Web, we are creating data.
• Many industry specific types of analytics have been
created such as, Marketing Analytics, Health analytics,
HR analytics, sports analytics, Risk analytics etc.
WHAT IF YOU COULD …
. . . predict the buying behavior and decision criteria of your
prospects weeks before your competition?
. . . gain first-mover advantage by introducing new products and
services to micro-segments that haven't been identified by
competitors?
. . . evaluate the impact of your marketing campaigns hourly and
make adjustments in real-time?
. . . improve customer experience scores that grow products per
customer, reduce attrition, and leverage the power of customer
recommendations for new business?
. . . predict likely failures of critical equipment and processes?
Where does data come from?
Facebook users share nearly 2.5 million pieces of content every day.

Twitter users tweet nearly 300,000 times every day.

Instagram users post nearly 220,000 new photos every day.

Apple users download nearly 50,000 apps

Amazon generates over $80,000 in online sales every minute.

205 billion emails sent per day .

Enterprise systems are churning out thousands of data sets for organizations
Where does data come from?
• 1 billion websites on the world
wide web today. This
milestone was first reached in
September of 2014.

• 8.3 trillion text messages sent


in day

• 205 billion emails sent per day


(Radicati group, 2015)
Explosion of Data Points
• Think about the many data points about your personality alone:
– Name
– Gender
– Weight
– Height
– Colour
– Race
– Tribe
– (Kojo, Male, 70.5, 5’’, Black, African, Fante, Adenta, )
13
5 Vs of Big Data

• Raw Data: Volume


• Change over time: Velocity
• Data types: Variety
• Data Quality: Veracity
• Information for Decision Making: Value
Types of Analytics
Scope of Analytics

❑ Descriptive analytics
- uses data to understand past and present (what happened?)
❑ Diagnostic analytics
- Provide answers to the question “Why did it happen?”
❑ Predictive analytics
- analyzes past performance to predict the future
❑ Prescriptive analytics
- uses optimization techniques
ANALYTICS – EXAMPLE APPLICATIONS

Association Rules Mining


ANALYTICS – Recommender Systems
Examples of Marketing Analytics
• Which customers are likely to “churn”
• Concentrate on these customers

• What kind of customers like a specific offer?


• Up-lift modeling

• Which promotions to offer to a customer?


• Products he/she likes
• Products people with a similar profile like
Examples of HR Analytics

• Employee Attrition: reduction of staff by voluntary or involuntary


reasons
• Cost per Hire : Costs per hire for recruiting, hiring and onboarding.
• Example: the average cost per hire was $22,127.
• Hiring Cycle Time: Time from hiring requisition date to start date.
• Example: average hiring cycle time was 5.8 months.
• Hiring Fill Rate How many positions are filled after 6 months?
Example: the 6 month hiring fill rate was 58%.
• Offer Acceptance Rate How often does the preferred candidate
accept the position? Often broken down by job level. Example: the
offer acceptance rate was 55.5% for level 3 executives.
ANALYTICS – Human in vitro fertilization (IVF)
ANALYTICS – Policing

•Predict crimes: type, location, time, time of


the year.
•Learn characteristics of people that wear
concealed weapons
•Find patterns in crimes; e.g., sudden
increase in burglaries in one particular
area
ANALYTICS – Precision Farming
ANALYTICS – Healthcare

• Can we predict the length of stay for certain diseases?


• Can we save the limited beds in hospitals for trauma wards?,
• Can we accurately classify patients with hospital length of
stay (LOS) of 2 days or less versus those who require longer
stays?
ANALYTICS – Spam Detection

Machine learning helps to solve this problem. The email


client can be trained to learn where to put each email.
ANALYTICS – Loan Application
• When you apply for a loan, you have to fill out a questionnaire
asking for relevant financial and personal information.
• This information is used by the loan company as the basis for
its decision as to whether to accept or reject your loan
application
ANALYTICS – Disease Outbreak

E.g., Google Flu Trends:

Detecting outbreaks two weeks


ahead of CDC data

New models are estimating


which cities are most at risk
for spread of the Ebola virus.

Prediction model is built on


Various data sources, types and
analysis.

28
ANALYTICS – Predicting election Outcome

29
Analytics: Traffic Prediction and Earthquake Warning

Crowdsourcing + physical modeling + sensing + data assimilation

to produce:

30
ANALYTICS – Tweet sentiment

Tweet Demonstration
THE ANALYTICS GAP
Many organizations:
▪ Can’t always generate the information they need.
▪ Can’t generate insight fast enough to act upon it.
▪ Continue to incur huge costs due to uninformed
decisions and misguided strategies.

The opportunities afforded by analytics have never been greater.


Dangers of Data mining
• One of the key issues raised by data mining is not a
business or technological one, but a social one.

• It is the issue of individual privacy.

• Data mining makes it possible to analyze routine business


transactions and glean a significant amount of information
about individuals buying habits and preferences (eg. Target
marketing).
Danger 1 : Correlation ≠ Causality
• Diet Coke → Obesity
• Intensive Care → Death
Danger 2: Privacy Issues
• First step in data mining:
• Collect data
− Buying history (Loyalty cards)
− Blog posts, Twitter messages
− Browsing behavior
− Contest
• Buy the data from a company
• Survey data; e.g. CBS (average income/postal code)

• Combining different sources may lead to privacy issues


• Information asymmetry
Why is Facebook Worth 350bn$?
Source: Facebook’s Privacy Policy:
• Information we collect when you interact with
Facebook:
• Site activity information, …
• “We may ask advertisers to tell us how our users
responded to the ads we showed them”

• "We allow advertisers to choose the characteristics of users


who will see their advertisements and we may use any of the
non-personally identifiable attributes we have collected
(including information you may have decided not to show to
other users)"
Danger 3: Discriminating Models
• Often we observe classifiers learn undesirable properties
from data …
Danger 4: Classification
• Often we observe classifiers learn undesirable
properties from data …

Gender Age Diploma Nationality … City Required Invite?


Candidate Diploma
M 29 MSc. Math Belgian … Antwerp MSc. CS y
F 49 BSc. CS Turkish … Eindhoven BSc. CS n
M 32 HBO Dutch … The Hague - y
… … … … … … … …

(Gender = F) and (Job_type = “full professor”)  (Invite = No)


(Nationality = “Moroccan”) or (Nationality=“Turkish”)  (Invite = No)
BUSINESS INTELLIGENCE
(VISUAL ANALYTICS)
Data Visualization - Defined
• Data visualization refers to techniques used to
communicate insights from data through visual
representation.
• Its main goal is to distill large datasets into visual
graphics to allow for easy understanding of complex
relationships within the data.
• It is often used interchangeably with terms such as
information graphics, statistical graphics, and
information visualization.
Tracing the History of Data Visualizations

William Playfair invented four types of graphs: the line graph and bar chart of
economic data (1786), and the pie chart and circle graph (1801). Joseph Priestly created
the innovation of the first timeline charts.
Charles Joseph Minard's Figurative map of 1812
Benefits of BI
Faster Decision Making
• Companies who gather and quickly act on their data
are more competitive in the marketplace because
they make informed decisions sooner than the
competition.
• Speed is key, and data visualization aides in the
understanding of vast quantities of data by applying
visual representations to the data.
• This visualization not only spur creativity, but also
reduces the need for IT to allocate resources to
continually build new models.
Basic Example
• Let’s say you’re a retailer and you want to compare sales of jackets
to sales of socks over the course of the previous year.
Common Types of Data Visualizations
• Time-series
• Ranking
• Part to Whole
• Deviation
• Correlation
• Frequency Distribution
• Geographical Comparison
• Relationships
Line charts
These are one of the most basic and commonly used
visualizations. They show a change in one or more variables over
time.

When to use: You need to show how a variable changes over time.
Area charts
• Area charts are primarily used when the summation of quantitative data (dependent variable) is to be communicated (rather than
individual data values). The area underneath the line(s) helps in graphically depicting quantitative progression over time..

When to use: You need to show cumulative changes in multiple variables over time.
Bar charts
• These charts are like line charts, but they use bars to represent
each data point.
Bar charts
Stacked Bars
Scatter Plot
Scatter Plot

You might also like