You are on page 1of 7

CHAPTER 1 Approaches to making decisions:

-tradition (“We’ve always done it this


Decision Making way”)
Responsibility of managers 
-intuition (“gut feeling”)
-Plan
-coordinate - rules of thumb (“As the restaurant
-organize owner, I schedule twice the number of
-lead their organizations to better waiters and cooks on holidays”)
performance
Business Analytics Defined
Ultimately, managers’ responsibilities
require that they make: CHALLENGES
- Uncertainty
1. Strategic decision -enormous number of alternatives that
-involve higher-level issues we cannot evaluate them all
-organization’s overall goals and
aspirations for the future Business analytics
-usually the domain of higher-level -Scientific process of transforming data
executives into insight for making better decisions
-have a time horizon of three to five -used for data-driven or fact-based -
years. more objective than other alternatives for
decision making.
2. Tactical decision  -can involve anything from simple
-concern how the organization should reports to the most advanced optimization
achieve the goals and objectives techniques
-responsibility of midlevel
management 3 broad categories of techniques: 
-span a year and thus are revisited
annually or even every six months. 1.Descriptive analytics -encompasses the
set of techniques that describes what has
3. Operational decisions  happened in the past.
-affect how the firmrun from day to -Examples:
day -data queries (A data query is a request
-the domain of operations managers, for information with certain characteristics
who are the closest to the customer. from a database.)
-reports, descriptive statistics, data
Regardless of the level within visualization including data dashboard
the firm, decision making can be (collections of tables, charts, maps, and
defined as the following process: summary statistics that are updated as new
data become available. ), some data-mining
1. Identify and define the problem. techniques(use of analytical techniques for
-most critical better understanding patterns and
2. Determine the criteria that will be used relationships that exist in large data sets)
to evaluate alternative solutions. and basic what-if spreadsheet models.
3. Determine the set of alternative
solutions. 2. Predictive Analytics
4. Evaluate the alternatives. -Consists of techniques that use
5. Choose an alternative. models constructed from past data to
predict the future or ascertain the impact
of one variable on another. 
-Linear regression, time series analysis,
some data-mining (often used in predictive
analytics), and simulation (use of -any set of data that is too large or too
probability and statisticsl to study the complex to be handled by standard data-
impact of uncertainty on a decision), often processing techniques and typical desktop
referred to as risk analysis. software

3. Prescriptive analytics 
-indicates a course of action to take -
the output is a decision. 
-prediction + a rule=prescriptive model

Examples:
 optimization models (give the best
decision subject to the constraints of
the situation.)
 rule-based model
-rely on a rule or set of rules
 portfolio models in finance Volume
use historical investment return data -data must be stored, and this storage
to determine which mix of investments will has led to vast quantities of data.
yield the highest expected return while -a terabyte = 1,024 gigabytes
controlling or limiting exposure to risk. Velocity
 supply network design models in -Real-time capture and analysis of data
operations present unique challenges both in how data
-provide plant and distribution center are stored and the speed with which those
locations that will minimize costs while still data can be analyzed for decision making.
meeting customer service requirements Variety
 price-markdown models in retailing Text data- collected by monitoring
what is being said about a company’s
 simulation optimization products or services on social media Audio
-combines the use of probability and data - collected from service calls
statistics to model uncertainty with Video data - by in-store video cameras
optimization techniques are used to analyze shopping behavior.
 decision analysis  Veracity- has to do with how much
-used to develop an optimal strategy uncertainty is in the data.
  utility theory
-assigns values to outcomes based on Hadoop- an open-source programming
the decision maker’s attitude environment that supports big data
-provides a divide-and-conquer
approach to handling massive amounts of
data
MapReduce- a programming model used
within Hadoop that performs the two major
steps:
1. map step- divides the data into
manageable subsets and distributes it
to the computers in the cluster (often
termed nodes) for storing and
processing.
2. reduce step- collects answers from the
 big data nodes and combines them into an
-no universally accepted definition answer to the original problem. 
-one of the fastest-growing areas for
Data security- the protection of stored data the application of analytics
from destructive forces  -better understanding of consumer
behavior through the use of scanner data
Business Analytics in Practice and data generated from social media has
-involves tools as simple as reports and led to an increased interest in marketing
graphs to those that are as sophisticated as analytics. 
optimization, data mining, and simulation. Health Care Analytics
In practice, companies that apply -The use of analytics in health care is
analytics often follow a trajectory on the increase because of pressure to
advanced analytics simultaneously control costs and provide
more effective treatment.
-Descriptive, predictive, and
prescriptive analytics are used to improve
patient, staff, and facility scheduling;
patient flow; purchasing; and inventory
control
-The use of prescriptive analytics for
diagnosis and treatment is relatively new,
but it may prove to be the most important
application of analytics in health care.
Supply-Chain Analytics
-predictive and prescriptive
-The optimal sorting of goods, vehicle
and staff scheduling, and vehicle routing
Financial Analytics
are all key to profitability for logistics
-applications of analytics in finance are
-Companies can benefit from better
numerous and pervasive.
inventory and processing control and more
Predictive models -used to forecast
efficient supply chains. Analytic tools used
financial performance, to assess the risk of
in this area span the entire spectrum of
investment portfolios and projects, and to
analytics. 
construct financial instruments such as
Analytics for Government and Nonprofits
derivatives.
-Government agencies and other
Prescriptive models - used to construct
nonprofits have used analytics to drive out
optimal portfolios of investments, to
inefficiencies and increase the effectiveness
allocate assets, and to create optimal
and accountability of programs. 
capital budgeting plans. 
-Likewise, nonprofit agencies have
Simulation- often used to assess risk in the
used analytics to ensure their effectiveness
financial sector
and accountability to their donors and
clients.
Human Resource (HR) Analytics
Sports Analytics
-relatively new area of application for
-has gained considerable notoriety
analytics The HR function is charged with
since 2003 when renowned author Michael
ensuring that the organization
Lewis published Moneyball.
1. has the mix of skill sets necessary to
Web Analytics
meet its needs
-analysis of online activity, which
2. is hiring the highest-quality talent and
includes,visits to web sites and social media
providing an environment that retains it,
3. achieves its organizational diversity
goals.
Chapter 2
Google refers to its HR Analytics function as
“people analytics.” 
Overview of Using Data: Definitions and
Marketing Analytics
Goals
Sources of Data
Data -facts and figures collected, analyzed,
and summarized for presentation and statistical studies 
interpretation *experimental study- a variable of interest
variable - quantity of interest that can take is first identified
on different values -Then one or more other variables are
observation - set of values corresponding identified and controlled or manipulated to
to a set of variables obtain data about how these variables
variation- difference in a variable measured influence the variable of interest
over observations (time, customers, items, Example: learn about how a new drug
etc.). (another variable) affects blood pressure
decision variables- under direct control of ( variable of interest)
the decision maker
random variable, or uncertain variable- *non-experimental study / observational
quantity whose values are not known with -make no attempt to control the
certainty variables of interest.
Example: survey
Types of Data
Sorting and Filtering Data in Excel
Population and Sample Data -easily identify patterns
-Data can be categorized in several
ways based on how they are collected and Kapag isosort mo base sa march 2010
the type collected. Step 1. Click Data bar
Collection of Data Step 2. Click sort
*population - not feasible Step 3: select all the data
*sample- collect data from a subset of the Step 4: In the first Sort by dropdown menu,
population select Sales (March 2010)
Step 5: In the Order dropdown menu,
Quantitative data- numeric and arithmetic select Largest to Smallest
operations, such as addition, subtraction,
multiplication, and division, can be Kapag gustoi lang makita sales ng Toyota
performed on the
categorical data- arithmetic operations Step 1. Click Data bar
cannot be performed on the data Step 2. select all the data
-We can summarize categorical data by Step 3: Click auto filter
counting the number of observations or Step 4: Click on the Filter Arrow   in
computing the proportions of observations column B, next to Manufacturer
in each category Step 5: select all hanggang mawala na lahat
ng check
Cross-sectional data- collected from several Step 6: icheck ay yung toyota lang
entities at the same, or approximately the
same, point in time.
Time series- collected over several time
periods.
- frequently found in business and
economic publications.
-help analysts understand what Conditional Formatting of Data in Excel
happened in the past, identify trends over -can make it easy to identify data that
time, and project future levels for the time satisfy certain conditions in a data set.
series.
Kapag gusto ihighlight yung negative quantitative data are as follows:

Step 1. Click home bar 1. Determine the number of


Step 2. select all the data (nung sales lang) nonoverlapping bins.
Step 3: Click conditional formatting 2. Determine the width of each bin.
Step 4: click highlight cell rules 3. Determine the bin limits.
Step 5: click less than
Step 6: tapos itrype mo dun sa space 0% Number of Bins
Bins- formed by specifying the ranges used
Creating Distributions from Data to group the data.
-help summarize many characteristics RECOMMENDED= 5-20 BINS
of a data set by describing how often SMALL NO. OF DATA = 5-6 BINS
certain values for a variable appear in that LARGE NO. OF DATA= more bins are
data set. Distributions can be created for usually required
both categorical and quantitative data, and -The goal is to use enough bins to show the
they assist the analyst in gauging variation. variation in the data, but not so many that
some contain only a few data items
frequency distribution - a summary of data Width of the Bins
that shows the number (frequency) of larger number of bins = smaller bin
observations in each of several width
nonoverlapping classes(bins) Bin Limits
must be chosen so that each data item
Bilangin ang hirap hahaha belongs to one and only one class
lower bin limit- smallest possible data value
Relative Frequency and Percent Frequency upper bin limit- largest possible data value
Distributions assigned to the class.

frequency distribution- number


(frequency) of items in each of several
nonoverlapping bins
relative frequency =  the fraction or show that four values—12, 14, 14, and 13—
proportion of items belonging to a class belong to the 10–14 bin
relative frequency distribution - a tabular
summary of data showing the relative histogram -graphical summary can be prepared
frequency for each bin. for data previously summarized in either a
percent frequency distribution frequency, a relative frequency, or a percent
- summarizes the percent frequency of frequency distribution.
the data for each bin. horizontal axis= variable of interest
- can be used to provide estimates of Vertical = frequency
Skewness- lack of symmetry, is an important
the relative likelihoods of different values
characteristic of the shape of a distribution
for a random variable

Frequency Distributions for Quantitative  cu mulati


Data ve

The three steps necessary to define the frequency distribution


classes for a frequency distribution with -uses the number of classes,
class widths, and class limits developed for
the frequency distribution approximately one-fourth, or 25 percent, of
-shows the number of data items with the observations. These division points are
values less than or equal to the upper class referred to as the quartiles and are defined
limit of each class as follows:
interquartile range, or IQR -difference
Measures of Location between the third and first quartiles
*Mean (Arithmetic Mean) *z-score- allows us to measure the relative
- a measure of central location for the location of a value in the data set. More
data. average specifically, a z-score helps us determine
*Median- value in the middle when the how far a particular value is from the mean
data are arranged in ascending order relative to the data set’s standard
(smallest to largest value) deviation.
*Mode- value that occurs most frequently *Empirical Rule -to determine the
in a data set percentage of data values that are within a
*geometric mean - a measure of location specified number of standard deviations of
that is calculated by finding the nth root of the mean. Many, but not all, distributions
the product of n values. of data found in practice exhibit a
-often used in analyzing growth rates symmetric bell-shaped distribution.
in financial data. In these types of
situations, the arithmetic mean or average Identifying Outliers-  one or more
value will provide misleading results. observations with unusually large or
unusually small values
Measures of Variability -Standardized values (z-scores) can be
*Range- simplest measure of variability used to identify outliers. Recall that the
*Variance- measure of variability that empirical rule allows us to conclude that for
utilizes all the data. data with a bell-shaped distribution, almost
- based on the deviation about the all the data values will be within 3 standard
mean, which is the difference between the deviations of the mean. Hence, in using z-
value of each observation  and the mean. scores to identify outliers, we recommend
*Standard Deviation -defined to be the treating any data value with a z-score less
positive square root of the variance than −3 or greater than +3 as an outlier.
*coefficient of variation -how large the Box Plots
standard deviation is relative to the mean. - is a graphical summary of the distribution
- usually expressed as a percentage. of data. A box plot is developed from the
quartiles for a data set
Analyzing Distributions
* percentile- value of a variable at which a Measures of Association Between Two
specified (approximate) percentage of Variables
observations are below that value. The pth
percentile tells us the point in the data scatter chart - a useful graph for analyzing
where approximately p% of the the relationship between two variables
observations have values less than the pth * Covariance is a descriptive measure of
percentile; hence, approximately (100 − p) the linear association between two
% of the observations have values greater variables.
than the pth percentile. * correlation coefficient measures the
relationship between two variables, and,
unlike covariance, the relationship between
two variables is not affected by the units
*Quartiles of measurement for x and y.
-It is often desirable to divide data into
four parts, with each part containing CHAPTER 3
* bubble chart is a graphical means of
Effective Design Techniques visualizing three variables in a two-
dimensional graph and is therefore
*data-ink ratio- One of the most helpful sometimes a preferred alternative to a 3-D
ideas for creating effective tables and graph. 
charts for data visualization * heat map is a two-dimensional graphical
*Tables -The first decision in displaying representation of data that uses different
data is whether a table or a chart will be shades of color to indicate magnitude. 
more effective. In general, charts can often
convey information faster and easier to Advanced Data Visualization
readers, but in some cases a table is more
appropriate. Tables should be used when: Advanced Charts
1. reader needs to refer to specific * parallel-coordinates plot, includes a
numerical values. different vertical axis for each variable.
2. reader needs to make precise Each observation in the data set is
comparisons between different values and represented by drawing a line on the
not just relative comparisons. parallel-coordinates plot connecting each
3. values being displayed have different vertical axis. The height of the line on each
units or very different magnitudes. vertical axis represents the value taken by
* crosstabulation, which provides a tabular that observation for the variable
summary of data for two variables. corresponding to the vertical axis.
*Charts (or graphs) are visual methods for *A treemap is useful for visualizing
displaying data. hierarchical data along multiple
*A scatter chart is a graphical presentation dimensions.
of the relationship between two *geographic information system (GIS),
quantitative variables. As an illustration, which merges maps and statistics to
consider the advertising/sales relationship present data collected over different
for an electronics store in San Francisco. On geographic areas. Displaying geographic
10 occasions during the past three months, data on a map can often help in
the store used weekend television interpreting data and observing patterns.
commercials to promote sales at its stores.
*Line charts are similar to scatter charts, data dashboard is a data-visualization tool
but a line connects the points in the chart. that illustrates multiple metrics and
Line charts are very useful for time series automatically updates these metrics as new
data collected over a period of time data become available.
(minutes, hours, days, years, etc.)
*Bar and column charts
Bar charts use horizontal bars to display the
magnitude of the quantitative variable. 
Column charts use vertical bars to display
the magnitude of the quantitative variable.
-Bar and column charts are very
helpful in making comparisons between
categorical variables.

*Pie charts are another common form of


chart used to compare categorical data.
However, many experts argue that pie
charts are inferior to bar charts for
comparing data.

You might also like