You are on page 1of 50

# Data, Models and Decisions

PGP 13-15

## Why Study Statistics?

Decision Makers Use Statistics To:
Present and describe business data and information

properly
Draw conclusions about large populations, using
information collected from samples

## Why Collect Data?

A marketing research analyst needs to assess the

## process to find out whether the quality of product being

manufactured is conforming to company standards.

## company in order to determine whether the company is in

compliance with generally accepted accounting principles.

Types of Statistics
Statistics
The branch of mathematics that transforms data

## into useful information for decision makers.

Descriptive Statistics

Inferential Statistics

describing data

## Drawing conclusions and/or making

decisions concerning a population
based only on sample data

Descriptive Statistics
Collect data
ex. Survey

Present data
ex. Tables and graphs

Characterize data

X
ex. Sample mean = i
n

Inferential Statistics
Estimation
ex. Estimate the
population mean weight
using the sample average
weight
Hypothesis testing
ex. Test the claim that
the population average
weight is 65 Kg
Drawing conclusions and/or making decisions
concerning a population based on sample results.

## Basic Vocabulary of Statistics

VARIABLE
A variable is a characteristic of an item or individual.
DATA
Data are the different values associated with a variable.
POPULATION
A population consists of all the items or individuals about which you want to draw a
conclusion.
SAMPLE
A sample is the portion of a population selected for analysis.
PARAMETER
A parameter is a numerical measure that describes a characteristic of a population.
STATISTIC
A statistic is a numerical measure that describes a characteristic of a sample

Population

## Measures used to describe the

population are called parameters

Sample

## Measures computed from

sample data are called statistics

Sources of Data
Primary Sources: The data collector is the one
using the data for analysis
Data from a political survey
Data collected from an experiment
Observed data

## Secondary Sources: The person performing data

analysis is not the data collector
Analyzing census data
Examining data from print journals or data published on
the internet.

Types of Variables
Categorical (qualitative) variables have values

## that can only be placed into categories, such as

yes and no.
Numerical (quantitative) variables have values

## that represent quantities.

Types of Data
Data

Categorical

Numerical

Examples:

Marital Status
Political Party
Eye Color
(Defined categories)

Discrete

Continuous

Examples:

Number of Children
Defects per hour
(Counted items)

Examples:

Weight
Voltage
(Measured characteristics)

Probability
Empirical classic probability

## Based on historical data

Computed after performing the experiment
Number of times an event occurred divided by the number of

trials
Objective -- everyone correctly using the method assigns an
identical probability

Subjective probability
different individuals may (correctly) assign different numeric

## Mutually Exclusive event

Collectively Exhaustive event
Equally Likely event

Random Variable
A random variable x takes on a defined set
of values with different probabilities.

## For example, if you roll a die, the outcome is random

(not fixed) and there are 6 possible outcomes, each of
which occur with probability one-sixth.
For example, if you poll people about their voting
preferences, the percentage of the sample that responds
Yes on Proposition 100 is a also a random variable
(the percentage will be slightly differently every time
you poll).

## Roughly, probability is how frequently we

expect different outcomes to occur if we
repeat the experiment over and over
(frequentist view)

## Random variables can be discrete or

continuous
Discrete random variables have a countable number of

outcomes

## Continuous random variables have an infinite continuum

of possible values.

## Examples: blood pressure, weight, the speed of a car, the real

numbers from 1 to 6.

Probability functions
A probability function maps the possible values of

## x against their respective probabilities of

occurrence, p(x)
p(x) is a number from 0 to 1.0.
The area under a probability function is always 1.

p(x)

1/6

P(x) 1

all x

x

p(x)

p(x=1)=1
/6
p(x=2)=1
/6
p(x=3)=1
/6
p(x=4)=1
/6
p(x=5)=1
/6
p(x=6)=1
/6
1.0

2
3
4
5
6

(CDF)
1.0
5/6
2/3
1/2
1/3
1/6

P(x)

## Cumulative distribution function

x

P(xA)

P(x1)=1/6

P(x2)=2/6

P(x3)=3/6

P(x4)=4/6

P(x5)=5/6

P(x6)=6/6

Practice Problem:
The number of patients seen in the ER in any given hour is

## a random variable represented by x. The probability

distribution for x is:
x
P(x)

10
.4

11
.2

12
.2

13
.1

14
.1

## Find the probability that in a given hour:

a. exactly 14 patients arrive

p(x=14)= .1

## p(x11)= (.4 +.2) = .6

Review Question 1
If you toss a die, whats the probability that you
roll a 3 or less?
a.
b.
c.
d.
e.

1/6
1/3
1/2
5/6
1.0

Review Question 1
If you toss a die, whats the probability that you
roll a 3 or less?
a.
b.
c.
d.
e.

1/6
1/3
1/2
5/6
1.0

Review Question 2
Two dice are rolled and the sum of the face
values is six? What is the probability that at
least one of the dice came up a 3?
a.
b.
c.
d.
e.

1/5
2/3
1/2
5/6
1.0

Review Question 2
Two dice are rolled and the sum of the face
values is six. What is the probability that at least
one of the dice came up a 3?
a.
b.
c.
d.
e.

1/5
2/3
1/2
5/6
1.0

## How can you get a 6 on two dice?

1-5, 5-1, 2-4, 4-2, 3-3
One of these five has a 3.
1/5

## simultaneously. What is the probability of obtaining a head

on the first coin (call event A) and a head on the second
coin (call event B)?
Example: A card is drawn from a well shuffled pack of
playing cards. What is the probability that it will either a
Example: In a DMD class there are 123 students of which
93 students are males and 30 are females. Of these, 36
males and 18 females plan to major in Marketing. A student
is selected at random from this class and it is found that this
student plans to be a Marketing major. What is the
probability that the student is a male?

Continuous case
The probability function that accompanies a
continuous random variable is a continuous
mathematical function that integrates to 1.

## For example, recall the negative exponential

function (in probability, this is called an
exponential distribution):
f ( x) e x

e
0

0 1 1

## For example, the probability of x falling within 1 to 2:

Clinical example: Survival
times after lung transplant may
function.
Then, the probability that a
patient will die in the second
year after surgery (between
years 1 and 2) is 23%.

p(x)=e-x
1

x
1
2

P(1 x 2) e
1

2
1

## Expected Value and Variance

All probability distributions are
characterized by an expected value
(mean) and a variance (standard
deviation squared).

## Expected value, formally

Discrete case:

E( X )

x p(x )
i

all x

Continuous case:

E( X )

xi p(xi )dx

all x

A Situation
Acme Fruit and Vegetable Wholesalers buys tomatoes,
then sells them to retailers. Acme currently pays `
2000 per container. Tomatoes sold on the same day
bring ` 5000 per container. Extremely perishable in
nature, if any tomato container not sold on the same
day are worthless and required to be disposed off
(consider at no cost). The distribution managers
problem is to determine the optimum number he
should order each day. On days when he stocks more
than he sells, his profit is reduced by the cost of the
unsold containers. On the other hand, when retailers
request more containers than he has in stock, he loses
sales and makes smaller profit than he could have.

## Developing Pay-off table

Acme currently pays ` 2000 per container. Tomatoes sold

## on the same day bring ` 5000 per container. Profit = 3000

per container.
Pay off table in ` 00
ACTIONS ( Quantity ordered Q)
EVENTS
(Demand)

Q1= 10

Q2= 11

Q3 =12

Q4= 13

D1= 10

300

280

260

240

D2= 11

300

330

310

290

D3= 12

300

330

360

340

D4= 13

300

330

360

390

## When D Q, P = 30 Q and when D

Q, P = 30 D 20 (Q-D)

## Probability of Occurrence principle

Let us suppose the Manager kept a record of his sales for the past

100 days.
Daily Sales

Number of days
sold

Probability of each
number being sold

D1= 10

15

0.15

D2= 11

20

0.20

D3= 12

40

0.40

D4= 13

25

0.25

## The expected value

of decision alternative di is defined as:
EV( d(EV)
i ) P( s j )Vij
j 1

where:
N = the number of states of nature
P(sj ) = the probability of state of nature sj
ij = the payoff corresponding to decision alternative di and

state of nature sj

## Expected profit from stocking 10 containers

ACTION ( Quantity ordered is 10)
EVENTS
(Demand)

Conditional
profit (1)

Probability of
selling (2)

Expected profit
=(1) x (2)

D1= 10

300

0.15

45

D2= 11

300

0.20

60

D3= 12

300

0.40

120

D4= 13

300

0.25

75

Total EV

300

## Expected profit from stocking 11 containers

ACTION ( Quantity ordered is 11)
EVENTS
(Demand)

Conditional
profit (1)

Probability of
selling (2)

Expected profit
=(1) x (2)

D1= 10

280

0.15

42

D2= 11

330

0.20

66

D3= 12

330

0.40

132

D4= 13

330

0.25

82

Total EV

322.50

## Expected profit from stocking 12 containers

ACTION ( Quantity ordered is 12)
EVENTS
(Demand)

Conditional
profit (1)

Probability of
selling (2)

Expected profit
=(1) x (2)

D1= 10

260

0.15

39

D2= 11

310

0.20

62

D3= 12

360

0.40

144

D4= 13

360

0.25

90

Total EV

335

## Expected profit from stocking 13 containers

ACTION ( Quantity ordered is 13)

EVENTS
(Demand)

Conditional
profit (1)

Probability of
selling (2)

Expected profit
=(1) x (2)

D1= 10

240

0.15

36

D2= 11

290

0.20

58

D3= 12

340

0.40

136

D4= 13

390

0.25

97

Total EV

327.50

## Important discrete probability

distribution: The binomial

## The Binomial Distribution: Properties

A fixed number of observations, n
ex. 15 tosses of a coin; ten light bulbs taken from a
warehouse
Two mutually exclusive and collectively

exhaustive categories
ex. head or tail in each toss of a coin; defective or not

## defective light bulb; having a boy or girl

Generally called success and failure
Probability of success is p, probability of failure is 1 p
Constant probability for each observation
The outcome of one observation does not affect the outcome of

the other

## Two sampling methods

Infinite population without replacement
Finite population with replacement

Binomial distribution
Take the example of 5 coin tosses. Whats the
probability that you flip exactly 3 heads in 5 coin
tosses?

## Binomial distribution, generally

Notethegeneralpatternemergingifyouhaveonlytwopossible
outcomes(callthem1/0oryes/noorsuccess/failure)innindependent
trials,thentheprobabilityofexactlyXsuccesses=
n = number of trials
n

X=#
successes
out of n
trials

p (1 p )

n X

1-p = probability
of failure
p=
probability of
success

## Binomial distribution: example

If I toss a coin 20 times, whats the probability of

20

10

10
10
(.5) (.5) .176

## Binomial distribution: example

If I toss a coin 20 times, whats the probability of

20

20

20

0
20
(.5) (.5)

19

(.5) (.5)

20!

## (.5) 20 20 x9.5 x10 7 1.9 x10 5

19!1!

2
18
(.5) (.5)

1.8 x10 4

20!
(.5) 20 9.5 x10 7
20!0!

20!
(.5) 20 190 x9.5 x10 7 1.8 x10 4
18!2!

## **All probability distributions are characterized

by an expected value and a variance:
If X follows a binomial distribution with parameters n
and p: X ~ Bin (n, p)
Mean

E(x) np

2 np (1 - p )

np (1 - p )

## Where n = sample size

p = probability of success
(1 p) = probability of failure

Applications
A manufacturing plant labels items as either

defective or acceptable
A firm bidding for contracts will either get a
contract or not
A marketing research firm receives survey responses
of yes I will buy or no I will not
New job applicants either accept the offer or reject it
Your team either wins or loses the football game at
the company picnic

## The Hypergeometric Distribution

The binomial distribution is applicable

## when selecting from a finite population with

replacement or from an infinite population
without replacement.
The hypergeometric distribution is

## applicable when selecting from a finite

population without replacement.

## The Hypergeometric Distribution

P( X )

N A

n X
N

Where
N = population size
A = number of successes in the population
N A = number of failures in the population
n = sample size
X = number of successes in the sample
n X = number of failures in the sample

## The Hypergeometric Distribution

Example
Different computers are checked from 10 in the

## department. 4 of the 10 computers have illegal

software loaded. What is the probability that 2 of the
3 selected computers have illegal software loaded?
So, N = 10, n = 3, A = 4, X = 2
A

X
P(X 2)

N A

4 6

2 1
n X
(6)(6)

0.3
N
120
10

3
n

## The Hypergeometric Distribution

Characteristics
The mean of the hypergeometric distribution is:
E(x)

nA
N

## The standard deviation is:

Where

nA(N - A) N - n

2
N
N -1

N-n
N - 1 is called the Finite Population Correction Factor

## The Poisson Distribution Definitions

An area of opportunity is a continuous unit or

## interval of time, volume, or such area in which

more than one occurrence of an event can
occur.
ex. The number of scratches in a cars paint
ex. The number of mosquito bites on a

person
ex. The number of computer crashes in a day

## The Poisson Distribution Properties

Apply the Poisson Distribution when:
You wish to count the number of times an event occurs in a

## given area of opportunity

The probability that an event occurs in one area of opportunity
is the same for all areas of opportunity
The number of events that occur in one area of opportunity is
independent of the number of events that occur in the other
areas of opportunity
The probability that two or more events occur in an area of
opportunity approaches zero as the area of opportunity
becomes smaller
The average number of events per unit is (lambda)

## The Poisson Distribution Formula

e x
P(X)
X!
where:
X = the probability of X events in an area of opportunity
= expected number of events
e = mathematical constant approximated by 2.71828

An example
Suppose that, on average, 5 cars enter a parking lot

## per minute. What is the probability that in a given

minute, 7 cars will enter?
e x e 5 5 7
P(7)

0.104
X!
7!
So, there is a 10.4% chance 7 cars will enter the

## parking in a given minute.

Mean = Variance =