You are on page 1of 18

Lamb

MBA612;

Lecture1; fall 2016,


On campus

Wednesday, August 24, 2016

Reading assignment from the text associated with Lecture 1


Those topics are found in Business Statistics: A Decision-Making Approach
I will only review those that have the double star.
Please be familiar with all the topics.
Topic

Chapter(s)

Rules of Summation & Double Summation Notation **


Instructor
Data Collection
1
Describing Your Data, Graphs, Tables, Charts
2
Numerical Measures
3
Measures of Central Tendency, **
3
Variation and Shape
3
Population Measures
Mean **
3
Variance, Standard Deviation **
3
Sample measurements (statistics)
Sample Mean **
3
Sample variance, and sample standard deviation**
3
Basic Probability
4
Discrete Probability Distributions
5
Mean, Standard deviation & Variance of a
Discrete Random Variable **
5
Binomial Distribution**
5
The Continuous Distributions **
Normal Distribution **
Normal Approximation to the Binomial **
Sampling Distributions for the mean and proportion **

instructor, 6
6
instructor
7

Lecture 1
Four Rules of Summation

1)
2)
3)
4)
5)

Xi

5Xi

20
18
22
25
15

100
90
110
125
75
500

X
i=1

= 100

5Xi = 500, The above approach uses case by case method. That is, for each value
of I, the value of 5xi is found and then these values are summed.

Lamb
MBA612;

Lecture1; fall 2016,


On campus

Wednesday, August 24, 2016

Rule #1
n

K
i=1

Xi= K X
i=1

500 = 5(100) = 500


out of the expression.

Here the expression is simplified by pulling the constant

Rule #2
(Xi + Yi)
(Xi + Yi) = Xi + Yi

A summation expression can be broken down into its parts

Likewise (Xi - Yi) = Xi - Yi


Xi
20
18
22
25
15
100

1)
2)
3)
4)
5)

Yi
2
1
0
2
1
6

Xi + Y i
22
19
22
27
16
106

Here, again, we have found the value of the expression case by case then summed.
Also, (Xi + Yi) = Xi + Yi = 100 + 6 = 106 Here we have simplified the expression
and found the same answer.
Rule #3
n

The sum of a constant n times is

K =nk
i=1

Rule #4
n

X
i=1

1)
2)

Yi

Xi
20
18

Yi
2
1

XiYi
40
18
2

Lamb
MBA612;

Lecture1; fall 2016,


On campus

3)
4)
5)

0
2
1
6

22
25
15
100

Wednesday, August 24, 2016

0
50
15
123

= X1Y1 + X2Y2 + X5Y5


n

X
i=1

Yi (Xi) (Yi)

123 100 x 6
That is, one cannot find the value of the quantity on the left hand side of the
expression by using the expression on the right hand side and visa versa.
The following is a formula for sample variance. As one moves from the first formula
to the second formula, one uses all the rules of summation.
S2 = (Xi

X )2

(n-1)
S2 = n Xi2 ( Xi)2
n (n 1)

Statistics
X

is the sample mean

= Xi
n

This is used for sample data.

Property of this Statistic

E ( X ) = , This means, the expected value of the sample mean is the population
mean.
Parameter
Definition of Parameter is the average value of all X
This formula is used when you have all the data and not a sample of data.

Lamb
MBA612;

Lecture1; fall 2016,


On campus

Wednesday, August 24, 2016

xi
i=1

N
= E (X) is again defined as the population average value of x.

= /

The standard error of the mean

Central Limit TheoremSample means are normally distributed about the


population mean. The standard deviation of sample means is called the standard
error. More about this later. Note the picture above.

Double Summation Notation


r

An example, Calculate
i/j

1)
2)
3)
4)
5)

20
22
18
25
15
100

30
32
28
35
25
150

X ij
j=1 i=1

X ij

X
i=1

using the following data

j=1 i=1

i1

X
i=1

i2

(X11 + X21 + X31 + X41 + X51) + (X12 + X22 + X32 + X42 + X52)
20 + 22 + 18+ 25 + 15 + 30 + 32 + 28 + 35 + 25
100 + 150 =250

Lamb
MBA612;

Lecture1; fall 2016,


On campus

Wednesday, August 24, 2016

Measures of Central Tendency


Mean
Mode
Median

= X/n

value that occurs most frequently


value that is in the middle once you arrange them in order

Median, arrange the data in order from smallest to largest


5
8
12
average of these two
16
30
3,000,000
Position of the median
2

n+1

The median is not affected by extreme values; however the mean is affected by
extreme values

Mode

Median

Mean

The mean is affected by extreme values. Assume that in this example the data is
nonsymmetrical and skewed to the right.

Variance
A statistic is always from a sample.

Lamb
MBA612;

Lecture1; fall 2016,


On campus

Wednesday, August 24, 2016


n

Statistic

s2 =

( X i X )2

/(n-1)

i=1

xi
i=1

E (s2) = 2 , this statistic is unbiased

Property of this Statistic

2 =

Parameter

i=1

/N

2 = E (X - )2

Definition of Parameter
Standard Deviation

( X i)2

This is the picture of a variable that is normally distributed with mean , and
standard deviation .
Note, picture should be symmetrical about

This point is the inflection point; that point on


the curve at which the
shape of the curve changes from concave to convex.

If a variable were normally distributed such as


Student heights with
= 57 and
= 4 inches
53 61
If heights are normally distributed, then about two-thirds of the students will fall
within this range. Recall that 2/3 of the observations fall within one standard
deviation of the mean, if the variable is normally distributed.
This is the picture of the distribution of sample means, they are normally distributed
about , with standard deviation /

. The central limit theorem is in effect.

Sample Means are normally distributed. Again, the curve should be symmetrical
(sorry, I flunked art)

Lamb
MBA612;

Lecture1; fall 2016,


On campus

Wednesday, August 24, 2016

X = /

The standard error of the sample

mean

A larger size sample used to obtain the sample mean will tend to get the sample
means closer to

The Central Limit Theorem is in effect.


Area under the curve equals 1

Discrete Random Variable, variable takes on a limited


number of values.
Balanced Die
Roll
you get
X
P (x)
1, 2
$10 2/6
3, 4, 5
$20 3/6
6
$100 1/6
Average winnings per roll
= E(x)
To find the mean of a discrete random variable use the following formula
= E(X) = Xi P(Xi)
Play the game 6 times. What game? Rolling a balanced die.
= E(x) = Xi P(xi)
X
p(X)
X#
X P(x)
1, 2
$10 2/6
20
20/6
3, 4, 5
$20 3/6
60
60/6
6
$100 1/6
100 100/6
180
180/6
180/6 = $30 average winnings (Mean)

Lamb
MBA612;

Lecture1; fall 2016,


On campus

Wednesday, August 24, 2016

To find the variance of a discrete random variable, use the following formula.
2 = E(X - )2 = (X - )2 P(Xi)
X
P(X) (Xi - )2
(Xi - )2 P(Xi)
1, 2
$10 2/6
400
800/6
3, 4, 5
$20 3/6
100
300/6
6
$100 1/6
4,900
4900/6
6,000/6
2 = 1000
= $31.62
If we were to play the game 16 times the average winnings per roll would be $30,
and the standard deviation of that mean would be
X = /

= 31.62/4 = 7

$30 Population mean, or expected winnings per roll

= /

= 31.62/4 = 7 This is the standard deviation of the mean

(called the standard error).

30

Binomial Distribution
1) Successive independent trials with only 2 outcomes per trial
Yes
or
No
Heads or
Tails
Buy
or
Not Buy
Probability of success remains constant throughout experiment

Lamb
MBA612;

Lecture1; fall 2016,


On campus

Wednesday, August 24, 2016

Random variable
If distribution follows a binomial then this is the formula
n!
(n x)! x!

pxqn- x

n = # of trials
P = probability of success on any single trail
q = (1 P)
X = specific # of successes
For Example
n = 5, this is the number of trials
P = 2/6 =1/3
X=2
5! (1/3)2 (2/3)2
3! 2!
10 x 1/9 x 8/27
80/243
= Xi P(Xi)
Shortcut for mean
= np
(number of trials x probability of success)
Variance shortcut for a binomial
2 = npq
5 2/6 4/6 =
In order for us to recognize that a problem can be modeled by the binomial
theorem, we must have n independent trials with only 2 relevant outcomes per
trial. It is also the case that the binomial distribution becomes normally distributed
if certain conditions are met np> 5 and nq>5. In the following example the
conditions for the normal distribution would be met.

170

200 230
n= 400
P=
And therefore np=200 (the mean)
The variance is equal to npq or
400 = 100, and the standard
Deviation would equal , = 10, = 100

10

Lamb
MBA612;

Lecture1; fall 2016,


On campus

170
Rarely will you be outside the range 170 230

Wednesday, August 24, 2016


200

230

Discrete Distribution
X
0
1
2
3
4

P(X)
1/16
4/16
6/16
4/16
1/16

0 < P(Xi) < 1


For a formula to be a discrete probability distribution, the probabilities calculated
have to be between 0 and 1 and the sum of the values has to be 1.
P(Xi) = 1
Minimum sum of two dice = 2
Maximum sum of two dice = 12
P(x) = 6 7 x
36
P(2) = 1/36

Where 2, 3, 4, 12

1
2
3
4
5
6

X
2
3

p(x)
1/36
2/36
10

Lamb
MBA612;
4
5
6
7
8
9
10
11
12

Lecture1; fall 2016,


On campus

Wednesday, August 24, 2016

3/36
4/36
5/36
6/36
5/36
4/36
3/36
2/36
1/36

Green/Red
1
1
2
2
3
3
4
4
5
5
6
6
7

Sum of the two dice


2
3
4
5
3
4
5
6
4
5
6
7
5
6
7
8
6
7
8
9
7
8
9
10
8
9
10
11

6
7
8
9
10
11
12

Normal Distribution
(X - )2
103

F(x) =

12
90

e- 2

(x-)2/2

The only input into this equation is and .


(x - ) is squared whether or not the deviation is positive or negative doesnt matter.
determines the central location.
Area subscribed = 1
= 100
= 10
Find the area between 90 and 103
X
N (, ) change into a z statistic
Z
N (0,1)
Z transformation (Z scores)
11

Lamb
MBA612;

Lecture1; fall 2016,


On campus

Wednesday, August 24, 2016

Z is the number of standard deviation that an observation is away from its mean.

Z=

Answer = .3413 + .1179 = .4592


.3413
.1179

.0228

90
100 103
-1
.3
Prove that the z = 0
Z is defined as = X - x
x
z = E (x-x) = E (x) - x = x - x = 0
x
x
x
Variance of Z
22 = E(z-z)2
= E(z2)
=E
=

X i 2
(
)

E ( X i )2 / 2

x2
= E(x - x)2
x2
= x2
x2
=1

Central Limit Theorem Sample means are normally distributed. A small sample used to
obtain means will generate sample means that are normally distributed, then if the distribution of
the original variable is non symmetric, a larger size sample to gather the means will be necessary
before the distribution of sample means become normal.
An Exercise to demonstrate the Central Limit Theorem:
X

P(X)

(X - x) 2P(Xi)

P(

)
12

Lamb
MBA612;

1
3
5

Lecture1; fall 2016,


On campus

1/3
1/3
1/3

Wednesday, August 24, 2016

4/3
0
4/3

3/9
2/9

= (XiP(Xi)) = 3
2= E(X-x)2 = (X - x)2P(Xi) = 8/3

1/9

Sample Size 2
X
1 2
3 4 5
Notice that with a sample size of only two, the distribution of sample means can be
approximated by the normal distribution. That is to say, sample means quickly take on the shape
of the normal even when the sample size is very small. Notice, however, that the original
distribution of the variable was symmetrical. If the underlying distribution is highly skewed, then
it would take a much larger sample size for the distribution of sample means to become normal.
X

1, 1
1, 3
1, 5
3, 1
3, 3

1
2
3
2
3

1
2
3
4
5

3, 5
5, 1
5, 3
5, 5

4
3
4
5

X
X

E(
E(

- )2 P(

=(

= (8/3)/2 = 4/3

X
(P )
1/9
4/9
9/9
8/9
5/9

- )2
4
1
0
1
4

=3

X
P( )
1/9
2/9
3/9
2/9
1/9

- )2P(
4/9
2/9
0
2/9
4/9
X

= 4/3

) = x (the population mean, regardless of sample size)


) = (XiP(Xi)) =

( X i P ( X i ) )

If you increase sample size, the variance of


X
may take on will increase.

=
X

P(

)=

will decrease, even though the possible values that

13

Lamb
MBA612;

Lecture1; fall 2016,


On campus

There will be more possible

2 / n or
x

X
=

Wednesday, August 24, 2016

values, but the extreme values will have a low probability.

x/n

Lambs constructed problems, Lecture 1, Part A


Problem #1
X

+ 6Yi)

i=1

Work
a.) case by case
b.) simplify
Xi
Yi
1) 20
2
2) 18
1
3) 22
0
4) 25
2
5) 15
1
100 6

Problem #2
(Xi 20)
Work
a.) case by case
b.) simplify

14

Lamb
MBA612;

1)
2)
3)
4)
5)

Lecture1; fall 2016,


On campus

Xi
20
18
22
25
15
100

Wednesday, August 24, 2016

Yi
2
1
0
2
1
6

Problem #3
(3XiYi 4Xi)
Work
a.) case by case
b.) simplify

1)
2)
3)
4)
5)

Xi
20
18
22
25
15
100

Yi
2
1
0
2
1
6

Problem #4
Prove. Using the rules of summation that :
n

(x ix )2
i=1

(n1)

={n

x
i=1

2
i

xi
i=1

)2}/n(n-1)

Problem #5
Find s2 using both formulas and the data below. You should get the same answer for
each.
Formula #1
15

Lamb
MBA612;

Lecture1; fall 2016,


On campus

S2 = (Xi -

X )2

Wednesday, August 24, 2016

n-1

Formula #2
S2 = nXi2 (Xi)2
n(n 1)
30
30
10
15
25

Problem #6
Flip a coin 4 times. Xi = the number of heads

H
T

Xi
0
1
2
3

H
T
H
T

P(Xi)
1/16
4/16
6/16
4/16
16

Lamb
MBA612;
4

Lecture1; fall 2016,


On campus

Wednesday, August 24, 2016

1/16

Find
= Xi P(Xi)
Find Variance
2 = (Xi

)2 P(Xi)

Problem #7
Binomial Distribution,
a) Determine the following probabilities for all possible values of x, given n=5, p
=1/3, q=2/3.
b) Then find the mean using = Xi P(Xi)
c) Then find the variance using 2 = (Xi

)2 P(Xi)

d) Use the shortcut formula for the mean of a binomial = np, and compare
your answer to the answer found in b.
e) Use the shortcut formula for the variance of a binomial 2 = npq, and
compare your answer to the answer found in c.

X
0
1
2
3
4
5

P(X)

80/243

Problem #8
Redo the exercise that I did in class to demonstrate Central Limit
Theorem, using a sample size of 3 instead of 2.

A) Graph the distribution of the variable


B) Find the population mean, and the variance, and standard
deviation of the variable.
C) Find and list all possible samples of size three, and calculate
each mean for that sample as was done above for a sample of
size two.
17

Lamb
MBA612;

Lecture1; fall 2016,


On campus

Wednesday, August 24, 2016

D) Write down the probability distribution of those means, and

then find the mean of those means using XP ( X ) ,


E) Also find the Variance of those means, sample of size 3, Using
2

( X )2P( X )

F) Now, also calculate

2X

2x
n

Problems from the text book, Lecture 1,


Part A
1) Page 110-111. These problems emphasize the concept of
variance (population as well as sample). Problem #s are.
27,28 and 30
2) Page 187. Discrete random variables. Do problems 2,6 and 7
3) Page 200-201. Binomial distribution. Do problems 22, 23, 33
and 34.
4) Page 237. Normal distribution. Do problems 11, 13, 14 and 15.
5) Page 277. The Standard Error of the mean. Do problems 26, 29,
and 30.
6) Page 285. The standard error of the Proportion. This may be
new material for you. Do problems 48 and 50.

18