Professional Documents
Culture Documents
Pre-Tutor MSM 2018 PDF
Pre-Tutor MSM 2018 PDF
2
Decision making
3
Decision making
What is the best choice?
Market conditions
High demand Average demand Low demand
4
Data Types
Data
Qualitative Quantitative
(Categorical) (Numerical)
Examples:
Marital Status
Political Party Discrete Continuous
Eye Color
(Defined categories) Examples: Examples:
Number of Children Weight
Defects per hour Voltage
(Counted items) (Measured
characteristics)
5
Data, Data Sets, Elements,
Variables, and Observations
Variables
Stock Annual Earn/
Company Exchange Sales($M) Sh.($)
Dataram AMEX 73.10 0.86
EnergySouth OTC 74.00 1.67
Keystone NYSE 365.70 0.86
LandCare NYSE 111.40 0.33
Psychemedics AMEX 17.60 0.13
6
Data Types
Cross Sectional
Data
1-7
DATA TYPE AND LEVELS
8
Populations and Samples
9
Population vs. Sample
Population Sample
b c
a b cd
gi n
ef gh i jk l m n
o r u
o p q rs t u v w
y
x y z
Inferential Procedures
Making statements about a population by
examining sample results
Sample statistics Population parameters
(known) Inference
unknown, but can
be estimated from
sample evidence
Sample Population
Present data
e.g., Charts and graphs
Characterize data
e.g., Sample mean = x i
12
Example: Hudson Auto Repair
13
Example: Hudson Auto Repair
Tabular Summary (Frequencies and Percent Frequencies)
Parts Percent
Cost ($) Frequency Frequency
50-59 2 4
60-69 13 26
70-79 16 32
80-89 7 14
90-99 7 14
100-109 5 10
Total 50 100
14
Example: Hudson Auto Repair
Graphical
18 Summary (Histogram)
16
14
Frequency
12
10
8
6
4
2
Parts
50 60 70 80 90 100 110 Cost ($)
3
-
1
6
Shape of a Distribution
Describes how data is distributed
Symmetric or skewed
The greater the difference between the mean and the median,
the more skewed the distribution
18
Example: Hudson Auto Repair
Ogive with Cumulative Percent Frequencies
100
Cumulative Percent Frequency
80
60
40
20
Parts
Cost ($)
50 60 70 80 90 100 110
Descriptive Statistics: Numerical Methods
Measures of Location
Measures of Variability
Measures of Relative Location and Detecting Outliers
x
20
Measures of Location
Mean
Median
Mode
Percentiles
Quartiles
21
Example: Apartment Rents
Mean
x
x i
34,356
490.80
n 70
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Example: Apartment Rents
Median
Median = 50th percentile
i = (p/100)n = (50/100)70 = 35
Averaging the 35th and 36th data values:
Median = (475 + 475)/2 = 475
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Example: Apartment Rents
Mode
450 occurred most frequently (7 times)
Mode = 450
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Example: Apartment Rents
90th Percentile
i = (p/100)n = (90/100)70 = 63
Averaging the 63rd and 64th data values:
90th Percentile = (580 + 590)/2 = 585
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Example: Apartment Rents
Third Quartile
Third quartile = 75th percentile
i = (p/100)n = (75/100)70 = 52.5 = 53
Third quartile = 525
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Measures of Variability
Range
Interquartile Range
Variance
Standard Deviation
Coefficient of Variation
Variance
The variance is the average of the squared differences
between each data value and the mean.
If the data set is a sample, the variance is denoted by s2.
i
( x x ) 2
s2
n 1
If the data set is a population, the variance is denoted
by 2.
( xi ) 2
2
N
Example: Apartment Rents
Variance
( xi x ) 2
2
s 2 , 9 9 6.4 7
n 1
Standard Deviation
s s2 2996. 47 54. 74
Coefficient of Variation
s 54. 74
100 100 11.15
x 490.80
Variation
Smaller value
Less variation
Larger value
More variation
Same center,
different variation
3-30
3
-
3
Constructing the
1
* *
Outliers Lower 1st Median 3rd Upper
Limit Quartile Quartile Limit
375 400 425 450 475 500 525 550 575 600 625
Introduction to Probability
Experiments, Counting Rules, and
Assigning Probabilities
Events and Their Probability
Some Basic Relationships of Probability
Conditional Probability
Assigning Probabilities
Classical Method
Assigning probabilities based on the assumption of
equally likely outcomes.
Relative Frequency Method
Assigning probabilities based on experimentation or
historical data.
Subjective Method
Assigning probabilities based on the assignor’s
judgment.
Classical Method
Example
Experiment: Rolling a die
Sample Space: S = {1, 2, 3, 4, 5, 6}
Probabilities: Each sample point has a 1/6 chance of
occurring.
Relative Frequency Method
N N!
CnN
n n !(N n )!
N X X
C . C
P( x ) n x
N
x
C n
Where
N = population size
X = number of successes in the population
n = sample size
x = number of successes in the sample
n – x = number of failures in the sample
5-39
Sampling and Sampling Distributions
n = 30
Properties of a Sampling Distribution
σ
Also called the σx
standard error n Theorem 2
z-value for Sampling Distribution of x
x
[--------------------- x ---------------------]
[--------------------- x ---------------------]
[--------------------- x ---------------------]
Interval Estimate of a Population Mean:
With Known
x z /2
n
where: 1 - is the confidence coefficient
With Unknown
The sample standard deviation, s, is used as the point
estimate of the population standard deviation.
s
x t / 2;df
n
Summary of Test Statistics to be Used in a
Hypothesis Test about a Population Mean
Yes No
n > 30 ?
No
known ?
Yes Popul.
approx.
Yes normal
Use s to
?
estimate No
known ?
No
Yes Use s to
estimate
s s
x z /2 x t /2 x z /2 x t /2
Increase n
to > 30
n n n n
Hypothesis Testing
Developing Null and Alternative Hypotheses
Type I and Type II Errors
One-Tailed Tests About a Population Mean:
Large-Sample Case
Two-Tailed Tests About a Population Mean:
Large-Sample Case
Tests About a Population Mean:
Small-Sample Case
Developing Null and Alternative Hypotheses
Level of significance =
/2 /2
-zα 0 0 zα -zα/2 0 zα/2
Population Condition
H0 True Ha True
Conclusion ( ) ( )
52
Calculating the Correlation Coefficient
Sample correlation coefficient:
r
( x x)( y y)
[ ( x x ) ][ ( y y ) ]
2 2
Population Random
Population Independent Error
Slope
y intercept Variable term, or
Coefficient
Dependent residual
y β0 β1x ε
Variable
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000