You are on page 1of 41

1 A

contingency table is a way to display


frequency distribuths between 2 or more
variables catego ical compare
A method to
It categorical variables
2
Joint dist of important for probabilities
total leg t of
people in sample
that arefemale Too in
Marginal dist
look to margins
of table e.g marginal 3
bar charts display
proportion of people frequencies
accepted 4895881 Tho pie chart
Conditional dist contingencytables
one instance is 4
dependent on 4 people like ice
another see cream
words like Fort 4 10 4 2 20
Given x out of like rides sot
X e.g For males
Ho
conditional proportion L flite toffees
of rejected Yoo
1
notated Aitkin gig
E soup
centre spread
sigmay
l
mom
mean I most Ginty
median Fea variance Std
F Ent midpoint
Obs
z der
S
average order obs EGn yt s V52
by size
small centre spread
large
meanly
odd median
median gym rag
center
mode I QR
even median
ang of center
8
e
g 3 7891
2 569 10 5s

stop and think Which measures of centre spread


are sensitive to outliers skewness
2
Sample of of days experiencing could 19
1 2 S S 10
Symptoms
1 2 5 5 10
mean Y s
4.6 days
Variance 52 1 4.6 t 2 46 S 4.67 5 4.6 Go4.65

51
I
49.2 sum of
4
squares
12.3 SS
Std der ST 12.35 3.5 days
mean avg centre
Std der distance from mean spread
In this context understand severity of
virus and its consensus effects
3 to dirt quant
graph display frequency

I'aime
let
o o
0
00 000
right stew left skew
median mean median mean
3 17
1707222525291 n 8

19.3751
19 5

Maish
1
Z score standardized value
cannot compare apples to oranges or C to
F w o
standardizing
tells us how many SD an observation
is from the mean i Z X M
basis to compare values o
tr different means where X obs
and so u mean
T SD
2
Standard normal tables give us Z scores
and corresponding area under the normal
curve
e if 7 01520 proportion under cure
g
ANSWER
0.9357
or

93.571

0
a Tx
alwaysunder
III stand
3
aka empirical rule
based on normal distribution a
meany
and SD 8
68 t of obs w in 1 SD of
y
957 11 2 SD
99 7 t 3 SD

O
2 4 61810 12 A

4
multiplying obs in data set
by value b
multiplies mean and SD by lb
addition of value a to each obs in a

data set adds to mean but not so


e C to of
g of temperature from
REVIEW
1
3 5 8 0.6
µ
Z X 3.7 3.5
a 0.333
Y
b

C
z

First
2.9 3S
0.6 LIL is below
is above mean second

2
µ 15.5 8 0.7
a Z x 13 15.5 3 57
0.7

Using Z table

0 0002 0.021
Up

very small number


t.se
b 7 3.57 Z x 16 1 55 0.71
3
P O 0002 4
Using Z table p 0.7611

P 2 16 PE xt

0.7611 0.0002

07609
OR

1 7 16 7 13
1

1 1 7611 0.0002

447 0.7609
3
mean multiply by 1.8 and add 32 96.6
SD only multiply by 1.8 0.9
1

2
Direction neither
positive or negative
i

d
i

Mx My NY both Tx y
ve ve neither
Strength weak moderate strong
i
moderate weak
strong

Linearity linear non linear

ii i'or y
i or i

Ix
ay
Scatterplots correlation linear
t t regression
relationship Want to understand
of 2 quant correlation of x and y
r
variables correlation coefficient
Pearson's
visualized direction
f
strength of
linear
relationship
where
X obs
y
X y mean
n samplesize
Sx Sy Sample SD

PROPERTIES of r
2
quantitative
no units
tu e or ve

blue t 1 and t

T strength O or O
not resistant to outliers
x y
Linear regression least squares regression
LOBI smallest residuals line
how does with X
y change
allows us to predict response
y bo t bi x
slope
intercept

Sy Sx gample
r correlation
coefficient
meagle
Tik

Coefficient of determination R
literally correlation squared
how much variation is
explained by regression
model hat
Residuals
residual y y y
difference btw obs and pied for response
Sample Surveys
Question How many hours on average do
STAB221 3 Students spend studying for
this course week

known

Poggi
parameters statistic
to infer
sample Unknown
Statistics parameters
I s
Action Ask all students
physically
present at lecture this week

Is this sample representative cangowrong


key tips
Part of the whole sample
Randomize reduce bias
Appropriate sample size size matters
sampling types
SRS equal chance of randomly being
selected O
O
Stratified Random Sampling reduces
Divide into bias
poplin strata Mari
Perform Srs within each a ility

O
0000
O O O
strata
accuracy
sample
Cluster Multistage hierarchy of dusters

O O

cluster

convenience

or sample
Observation vs Experiment
d f aka
passively actively Ateatment
watching manipulating
situation variables and
and drawing recording outcomes
conclusions determine causal
retrospective relationships
prospective

Experimental Design
Factors fertilizer
water
Levels high med low

Treatments high high


high med
etc etc
Observation is.E ent
d aka
t
treatments
passively actively
watching manipulating
situation variables and
and drawing recording outcomes
conclusions determine causal
retrospective relationships
prospective
Effect of water and
Experimental Design fertilizer levels on plants
Factors fertilizer each has levels
water
Levels high med low fertilizer 3

high med low water 3


Treatments
hight 8.8
combinations levels x
of med f mined
w OR levels
low w
factors 3 3
low f Ted w 9
low w

Experimental units plants identify


all components
Response variable height in a situation
Basic Probability P outcome or set of outcomes
likelihood of a specific event or events
occurring in specific sample space
aka all possible
subset of Sam ace
p by outcomes
5 YY Y R
P R 1 possible

rapndonigmenon
It totaftone
outcomes

drawing a ball a trial


Independence
outcome of one trial does not change
outcome of the other
e toss is independent
coin
g
Question Is drawing 2 cards from a
deck without replacement independent

Theoretical Probability
e P A 2 or
15
g

I P S
at

at

no
no outcomes
overlap in
common
outcomes overlap
can occurs
occur

simulta eously
aka
PLANB
Important to distinguish ormultiplication
b w the two rule
PCA and B PLA XP B
Quiz 7 Q2 b
Probability that Student chosen at random is
neither smart nor
funny
Correct answer
1 0.49 0.51
Syd f
P Sore
S F

Common mistake
1 0.36 0 26 0 13
0.25
Calculating this way doubles the amount of
people you take into account
people in the smart category also fall into
Sandf category
hotsmart or
funny
S f
greater than 0
multiplication
rule
B gifenda'd
A and B can be any events
order is important what would P A B be
PCA IB PCB and A
P B

are

or
P AIB P A
At least one
type questions
at least one is equal to 1 P none
e albinism example from lecture
g 3 Children at least one albino
calculate all scenarios 1 P none
All combos
Aa albino 04 4 3 4 3
A AA Aa
aaa et
Zadkine at 4 3 31
I P none

txt t t
3 4 4 37
1
It
Eg
Ya Ya
fo
focus on discrete
variables
Examples
tossing a coin number of students in a

rolling a die class at Uoft

Crv probability
of
events each
outcome
probability
distribution
special riv
boolean outcomes denoted
only yes no OH as numbers
distribution
v.v of heads

all possible
outcomes

heads
IT
same X P
table O 1 8
diff I 3 8
orientation 2 3 8
3 118
Mean expected value of v.v
aka weighted average oryx
X 12 3
P 0.20.3 0.5
MX 1 0.2
3
2 0.3
0.5
2 3
Median of v.v
aka midpoint on which X D P 0.5

t t
t
pass 0.5
Variance of r v

Nary SD X

sigma
Binomial Model
type of probability model dist Gottman
outcome either success or failure
X B hip Binomial distribution in
h and
parameters Pls
prob of success
of observations
Solving questions
Formula or Binomial Table
Example 257 of people believe in astrology
you select 4 people at random
a P none believe
b DC at least one believes
Formulas
X B Cn p
where
d n trials
X success X of people
D prob ofsuccess who believe
a I p prop of failure
P X O 4C 025 1 0.2540 0.25
p
1 1 0.757
0.31640625
Note C nor key where rex can also
be written as combination n choose x k
P X K Ordo
f D
or k
p l
manually
b P X2 1 I PED
I 031640625
0.68359375
0.409620.2901
Binomial table a P X o
0.32985
large diff due
to rounding

b PIX 1
0.67515

1 PLAN
on
sum up values 21
Note Make sure to use correct table
Not z or t tables
Also x is often interchangeable it
k as of sucesses but not always

When choosing b w formula table


consider which method
you feel
comAND
using ortable
which
OR is more
suited to the question
Quiz 9 answers
1

a MX 2 0.3 5 0.7
0 6 3.5
4 I
b var x E x M PA
2 4.1 0.3 5 4.1 10.7
1 89
SD X vara
V89
1 37
p 0.1 prob of event happening i.e drop
out before graduation
n 10
a K 3

00574
b K 3 not including3

0.9298
Review

µ
proportion
µ
T
count
Zuse
fable
SRS simple random sample
n size of sample
p poplin probability of success
X count of success of sample
P sample proportion of success
ie X h

Rutes
np 210 AND nu p 210
when n is large
sampling dist
approaches normal

mean of sample
proportionsapproach
poplin prob of
success p
In proportion
pop

e Stat
g key
distribution
Sampling for prop
Dataset Coin flip
Sample size h to us n too
Generate 1,10 100 1000 Samples
p 0.21
n 300
a X N G0.21 np
µ 300
63

SD
http
763110.2T
7.0548
Z score
1 0.5
P X 80 X x val
z
p É

Z 8

I z 2.41
É t
P X2 2.41
Using Z table
0 or
IE
0.00

P X2 2.4 1 1 0.9920
0.008
all the best

for the final


exam I
Am and

You might also like