SAMPLING AND OUTLIERS Peta
Pos
When interpreting data, its important to consider how it's gathered, and how it's presented.
Digorete data: Exact numbere (ueually from counting)
Ex
number of plums on my tree
number of students in a claee
Continous data: Any value in a certain range (can be decimal)
Ex
heights of studentsPopulation: We sample the entire population (have all data representative)
SL: data sete are representative of the entire population
Sample: We have a emaller part of the population
“may end up making false conclusions /interpretations
Reliable data: Repeatable
What if you have mieging data?
Can affect reliability (fic monitor collection of data carefully, get good data eamplee)
Bia?
You have results favouring one outcome over another
xc eample only males (not femelee)
We try to minimise bi‘Sampling techniques:
A) Simple random Equal chance of choosing, Choose out of a hat, number generator, ee
Exc Poll etudents from gchool- # aesigned to studente, choose with a random # generator
B) Convenience Choose easiest people to sample: ack your friends, eto
Probleme? May not be representative of population
C) Systematic. Choose random starting point, use fixed interval
Ex make a list of all students in a class, choose every Srd student.
D) Quota ‘Sample sizee in proportion to who you're polling
Ex 55% girls, 45% boys in school, go sample should have those same %
€) Stratified Spl into strata (emailer groupe)
Exc Choose haf DP, half DO2 etudenteOutliers: We might want to remove values from a data set if they fall too far outside criteria : outliers
Frey
tfrormaly dctsbuted: eo
yeti yy (a0Ex Data set: 16,14, 3, 12, 15, 17, 22, 15, 52
aan
Show that 52 ie an outlier = 145
outte: hss tow Q, = 15 THE (Siam) <5 (105+ 14)
rmge_thn @, FL5 Bae =A
gull, gti UGA 3 fee, Pay 3
jubtne, fers fh
14.540 75 SIDS grteler hon 24.05
gree HatWhy should you oare?
Knowing how to take good sample, and when to remove datapoints (outiere) is important to etate
Plus, might help in your LAL