You are on page 1of 7

Statistics Review

Mean, Median, Mode


● The mean is the average of dataset founded by dividing the sum of all
the values by the number of values

● The median is the middle value when the data is in order of size

● The mode is the value that occurs most often in a data set
Quartiles, Interquartile range,
Box & Whisker
Box and whisker plot/box plot shows
five-number summary of a set of data

● Minimum
● Maximum
● Lower Quartile
● Upper Quartile
● Median

● Minimum: The minimum value in a data set


● Lower Quartile (Q1): Where the value of the 25% of the data set
● Median: The value in the middle of the data set
● Upper Quartile (Q3): Where the value of the 75% of the data set
● Maximum: The maximum value in a data set
● Range: The distance between the minimum value and the maximum value
● Interquartile Range (IQR): The distance between the Upper Quartile and Lower Quartile
Outliers
An outlier is a data point that lies outside the overall pattern in a
distribution

Spotting outliers with interquartile range method

1. Sort data from low to high


2. Identify the first quartile (Q1), the median, and the third quartile (Q3).
3. Calculate your IQR = Q3 – Q1
4. Calculate your upper fence = Q3 + (1.5 * IQR)
5. Calculate your lower fence = Q1 – (1.5 * IQR)
6. Use your fences to highlight any outliers, all values that fall outside your
fences.
Data sampling methods
Simple random sampling
→ Selecting a sample completely at random. For instance, using a random number
generator

Systematic sampling
→ Researchers select members of the population at a regular interval
Convenience sampling
→ Getting data by selecting people who are easy to reach. Does not include a random
sample of participants
Quota sampling
→ Setting certain quotas for the sample. Example using 8 boys and 8 girls
Stratified sampling
→ Selecting a random sample where numbers in certain categories are proportional
to their numbers in the population
Scatter plots
Positive correlation
→ This is where one quantity increases as the other
quantity increases
→ Example: As the temperature increases, the sales
of cold drinks increase

Negative correlation
→ This is where one quantity decreases as the other
quantity increases
→ Example: The age of a car and its value

No correlation
→ This is where there is no apparent relationship
between the two quantities
→ The amount of time spent playing computer
games and the weight of an elephant
Line of best fit

If a scatter graph suggests there is a correlation, a line of best fit can be drawn on the scatter graph.
This can then be used to predict one data value from the other.

Finding through GDC


1. Enter data in table
2. Stat → Calc
3. LinReg(ax+b)

You might also like