EDX S1 Lecture Note: 1 Basic Statistics
EDX S1 Lecture Note: 1 Basic Statistics
Yan Jiaqi
1 Basic Statistics
1. Type of Data.
Continuous Measured: Heights/Weights/…
Numerical
Type of data Discrete Counted: Number of pets…
• Raw data is data that has not been processed for use.
• Grouped data is data that is organized into a number of groups.
3. Frequency table.
For the table 1, gap = 350 − 349 = 1. For the table 2, gap = 0 (and hence is ungapped).
1
raw data grouped data
Mode
the value occurs most often.
Modal class
n+1 the n2 -th value.
Median the 2 -th value
Interpolation.
∑ ∑
x x·f
Mean µ= µ= ∑ .
n f
• Interpolation.
To find some measures of locations for grouped data, Interpolation is used, it is assumed that within
each class, the data values are evenly distributed.
3 + 6 + 10 + 7 + 5
t (min) Frequency = 15.5
2
300-349 3 Position 9 15.5 19
350-399 6
400-449 10 Value 399.5 Median 449.5
450-499 7
500-549 5 15.5 − 9 19 − 9
=
Median − 399.5 449.5 − 399.5
5. Measure of other locations.
A value describes other positions of a set of data.
Min Q1 Q2 Q3 Max
6. Measures of Spread.
A value describes how spread out a set of data is.
2
∑
• Sxx symbols: Sxx = (x − x̄)2 .
the subscript xx means the product of (x − x̄) and (x − x̄).
∑ ∑
Similarly, we have Sxy = (x − x̄)(y − ȳ), Syy = (y − ȳ)2 .
7. Coded data.
2 Representation of data
1. Histogram.
2. Box plot.
4. Scatter diagram.
3 Analysis of data
1. Outliers.
2. Skewness.
3. Comparing data.
4 probability
6 Normal distributions