Professional Documents
Culture Documents
Descriptive Statistics III
Descriptive Statistics III
1
Example
2
Hand anthropometry of non-disabled individuals
(Sources: DTI, 2002; Ergonomics for Schools, 2008; RoyMech, 2008)
5th percentile 50th percentile 95th percentile
Dimension Gender
(mm) (mm) (mm)
Male 173-175 178-189 205-209
Hand length
Female 159-160 167-174 189-191
Male 98 107 116
Palm length
Female 89 97 105
Male 44 51 58
Thumb length
Female 40 47 53
Male 11-12 23 26-27
Thumb breadth
Female 10-14 20-21 24
Male 64 72 79
Index finger length
Female 60 67 74
Male 78 87 95
Hand breadth
Female 69 76 83-85
3
X3: representa la tercera observación obtenida
Ejemplo: para los datos 2, 6, -1, 8, 0, -1, 8, 6 encontrar los estadísticos de orden 2 y 5.
-1 -1 0 2 6 6 8 8
X(2) X(5)
4
Quartiles
Median
Q1 Q2 Q3
5
Quartiles
𝑄𝑘 = 𝑥 𝑛+1
𝑘
4
with k = 1, 2, 3
6
Finding Quartiles 𝑄𝑘 = 𝑥
𝑘
𝑛+1
4
Example:
The quiz scores for 15 students is listed below. Find the first,
second and third quartiles of the scores.
28 43 48 51 43 30 55 44 48 33 45 37 37 42 38
Order the data.
Lower half Upper half
28 30 33 37 37 38 42 43 43 44 45 48 48 51 55
8
Box and Whisker Plot (boxplot)
A box-and-whisker plot is an exploratory data analysis tool
that highlights the important features of a data set.
The five-number summary is used to draw the graph.
• The minimum entry
• Q1
• Q2 (median)
• Q3
• The maximum entry
Example:
Use the data from the 15 quiz scores to draw a box-and-
whisker plot.
28 30 33 37 37 38 42 43 43 44 45 48 48 51 55
Continued. 9
Box and Whisker Plot
Five-number summary
• The minimum entry 28 IQR = Q3 – Q1= 11
• Q1 37 Max. length = 1.5 IQR = 16.5
• Q2 (median) 43
• Q3 48
• The maximum entry 55
Q1 Q2 Q3
28 37 43 48 55
28 32 36 40 44 48 52 56
10
Parts of a boxplot
La longitud máxima del bigote es 1.5 IQR. Si la parte final del bigote NO coincide con un
punto este se encoje hasta que coincida con uno.
11
Utilidad de los boxplot
1. Variabilidad
2. Puntos atípicos
3. Simetría
4. Comparar
12
Recomended video
13
Example
14
Example
15
Deciles and Percentiles
Deciles divide an ordered data set into 10 parts. There
are 9 deciles: D1, D2, D3…D9.
16
Deciles and Percentiles
𝐷𝑘 = 𝑥 𝑛+1 𝑃𝑘 = 𝑥 𝑛+1
𝑘 𝑘
10 100
with k = 1, 2, …, 9 with k = 1, 2, …, 99
17
Note: quartiles, deciles and
percentiles are called QUANTILES.
18
Distribution shapes
Histogram
Density
100%
Variable
19
Galton board
20
Distribution shapes
21
Distribution shapes
22
Distribution shapes
23
Summary of distribution shapes
Mean = Median
Mean > Median > Mode Mean < Median < Mode
24
Distribution Shape and Boxplot
Q1 Q2 Q3 Q1Q2Q3 Q1 Q2 Q3
25
Pearson Correlation Coefficient 𝑟
Measures the strength of the linear relationship
between two quantitative variables.
𝑛
σ𝑖=1 𝑥𝑖 − 𝑥ҧ 𝑦𝑖 − 𝑦ത
𝑟=
σ𝑛𝑖=1 𝑥𝑖 − 𝑥ҧ 2 σ𝑛𝑖=1 𝑦𝑖 − 𝑦ത 2
26
Features of Correlation Coefficient
• Unit free
• Ranges between –1 and 1
• The closer to –1, the stronger the negative linear
relationship
• The closer to 1, the stronger the positive linear
relationship
• The closer to 0, the weaker any positive linear
relationship
27
Correlation coefficient
28
Scatter Plots of Data with Various
Correlation Coefficients
Y Y Y
X X X
r = -1 r = -0.6 r=0
Y Y
X X
r = 0.6 r=1
29
Coeficiente de correlación de
Spearman 𝜌
El coeficiente de correlación de Spearman, 𝜌 (rho)
es una medida de la correlación entre dos variables
aleatorias 𝑥 e 𝑦 continuas.
30
El coeficiente de correlación de Spearman es menos sensible que el de
Pearson para los valores muy lejos de lo esperado. En este ejemplo:
Pearson = 0.30706 Spearman = 0.76270
31
Tips
32
What your scientific figure looks like,
vs. what the audience sees
33
Chartjunk
link
34
An example of a chart containing
gratuitous chartjunk. This chart uses a
large area and a lot of "ink" (many symbols
and lines) to show only five hard-to-read
numbers, 1, 2, 4, 8, and 16.
35
A map with chartjunk: the gradients inside
each province
36
37
38
Data-ink ratio formula
39
Data-ink ratio formula
40
The value of telling stories with data as
opposed to merely displaying it.
41
What do you think about the picture?
42