You are on page 1of 42

Position measurement

1
Example

2
Hand anthropometry of non-disabled individuals
(Sources: DTI, 2002; Ergonomics for Schools, 2008; RoyMech, 2008)
5th percentile 50th percentile 95th percentile
Dimension Gender
(mm) (mm) (mm)
Male 173-175 178-189 205-209
Hand length
Female 159-160 167-174 189-191
Male 98 107 116
Palm length
Female 89 97 105
Male 44 51 58
Thumb length
Female 40 47 53
Male 11-12 23 26-27
Thumb breadth
Female 10-14 20-21 24
Male 64 72 79
Index finger length
Female 60 67 74
Male 78 87 95
Hand breadth
Female 69 76 83-85

3
X3: representa la tercera observación obtenida

X(3): representa la tercera observación en el conjunto


de datos ordenados de menor a mayor.

Tercer estadístico de orden.

Ejemplo: para los datos 2, 6, -1, 8, 0, -1, 8, 6 encontrar los estadísticos de orden 2 y 5.

-1 -1 0 2 6 6 8 8

X(2) X(5)

4
Quartiles

The three quartiles, Q1, Q2, and Q3, approximately divide an


ordered data set into four equal parts.

Median

Q1 Q2 Q3

Q1 is the median of the Q3 is the median of


data below Q2. the data above Q2.

5
Quartiles

𝑄𝑘 = 𝑥 𝑛+1
𝑘
4

with k = 1, 2, 3

Note: quartiles are useful if one have a large number of observations.

6
Finding Quartiles 𝑄𝑘 = 𝑥
𝑘
𝑛+1
4

Example:
The quiz scores for 15 students is listed below. Find the first,
second and third quartiles of the scores.
28 43 48 51 43 30 55 44 48 33 45 37 37 42 38
Order the data.
Lower half Upper half

28 30 33 37 37 38 42 43 43 44 45 48 48 51 55

Q1=X(4) Q2=X(8) Q3=X(12)


About one fourth of the students scores 37 or less; about one half
score 43 or less; and about three fourths score 48 or less.
7
Interquartile Range
The interquartile range (IQR) of a data set is the difference
between the third and first quartiles.
Interquartile range (IQR) = Q3 – Q1.
Example:
The quartiles for 15 quiz scores are listed below. Find the
interquartile range.
Q1 = 37 Q2 = 43 Q3 = 48

(IQR) = Q3 – Q1 The quiz scores in the middle


= 48 – 37 portion of the data set vary by at
= 11 most 11 points.

8
Box and Whisker Plot (boxplot)
A box-and-whisker plot is an exploratory data analysis tool
that highlights the important features of a data set.
The five-number summary is used to draw the graph.
• The minimum entry
• Q1
• Q2 (median)
• Q3
• The maximum entry
Example:
Use the data from the 15 quiz scores to draw a box-and-
whisker plot.
28 30 33 37 37 38 42 43 43 44 45 48 48 51 55
Continued. 9
Box and Whisker Plot
Five-number summary
• The minimum entry 28 IQR = Q3 – Q1= 11
• Q1 37 Max. length = 1.5 IQR = 16.5
• Q2 (median) 43
• Q3 48
• The maximum entry 55

Q1 Q2 Q3
28 37 43 48 55

28 32 36 40 44 48 52 56
10
Parts of a boxplot

La longitud máxima del bigote es 1.5 IQR. Si la parte final del bigote NO coincide con un
punto este se encoje hasta que coincida con uno.
11
Utilidad de los boxplot

Los boxplot permiten ver:

1. Variabilidad
2. Puntos atípicos
3. Simetría
4. Comparar

12
Recomended video

13
Example

14
Example

15
Deciles and Percentiles
Deciles divide an ordered data set into 10 parts. There
are 9 deciles: D1, D2, D3…D9.

Percentiles divide an ordered data set into 100 parts.


There are 99 percentiles: P1, P2, P3…P99.

Example: A test score at the 80th percentile (D8), indicates


that the test score is greater than 80% of all other test scores
and less than or equal to 20% of the scores.

16
Deciles and Percentiles

𝐷𝑘 = 𝑥 𝑛+1 𝑃𝑘 = 𝑥 𝑛+1
𝑘 𝑘
10 100

with k = 1, 2, …, 9 with k = 1, 2, …, 99

17
Note: quartiles, deciles and
percentiles are called QUANTILES.

18
Distribution shapes

Histogram

Density
100%
Variable

19
Galton board

Recomended video and applet:


http://www.youtube.com/watch?v=6YDHBFVIvIs
http://www.disfrutalasmatematicas.com/datos/quincunce.html

20
Distribution shapes

A frequency distribution is symmetric when a vertical line can be drawn


through the middle of a graph of the distribution and the resulting halves
are approximately the mirror images.

21
Distribution shapes

A frequency distribution is uniform (or rectangular) when all entries, or


classes, in the distribution have equal frequencies. A uniform distribution
is also symmetric.

22
Distribution shapes

A frequency distribution is skewed if the “tail” of the graph elongates more


to one side than to the other. A distribution is skewed left (negatively
skewed) if its tail extends to the left. A distribution is skewed right
(positively skewed) if its tail extends to the right.

23
Summary of distribution shapes

Mean = Median

Mean > Median > Mode Mean < Median < Mode
24
Distribution Shape and Boxplot

Left-Skewed Symmetric Right-Skewed

Q1 Q2 Q3 Q1Q2Q3 Q1 Q2 Q3

25
Pearson Correlation Coefficient 𝑟
Measures the strength of the linear relationship
between two quantitative variables.

𝑛
σ𝑖=1 𝑥𝑖 − 𝑥ҧ 𝑦𝑖 − 𝑦ത
𝑟=
σ𝑛𝑖=1 𝑥𝑖 − 𝑥ҧ 2 σ𝑛𝑖=1 𝑦𝑖 − 𝑦ത 2

26
Features of Correlation Coefficient

• Unit free
• Ranges between –1 and 1
• The closer to –1, the stronger the negative linear
relationship
• The closer to 1, the stronger the positive linear
relationship
• The closer to 0, the weaker any positive linear
relationship

27
Correlation coefficient

Shows strength and direction of correlation

Strong Weak Weak Strong

-1.0 0.5 0 0.5 1.0

Negative correlation Positive correlation

28
Scatter Plots of Data with Various
Correlation Coefficients
Y Y Y

X X X
r = -1 r = -0.6 r=0

Y Y

X X
r = 0.6 r=1
29
Coeficiente de correlación de
Spearman 𝜌
El coeficiente de correlación de Spearman, 𝜌 (rho)
es una medida de la correlación entre dos variables
aleatorias 𝑥 e 𝑦 continuas.

donde D es la diferencia entre los correspondientes


estadísticos de orden de 𝑥 − 𝑦. 𝑁 es el número de
parejas.

30
El coeficiente de correlación de Spearman es menos sensible que el de
Pearson para los valores muy lejos de lo esperado. En este ejemplo:
Pearson = 0.30706 Spearman = 0.76270

31
Tips

32
What your scientific figure looks like,
vs. what the audience sees

33
Chartjunk

Chartjunk refers to all visual elements


in charts and graphs that are not necessary to
comprehend the information represented on
the graph, or that distract the viewer from this
information.

link

34
An example of a chart containing
gratuitous chartjunk. This chart uses a
large area and a lot of "ink" (many symbols
and lines) to show only five hard-to-read
numbers, 1, 2, 4, 8, and 16.

35
A map with chartjunk: the gradients inside
each province

36
37
38
Data-ink ratio formula

39
Data-ink ratio formula

40
The value of telling stories with data as
opposed to merely displaying it.

41
What do you think about the picture?

42

You might also like