You are on page 1of 35

Basic Statistics

part 2

Siswanto & Harun Al Rasyid

Metodologi 1 - 2014
Central of tendency
Curiosity is in our genes
What have you
learned
yesterday?
To find answers of our
questions we collect
data

Categorical variables
Mean
Numeric variables
Median
Outliers
Mode 2
At the end of this session, student will be able
to understand:
• What is measures of dispersion
– Range
– Interquartile range
– Standard deviation
• Type of statistical analysis
– Univariate
– Bivariate
– Multivariate
• Examples of bivariate analysis

3
Why do I need to • The measures of
know measure of central tendency are
dispersion? not adequate to
describe data.
• Two data sets can
have the same mean
but they can be
entirely different.
• Thus to describe data,
one needs to know
the extent of
variability. This is
given by the
measures of
dispersion. 4
Dangdut Singing Contest!

VS

The judges

5
sule andre nunung azis inul
Both of the group
must sing
“Sakitnya Tuh Di
Sini”!

6
Dangdut Singing Contest!

Mean score

8 7 8 8 7 7,6

Mean score

10 6 10 6 6 7,6
7
Measure of dispersion

3 most common measures of dispersion

INTER-
STANDARD
RANGE QUARTILE
DEVIATION
RANGE
8
RANGE

The range is the difference between the


largest (maximum) and the smallest
(minimum) observation in the data.

Advantage:
It is easy to calculate

Limitation:
It is very sensitive to outliers and does not use
all the observations in a data set
9
RANGE

Example: age of 10 students

16 16 17 17 17 18 18 18 18 19
Minimum= 16 Maximum=19
Range= 19 – 16 = 3

10
INTERQUARTILE RANGE

• Interquartile range is defined as the difference


between the 25th and 75th percentile (also called
the first and third quartile).
• the interquartile range describes the middle 50%
of observations.
• because it uses the middle 50%, it is not affected
by outliers or extreme values.

11
INTERQUARTILE RANGE

Q1 Q3
25th percentile 75th percentile

IQR

Min Max

Q2
Median
th
50 Percentile

Range

12
INTERQUARTILE RANGE

Advantage
• it can be used as a measure of variability if the
extreme values are not being recorded exactly (as
in case of open-ended class intervals in the
frequency distribution).
• It is not affected by extreme values.

Limitations:
• It is not amenable to mathematical manipulation.

13
STANDARD DEVIATION

• Standard deviation (SD) is the most


commonly used measure of dispersion.
• It is a measure of spread of data about the
mean.
• SD is the square root of sum of squared
deviation from the mean divided by the
number of observations.

14
STANDARD DEVIATION

15
STANDARD DEVIATION

Advantage:
• If the observations are from a normal
distribution, then 68% of observations lie
between mean ± 1 SD 95% of observations lie
between mean ± 2 SD and 99.7% of
observations lie between mean ± 3 SD
• Along with mean it can be used to detect
skewness.

Disadvantage:
It is an inappropriate measure of dispersion for
skewed data. 16
Nominal
variable

Frequency table:
• count, %, valid %,
cumulative %.
 Measure of central tendency:
• mode
 Measure of dispersion:
• no measure
17
Ordinal
variable

Frequency table:
• count, %, valid %,
cumulative %.
 Measure of central tendency:
• Mode, median
 Measure of dispersion:
• no measure
18
Ratio/
Interval
variable

Frequency table:
• count, %, valid %,
cumulative %.
 Measure of central tendency:
• Mode, median, mean
 Measure of dispersion:
• Range, standard
deviation
19
Dangdut Singing Contest!

Mean score

8 7 8 8 7 7,6
Range= 1 Standard deviation= 0,548

Mean score

10 6 10 6 6 7,6
Range= 4 Standard deviation= 2,191

20
The standard deviation measures how
concentrated the data are around the
mean; the more concentrated, the smaller
the standard deviation.

21
22
Tabel Riwayat pemberian ASI setelah melahirkan

Lama Pemberian ASI


Frequency Percent Valid Percent Cumulative
Percent
< 7 hari 3 5.1 5.3 5.3
7 - 30 hari 2 3.4 3.5 8.8

31 - 60 1 1.7 1.8 10.5


Valid hari

> 60 hari 51 86.4 89.5 100.0

Total 57 96.6 100.0

Missing System 2 3.4

Total 59 100.0
23
Frequency
How do I distribution
describe data I
Describe data collect? Central
tendency

Collect data Dispersion

Generate categorical
hypothesis & What is
identify variables variable?

numeric
Define a question

How do I do
A summary of
research? what we have 24
learned
How many
Based on the number of variables
type of involved in the analysis at a time, we
statistical have three type of analysis:

analysis? Univariate analysis

Bivariate analysis

Multivariate analysis

25
What is a
Univariate
analysis?

• the examination of the


distribution of cases on
only one variable at a
time
• For example: frequency
distribution, mean, median,
mode
• Purpose: description

26
What is a
bivariate
analysis? • the examination of two
variables simultaneously
• For example: the
relationship between gender
and favorite music
• Example of test: t-test, one-
way ANOVA, chi-square,
correlation test
• Purpose: determining the
empirical relationship
between the two variables

27
Comparison Bivariate analysis – statistical test

Categorical Continuous
vs
variable variable

t-test
Two groups

One-way ANOVA
More than two groups

28
Comparison Bivariate analysis – statistical test

Example:

- Is there any difference of body height between


male and female students?

- Is there any difference of body weight based on


blood type (A, B, AB and O type)?

29
Correlation Bivariate analysis – statistical test

Continuous Continuous
vs
variable variable

Both of variables have


Pearson
symmetrical (normal)
distribution of data Correlation test

One or two variables


Spearman
have skewed ( not
normal) distribution of Correlation test
data
30
Correlation Bivariate analysis – statistical test

Example:

- Is there any correlation between the number of


children in a family and the salary of parents?

- Is there any correlation between IQ score and


academic performance (IPK)?

31
Association Bivariate analysis – statistical test

Categorical Categorical
vs
variable variable

Chi-square test

32
Association Bivariate analysis – statistical test

Example:

- Is there association between blood type and


personality type?

- Is there any correlation between gender and


fashion style?

33
What is a
multivariate • the examination of more
than two variables
analysis? simultaneously
• For example: the
relationship between
gender, race, and blood
pressure
• Example of test: multiple
linear regression,
multinomial logistic
regression
• Purpose: determining the
empirical relationship
among variables
34
Any questions?

35

You might also like