You are on page 1of 31

CHAPTER III

  

Prof. Ir. A. Caroline Sutandi, S.T., M.T., Ph.D., IPU

 Civil Engineering Department 



Parahyangan Catholic University
2023
INTRODUCTION [1]

Two major parts of statistics that work with


a set of measurements to be organized,
summarised, and described are:

 Descriptive Statistics
 Inferential Statistics
INTRODUCTION [1]

 Descriptive Statistics
enable to make sense of the data by reducing
a large set of measurements to a few
summary measures that provide a good,
rough picture of the original measurements.

 Inferential Statistics
enable to make sense of the sample to draw
conclusions about the population from which
the sample was drawn.
DESCRIBING DATA ON A SINGLE
VARIABLE [1]

There are few ways to describe the data,


including:

 Graphical Methods
 Measures of Central Tendency
 Measures of Variability
DESCRIBING DATA ON A SINGLE
VARIABLE [1]

 Graphical Methods

25%
45%

30%

at least 5%
ascending or
descending
DESCRIBING DATA ON A SINGLE
VARIABLE [1]

 Graphical Methods
DESCRIBING DATA ON A SINGLE
VARIABLE [1]

 Graphical Methods
DESCRIBING DATA ON A SINGLE
VARIABLE [1]

 Measures of Central Tendency

Graphical descriptive measures are followed


by numerical descriptive measures for two
reasons:
• Graphical descriptive measures are
inappropriate for statistical inference
• Numerical descriptive measures
provides frequency distribution
DESCRIBING DATA ON A SINGLE
VARIABLE [1]

Two most common of Numerical


descriptive measures are:

 measures of Central Tendency


(centre of distribution of measurements)
 measures of Variability
(vary of the centre of distribution)
 for population → Parameters
 for a sample → Statistics
DESCRIBING DATA ON A SINGLE
VARIABLE [1]

Measures of Central Tendency


• Mode – measurement that occur most
(highest frequency)
• Median – the middle value of the
measurements that arrange from lowest
to highest
• Mean – sum of measurements divided by
total number of measurements
DESCRIBING DATA ON A SINGLE
VARIABLE [1]

Major characteristics of Mode:


- It is the most frequent or probable measurement of
the data set;
- there can be more one mode for a data set ;
- It is not influence by extreme measurements;
- Modes of subsets cannot be combined to
determined the mode of the complete data set;
- For grouped data, its value can change depending
on the categories used;
- it is applicable for both qualitative and quantitative
data.
DESCRIBING DATA ON A SINGLE
VARIABLE [1]

Major characteristics of Median:


- It is the central value, 50% of the measurement lie
above it and 50% fall below it;
- There is only one median for a data set;
- It is not influenced by extreme measurements;
- Medians of subset cannot be combined to
determine the median of the complete data set;
- For grouped data, its value is rather stable even
when the data are organized into different
categories;
- It is applicable quantitative data only .
DESCRIBING DATA ON A SINGLE
VARIABLE [1]

Major characteristics of Mean:


- It is the arithmetic average of the measurements in
a data set;
- There is only one mean for a data set;
- Its value is influenced by extreme measurements,
trimming can help to reduce the degree of
influence;
- Means of subjects can be combined to determined
the mean of the complete data set;
- It is applicable quantitative data only.
DESCRIBING DATA ON A SINGLE
VARIABLE [1]

Example of Median of group data


DESCRIBING DATA ON A SINGLE
VARIABLE [1]

Example of Median of group data


DESCRIBING DATA ON A SINGLE
VARIABLE [1]

Example of Mean of sample group data


DESCRIBING DATA ON A SINGLE
VARIABLE [1]

Relation among
mean (),
trimmed mean (TM),
median (Md),
and Mode (Mo)

extreme values (outliers)


DESCRIBING DATA ON A SINGLE
VARIABLE [1]

 Measures of Variability

Measures the spread of the data:


▪ range
▪ pth percentile
▪ variance
▪ standard deviation
DESCRIBING DATA ON A SINGLE
VARIABLE [1]

▪Range
is the difference between the largest and the
smallest measurements of data set
DESCRIBING DATA ON A SINGLE
VARIABLE [1]

▪ pth percentile
Is the value that has at most p% of the
measurements below it and at most (100-
p)% above it.

IQR =
Interquartile
Range
DESCRIBING DATA ON A SINGLE
VARIABLE [1]

▪ pth percentile
DESCRIBING DATA ON A SINGLE
VARIABLE [1]

▪ Variance
is some of the squared deviations divided by
n-1 of n measurements y1, y2, ..., yn.
2 as variance for population
s2 as variance for sample

Standard deviation (, s) is positive value of


square root of variance
DESCRIBING DATA ON A SINGLE
VARIABLE [1]

Example:
DESCRIBING DATA ON A SINGLE
VARIABLE [1]
DESCRIBING DATA ON A SINGLE
VARIABLE [1]

Example for grouped data:

Midpoint m=586
Interval width w=69
SUMMARIZING DATA FROM MORE
THAN ONE VARIABLE [1]

Besides graphical methods and numerical


methods, to summarise the data, there is
other methods to summarise the data of
more than one variables as an introduction to
chi-square methods, analysis of variance and
regression methods.
SUMMARIZING DATA FROM MORE
THAN ONE VARIABLE [1]

Example of contingency table:


SUMMARIZING DATA FROM MORE
THAN ONE VARIABLE [1]

Example of percentage comparison from


contingency table:
SUMMARIZING DATA FROM MORE
THAN ONE VARIABLE [1]

Example of relationship between productivity


and released time (R), bonus pay (B), and
profit sharing (P)
SUMMARIZING DATA FROM MORE
THAN ONE VARIABLE [1]

Example of scattered plot:


KEY FORMULAS
[1]

You might also like