You are on page 1of 47

BEKA 2453 Statistics & Numerical Methods

CHAPTER 1: INTRODUCTION TO
STATISTICS
- Qualitative data and
Quantitative data
- Measures of central tendency
and dispersion

BENG 2142 Statistics

WHAT IS STATISTIC???
Statistics is the mathematical science involved in the application of quantitative
principles to the collection, analysis, and presentation of numerical data. The
practice of statistics utilizes data from some population in order to describe it
meaningfully, to draw conclusions from it, and make informed decisions.

Your company has created a new


drug that may cure arthritis. How
would you conduct a test to
confirm the drug's effectiveness?

You want to conduct a poll on


whether your school should use its
funding to build a new athletic
complex or a new library. How
many people do you have to
poll? How do you ensure that
your poll is free of bias? How do
you interpret your results?

BENG 2142 Statistics

WHAT DO ENGINEERS DO???


An engineer is someone who solves problems of interest to
society with the efficient application of scientific principles
by:
Refining existing products
Designing new products or processes

THE CREATIVE PROCESS

THE ENGINEERING PROCESS

BENG 2142 Statistics

STATISTICS SUPPORTS THE CREATIVE PROCESS

BENG 2142 Statistics

1.1 Qualitative data and Quantitative data


1.1.1 Introduction
Statistics - the science of collecting, organizing,
summarizing and analyzing information in order to draw
conclusions.
Two types of statistics
Descriptive statistics
Inferential Statistics

BENG 2142 Statistics

Descriptive statistics
consists of organizing and summarizing the
information collected. Descriptive statistics
describes the information collected through
numerical measurements, charts, graphs and
tables.
Inferential Statistics
generalize results obtained from a sample to
the population and measure their reliability.

BEKA 2453 Statistics & Numerical Methods

1.1.2 Basic Terms


Population - consists of all items or elements of
interest for a particular decision or
investigation.
Example (all FKE students in the UTeM. )
Sample is a certain number of elements that
have been chosen from a population. Sample is
a subset of population.
Example: (a list of students of 2BEKG would be
a sample from the population of all FKE students
in the UTeM.)

BENG 2142 Statistics

POPULATION VS. SAMPLE

BENG 2142 Statistics

1.1.2 Basic Terms (cont.)


Random sample is a sample drawn in such a
way that each element of the population has a
chance of being selected.
Simple random sample implies that any
particular sample of a specified sample size
has the same chance of being selected as any
other sample.
Element / member is a specific subject or
individual about which the information is
collected.

BENG 2142 Statistics

1.1.2 Basic Terms (cont.)

Variable is a characteristic of the individual within the


sample or population.
Observation/Measurement is the value of a variable
for an element
Data set is a collection of values of one or more
variables.
Grouped data set is a collection of data which are
grouped in classes.
Population parameter is a descriptive measure
computed from a population data.
Sample statistic is a descriptive measure computed
from a sample data.
Outliers / Extreme Values are values that are very
small or very large relative to the majority of the
values in a data set.

BENG 2142 Statistics

Outlier

BENG 2142 Statistics

PRACTICE PROBLEM
A random sample of 30 middle school students average 1.8 hours
spent on homework each night. It is believed that middle school
students spend 2 hours each night on homework. Identify the sample,
the population, the sample statistic, and the population parameter.
The sample is the 30 middle school students who are randomly selected. The
population is all middle school students.
The sample statistic is x = 1.8 hours, and the population parameter is = 2 hours.
Remember that sample statistics are values that represent a sample, while
population parameters are values that represent a population.

BENG 2142 Statistics

1.1.3 Variables
Qualitative variables allow for classification of
individuals based some attribute or
characteristics
Example: the gender of new born babies;
the marital status of people, types of cars.
Quantitative variables provide numerical
measures of individuals. (countable).
Example: The weight of children; the
numbers of cars owned.

BENG 2142 Statistics

1.1.3 Variables (cont.)


Quantitative variables can be further classified
into two groups:
(a) Discrete Variables.
finite / countable number of possible values.
Example:
The number of heads obtained by flipping
a coin five times.
The number of cars that arrive at KFCs
drive-through between 1.00 p.m to 2.00
p.m.

BENG 2142 Statistics

1.1.3 Variables (cont.)


(b) Continuous Variables.
infinite number of possible values that are not
countable. They are obtained by measuring;
include fractions and decimals.
Example
Time spent studying for your first statistics
exam.
The height of volleyball players.

BENG 2142 Statistics

PRACTICE PROBLEM
Determine whether the
following variables are
qualitative or quantitative.
1.

Postal Code

2.

Salary

3.

PTPTN allowances

4.

Gender

5.

Marital Status

Determine whether the following


variables are discrete or continuous.

1.

Heights of 2BEKG students in FKE.

2.

Number of books that have been


borrowed by FKE students each day
from library.

3.

Number of 2BEKG students


attended Statistic class every
Tuesday.

4.

The time taken for 2BEKG students


to get to class at 8oclock in the
morning.

BENG 2142 Statistics

1.1.4 Graphical Methods


Qualitative data can be displayed by using
Bar graph
Pie chart
Example:

BENG 2142 Statistics

Solution:

Bar graph:

Pie chart:

300

3%

Frequency

250

23%

200

9%
A

150

100

65%

C
D

50
0

C
Rating

BENG 2142 Statistics

1.1.4 Graphical Methods


Graph grouped data can be classified to:
Histogram
Polygon
Graphing for cumulative frequency distribution used
Ogive (cumulative histograms)

BENG 2142 Statistics

Example:

BENG 2142 Statistics

Solution:
(a) & (b)

BENG 2142 Statistics

Solution:
(c) Histogram

Polygon

BENG 2142 Statistics

Solution:
(d)

PRACTICE PROBLEM
The following scores represent the final examination grade for Statistic
subject:

23 60 79 32 57 74 52 70 82 36 80 77 81 95 41 65 92 85 55 76 52
10 64 75 78 25 80 98 81 67 41 71 83 54 64 72 88 62 74 43 60 78
89 76 84 48 84 90 15 79 34 67 17 82 69 74 63 80 85 61
a)

Construct a frequency distribution table with the class width is 10.

b)

Determine the class boundaries and class midpoints.

c)

Calculate the relative frequencies and percentages for all classes.

d)

Construct a frequency histogram for the data

e)

Prepare the cumulative frequency distribution table.

f)

Construct an ogive for cumulative frequency.

MARKS
1-10
11-20
21-30
31-40
41-50
51-60
61-70
71-80
81-90
91-100
Total

No. of
Student
1
2
2
3
4
7
10
16
12
3
60

Relative
frequence
0.017
0.033
0.033
0.050
0.067
0.117
0.167
0.267
0.200
0.050
1.000

Percentage
2
3
3
5
7
12
17
27
20
5
100

Class
boundaries
0.5-10.5
10.5-20.5
20.5-30.5
30.5-40.5
40.5-50.5
50.5-60.5
60.5-70.5
70.5-80.5
80.5-90.5
90.5-100.5

No. of Student
1
2
2
3
4
7
10
16
12
3

Class
midpoints
5.5
15.5
25.5
35.5
45.5
55.5
65.5
75.5
85.5
95.5

BAR GRAPH
18
16

HOW ABOUT
HISTOGRAM (NO
GAP)?

No. of Students

14

12
10
8
6
4
2
0
1-10

11-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 91-100


Marks

BENG 2142 Statistics

1.2 Measures of central tendency and


dispersion
Measures of central tendacy
- Indicates the central point around which observations tend to
cluster
- Mean, Mode, Median
Measures of dispersion
- The measures that help us know about the spread of a data set
- Range, Variance, Standard deviation

BENG 2142 Statistics

1.2 Measures of central tendency and


dispersion
1.2.1 Numerical Measures
Measures of Central
Tendency

Measures of Dispersion

Mean

Range

Median

Variance

Mode

Standard deviation

Skewness

Skewness:

BENG 2142 Statistics

1.2.2 Measures of Central Tendency


Mean is the arithmetic mean or average.
Median of a variable is the value that lies in the
middle of the data when arranged in ascending
order.
Mode of a variable is the most frequent observation
of the variable that occurs in the data set.

BENG 2142 Statistics

RELATIONSHIP BETWEEN MEAN, MEDIAN


AND MODE

MEAN
POPULATION
Mean
=

1 +2 ++

=1

where N is the number of observations in the population

SAMPLE
Mean
=

1 +2 ++

=1

where n is the number of observations in the sample

MEDIAN
Steps in Computing the Median of a Data Set
Step 1: Arrange the data in ascending order.
Step 2: Determine the number of observations, n .
Step 3: Determine the observation in the middle of the data set.
If the number of observations is odd, then the median is the data
value that is exactly in the middle of the data set. That is, the median
+1
is the observation that lies in the
position.
2

If the number of observations is even, then the median is the mean of


the two middle observations in the data set. That is, the median is the

mean of the data values that lie in the and + 1 positions.


2

MODE
To compute the mode, tally the number of observations that occur for
each data value. The data that occurs most often is the mode. A set of
data can have no mode, one mode or more than one mode. If there
is no observation that occurs with the most frequency, we say the
data has no mode.

EXAMPLE
The following data represent the monthly phone bill for six randomly
selected months (in RM).
35.34 42.09 39.43 38.93 43.39 49.26
Calculate the mean, median and mode for the monthly phone bill.

Solutions:

BENG 2142 Statistics

1.2.3 Measures of Dispersion


Range of a variable is the difference between
the largest data value and the smallest data
value
Variance is based upon the difference between
each observation and the mean; that is, it is
based upon the deviation about the mean
Standard deviation tells us how closely the values
of a data set are clustered around the mean.

BENG 2142 Statistics

POPULATION
Mean
1 + 2 + +
=
=

Variance
1
2
=

2
=1

=1

=1

Standard deviation
=

2
=1

=1

BENG 2142 Statistics

SAMPLE

Mean

=1

1 + 2 + +
=
=

Variance
1
2
=
1

2
=1

=1

Standard deviation
=

1
1

2
=1

=1

EXAMPLE
The following data represent the monthly phone bill for six randomly
selected months (in RM).
35.34 42.09 39.43 38.93 43.39 49.26
Compute the range, sample variance and sample standard deviation.

Range = Largest Data Value Smallest Data Value.


= 49.2635.34
=13.92

Sample Variance, s2

Sample standard deviation, s

PRACTICE PROBLEM
An engineer is interested in testing the bias in a pH meter. Data are
collected on the meter by measuring the pH of a neutral substance
(pH=7.0). A sample of size 10 is taken with results given by
7.07 7.00 7.10 6.97 7.00 7.03 7.01 7.01 6.98 7.08
Compute the range, sample variance and sample standard deviation.

MEASURES OF CENTRAL TENDENCY AND


DISPERSION FOR GROUPED DATA
MEASUREMENT
MEAN
VARIANCE

STANDARD
DEVIATION

POPULATION

SAMPLE

EXAMPLE
The following data give the monthly expenditures (in hundred RM) on
food for 30 households randomly selected from the households who
incurred such expenses.
4.57 3.95 6.95 3.80 1.50 3.99 7.84 5.05 8.00 14.75 9.33 1.05
5.08 7.00 9.60 18.99 9.15 11.32 4.75 9.95 3.63 1.99 1.39 13.09
19.31 11.15 7.73 12.00 7.58 16.35
Find the sample mean for the monthly expenditures on food for 30
households

Sample Variance

Sample standard deviation

BENG 2142 Statistics

1.2.4 Reasons to sampling


To construct the whole population would be time
consuming.
The cost of studying the all items in population
may be prohibitive.
The physical impossibility of checking all items in
the population.
The destructive nature of some tests
The sample results are adequate

END OF CHAPTER 1

You might also like