You are on page 1of 27

INTRODUCTION TO STATISTICS

Mrs Namakau Monde

Lecture 1
Contents

Expected Outcomes

Types of Data

Data Presentation: Frequency Distribution


Expected Outcomes

After completing this Lecture you should be able to:


I Dene statistics
Expected Outcomes

After completing this Lecture you should be able to:


I Dene statistics
I Describe key data collection methods
Expected Outcomes

After completing this Lecture you should be able to:


I Dene statistics
I Describe key data collection methods
I Know the types of data
Expected Outcomes

After completing this Lecture you should be able to:


I Dene statistics
I Describe key data collection methods
I Know the types of data
I Know key denitions
Expected Outcomes

After completing this Lecture you should be able to:


I Dene statistics
I Describe key data collection methods
I Know the types of data
I Know key denitions
I Population vs. Sample
Branches Statistics
Statistics is a branch of Mathematics that examines ways to
process and analyze data. It is divided into two branches namely:
I Discriptive statistics which focuses on collecting, summarizing
and presenting a set of data
I Inferential statistics uses sample data to draw conclusions
Terms Frequently used,
I A Variable is a characteristic of an item or individual.
I A population consists of all the items or individuals about
which you want to reach conclusions.
I A sample is the portion of a population selected for analysis.
I A parameter is a measure that describes a characteristic of a
population.
I A statistic is a measure that describes a characteristic of a
sample.
Examples

Populations include all full time students in BBA24O, All likely


voters in the next election and all the Lecturers at UNILUS.
Samples could be 100 full time students selected for a research,

5000 voters in Lusaka and 20 Lecturers at pioneer campus.


The average number of courses of all full time students in BBA24O
represents a parameter while the average number of courses of
100 full time students selected for a research represents a statistic.
Sources of Data
Types of Data
I Categorical variables (also known as qualitative variables)
have values that can only be placed into categories such as yes
and no.
I Numerical variables (also known as quantitative variables)

have values that represent quantities. Numerical variables are


further identied as being either discrete or continuous
variables.
I Discrete variables have numerical values that arise from a
counting process.
I Continuous variables produce numerical responses that arise
from a measuring process.
I Categorical variables (also known as qualitative variables)
have values that can only be placed into categories such as yes
and no.
I Numerical variables (also known as quantitative variables)

have values that represent quantities. Numerical variables are


further identied as being either discrete or continuous
variables.
I Discrete variables have numerical values that arise from a
counting process.
I Continuous variables produce numerical responses that arise
from a measuring process.
Measurement Scales
Statisticians use the terms nominal scale and ordinal scale to
describe the val- ues for a categorical variable and use the terms
interval scale and ratio scale to describe numerical values.
I A nominal scale classies data into distinct categories in
which no ranking is implied.
I An ordinal scale classies values into distinct categories in
which ranking is implied.
Numerical values.

I An interval scale is an ordered scale in which the dierence


between measurements is a meaningful quantity but does not
involve a true zero point. For example, Shoes for adults are
often sold in Zambia marked with sizes based on the US or UK
system. The size below an adult size 1 is a child's size 13.
However, in each system the intervals between sizes are equal.
I A ratio scale is an ordered scale in which the dierence
between the measurements involves a true zero point, as in
height, weight, age, or salary measurements.
Practice Questions
Practice Questions
Frequency Distribution

A frequency distribution is a summary table in which the data are


arranged into numerically ordered classes. Classes are groups that
represent a range of values, called a class interval. Each value can
be in only one class and every value must be contained in one of
the classes.
I To create a useful frequency distribution, you must think
about how many classes are appropriate for your data and also
determine a suitable width for each class interval. In general, a
frequency distribution should have at least 5 classes but no
more than 15 classes because having too few or too many
classes provides little new information. To determine the class
interval width, you subtract the lowest value from the highest
value and divide that result by the number of classes you want
your frequency distribution to have.
Steps in constructing a frequency table.

I Step 1: Sort raw data in ascending order

I Step 2: Find the range = Maximum value  Minimum value

I Step 3: Find class width (w) = range/(number of class


intervals).The class width is rounded up not rounded o e.g.
rounding o 2.2 is 2 but rounding up 2.2 is 3.

I Step 4: Pick a suitable starting point less than or equal to the


minimum value. Your starting point is the lower limit of the
rst class add the class width to this lower limit to get the rest
of the lower limits.
An example

Examples on the Time it takes 30 laptops to download a movie use


5 class intervals
5 1.5 1 10 11.9 4 2 3 15 25
4 4.5 2.5 8 1.5 2 12 9 8 7.5
1.3 3.3 4.1 6.4 7 3 14 10.5 3.4 5.8

Step 2: We rst nd the range; 15 − 1 = 14


Step 3: Class width (w) = 14
5 = 2.8 w 3
Step 4: We can use 1 as the lower limit of the rst class
To nd the rest of the lower limits we then add the class width to
the rst lower limit.
1+3=4
4+3=7
7+3=10
10+3=13
13+3=16
To nd the rst upper limits, we add the class width to the rst
lowe limit i.e
1+3=4
To nd the rest of the lower limits we then add the class width to
the rst upper limit.
4+3=7
7+3=10
10+3=13
13+3=16
Notice that the upperlimit of one class is also the lower limit of the
next class.
The table is displayed below
Time is seconds (s) frequency (f)
1 but less than 4 12
4 but less than 7 7
7 but less than 10 5
10 but less than 13 4
13 but less than 16 2
Example 2

1. The following data refer to a certain type of chemical impurity


measured in parts per million in 25 drinking water samples
randomly collected from dierent areas. Make a frequency table of
5 class intervals displaying frequencies
30 12 20 18 27
29 15 21 19 24
24 31 16 32 11
23 25 26 24 25
17 22 26 35 18
Solution

Sorting: 11, 12, 15, 16,. . . . . . . . . . . . , 35.


Range= 35  11 = 24
Class width (w) = 245 = 4.5 which is approximately 5.
We will choose the First lower limit to be 11. The second lower
limit will be 11 + 5 which is 16. The rst upper limit is 11 + 5=
16.
Thus the table will be as follows.
chemical impurity measured in parts per million frequency (f)
11 but less than 16 3
16 but less than 21 6
21 but less than 26 8
26 but less than 31 5
31 but less than 36 3
Practice Questions

Construct a frequency distribution table with 7 Class intervals of


marks obtaitained by 50 students in BBA 240
23 50 38 42 63 75 12 33 26 39
35 47 43 52 56 59 64 77 15 21
51 54 72 68 36 65 52 60 27 34
47 48 55 58 59 62 51 48 50 41
57 65 54 43 56 44 30 46 67 53

You might also like