You are on page 1of 20

Introduction to Statistics Lectures by Neelam Younas

1
Introduction to Statistics Lectures by Neelam Younas

2
Introduction to Statistics Lectures by Neelam Younas

3
Introduction to Statistics Lectures by Neelam Younas

Introduction
Statistics is used in almost all fields of human activities for
examples: business, humanity, computer sciences,
agricultural, medical etc

Definition of Statistics

Definition
Statistics is the science of collecting, organizing, presenting,
analyzing, and interpreting numerical data to assist in
making more effective decisions.

4
Introduction to Statistics Lectures by Neelam Younas

Descriptive Statistics
Definition
Tools for summarizing, organizing & presenting data in an
informative way.

The tools are:


- Tables & Graphs
- Measures of Central Tendency
- Measures of Variability

Examples:
- Average rainfall in Oxford last year
- Number of car thefts in Cincinnati last semester
- Percentage of seniors in this class
Inferential Statistics
Definition
The methods used to estimate a property of a population
on the basis of a sample.

What is the difference between a population


and a sample?
Population:
A population is a collection of all possible individuals, objects, or
measurements of interest
The number of entities in population called the population
size is denoted by “N”.
Examples:
- All the students enrolled at king Saud University
- All the students in IR class .

5
Introduction to Statistics Lectures by Neelam Younas

Sample:
A portion or part, of the population of interest.

The number of entities in a sample called the sample size


is denoted by “n”.
The Parameter & Statistic
Parameter:
A numerical measure that is calculated from the population

Statistic:
A numerical measure that is calculated from the Sample
Example of the Parameters & Statistics
Characteristics Population Sample statistic
parameter
Mean µ (Mue) (x bar)
Standard σ (Sigma) S
Deviation
Proportion π (Pi) P
Correlation ρ (Rho) r
coefficient

6
Introduction to Statistics Lectures by Neelam Younas

Stage of Statistical Study

Data collection
Foundation of the statistics. There are different methods of
collection of data and different sources.
We study this topic in the three points:
(1) Types of variables & Levels of Measurement
(2) Sources of data.
(3) Methods of Data Collection.
(1) Types of variables & Levels of
Measurement
Types of variables:

7
Introduction to Statistics Lectures by Neelam Younas

1- Qualitative Variables:
Variables which assume non-numerical values.
(e.g.: Gender, Educational level, Student grade, blood
group)
2- Quantitative Variables:
Variables which assume numerical values.
(e.g.: height, weight, number of floors of building)

(2-1) Discrete Variables


Variables which assume a finite or countable number
of possible values. Usually obtained by counting.
There are usually “gaps” between values.
(e.g.: number of traffic accidents, number of absent days in
work week for an employee, size of family ….)
(2-2) Continuous Variables
Variables which assume an infinite number of possible
values. Usually obtained by measurement.
Variable can assume any value within a specified range.
(e.g.: Age, Income, Expenditures, Temperature…)
Note:
The variables could be considered as independent or
dependent according to their effect on other variables.
Independent variable: is that variable that
has an effect on anther variable in some certain
circumstances.
Dependent variable: is that variable that is
affected by one or more other variable in some certain
circumstances.

8
Introduction to Statistics Lectures by Neelam Younas

Levels of Measurement:

Nominal –level Data:


For the nominal level of measurement observations of a qualitative
variable can only be classified and counted. There is no particular
order to the labels.
Here, the data is divided into groups or categories.
- Each category has a Name.
- No units of measurements are given.
This is the lowest level of measurement.
Examples:
-Marital status (Divorced, Married, Single, Widow).

9
Introduction to Statistics Lectures by Neelam Younas

-Class subject in collage of business administration.


-Area code.
-Blood group (A, B, AB, O)
-Gender (male, female)
-Place of birth
Ordinal –level Data:
This is the data which cannot be given in values; it is given in Orders
or ranks. This level is a qualitative variable and the next higher level
of measurement.
Examples:
- Level of school (elementary school, secondary school, high school)
- Educational level of an adult (Bachelor degree, Master degree,
Doctorate degree)
- Rank in military (Field Marshal, General, brigadier, Colonel, Captain,
Lieutenant).
- Position in a committee (President, Vice president, Member)
- Health condition for new patient received by a hospital (Critical,
Serious, Moderate, Minor).
- A student's byname at the four years of college (freshman, sophomore,
junior, senior).
- The rank of the faculty members in the university (professor, associate
professor, assistant professor, lecturer, teaching assistant).
- Academic grade in the university (A, B, C, D, F).
- Cloth size (Large, Medium, and Small).
- Data on Opinion:
Answers (I strongly agree, I agree, I do not agree, I strongly oppose)
These answers can only be given Orders like: 1,2,3,4, consecutively.
Scale data
Scale data is the data which is given in terms of units of
measurement, i.e. values. E.g. Age in years, weight in
kilograms.
Scale data is sub-divided into two:
Interval-level data:
This level is a quantitative variable and higher than the ordinal.
Examples:
- Temperature.
If in the morning, the temperature was 20C, and at noon, it
was 40C, we cannot say that the degree of temperature in the

10
Introduction to Statistics Lectures by Neelam Younas

Atmosphere has doubled at noon. The comparisons


between them have no meaning.
In Interval data, the zero does not take its real value. IF
temperature is Zero, this does not mean that there is no
temperature in the atmosphere.

Ratio-level data:
This level is a quantitative variable and the highest level of
measurement.
Examples:
- Height
- Weight
- Time spend to complete certain task.
- Family income.
In this case, different values can be expressed as ratios.
If (X) has height of 6 ft, (Y) of 3 ft, and (Z) of 2 ft, we can
say that the height of (X) is Twice that of (Y), and of (Z) is
one-third (1/3) of (X).
So that, the distances between different values can be
measured and known exactly.

11
Introduction to Statistics Lectures by Neelam Younas

Primary sources:
It is sources of the data collected by a particular
organization from its own recourse for its own use.
Examples:
- Questionnaire.
- Interviews.
- Focus Group Discussion (FGD).
- Observation.
- Check List.
Secondary sources:
It is sources of the data collected, organized,
presented and may be described by some organization for its
own use, and then published it in order to be used by other
organization
Examples:
Published Data, The newspapers, Periodicals, Trade
associations,
Research centers.

Methods of Data Collection

12
Introduction to Statistics Lectures by Neelam Younas

1- Complete enumeration (population)


Collect measurements from the entire population, e.g.:
-Determine average grade on a Statistics exam.
-Measure salaries of all 50 state governors.

(i) A finite population:


Is one which has entities that can be counted from first
to last element (N is fixed).
Examples:
- All students enrolled in a course (finite and small).
- The population of all school teachers in the kingdom of Saudi
Arabia (finite and large).
(ii) An infinite population:
Is one which has elements that cannot be counted, that is,
there is no last element in the population (N is unlimited)
Examples:
- The population of all fish in the red sea.
- The population of all red cell in blood of a patient.

2-A sample
A portion or part, of the population of interest.

Reasons to sample:
1. To contact the whole population would often be time-consuming.

2. The cost of studying all the items in a population may be


prohibitive.
3. The physical impossibility of checking all items in the population.

13
Introduction to Statistics Lectures by Neelam Younas

(If the population is infinite, finite, but sufficiently large, dynamic)

4. The destructive nature of certain tests.

5. The sample results are usually adequate.


What Errors is Present in a Sample?

Sampling errors:
Is inherent in the method of sampling and refers to the
heterogeneity or chance differences from sample to sample
Leaving a part of the population not to be included in the
sample, results in a loss of information.
Different samples selected from the same population, give
different measurements (results).
Therefore: the mere fact that we use a sample for data
collection causes the Sampling Error.
Non-sampling errors (Bias):
Bias is an error committed by somebody consciously or
unconsciously during all stages of the survey.
Somebody like:
- Interviewer.
- Respondent.
- Typist.( Computer processing errors).
- Organizer of the survey.
- Any other person involved in the survey.

14
Introduction to Statistics Lectures by Neelam Younas

Also the Non-sampling errors arise due to:


- Inadequate frames.
- Unsatisfactory questionnaires.
- Incomplete coverage of sample units.

Types of Samples

Probability Sampling:

Where each unit in the population has a known


probability to be selected in the sample.
Where each Ultimate unit in the population has an equal
chance (probability) to be selected in the sample.

15
Introduction to Statistics Lectures by Neelam Younas

(1) Simple Random Sample (SRS):


Each unit in the population has the same chance of being
selected in the sample.
This can only be done by using Random Numbers Table,
or any other randomization device, e.g. computers,
lottery…etc.
In case that the population is homogeneous, the best sampling
method to follow is simple random sampling.
SRS is not a good design if the population under study is
very large or heterogeneous.
How we used the Random Numbers Table?
Look the next example:
Example:
We want selected SRS, if the
Population size (N) = 75 & a sample size (n) = 15.
Step (1): Assign number for each item in the frame from
(1 to N) or (1 to 75).
Step (2): selected the starting point in the numbers table.
Suppose we starting at the intersection of (row 3 &
column 2) and move horizontally and only reading two
numbers according N (go horizontally or vertically).
Step (3): We selected the items number:
71 , 57 , 18 , 37 , 22 , 57 , 75 , 65, 17 , 83 , 11,31 , 30
,19 ,66
Note: We disregarded any number greater than N

16
Introduction to Statistics Lectures by Neelam Younas

Part of a table of random numbers


39634 62349 74088 65564 16379 19713 39153 69459 17986 24537
14595 35050 40469 27478 44526 67331 93365 54526 22356 93208
30734 71571 83722 79712 25775 65178 07763 82928 31131 30196
64628 89126 91254 24090 25752 03091 39411 73146 06089
15630
42831 95113 43511 42082 15140 34733 68076 18292 69486
80468
80583 70361 41047 26792 78466 03395 17635 09697 82447
31405
00209 90404 99457 72570 42194 49043 24330 14939 09865
45906

If we moving vertically the items a sample number:


71 , 57 ,18 , 26 , 11 , 37 , 3 ,61 , 40 , 47 , 46 , 22 , 25 , 44 ,
35

39634 62349 74088 65564 16379 19713 39153 69459 17986 24537
14595 35050 40469 27478 44526 67331 93365 54526 22356 93208
30734 71571 83722 79712 25775 65178 07763 82928 31131 30196
64628 89126 91254 24090 25752 03091 39411 73146 06089
15630
42831 95113 43511 42082 15140 34733 68076 18292 69486
80468
80583 70361 41047 26792 78466 03395 17635 09697 82447
31405
00209 90404 99457 72570 42194 49043 24330 14939 09865
45906

17
Introduction to Statistics Lectures by Neelam Younas

(2) Systematic Random Sample


Here are the steps you need to follow in order to achieve
a systematic random sample:
1- Number the units in the population from (1 to N )
2- Decide on the n (sample size) that you want or need.
3- K = N/n = the interval size
4- Randomly select an integer between 1 to k.
5- Then take every kth unit from other intervals.
Example: Selection of Systematic
Sample

 The first unit, from the Table is number 4, then the units
of the sample will be:
4 , 9, 14, 19, 24, 29…… 99.
 Systematic Sample has the advantage to SRS of good
coverage of the population
 However, it has the same disadvantages.
(3)Stratified Sample
Stratified random sampling is used when we have heterogeneous
population regarding some characteristics.
 The population is sub-divided into Strata. Each Stratum is
homogeneous internally.
 A sub-sample is drawn from Each Stratum and the totality
of the sub-samples constitutes the stratified sample size.

18
Introduction to Statistics Lectures by Neelam Younas

 How drawn? How many units from each stratum?


 The sub-sample can be drawn using: SRS or Systematic
RS.
 The size of the sub-sample from each stratum can be
determined through one of the alternatives:
(1) Equal Allocation:

The sub-sample is equal for all strata irrespective of the


size the stratum, (useful only if all strata have approximately
the same size).
n1 = n2 =……..= nh
Where: nh = sub-sample from stratum (h).
n = total sample size.
(2) Proportional Allocation:

Sub-sample = Size of stratum * sample size


Size of population
= (Nh / N) * n
The sub-sample is taken Proportional to the size of the
stratum.
(Larger strata will have larger sub-sample).
Example: We want selected random sample form the
following table
Sample size = 240
Stratum (Students) Freshman Sophomore Junior Senior
Size Nh 200 600 100 300

The Sub-Samples:
(1) Equal Allocation:
n1=n2=n3=n4= 240/4 =60
nh = 60.
(2) Proportional Allocation:
n1 = (200/1200) *240 = 40
n2 = (600/1200) *240 = 120
n3= (100/1200) *240 = 20
n4 = (300/1200) *240 = 60
n = 40 + 120 + 20 + 60 = 260

19
Introduction to Statistics Lectures by Neelam Younas

(4) Multi- Stage or Cluster Sample:


The population is sub-divided into sub-divisions called
Clusters, according to geographical locations or
administrative units.
The sample is selected into stages; each stage is composed
of two steps:
- Sub-division into smaller Clusters.
- Selection of some clusters.
This goes on in stages till the ultimate unit which is
indivisible, e.g. household or an individual.
An Example of a study to be number of traffic accidents
done to all Saudi Arabia:
Stage One:
1- Subdivide Pakistan into cities, (25).
2- 2- Select some cities, (say 10).
Stage Two:
1- Sub-divide each selected cities into streets.
2- Select some streets from each selected city.

20

You might also like