You are on page 1of 34

Statistics

Dr Malinga Geoffrey Maxwell


Course outline
 Overview of data collection and analysis
 Bivariate correlation
 Linear regression
 Multiple regression
 Logistic regression
 Chi-square
 Analysis of variance-covariates, interactions
 Multivariate analysis
 Repeated measure ANOVA
Introduction

 Statistics-derived from the Latin for state, indicating


historical importance of governmental data gathering which
related principally to census taking and tax collecting. In
essence, the word statistics refer to the analysis and
interpretation of data with a view toward objective
evaluation of the reliability of the conclusions based on the
data.
 What is data (sing. Datum)? Information, statistics,
numerical facts, figures, numbers or records.
Introduction
 Depending on the source of information, data is
classified as primary (collected afresh and for the
first time/original character) or secondary (if already
collected by someone else and has already been
passed through the statistical process (e.g.,
secondary sources include newspapers, other
research materials, the internet and publications).
 A characteristics that varies from one biological
entity to another is termed a variable/variate.
Data collection methods/
techiques (primary data)
1. Observation-data collected through direct observation
without asking respondents anything e.g., the behaviour of
buyers of a certain product can be studied through
observation.
• Advantages-if done accurately, subjective bias is eliminated,
information of what is happening at the time of data collection is
observed, demands less active cooperation between the person
collecting data and respondents, suitable for respondents with no
verbal ability.
• Disadvantages-expensive, provides limited information, unforeseen
factors may interfere with observation task, some phenomena ( fact or
experience) are unobservable.
Data collection techniques
cont’d
2. Interview methods-these involve presentation of oral
verbal stimuli seeking oral verbal responses. There are four
main types of interviews:
• Structured-involve use of structured questionnaires.
• Semi-stuctured interviews-these involve the use of checklists of
terms or topics to be discussed with respondents. As they
respond to listed items, more questions emerge and are asked
besides the prepared ones e.g., FGDS
• Unstructured-this type of interview involves talking causually with
respondents, usually key informants on particular issues without
prior appointments or telling them the purpose of the observation
• Interview schedules-these are interviews in which questionnaires
are used and filled when interviewing respondents
Data collection techniques
con’td
3. Questionnaire formulation-this method makes use of
questionnaires which are distributed to respondents.
• Questionnaires can have questions that are open ended, close ended or
tabular whereby open ended questions invite free responses, closed
questions only allow respondents to choose from alternative responses
provided and tabular questions are answered by filling in tables.
4. Dairies-A diary is a way of gathering information about the
way individuals spend their time on professional activities. They
are not about record of engagements or personal journals of
thought. Dairies can record either qualitative or quantitative
data and in management research can provide information
about work patterns and activities.
Data collection techniques
5. Case-studies
refers to a fairly intensive examination of a single unit such
as a person, a small group of people or a single company.
Case-studies involve measuring what is there. In this sense,
it is historical. It can enable the researcher to explore,
unravel and understand problems, issues and relationships
It cannot however allow the researcher to generalize, i.e., to
argue that from one case study the results, findings or theory
developed apply to other similar case-studies.
Questionnairres
Questionnairres
Data collection techniques
6. Critical incidents
is an attempt to identify the more noteworthy aspects of job
behaviour and is based on the assumptions that jobs are
composed of critical and non-critical tasks. A critical tasks is
one that makes the difference between success and failure
incarrying out important parts of the job. The incidents are
scalled in order of difficulty, frequency and importance to the
job as a whole.
Data collection techniques
7. Portfolios
A measure of a manager’s ability may be expressed in
terms of the number and duration of issues or problems
being tackled at any one time. The compilation of the
problem portfolios is recording information about how each
problem arose, methods used to solve it, difficulties
encountered etc
Samplig
 Sampling is the process of selecting a sufficient number of
elements from the population so that by studying the sample
and understanding the properties or characteristics of the
sample subjects, it would be possible to generate the
properties or characteristics of the population.
 Population-refers to the entire group of people, events or
things of interest that one whishes to investigate.
 Sample-a subset of a population. It comprise members
selected from the population. By studying a sample, one
would be able to draw conclusions that are general to the
entire population.
Sampling
Sampling
 An element- a single member of the population.
 Subject-single member of a sample.
 Parameters of the population-are characteristics that are
general to the entire population e.g., population mean,
population standard deviation and population variance.
 A population may be homogenous or heterogeneous. A
population is said to be homogenous when its every element
is similar to each other in all aspects.
 A population is said to be heterogeneous when its elements
are not similar to each other in all aspects.
• Common variables that make a population heterogeneous are gender,
age, ethnicity, socioeconomic status
Sampling
methods/Techniques/Designs

There are two broad types of sampling designs


or methods
1. Probability sampling
2. Non-probablity sampling
Probability sampling
 A probability sampling scheme is one in which every unit in
the population has a chance (greater than zero) of being
selected in the sample, and this probability can be
accurately determined.
 Can either be unrestricted (simple random sampling) or
restricted (complex probability sampling).
 Choice depends on nature of research problem, availability
of money, time, desired level of accuracy in the sample etc
Simple random sampling
 Under this technique, each person has some chance as any
other of being selected into the sample and forms a standard
against which the methods are evaluated.
 The technique is suitable when the population is relatively
small and where the sampling frame is complete and up to
date.
Restricted (complex) probability
sampling
 Systematic random sampling techniques
 Stratified random sampling techniques
 Random route sampling techniques
 Multi-stage cluster sampling
Systematic random sampling
 This is similar to simple random sampling but instead of
selecting random numbers from tables, you move through a
list (sample frame) picking every nth number.
 To choose the nth name, one must work out the appropriate
sampling fraction by dividing the population size with the
required sample size. E.g. consider a population of 600
employees and a required sample size of 120, the sampling
fraction R would be 120/600=1/5.
 This implies that the person choosing the sample from the
population must build up his/her sample size by selecting
one person out of every five in the population.
Stratified random sampling
 In this method all the people in the sampling frame are
divided into groups or categories known as strata. Within
each stratum, simple random sample or systematic sample
is selected.
 Stratified sampling – population divided into subgroups
(strata) and members are randomly selected from each
group.
Random route sampling
 This is a technique used in market research surveys for
sampling households, shops and other premises in rural and
urban areas.
 In this method an address is selected at random from a
sampling frame usually an electoral register as a starting
point. Then given instructions, the person collecting data
identifies more addresses by taking alternate left and right
turns at road junctions and calling at every nth address (i.e.,
shop or premises).
Multi-stage sampling
 This involves drawing several samples known as
sample areas. The sample areas are progressively
reduced into smaller sample areas from the larger
sample areas. Eventually smaller areas end up into
sample households and by using an appropriate
method such as systematic or simple random sampling,
individuals are selected from the households.
Non-probability sampling
 These are techniques applied when it becomes
impossible to undertake a probability method of
sampling.
 Includes: purposive sampling, Quota sampling,
Convenience sampling, Snow ball sampling, Self-
selection
Purposive sampling
 Is one which is selected subjectively.
The sampling design is based on
judgement of the researcher as to who
will provide the best information to
succeed for the objectives of the study.
Quota sampling
 This is often used in market survey where a person
collecting data is required to find cases with
particular characteristics. The person collecting data
is given a quota of particular types of people to
select from and the quota is organized so that the
final sample should be representative of the
population
Convenience sampling
 In this method, the sample comprises subjects who
are simply available in a convenient way to the
person collecting data. There is no randomness and
the likelihood of bias is high. One cannot draw
meaningful conclusions from the results you obtain.
 This method is feasible in situations where time and
resources are constrained.
Snowball sampling
 With this approach, you initially contact a few
respondents and then ask whether they know of
anybody with the same characteristics that you are
looking for in your study.
Self-selection
 In this approach, respondents themselves decide
that they would like to take part in your study.
Summary of sampling methods
 Probability Sampling – Uses randomization and takes steps to ensure all
members of a population have a chance of being selected. There are
several variations on this type of sampling and following is a list of ways
probability sampling may occur:
 Random sampling – every member has an equal chance
 Stratified sampling – population divided into subgroups (strata) and
members are randomly selected from each group
 Systematic sampling – uses a specific system to select members such
as every 10th person on an alphabetized list
 Cluster random sampling – divides the population into clusters, clusters
are randomly selected and all members of the cluster selected are
sampled
 Multi-stage random sampling – a combination of one or more of the
above methods
Summary of sampling methods
 Non-probability Sampling – Does not rely on the use of randomization
techniques to select members. This is typically done in studies where
randomization is not possible in order to obtain a representative sample. Bias
is more of a concern with this type of sampling. The different types of non-
probability sampling are as follows:
 Convenience or accidental sampling – members or units are selected based
on availability
 Purposive sampling – members of a particular group are purposefully sought
after
 Quota sampling – members are sampled until exact proportions of certain
types of data are obtained or until sufficient data in different categories is
collected
 Snowball sampling – members are sampled and then asked to help identify
other members to sample and this process continues until enough samples are
collected
Anazying Research data
 Data analysis can be dividied into two:
• Descriptive statistics-the main objective is to
summarise the variables concerned, usually
individual
• Analytical/inferential statistics-the aim is to
describe the relationship between two
variables or more
Survey designs
1. Longitudinal research- studies over an extended
period to observe the effect that time has on the
situation under observation and to collect primary
data (data collected at first hand) of these changes.
Involves different organisations or groups of people
to look at similarities or differences between them
at any one particular time.
Survey designs
2. Cross-sectional studies are done when time
or resources for more extended research, e.g.
longitudinal studies, are limited. It involves a
close analysis of a situation at one particular
point in time to give a ‘snap-shot’ result.

You might also like