You are on page 1of 23

Definitions of Statistics

• Statistics is a branch of mathematics dealing with the collection, analysis, interpretation, and presentation of
numerical data.
• There are two main branches of statistics:
➢ Descriptive statistics
➢ Inferential statistics
Both of these are employed in scientific analysis of data and both are equally important for conducting
investigation and research work.

Descriptive Statistics
• Descriptive statistics mainly deals with the presentation and collection of data.
• This is usually the first part of a statistical analysis.
• Summarizes of data, which may be tabular, graphical or numerical are known as descriptive statistics.
• Descriptive analysis intends to describe a big hunk of data with summary, charts and tables, but do not attempt to
draw conclusions about the population from which the sample was taken.
• Mean, Median, Mode, Range, Variation, Standard Deviation etc are the descriptive statistical tools to analyse data.

Inferential Statistics
• The inferential analysis intends to describe the population parameter on the basis of sample evidence.
• It means drawn sample can represent the entire population
• On the basis of the statistical measures calculated from the sample data so that the conclusion can be
representative of whole data
• Different Statistical techniques are used to draw the valid conclusion
• Z test, t test, F test, Chi-Squared test, confidence interval, correlation and regression analysis etc are the inferential
statistical tools to analyse data
Population and Sample
• The Population is the collection of units (people, objects or whatever) that researchers are interested in knowing
about and the number of individuals in a population is called population size.
• A small unit of population is called sample and the number of individuals in sample is called sample size.
• Population may be finite or infinite.
• The population of city, the number of students in a college, number of voters in election etc. are the examples of
finite population whereas customers of United Trade Centre, number of stars in the sky, listeners of a specific radio
program etc. are the examples of infinite population.

Parameters and Statistics


• Mathematically, we can describe samples and populations by using measures such as mean, median, mode and
standard deviation etc.
• When these terms describe the characteristics of a population, they are called parameters and when they describe
the characteristics of sample, they are called statistics ( or estimators).

Population Sample

Definition Collection of all items under study Part of population chosen for study

Characteristics Parameters Statistics

Population size = N Sample size = n

Population mean = µ Sample mean = X

Symbols
Pop standard deviation = σ Sample standard deviation = S

Pop correlation coeff. = ρ Sample correlation coeff. = r


Census and Sampling
• Under the census or complete enumeration survey method, data are collected for each and every unit (person,
household, shop, organization etc.) of the population.
• Sampling is simply the process of learning about the population on the basis of sample drawn from it.
• Thus, the sampling is the technique of studying parts of population instead of every unit of the population.

Sampling Frame (Source list)


• The sampling frame is the list of items in the population (universe) from which sample is to be drawn.
• If sampling frame is not available, researcher has to prepare it and such list should be comprehensive, correct,
reliable and appropriate.

Variables
• A symbol or number or characteristic that posses different numerical values or categories and that can be
interpreted is called variable.
• Variables can be classified into following two types:
▪ Dependent variables
▪ Independent variables
Dependent Variable
• The variable which is affected by another variable is called dependent variable.
• It responds to the independent variable.
• It is also known as explained or response variable.

Independent Variable
• The variable which is presume to affect another variable is called independent variable.
• It has cause to show effect in dependent variable.
• It is also known as explanatory or stimulus variable.

Example
1. Saving can be increased if the income is increased.
Dependent – Saving Independent – Income

2. Promotion affects the employee motivation Dependent – Employee Motivation Independent –


Promotion
Data
• Data are facts and figures that are collected analysed and summarized.
• Data contains information and needed to make a more informed decision in a particular situation of any research.
• Obtaining appropriate information is essential to conduct any research.
• On the basis of sources of collecting information, data is divided into two categories.
1. Primary Data
2. Secondary Data

Primary Data
• Data collected for the first time by the researcher in its original form i.e. First-hand account of an event that has
not been interpreted by anyone else other than its creator is known as primary data.
• Primary data is a type of information that is obtained directly from first-hand sources by
means of surveys, observation or experimentation.
• It is data that has not been previously published.

Method of collecting primary data

• There are different methods of collecting primary data. Each method has its relative merits and demerits.
• The investigator has to choose a particular method to collect the information and the choice to a large extent
depends on the preliminaries to data collection.
• some of the commonly used methods are discussed below.
1. Observation method
2. Focus group discussion
3. Information received through local agencies
4. Interview methods
5. Questionnaire method
Observation method
• In this method, researcher does not ask any question to the respondent, but observe the phenomenon.
• Observation is thus the process of recognizing and noting people, objects and occurrences rather than asking for
information.
• In this method researcher observes only one group over a long period of time.
• It is a time consuming and expensive method for collecting data.

Focus Group Discussion (FGD)

• A group consisting of 6 to 10 randomly chosen members who discuss a product or any given topic for some time in
the presence of a moderator.
• Focus groups are relatively less expensive and can provide fairly dependable data within a short period of time.
• The strength of FGD relies on allowing the participants to agree or disagree with each other so that it provides an
insight into how a group thinks about an issue.
• This method become successful only when skilled moderators conduct them.

Picture of FGD
Information received through local agencies
• In this method the information is not actually collected by the researcher himself/herself.
• The local agent appointed collect the required information and given/sent to the researcher.
• Generally, newspaper and TV channel used this method to collect news.

Interview methods
• Interview is researcher administered technique of collecting primary data, is widely used in research.
• It is well established, practicable and reliable method of data collection.
• The researcher may ask questions on the issues of his or her interest and record the answers of the respondent in
a sheet of paper.
• Interviews could be unstructured or structured, conducted either face-to-face personal or telephone / on-line.

Questionnaires Method
• A questionnaire is a set of question designed to collect information from respondent.
• Under this method questionnaire is given or mailed to the respondents, respondents answer the questions and
return back to the researcher.
• Success of this method greatly depends upon the way in which the questionnaire is drafted.
• So, the investigator must be very careful while framing the questions.
Secondary Data
• Secondary data are collected from sources which have been already created for the purpose of first time use and
future uses.
• It is collected by someone other than the user.
• Secondary data analysis saves time and money of researcher.
• It may have limited application to the specific research so, the data should be carefully collected and tabulated.

Sources of Secondary Data


Processing of Data
• After data have been collected from a representative sample of the population, the next step is to analyze them to
test the research hypotheses.
• Data analysis is now routinely done with software programs such as SPSS, SAS, Excel and the like.
• Excellent graphs and charts can also be produced through most of these software programs.
• However, before we start analyzing the data to test hypotheses, some preliminary steps need to be completed
and those steps are incorporated in processing of data.
• The data processing includes
➢ Editing of data
➢ Coding of data
➢ Classification of data and
➢ Tabulation of data

Editing of Data
• Editing is the process of examining errors and omissions in the collected data and making necessary corrections in
the same.
• This is desirable when inconsistency in responses questionnaire or there is some as entered in the contains
when it partial or a vague answer. only a
Coding of Data
• The symbols used to indicate these categories are called codes and coding involves assigning numerals or other
symbols to answers so that responses can be grouped into a limited number of categories.
• The purpose of coding is to facilitate the transfer of data from data collection instrument into computer readable
form.
• Coding is necessary to carry out the subsequent operations of tabulating and analysing data.
• If coding is not done, it will not be possible to reduce a large number of heterogeneous responses into meaningful
categories with the result that the analysis of data would be weak and ineffective, and without proper focus.

Cross Tabulation
• Cross tabulation is a tool that enables to compare the relationship between two different variables.
• It helps us to understand how two different variables are related to each other.
• Under the tabulation process, we need to construct contingency table showing two variable in terms of rows and
columns.
Example

• Suppose that we have two variables, sex and income. Further suppose that 100 individuals are randomly sampled
from a very large population as part of a study of income differences according to the sex.
• A contingency table can be created to display the income distribution for male and female as shown below:

Income

Sex Total
Less than Rs. 20,000 More than Rs. 20,000

Male 40 25 65

Female 50 10 60

Total 90 35 N=125
Theory of Estimation
• Estimation can be defined as the statistical tools to estimate or predict the unknown value of population
parameter on the basis of sample evidence.
• There are two types of estimation:
➢ Point Estimation
➢ Interval Estimation
Example:
✓ I will earn Rs. 20,000 in the next month is the example of Point estimation
✓ I will earn Rs. 20,000 to Rs. 30,000 in next month is the example of interval estimation

Criteria for Good Estimators


• A good estimator is one which is as close to the true value of the parameter as possible.
• The following are some of the criteria which should be satisfied by a good estimator.
➢ Unbiased
➢ Consistency
➢ Efficiency
➢ Sufficiency

You might also like