You are on page 1of 10

Statistics

06 January 2024 08:58

• Collection
• Classification
• Tabulation
• Analyzing
• Interpretation

• Statistics is a tool for creating new understanding from a set of numbers.


• Data are facts (especially numerical facts) collected together for reference or information.
• Information is knowledge communicated, concerning some particular fact.
(Sri Krishna taught Arjun about management, human values.)

In short, statistics is a tool which converts information from numbers.


 E.g. To travel from Jorhat to Golaghat, the average one way travel time is 55minutes.
 Users of twitter have risen by 175 million last year
 2.8million WhatsApp users are added everyday which comes to 6 per second.
In all the above mentioned examples, there are some numerical figures or facts like, 55minutes,
175millions, 6 users per second. These numerical facts are called statistics and these numbers,
percent, figures, allows us to understand business and economic conditions.
From the above examples, it become clear that numbers, figures are the heart of statistics without
which the discipline cannot survive but it is also imperative that statistics is just not limited to
numbers and graphical representation of data. It is about extracting information and conclusions from
such data.

The following points became the area of focus in statistics.


 Type and Volume of Data
 Data organization and summarization
 Data Analysis & Interpretation
 Results representation and Judgement on certainty.
Business Statistics Page 1
 Results representation and Judgement on certainty.

Thus, with statistics, design the systems, create a description for them and detect inferences from
them. Statistics may be defined as set of procedures and rules for minimizing large mass of data into
manageable proportions and for allowing to draw conclusions from these data. It helps in drawing
inferences from the available data and take decisions accordingly.

Important Definitions of Statistics:


(Merriam-Webster) Dictionary defines statistics as
1) "Classified facts representing the conditions of people in a state specifically the facts that can be
stated in numbers or any other tabular or classified arrangement."

2) Statistician Arthur Lyon Bowley would define statistics as


"numerical statements of facts in any department of inquiry placed in relation to each other."
He also defines statistics as
"Statistics may be called science of counting."
At another place he defines
"statistics as science of averages."
i. Census Method
ii. Sample Method (E(x̄ )=µ ) (Statutory Theory, Regulation Theorem)
x̄ = Sample Mean
µ = Population Mean

3) Modern Definition:
According to Horace Secrist, "By statistics we mean 1)aggregate of facts affected to a 2)marked
extent by a multiplicity of causes 3)numerically expressed, estimated according to 4)reasonable
standards of accuracy collected in a 5)systematic manner for a 6)pre-determined purpose and
place in relation to each other."

Modern Definiton of statistics:


Statistics is a branch of science, which provides tools(techniques) for decision making in the face of
uncertainty (probability). -Wallis and Roberts
Statistics is the grammar of science - Karl Pearson(FATHER OF STATISTICS)
If your experiment needs statistics, you oath to have done a better experiment. -Ernest Rutherford.

The field of statistics has two main areas:


Mathematical Statistics
Concerns the development of new methods of statistical inference and requires detailed knowledge of
abstract mathematics for its implementation.
Applied Statistics
Involves applying the methods of mathematical statistics to specific subject areas, such as economics
Business Statistics Page 2
Involves applying the methods of mathematical statistics to specific subject areas, such as economics
psychology, and public health.

1) Write a short note on the origin of statistics.


2) Give the one definition of statistics which can be regarded as the best.
3) Mention few functions of statistics.
4) How can you interpret the term "division of statistics"?
5) Define the following terms:
a. Population and sample
b. Variable and attributes

Data: Qualitative and Quantitative


Is the information or fact about some particular characteristics under study.
In order to study any problem statistically, one has to collect data either through population
survey or sample survey.
In order to study the growth of a company it is necessary to have data on its capital, assets,
overheads, cost, sales, expenditures, etc. This type of data can be expressed in numbers,
hence the data of this type is called quantitative data.
But if data regarding employees in terms of their marital status, religion, efficiency, level of
education, skills, honesty, etc, then such type of data cannot be expressed in numbers. Hence,
such type of data are called qualitative data.

Statistical Investigation/Inquiry
Means some sort of investigation with the help of statistical methods. Only that information
can be collected through statistical investigations, which can be expressed in quantitative
terms (in the form of data). The foundation of statistical investigation is data. All numerical
statements of facts are not statistical data.
Statistical data are numerically expressed aggregate of facts collected in a systematic manner
for a pre-determined purpose and placed in relation to each other.
The person who conducts the statistical inquiry is known as investigator. And the person from
which the information is collected are known as informants (respondents).
The unit of measurement are applied to data in any particular problem is called statistical
unit.

Stages of Statistical Inquiry/Investigation


1. Objectives and Scope
2. Sources of Information
▫ Primary
Business Statistics Page 3
▫ Primary
▫ Secondary
3. Types of Inquiry
▫ Initial(Original) or Repetitive (if objective is reached repeat the inquiry)
▫ Direct or Indirect (directly with the respondents, or indirectly)
▫ Official, Semi-Official, or Non-Official
▫ Confidential or Non-Confidential
▫ Casual or Regular
▫ Pilot or Comprehensive
4. Statistical Units
▫ Characteristics of Statistical Units
▫ Types of the Statistical Units
▪ Units of Collection of Data
▪ Units of Analysis and Interpretation of Data
5. Determination of Standard of Accuracy

Statistical Units and their characteristics:


A well defined and identifiable basis of measurement in any statistical inquiry if called
Statistical Units. Statistical Units forms the basis of collection of statistics in any
inquiry/investigation, hence are related to the measurement of variables.

Important Characteristics:
1) It should be suitable to the subject of inquiry.
2) It should be simple, clear and self-explanatory.
3) It must be definite, specific and certifiable.
4) It should be stable and standardized.
5) It must ensure homogeneity and uniformity.

Business Statistics:
It is the science of good decision making in the face of uncertainty and us used in many
disciplines such as financial analysis, econometrics, auditing, production and operations including
services improvement and marketing research.
In the world of competitive business environment many organization find themselves data -
rich but information - poor. Thus for decision - makers it is important to develop the ability to
extract meaningful information from raw data to make better decisions.
It is possible only through the careful analysis of data guided by statistical thinking.
Therefore learning of statistics helps the decision - maker to understand how to
Present and describe information (data) to improve decision.
Draw conclusions about large population based upon information obtained from samples.
Seek out relationship between pair of variable to improve process.
Business Statistics Page 4
Seek out relationship between pair of variable to improve process.
Obtain reliable forecasts of statistical variables of interest.

Distinction
Basic Primary Secondary
Originality These data are original in nature. These data are in the form of
compilations of existing data or already
published data.
Nature These data in form of raw These data are in the form of finished
materials to which statistical product as they have already been applied.
methods are applied for the
purpose of analysis.
Economy These data are not economic These data are more economic as it
because it involves cost, time involves less cost, time and man-power.
and man-power.
Precautions There is no need for extra care These data need extra care to handle.
as data is used in its original
shape.
Editing Here editing is not required as Editing of the data is required.
data are collected originally.

Primary Data Sources for statistical data:


Combinations of the following methods or any one can be chosen to collect primary data-
1) Observation
In observational studies the investigator doesn't ask questions to get clarifications on
certain issues instead he records the behaviors, as it occurs, of an event in which he is
interested. Sometimes mechanical devices are also used to record the desired data. It is
one of the basic methods of collecting information in social situations.
 Structured
○ Overt or Covert Observation
○ Direct or Indirect Observation
 Unstructured
○ Participant Method or Non-Participant Method

2) Schedule
3) Questionnaire
4) Case Study

Statistical data therefore refer to those aspects of a problem situation that can be measured,
quantified, counted, or classified.
Any object or activity that generates data through this process is called variable.
Business Statistics Page 5
Any object or activity that generates data through this process is called variable.
A variable is some characteristic of a population or sample.
Eg. Student grades, typically denoted with a capital letter: X,Y,Z
The values of the variable are the range of possible values for a variable
Eg. Student marks (0..100)
Data are the observed values of a variable.
Eg. Student marks: (67,89,77)

Types of Data used in Data Analysis


• Categorical data are by classification or description, such as type of profession, countries,
etc.
• Numerical Data are expressed in numbers, which are of two types:
▫ Discrete variable is obtained by counting. For example, the number of shoppers visited a
shop in a day.
▫ Continuous variable is obtained from measuring. For example, we measure the weight or
height of the children in a school.

Techniques of data collection:


Statistical data may be collected in 2 ways
1) Census Method
2) Sample Method
If information is collected from each and every individual unit or object of a population, the
method of collecting information is known as census.
On the other hand, if information is carried out on a properly selected representative sample,
the method of collecting information is said to be sample survey/sampling method.
Thus, sampling is a technique which helps to draw inferences about the entire population
simply by analyzing few of them. Sample survey method is very useful in market and
consumer research. In the industries, sampling techniques are used for quality control of the
product.
Eg use of min. man power, min. time, min cost.
Why sampling is preferred over census?
Advantages of sampling over census:
• Sampling method requires less time, and labor (since only a part of population is studied
instead of whole population).
• Sampling method is less expensive than census.
• Sampling has a greater scope than census in certain investigation, highly trained
investigators or advanced equipment are required.
• Sampling gives more reliable results than census
• Sampling has administrative convenience (since the organization and administration of a
sample survey are relatively convenient.

Business Statistics Page 6
• In case of infinite and imaginary population, sampling is the method of investigation.
• In case of destructive testing sampling method is used.
Types of Sampling:
1) Purposive Sampling
2) Random Sampling

Statistical Description of Data


Presentation of Data:
The process and various methods of the collection of data is the main
objective of any statistical investigation. Usually the data collected
from field studies is found to be in huge volume and contains
questionnaire, such data is known as raw data. After the collection of
data next important step is to classify and tabulate the collected
information or to rearrange them into new groups. The data should be
presented in comprehensive condensed form so that the important
characteristics of the data are highlighted and also helping further
comparison, processing and interpretation. The presentation of data is
classified into 3 categories:
1) Textual Presentation: In this method data is presented with the help of
paragraph of paragraphs. These paragraphs includes text and figures
together.
(Where is it used?)The official Report of an inquiry commission is usually
presented in text.
It has an advantage of presenting the complete data and directing
attention. It takes much time in reading and understanding the data.
Systematic arrangement of raw data into homogenous classes is must
before proceeding to tabulation process. It is necessary for sorting out
the necessary relevant and significant information from irrelevant and
insignificant data. So arrangement of data into groups or classes
according to similar features is known as classification.
2) Tabulation Presentation:
3) Diagrammatic and Graphic Presentation:

Classification is the first step towards further processing after the collection and editing of
data.
Types of Classification:
1) Chronological Classification (with respect with the occurrence of time)
2) Geographical Classification (with respect to area or region)
3) Qualitative Classification (by character or attribute)
4) Quantitative Classification (by magnitude or numerical)

Q. The number of printing mistakes per page in a small book of 40pages are as follows:
3,3,2,3,4,1,7,4,0,5,2,1,4,3,2,6,3,5,2,4,3,4,2,3,5,3,4,5,6,3,4,2,1,4,5,2,4,6,5,3.
Prepare a frequency distribution table for the data.

Business Statistics Page 7


Mistakes (x) Tally Number of Pages (fx)
0 I 1
1 III 3
2 IIII III 7
3 IIII IIII II 10
4 IIII IIII I 9
5 IIII II 6
6 III 3
7 I 1
Total 40
Xbar = Sum(fx)/Sum(f)

The data of working hours of 50workers for a period of month in a certain factory is given
below:
103,204,162,149,79,113,69,121,93,143,165,133,195,151,71,94,87,42,30,62,110,175,161,157,155,108,164,12
8,114,178,140,144,187,184,197,87,40,122,203,148,130,156,167,124,164,146,116,149,104,141.
1. Identify the variable in above data. Which scale of measurement can be used for the
identified variable?
2. Whether the identified variable is discrete or continuous?
3. Construct the frequency distribution for the concerned variable.

Ans 1. Variable: Working Hours


Scale: Hours
Ans 2. Continuous
Ans 3.
Here the working hour of a worker is a variable and it is continuous. The maximum value in
the data is 204, and the minimum value is 30. Hence, the range of the data is 174, which is
large number. We therefore divide the given data into 7classes in terms of the interval.

Business Statistics Page 8


1) Range: Range is the difference between the highest and the lowest data.
2) No. of Classes: How many classes it will have
3) Class Interval: Only if there is presence of Interval or Ratio.
4) Class Frequency: It displays the no. of the values or no. of observations.
5) Mid-Point: Central value of Upper limit and lower limit. (UL+LL)/2
6) Class Boundaries:
Income should be in ratio scale for better understandability. (income less than 10,000 for
example)
7) Width of the class interval: 0-10, 10-20, 20-30,
10 is the class width
8) Relative Frequency and Percentage Frequency:
CI f Relative Frequency RF%
0-10 7
10-20 13 13/25=0.52 52%
20-30 5
Whenever there is interval graph is created.
9) Frequency Density: It is the ratio of class frequency to the class width of that class.
Frequency Density= Class Frequency/Class Width
In the above example: 13/10 = Frequency Density
10) Cumulative Frequency (cf):

Q. The following are the earning of the 40 workers:


10,26,24,16,16,23,28,23
25,18,10,11,20,21,19,18
15,13,22,17,15,29,29,12
34,15,14,18,22,24,30,38
17,32,36,20,19,27,33,31
1) Construct a frequency table taking 4 as class interval
2) Find the percentage of workers getting earning below Rs.32.
No. of classes= Range/Class Interval
38-10/4
28/4
7 class intervals.

Business Statistics Page 9


Business Statistics Page 10

You might also like