You are on page 1of 21

Meaning, scope and Types of

statistics
ORIGIN AND DEVELOPMENT OF STATISTICS

• Statistics was regarded as the ‘Science of Statecraft’ and was the byproduct of the administrative activity of
the State. The word Statistics seems to have been derived from the Latin word ‘status’ or the Italian word
‘statista’ or the German word ‘statistik’ or the French word ‘statistique’, each of which means a political state.
• In the ancient times the scope of Statistics was primarily limited to the collection of the following data by the
governments for framing military and fiscal policies :
(i) Age and sex-wise population of the country ;
(ii) Property and wealth of the country ;
• the former enabling the government to have an idea of the manpower of the country (in order to safeguard
itself against any outside aggression) and the latter providing it with information for the introduction of new
taxes and levies.
• Perhaps one of the earliest censuses of population and wealth was conducted by the Pharaohs (Emperors) of
Egypt in connection with the construction of famous ‘Pyramids’.
• Such censuses were later held in England, Germany and other western countries in the middle ages.
• In India, an efficient system of collecting official and administrative statistics existed even 2000 years ago - in
particular during the reign of Chandragupta Maurya (324 – 300 B.C.). Historical evidences about the
prevalence of a very good system of collecting vital statistics and registration of births and deaths even before
300 B.C. are available in Kautilya’s ‘Arthashastra’.
• The records of land, agriculture and wealth statistics were maintained by Todermal, the land and revenue
minister in the reign of Akbar (1556 – 1605 A.D.). A detailed account of the administrative and statistical
surveys conducted during Akbar’s reign is available in the book ‘Ain-e Akbari’ written by Abul Fazl (in 1596
– 97), one of the nine gems of Akbar.
• Sixteenth century saw the applications of Statistics for the collection of the data relating to the movements of
heavenly bodies – stars and planets – to know about their position and for the prediction of eclipses. J. Kepler
made a detailed study of the information collected by Tycko Brave (1554 – 1601) regarding the movements of
the planets and formulated his famous three laws relating to the movements of heavenly bodies. These laws
paved the way for the discovery of Newton’s law of gravitation.
• Seventeenth century witnessed the origin of Vital Statistics. Captain John Graunt of London (1620 – 1674),
known as the Father of Vital Statistics, was the first man to make a systematic study of the birth and death
statistics. The computation of mortality tables and the calculation of expectation of life at different ages by
these persons led to the idea of ‘Life Insurance’ and Life Insurance Institution was founded in London in
1698. William Petty wrote the book ‘Essay on Political Arithmetic’. In those days Statistics was regarded as
Political Arithmetic.
• The backbone of the so-called modern theory of Statistics is the ‘Theory of Probability’ or the ‘Theory of Games and
Chance’ which was developed in the mid-seventeenth century. Theory of probability is the outcome of the prevalence
of gambling among the nobles of England and France while estimating the chances of winning or losing in the
gamble, the chief contributors being the mathematicians and gamblers of France, Germany and England.
• Two French mathematicians Pascal (1623 – 1662) and P. Fermat (1601 – 1665), after a lengthy correspondence
between themselves ultimately succeeded in solving the famous ‘Problem of Points’ posed by the French gambler
Chevalier de-Mere and this correspondence laid the foundation stone of the science of probability.
• Next stalwart in this field was, J. Bernoulli (1654 – 1705) whose great treatise on probability ‘Ars Conjectandi’ was
published posthumously in 1713, eight years after his death by his nephew Daniel Bernoulli (1700 – 1782). This
contained the famous ‘Law of Large Numbers’ which was later discussed by Poisson, Khinchine and Kolmogorov.
• De-Moivre (1667 – 1754) also contributed a lot in this field and published his famous ‘Doctrine of Chance’ in 1718
and also discovered the Normal probability curve which is one of the most important contributions in Statistics.
• Other important contributors in this field are Pierra Simon de Laplace (1749 – 1827) who published his monumental
work ‘Theoric Analytique de’s of Probabilities’, on probability in 1782;
• Gauss (1777 – 1855) who gave the principle of Least Squares and established the ‘Normal Law of Errors’
independently of De- Moivre; L.A.J. Quetlet (1798 – 1874) discovered the principle of ‘Constancy of Great
Numbers’ which forms the basis of sampling;
• Euler, Lagrange, Bayes, etc.
• Russian mathematicians also have made very outstanding contributions to the modern theory of probability, the main
contributors to mention only a few of them are : Chebychev (1821 – 1894), who founded the Russian School of
Statisticians ; A. Markov (Markov Chains) ; Liapounoff (Central Limit Theorem); A. Khinchine (Law of Large
Numbers) ; A Kolmogorov (who axiomised the calculus of probability) ; Smirnov, Gnedenko and so on.
• Modern stalwarts in the development of the subject of Statistics are Englishmen who did pioneering work in the
application of Statistics to different disciplines.
• Francis Galton (1822 – 1921) pioneered the study of ‘Regression Analysis’ in Biometry; Karl Pearson (1857 –
1936) who founded the greatest statistic laboratory in England pioneered the study of ‘Correlation Analysis’. His
Chi-Square test of Goodness of Fit is the first and most important of the tests of significance in Statistics ; W.S.
Gosset with his t-test ushered in an era of exact (small) sample tests.
• Perhaps most of the work in the statistical theory during the past few decades can be attributed to a single person
Sir Ronald A. Fisher (1890 – 1962) who applied Statistics to a variety of diversified fields such as genetics,
biometry, psychology and education, agriculture, etc., and who is rightly termed as the Father of Statistics. It is
only the varied and outstanding contributions of R.A. Fisher that put the subject of Statistics on a very firm footing
and earned for it the status of a full-fledged science. In addition to enhancing the existing statistical theory he is
the pioneer in Estimation Theory (Point Estimation and Fiducial Inference); Exact (small) Sampling Distributions ;
Analysis of Variance and Design of Experiments.
• Indian statisticians also did not lag behind in making significant contributions to the development of Statistics in
various diversified fields. The valuable contributions of C.R. Rao (Statistical Inference); Parthasarathy (Theory of
Probability); P.C. Mahalanobis and P.V. Sukhatme (Sample Surveys) ; S.N. Roy (Multivariate Analysis) ; R.C.
Bose, K.R. Nair, J.N. Srivastava (Design of Experiments), to mention only a few, have placed India’s name in the
world map of Statistics.
DEFINITION OF STATISTICS

• "Statistics", in its modern connotation, "is a body of methods for making wise decisions in the face of
uncertainty." (Wallis and Roberts) (i) The field of utility of Statistics has been increasing steadily and thus
different people defined it differently according to the developments of the subject. In old days, Statistics was
regarded as the ‘science of statecraft’ but today it embraces almost every sphere of natural and human activity.
Accordingly, the old definitions which were confined to a very limited and narrow field of enquiry were
replaced by the new definitions which are more exhaustive and elaborate in approach.
• The word Statistics has been used to convey different meanings in singular and plural sense. When used as
plural, statistics means numerical set of data and when used in singular sense it means the science of
statistical methods embodying the theory and techniques used for collecting, analysing and drawing
inferences from the numerical data.
• 1. “Statistics are the classified facts representing the conditions of the people in a State…specially those facts
which can be stated in number or in tables of numbers or in any tabular or classified arrangement.”—Webster.
• 2. “Statistics are numerical statements of facts in any department of enquiry placed in relation to each other.”—
Bowley.
• 3. “By statistics we mean quantitative data affected to a marked extent by multiplicity of causes”.—Yule and
Kendall.
• 4. “Statistics may be defined as the aggregate of facts affected to a marked extent by multiplicity of causes,
numerically expressed, enumerated or estimated according to a reasonable standard of accuracy, collected in a
systematic manner, for a predetermined purpose and placed in relation to each other.”—Prof. Horace Secrist.
• Secrist’s definition seems to be the most exhaustive of all the four. Let us try to examine it in details.
• (i) Aggregate of Facts. Simple or isolated items cannot be termed as Statistics unless they are a part of
aggregate of facts relating to any particular field of enquiry. For instance, the height of an individual or the
price of a particular commodity do not form Statistics as such figures are unrelated and uncomparable.
However, aggregate of the figures of births, deaths, sales, purchase, production, profits, etc., over different
times, places, etc., will constitute Statistics.
• (ii) Affected by Multiplicity of Causes. Numerical figures should be affected by multiplicity of factors. This
point has already been elaborated in remark 3 above. In physical sciences, it is possible to isolate the effect of
various factors on a single item but it is very difficult to do so in social sciences, particularly when the effect
of some of the factors cannot be measured quantitatively. However, statistical techniques have been devised to
study the joint effect of a number of a factors on a single item (Multiple Correlation) or the isolated effect of a
single factor on the given item (Partial Correlation) provided the effect of each of the factors can be measured
quantitatively
• (iii) Numerically Expressed. Only numerical data constitute Statistics. Thus the statements like ‘the standard of living
of the people in Delhi has improved’ or ‘the production of a particular commodity is increasing’ do not constitute
Statistics. In particular, the qualitative characteristics which cannot be measured quantitatively such as intelligence,
beauty, honesty, etc., cannot be termed as Statistics unless they are numerically expressed by assigning particular scores
as quantitative standards. For example, intelligence is not Statistics but the intelligence quotients which may be
interpreted as the quantitative measure of the intelligence of individuals could be regarded as Statistics.
• (iv) Enumerated or Estimated According to Reasonable Standard of Accuracy. The numerical data pertaining to
any field of enquiry can be obtained by completely enumerating the underlying population. In such a case data will be
exact and accurate (but for the errors of measurement, personal bias, etc.). However, if complete enumeration of the
underlying population is not possible (e.g., if population is infinite, or if testing is destructive i.e., if the item is
destroyed in the course of inspection just like in testing explosives, light bulbs, etc.), and even if possible it may not be
practicable due to certain reasons (such as population being very large, high cost of enumeration per unit and our
resources being limited in terms of time and money, etc.), then the data are estimated by using the powerful techniques
of Sampling and Estimation theory. However, the estimated values will not be as precise and accurate as the actual
values. The degree of accuracy of the estimated values largely depends on the nature and purpose of the enquiry. For
example, while measuring the heights of individuals accuracy will be aimed in terms of fractions of an inch whereas
while measuring distance between two places it may be in terms of metres and if the places are very distant, e.g., say
Delhi and London, the difference of few kilometres may be ignored. However, certain standards of accuracy must be
maintained for drawing meaningful conclusions.
• (v) Collected in a Systematic Manner. The data must be collected in a very systematic manner. Thus, for any socio-
economic survey, a proper schedule depending on the object of enquiry should be prepared and trained personnel
(investigators) should be used to collect the data by interviewing the persons. An attempt should be made to reduce the
personal bias to the minimum. Obviously, the data collected in a haphazard way will not conform to the reasonable
standards of accuracy and the conclusions based on them might lead to wrong or misleading decisions.
• (vi) Collected for a Pre-determined Purpose. It is of utmost importance to define in clear and concrete
terms the objectives or the purpose of the enquiry and the data should be collected keeping in view these
objectives. An attempt should not be made to collect too many data some of which are never examined or
analysed i.e., we should not waste time in collecting the information which is irrelevant for ou enquiry. Also it
should be ensured that no essential data are omitted. For example, if the purpose of enquiry is to measure the
cost of living index for low income group people, we should select only those commodities or items which are
consumed or utilised by persons belonging to this group. Thus for such an index, the collection of the data on
the commodities like scooters, cars, refrigerators, television sets, high quality cosmetics, etc., will be
absolutely useless.
• (vii) Comparable. From practical point of view, for statistical analysis the data should be comparable. They
may be compared with respect to some unit, generally time (period) or place. For example, the data relating to
the population of a country for different years or the population of different countries in some fixed year
constitute Statistics, since they are comparable. However, the data relating to the size of the shoe of an
individual and his intelligence quotient (I.Q.) do not constitute Statistics as they are not comparable. In order
to make valid comparisons the data should be homogeneous i.e., they should relate to the same phenomenon
or subject
• From the definition of Horace Secrist and its discussion in remark 4 above, we may conclude that :
• “All Statistics are numerical statements of facts but all numerical statements of facts are not Statistics”.
• “Statistics is a method of decision making in the face of uncertainty on the basis of numerical data and calculated
risks.”—Prof. Ya-Lun-Chou
• “Statistics may be defined as the science of collection, presentation, analysis and interpretation of numerical
data.” —Croxton and Cowden
• 9. “Statistics may be regarded as a body of methods for making wise decisions in the face of uncertainty.”—Wallis
and Roberts
• “The science and art of handling aggregate of facts—observing, enumeration, recording, classifying and
otherwise systematically treating them.”—Harlow
• Harlow’s definition describes Statistics both as a science and an art—science, since it provides tools and laws for
the analysis of the numerical information collected from the source of enquiry and art, since it undeniably has its
basis upon numerical data collected with a view to maintain a particular balance an consistency leading to perfect
or nearly perfect conclusions. A statistician like an artist will fail in his job if he does not possess the requisite
skill, experience and patience while using statistical tools for any problem.
IMPORTANCE AND SCOPE OF STATISTICS

• Statistics in Planning
• Statistics in State.
• Statistics in Economics.
• Statistics in Business and Management.
• Statistics in Accountancy and Auditing
• Statistics in Industry
• Statistics in Physical Sciences.
• Statistics in Social Sciences
• Statistics in Biology and Medical Sciences
• Statistics in Psychology and Education
• Statistics in Big Data and Analytics
Basic Statistical Concepts

• Business statistics, like many areas of study, has its own language. It is important to begin our study
with an introduction of some basic concepts in order to understand and communicate about the
subject. We begin with a discussion of the word statistics. The word statistics has many different
meanings in our culture. Webster’s Third New International Dictionary gives a comprehensive
definition of statistics as a science dealing with the collection, analysis, interpretation, and
presentation of numerical data.
Basic Terminology
• The study of statistics can be organized in a variety of ways. One of the main ways is to subdivide
statistics into two branches: descriptive statistics and inferential statistics.
• population is a collection of persons, objects, or items of interest
• When researchers gather data from the whole population for a given measurement of interest,
they call it a census.
• A sample is a portion of the whole and, if properly taken, is representative of the whole.
• If a business analyst is using data gathered on a group to describe or reach conclusions about that
same group, the statistics are called descriptive statistics. For example, if an instructor produces
statistics to summarize a class’s examination effort and uses those statistics to reach conclusions
about that class only, the statistics are descriptive.
• Another type of statistics is called inferential statistics. If a researcher gathers data from a
sample and uses the statistics generated to reach conclusions about the population from which the
sample was taken, the statistics are inferential statistics. The data gathered from the sample are
used to infer something about a larger group. Inferential statistics are sometimes referred to as
inductive statistics.
• One application of inferential statistics is in pharmaceutical research. Some new drugs are
expensive to produce, and therefore tests must be limited to small samples of patients. Utilizing
inferential statistics, researchers can design experiments with small randomly selected samples of
patients and attempt to reach conclusions and make inferences about the population.
• Suppose a soft drink company creates an advertisement depicting a dispensing machine that talks
to the buyer, and market researchers want to measure the impact of the new advertisement on
various age groups. The researcher could stratify the population into age categories ranging from
young to old, randomly sample each stratum, and use inferential statistics to determine the
effectiveness of the advertisement for the various age groups in the population. The advantage of
using inferential statistics is that they enable the researcher to study effectively a wide range of
phenomena without having to conduct a census.
• A descriptive measure of the population is called a parameter. Parameters are usually denoted by
Greek letters. Examples of parameters are population mean (μ), population variance (σ2), and
population standard deviation (σ). A descriptive measure of a sample is called a statistic.
• Statistics are usually denoted by Roman letters. Examples of statistics are sample mean sample
variance (s2), and sample standard deviation (s).
• Differentiation between the terms parameter and statistic is important only in the use of inferential
statistics. A business analyst often wants to estimate the value of a parameter or conduct tests about
the parameter. However, the calculation of parameters is usually either impossible or infeasible
because of the amount of time and money required to take a census. In such cases, the business
analyst can take a random sample of the population, calculate a statistic on the sample, and infer by
estimation the value of the parameter.
• The basis for inferential statistics, then, is the ability to make decisions about parameters without
having to complete a census of the population. For example, a manufacturer of washing machines
would probably want to determine the average number of loads that a new machine can wash before it
needs repairs. The parameter is the population mean or average number of washes per machine
before repair. A company researcher takes a sample of machines, computes the number of washes
before repair for each machine, averages the numbers, and estimates the population value or
parameter by using the statistic, which in this case is the sample average.
• The Inferential Process- Inferences about parameters are made under uncertainty. Unless
parameters are computed directly from the population, the statistician never knows with certainty
whether the estimates or inferences made from samples are true. In an effort to estimate the level of
confidence in the result of the process, statisticians use probability statements.
• Business statistics is about measuring phenomena in the business world and organizing, analyzing, and
presenting the resulting numerical information in such a way that better, more informed business
decisions can be made. Most business statistics studies contain variables, measurements, and data.
• In business statistics, a variable is a characteristic of any entity being studied that is capable of taking
on different values.
• Some examples of variables in business include return on investment, advertising dollars, labor
productivity, stock price, historical cost, total sales, market share, age of worker, earnings per share,
miles driven to work, time spent in store shopping, and many, many others. In business statistics
studies, most variables produce a measurement that can be used for analysis.
• A measurement is taken when a standard process is used to assign numbers to particular attributes
or characteristics of a variable. Many measurements are obvious, such as time spent in a store shopping
by a customer, age of the worker, or the number of miles driven to work. However, some measurements,
such as labor productivity, customer satisfaction, and return on investment, have to be defined by the
business analyst or by experts within the field.
• Once such measurements are recorded and stored, they can be denoted as “data.” It can be said that
data are recorded measurements. The processes of measuring and data gathering are basic to all that
we do in business statistics and analytics. It is data that are analyzed by business statisticians and
analysts in order to learn more about the variables being studied. Sometimes, sets of data are organized
into databases as a way to store data or as a means for more conveniently analyzing data or comparing
variables.
• Valid data are the lifeblood of business statistics and business analytics, and it is important that the
business analyst give thoughtful attention to the creation of meaningful, valid data before embarking on
analysis and reaching conclusions.
END

You might also like