# Statistics

What is Statistics?

Statistics is a collection of numerical figures
or data obtained from the field of human activity such as statistics of population, literacy rate, inflation rate,GDP growth rate etc.Such statistics are quite commonly found in newspaper,journals,report and you can also hear them in radio or television.

For Example:Population of India, 1901-2001
Census Year 1901 1911 1921 1931 1941 1951 1961 1971 1981 1991 2001 Population (million) 238.4 252.1 251.3 279.0 318.7 361.1 439.2 548.1 683.3 843.4 1027.0

Size and literacy rate of different states of India,2001
States (1) Andhra Pradesh Assam Bihar Gujarat Haryana Karnataka Kerala Madhya Pradesh Maharashtra Orissa Punjab Rajasthan Tamil Nadu Uttar Pradesh West Bengal Area in sq.km (2) 2,75,045 78,438 94,163 1,96,024 44,212 1,91,791 38,863 3,08,245 (largest state of India) 3,07,713 1,55,707 50,362 3,42,239 1,30,058 2,40,928 88,752 Literacy rate (3) 60.5 63.3 47 69.1 67.9 66.6 90.9(highest literacy rate) 63.7 76.9 63.1 69.7 60.4 73.5 56.3 68.6

Characteristics of Statistics:
1) 3) In statistics all the available information are expressed in quantitative term. Statistics must be aggregate of facts:Statistics is no where concern with the individual numerical figure, it only concern with the group of related numerical figure . Single and isolated figures are not statistics for simple reason that such figures are unrelated and can not be compared and no meaningful inference can be drawn out of it. Always keep in mind that statistics is not only group of numerical figures but the group of related numerical figures ,so that the comparability is possible and on the basis of which a meaningful inference can be drawn.

For example:

Consider the following informations : The population of India in 2001 is 1027 million.  Size of Madha Pradesh is 308245 sq.km.  Literacy rate of Kerala is 90.9%. Q) This is also a group of figures but will you consider this group of numerical figures as statistics? Ans) No ,since the figures provided above are totally unrelated to each other, the first one refers to the population size of India, the second one refers to the size of state M.P and the third one refers to the literacy condition of state Kerala and no comparably is possible among them. So this group of numerical figure would not be considered as statistics.

But instead of providing the single information
about the population of India of 2001,if we provide the series of population figure starting from 1901,i.e.given in table-1,then it will be considered as statistics because all the figures stating from the 1901 to 2001 refers to the population size of India at different time and hence they are related and by comparing the population figure of successive years we can now easily infer that the population of India increases steadily over the years.

Census Year 1901 1911 1921 1931 1941 1951 1961 1971 1981 1991 2001

Population (million) 238.4 252.1 251.3 279.0 318.7 361.1 439.2 548.1 683.3 843.4 1027.0

Table-1 The population of India increases steadily over the years.

Similarly instead of providing the single information
about the size of state Madha Pradesh. if we also provide the sizes of other states (in sq.km) of India as given in Table-2, then it will be considered as statistics because all the figures refer the sizes of different states of India and hence they are related and by comparing the states in respect of their size ,we can now easily infer that Madhya Pradesh is the largest state of India by comparing the size of different states.

States (1) Andhra Pradesh Assam Bihar Gujarat Haryana Karnataka Kerala Madhya Pradesh Maharashtra Orissa Punjab Rajasthan Tamil Nadu Uttar Pradesh West Bengal

Area in sq.km (2) 2,75,045 78,438 94,163 1,96,024 44,212 1,91,791 38,863 3,08,245 (largest state of India) 3,07,713 1,55,707 50,362 3,42,239 1,30,058 2,40,928 88,752

Table-2 Madhya Pradesh is the largest state of India.

Similarly instead of providing the single information
about the the literacy rate of Kerala, if we also provide the literacy rates of the other states of India, as given in table3, then it will be considered as statistics because all the figures related to the literacy rates of the different states and hence they are related and by comparing the states in respect to their literacy rate we can easily infer that the literacy rate is highest in Kerala, which may not be possible to analyse only by observing the single figure about the literacy rate of Kerala.

States (1) Andhra Pradesh Assam Bihar Gujarat Haryana Karnataka Kerala Madhya Pradesh Maharashtra Orissa Punjab Rajasthan Tamil Nadu Uttar Pradesh West Bengal

Literacy rate (3) 60.5 63.3 47 69.1 67.9 66.6 90.9(highest literacy rate) 63.7 76.9 63.1 69.7 60.4 73.5 56.3 68.6

Table-3 literacy rate is highest in Kerala

1) The purpose of collecting statistical data should be well defined and specific:For example- If the objective is to collect the data on prices it would not serve the purpose, unless one specify whether the price refers to the wholesale price or retail price and what are the relevant commodity in that taken into consideration.

4) Statistics are affected by multiplicity of causes:In all the field of enquiry, the observed data are the result of the large number of factors, each of which contributes to the final figure. For exampleStatistics of production of Rice are affected by the rainfall, quality of soil,fertilizer,method of cultivation etc. Statistics of the sales volume of a particular product is also affected by the price of its own product, price of competing product, purchasing power of consumer, taste of consumer etc. Statistics segregate the effect of various factors and study them.

In the absence of the above characteristics numerical data cannot be called statistical and hence “all statistics are numerical statement of facts but all numerical statements of facts are not statistics”

Statistical Method

In addition to the meaning of numerical
data statistics also refers to a subject i.e. a method dealing with numerical data known as statistical method. The statistical methods are used for collection,organisation,presentation, analysis and interpretation of numerical data.

1) Collection of data Collection of data constitute the first step in

statistical investigation. The must be collected with utmost care because they form the foundation of statistical analysis. If the data are faulty, then the whole analysis on the basis of this data will be misleading and lead to a wrong conclusion. The data may be available from the existing published or unpublished sources or may be collected by the investigator himself.

2) Organisation of data

After the data have been collected the next
step is to organize and present the raw data in some suitable form. The need for proper presentation arises because the raw data collected from the investigation are so varied and numerous that no such meaningful inference can be drawn out of it.

Types of organisation
Organisation involves three steps b) Editing c) Classification d) Tabulation

a) Editing of data

The collected data must be edited or scrutinised
very carefully so that the omission, inconsistencies, irrelevant answer, and wrong computation in returns from survey must be corrected. We should always remember that if the data are faulty then the whole analysis based on this information will be misleading.

Editing for completeness:-The editor should check that each questionnaire is complete in all respect i.e. each and every question has been answered. If some questions have not been answered, the informants should be contacted again either personally or through correspondence and required correction should be done. • Editing for consistency:-The editor should check that answer to the questions are not contradictory in nature. If such contradictory answers are found then the informants should be contacted again either personally or through correspondence and required correction should be done. For example:- if among others, two questions in the questionnaire are:d) Are you employed? e) What is your monthly salary? The reply to the former question is “no” and to the later is Rs.5000/- per month then there is contradiction and it should be clarified.

3) Editing for accuracy:- The reliability of the whole analysis depend basically on the correctness of the information. If the information supplied is inaccurate then the whole analysis will be misleading and lead to a wrong conclusion. It is therefore necessary for the editor to check that the information is accurate in all respect. However if the inaccuracy is due to mathematical errors, it can be easily detected and rectified. But if the cause of inaccuracy is faulty information supplies by informants then it may be difficult to rectify it, for example:- information relating incemr,age,sales etc.

4) Editing for homogeneity:- The editor should check the information supplied by various people is homogeneous or uniform For example:- In the answer of the question about the income of a person, if some informants have given the monthly income, some annual income and some daily or weekly income, then no comparison can be made. So the editor have to keep in mind the information supplied by the informants should be homogeneous in nature i.e. all the informants should provide their monthly income.

b) Classification of Data

Classification is the process of arranging
the collected statistical information under different classes according to some common characteristics possessed by the individual members.

For example

During the population census conducted by the
Census of India apart from the number of members in each family various other statistical information on different characteristics e.g. sex,age,geographical location etc. of all people in the country are collected. The total population is then classified according to these characteristics.

For Example:- Classification on the basis of sex Total Population of India in 2001 Persons Males Females Population in number

1,028,737,436 532,223,090 496,514,346

c)Tabulation of data

Tabulation is the logical and systematic organization of
statistical data in rows and column, designed to Simplify the presentation. Tabulation enables the significance of data readily understood, and leaves a lasting impression than textual presentation. Enable the reader to quickly locate the desired information. It facilitates quick comparison of statistical data shown between rows and columns.

3)Representation of Data

In order to make the data suitable for analysis
and interpretation, the after classification and tabulation, the have to be presented properly by using Charts and Diagrams, so that the salient characteristics of the data which is crucial in terms of decision making and adoption of new policies, comes out.

Advantages:-

Diagrams are appealing to the eyes as well as to intellect,

and are therefore helpful is assimilating the data readily and quickly. It helps to find out the relative position of different sub divisions and can draw a meaningful inference on the basis of this comparative analysis. Moreover a chart or diagram can clarify a complex problem and reveal facts, which are not apparent from the tabular form. It is sometimes necessary in finding the trend in the time series.

d) Analysis of data
presentation the next step is that of analyse the character of the data and to identify the problem associated with it. Statistical analysis is a method of abstracting significant facts from the large mass of numerical data collected during the enquiry. Methods used in analysing the presentated data are numerous ranging from simple observation of data to complocated,sophisticated and highly mathemetical techniques such as, average, dispersion,cerrelation,regression,time-series analysis etc.

After collection, organization and diagrammatic

e) Interpretation of data

The last step of statistical investigation is interpretation
i.e. drawing conclusions from the data colleted and analysed and suggest suitable policies that need to be adopted in future. The interpretation of data is a difficult task and require a high degree of skill and experience. If the data have been analysed are not properly interpreted, the whole purpose of the investigation may defeated and lead to a wrong conclusion.Crrect interpretation will lead to a valid conclusion of the study and thus can aid in decision-making.

Consider the following example:-

Suppose we have statistics of the annual
sales figure of a particular product say plain potato chips for the last 20 years as shown in the table.

Year 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007

Annual sales of chips 235568 235574 235586 235596 235623 235635 235656 235658 235666 235689 235698 235689 235655 235645 235617 235623 235621 235616 235613 235609

In this particular problem, by simply observing
the annual sales figure of potato chips it is tough to comment anything or to identify the problem underlying within it. So all that we require is the proper presentation of the data through diagram or charts. Suppose in this case we present the facts by using a Bar Diagram.
.

Annual sales figure of plain potato chips, 1988-2007

Now, it is quite apparent that up to year
1998 the sales of plain potato chips increased but after 1998 the sales of the product began to decrease and it is decreasing steadily there after.

So by ordinary observation of the data in the
tabular form (Table -1),it is difficult to identify that the sales of the product gradually decreases in the last few years. But through the diagrammatic presentation of the data i.e. the bar diagram, we can easily identify this that after certain years the sales of the product stated to decrease (specially after 1998).

After identification of the problem, in order to

identify the causes behind the down-fall in the demand, we have to apply different Statistical Technique such as correlation, regression, time – series analysis etc.

The last step of statistical investigation is
interpretation i.e. drawing conclusions from the data colleted and analysed and suggest suitable policies that need to be adopted in future. After analysing the data of annual sales figure of the plain potato chips for the last 20 year, we may conclude that the main reason of the down-fall of the product in last 10 years, is the increase in the competitiveness of the market. In the last 10 years many other competing products (for exampleLays,Kurkure etc.) launched in the market which are supplying product at more cheaper price and offer huge variation in its taste and this makes the consumers to switch from this plain chips.

The future policies needed to be undertaken in order
to survive in the competing market are,

 Improve the quality and introduce more variety in its taste.  Improve the packaging of the product.  Adopt a proper advertising strategies in order to increase the sales volume.

Statistics in states
In order to analyse the performance of any event Statistical data and statistical methods are applied.  In order to know the educational quality the data on literacy rate, number of schools, number of teacher preschool, school enrollment rate are taken.  In order to know the Health condition the birth rate , death rate, fertility rate, infant mortality rate, neonatal mortality rate, no of health centres ,hospitals, Doctors are taken.

Statistics in Business
In order to know the performance of a existing product in the market and what policies need to undertake in near future in order to improve its performance most of the firms are now interested to conduct market research of their product. Not only for the existing products but also before a new product is launched, a market research should be conducted in order to know the possible market potential for the product.

Steps of market survey

The Market Research need to conducted by maintain
2) 3) 4) 5) 6) 7) the following steps Set the objective of the survey. Set the Sampling Frame. Questionnaire Designing. Field Survey. Collection of data. Editing and organization of the raw data so that a meaningful inference can be drawn out of this.

7)

After transformation of raw data to a database,

the researcher analyse the data on the population, purchasing power, price of raw materials used, habits of consumer, prices and behavior of the competing product etc. by using different statistical technique i.e. Time series, Index Number, Regression, correlation analysis. .All these factors are statistically taken into account before fixing the price of the new commodity and so that it may find a suitable place in the market.

Such studies help to reveal the possible market potential for the new product which is helpful in establishing sales territory, advertising strategies to increase the sales volume etc.  It also helps in find out the key reason behind the downfall in demand of existing commodities and suggest the changes or policies that need to be undertaken in near future in order to the improve the performance of a existing product in the market

Sign up to vote on this title