You are on page 1of 13

# Data

The number or observations usually obtained by some process of counting or
measurement. These are referred to collectively as data (raw material of statistics).
For example, heights and weights of the students of a class. Take another example,
we can collect the number of telephones that several workers install on a given day
or that one worker installs per day over a period of several days and we can call the
results our data. A collection of data is called a data set and single observation a
data point.
Reasons for obtaining data
1. Data are needed to provide the necessary input to a survey.
2. Data are needed to provide the necessary input to a study.
3. Data are needed to measure performance of an ongoing service or production
process.
4. Data are needed to evaluate conformance to standards.
5. Data are needed to assist in formulating alternative courses of action in a
decision making process.
6. Data are needed to satisfy our curiosity.
Cross-sectional and time series data. Cross-sectional data are data collected data
the same or approximately the same point in time. Time series data are data
collected over several time periods.
Variable
If man be an element of a population which possesses certain characteristic – such
as height, weight, age, hair, color etc. Each of these characteristics varies from man
to man either in magnitude or in quality and is, therefore, called a variable.
There are three basic ways of classifying a data set: (i) by the number of variables
(univariate, bivariate or multivariate), (ii) by the kind of information (numbers or
categories) represented by each variable and (iii) by whether the data set is time
sequence or comprises cross-sectional data (cross sectional is just a fancy way of
saying that no time sequence is involved, i.e., the first quarter 1996 earnings of eight
acre spaces firms).
Univeriate (one-variable) data sets have just one piece of information recorded for
each item. Only heights of the students of a class.
Bivariate (two-variable) data sets have exactly two pieces of information recorded
for each item. Heights and weights of the students of a class.
For bivariate data, in addition to looking at each variable as a univariate data set, you
can study the relationship between the two variables and predict one variable form
the other.

Also with multivariate data. weights and length of forearms of the students of a class. 1999. as well as examine the relationship among the variables and predict one variable from the other. For example. Quantitative variable can be measured while qualitative variable can categorized. An another example of types of data Data Type Categorica l Numerical Discrete Continuous Question Types Do you currently own U. As for example. Levels of Measurement and types of Measurement scales (Source: Berenson and Levine. Some categories of the hair color are block hair. seventh edition. it is called a discrete variable. S. page of Data 27) Statistical data may be broadly classified as categorical and numerical. Heights. the yield of a crop. if the number of children in a family is the variable of interest. Continuous variable A continuous variable is that which takes any value within some range. Government Savings Bonds? To how many magazines do you currently subscribe? How tall are you Responses ‫ ٱ‬Yes ‫ ٱ‬No 3 Number 67¼ inches Source: Berenson and Levine. Discrete variable When a variable can assume only isolated values. Categorical data are of two types: nominal and ordinal. Types of Data Variables may be either quantitative or qualitative. . Concept and applications. Basic Business Statistics. Height of a man is a continuous variable since it can any value which may be either an integral number of inches or fraction of an inch. golden hair and white hair. the price of a commodity is quantitative variables while hair color is a qualitative variable. The characteristics used to classify an individual into different categories is called attribute. you can look at each variable individually.Multivariate (many variable) data sets have three or more pieces of information recorded for each item. while numerical data are measured in interval scale and ratio scale. it is obvious that it cannot assume fractional values and it is a discrete variable. page: Types of Data 25. height of a man. The quantitative or measurable variable may of two types-discrete and continuous.

The distinct categories of the qualitative variables are sometimes called attributes. calendar time (3 p. Fairly Unsatisfied. it is quite meaningful to say that a 4-foot-tall boy is twice as tall as a 2-foot-tall boy. Fat consumed (in gm).Q test score. is attributable to the category smoker. Example of nominal scaling Categorical Variable Automobile ownership Political party affiliation Categories Yes No Democrat. Lecturer . Example of Ordinal scaling Categorical Variable Product satisfaction Faculty rank Example. Attributes. etc. Distance (in km). Neutral. Independent Other Example of Ordinal Data Job classification such as president. When there is an ordered relationship among the categories. All qualitative measurements are nominal.m. average and poor. Republican. Fairly Satisfied. white.Nominal data. Ratio Scale: Ratio data have all the ordering and distance properties of interval data. regardless of whether the categories are designated by names (red. vice-president. Urban-rural. Interval Scale: Data generated through the measurement of an interval variable are called interval data. male) or numerals (June 20. etc. Associate Professor. A thermometer. Religion.) etc. a ‘zero point’ can be meaningfully designated. measures temperatures in degrees. for example. recorded for each of a group of executives.m. In addition. Very Satisfied Highest-Lowest: Professor. to 6 p. when reported to be smoking. Ordered Categories Lowest-Highest: Very Unsatisfied. The difference between 20 0C and 210C is the same as the difference between 120C and 130C. For example. Weight. primary and secondary. Education level: illiterate. Ordinal data. His smoking behavior is used to classily him as smoker and thus it is an attribute. we can classify level of knowledge as good. Height. For example. which are the same size at any point of on the scale. I. departmental head and associate department head. Room 10). Assistant Professor. A worker. political affiliation. the variable is said to be an ordinal variable.

Classification Classification is the process of arranging individuals in groups or classes according to their affinities.. Chronological. Hebrew or Islamic) Height (in inches or centimeters) Weights (in pounds or kilograms) Age (in years or days) Salary (in American dollars or Japanese yen) Level of Measurement Interval Interval Ratio Ratio Ratio Ratio Sources of data Primary data. i.g.Numerical Variable Temperate (in degrees Celsius or Fahrenheit) Calendar Time (Gregorian. Broadly. Qualitative. 2. namely (i) Census (ii) Sample survey (iii) Focus group discussion. 3. e.e. Types of Classification.e. Secondary data. When the investigator collects first hand data for the purpose at hand. .. Quantitative. districts. according to some attributes. the data can be classified on the following four bases: 1. Geographical. area-wise. (iv) Telephone interview.. There are five important technique of data collection. industrial or individual sources such data will constitute secondary data. Technique of data collection. i. such data are known as primary data... i. in terms of magnitudes.e. When the investigator obtained the data from published or unpublished government. 4. on the basis of time. cities.e. etc. (v) Data collection through electronics media (vi) We may design an experiment to obtain the necessary data. i.

Chronological classification. colour.095. Qualitative classification.690. national income is expressed every year. Geographical Classification.414. etc. This type of statistical data is classified according to the time of its occurrence. regions. months. religion.065. etc.5 6. blindness. marital status. etc.7 1. zones. days. such as years. 2. areas. When the data are classified according to some quality or attributes.. cities.1.. etc. deafness. literacy.4 28. like States. Statistical data regarding population.9 17. exports. the classification is termed as qualitative or . For instance.3 Geographical classifications are usually listed in alphabetical order for easy reference.5 9. such as sex.38. intelligence. imports.074. In this type of classification data are classified on the basis of geographical or location differences between the various items. Time series are also called chronological classification. the production of foodgrains in India may be presented Stat-wise in the following manner: State-wise Estimates of Production of Foodgrains: 1987-88 Name of State Total Foodgrains (Thousand tonnes) Andhra Pradesh Bihar Haryana Punjab Uttar Pradesh All India 9. sales in a firm. Chronological Classification is illustrated below : Population of India from 1921 to 1981 Year Population (in inillion) 1921 1931 1941 1951 1961 1971 1981 248 276 313 357 438 536 684 3. weeks. honesty. They are further classified into the period of time and at the point of time..301. census data are expressed in decades. and departmental sales are expressed every month or week. also come under this classification. hours. For example.

we may first divide the population into males and females on the attribute of sex. of Students 90 – 100 100 – 110 110 – 120 120 – 130 130 – 140 140 – 150 Total 50 200 260 360 90 40 1. for example.descriptive attributes. the classification is termed as simple classification. This again can be classified into two types: (a) Simple classification. This classification is normally dichotomy or twofold. Quantitative Classification. such as literate and illiterate or honest and dishonest or skilled and unskilled. income. If the data are classified into only two classes. (b) Manifold classification. in the given units. production. then further divide them on the basis of literacy and so on: Population Male Literate Married Female Illiterate Literate Unmarried Married Unmarried Married Illiterate Unmarried Married Unmarried 4. the universe is classified on the basis of more than one attribute at a time. such as height. (a) Simple classification. profits. In manifold classification.000 . In this type we can only find out the presence or absence of the attributes. For example. Population Male Population Female Literate Illiterate (b) Manifold classification. for example. Quantitative classification refers to the classification of data according to some characteristics that can be measured. weight. sales. the students of a college may be classified according to weight as follows: Weight in (lbs) No. etc.

Titled of the table. 3.Such a distribution is known as empirical frequency distribution or simple frequency distribution. of Children No. Table number. Columns are vertical arrangement and rows are horizontal arrangement. It must describe the contents of the table. of Persons 10 15 40 45 20 4 Total 134 Tabulation of Data Tabulation. Objects Tabulation helps in understanding complex numerical data and makes them in a simple and clear way that their similar and dissimilar facts are separated. Series. a systematic presentation of numerical data in columns and rows in accordance with some salient features or characteristics. It must be written on the top of table. The following parts must be present in all tables : 1.130 130 – 140 140 – 150 150 – 160 No. By tabulation we mean. 7. 8. which can be described by a continuous variable. 4. 2. Each column should also be numbered as shown in the illustration. 6. Series represented by a discrete variable are called discrete series. It must explain (1) . 2. 5. A table should always be numbered for identification and reference in the future. The following are two examples of discrete and continuous frequency distributions: Examples of discrete and continuous frequency distributions: No.) 100 – 110 110 – 120 120 . of Children 0 1 2 3 4 5 6 10 40 80 100 250 150 50 Total 680 Weight (lbs. 1. Table number Title head note Caption Stubs Body of the table Foot-note Source-note. Parts of Tabulation A good statistical table is an art. Each table should be given a suitable title. are called continuous series.