You are on page 1of 178
BUSINESS STATISTICS & MATHEMATICS CONTENTS Cae STATISTICS a [2 [oatsoremmaten ———SSC~d [3 [vanne resonatonetoas id [4 [essuescttoeston Si 8 Le = Linear Correlation and Linear Regression Index Numbers | 3. | Set Theory and Probability | 9. | Random Variables and Probability Distributions Hypothesis Testing | 22. [Test of independence [22s | Wily we ed 33, [une uadalcanasimutoeousewotons | 443 | [as [seqenceaseres ——S—~dC [36 [mamematesorrranee i | CL Pe nes [= [owensmerasrrarens =m | BUSINESS STATISTICS & MATHEMATICS: Introduction OBJECTIVES Having studied this chapter, you will be able to: « Present a broad overview of statistics as a subject. * Bring out applications of statistics and its usefulness in decision-making. BUSINESS STATISTICS & MATHEMATICS 10 ee sy Meaning of the Word Statistics Definition of Statistics Characteristics of Statistics Limitations of Statistics Functions or Uses of Statistics Scope of Statistics Statistics — A Science or an Art Chapter Summary Exercise Short Questions and their Answers BUSINESS STATISTICS & MATHEMATICS 11 MEANINGS: To understand the meanings of the word “statistics” we look at the historical background of the word. It seems to have been derived from: v Latin word status which means a political state or Italian word statistica which means a political state or Statist used by Shakespeare and Milton, meaning a statesman i.e., an expert in affairs of state or German word statisitik which means the political science of the countries. or The English word statistics which means the political science of the countries. All these words were used before 19" Century and they all referred to a branch of knowledge dea with the affairs and arrangements of a state. During the 19" Century it began to be used in narrow sense ie., the description of affairs of state by numerical methods. Another meaning was noticed in the first volume of the journal of the Royal Statistical Society (1838 — 39). where statistics was defined as the collection of facts to illustrate the conditions and prospects of society. At present the word statistics is used to give following three meanings: I is used in plural sense to refer the aggregates of numerical facts (also called data). e.g., Number of students registered in different disciplines in P.U in 2011. ‘The number of deaths due to different reasons in a particular year. It is used as the plural of the word “Statistic” where statistic means a numerical quantity measured from only a part of data (called sample) e.g. If we select a group of 10 students from a class of 60 and take their test and calculate the average of the marks, then this average is called statistic. Definition of Statistics: The word statistics is also used in singular sense. i.c., “Branch of Science that deals in collection, processing, presentation, analysis and interpretation of numerical data, in order to make decision (conclusion) incase of uncertainty”. Last sense of the word statistics may also be referred to a ‘comprehensive definition of the science of statistics. GHARACTERISTICS OF STATISTICS: @ Statistics (as data) have the following characteristics: Statistics are the aggregate of facts and single fact may have no importance as an individual, however it plays role in aggregate. BUSINESS STATISTICS & MATHEMATICS. 12 Gi) ii) (iv) (vy) (vi) (vii) Statistical facts are affected to great extent by multiplicity of causes. ¢.g., price of sugar may be affected by supply, demand, weather conditions, political stability etc. Statistics are numerically expressed. e.g., when we say that result of B.Com. (P.U) of a particular year is not good, it is not a statistical statement. But when we say that result of B.Com is 40% where as it was 48% last year than it is a statistical statement. St sare enumerated or estimated according to a reasonable standard of accuracy. The standard, however, may be determined according to the purpose with which statistics are collected. e.g., weight of chicken may be accurate to the nearest gram where as the weight of the cow may be accurate to the nearest kilograms. Statistics are collected in a systematic manner, Statistics are collected with a definite object ar purpose. Statistical data are capable of comparison. But comparison is possible only for homogeneous groups of data. Comparison of heterogeneous groups of data is useless. e.g., wheat production of Sindh and Punjab is comparable but rainfall in Punjab and literacy rate of Sindh cannot be compared. Another comprehensive definition of statistics that includes all it’s characteristics has been given by “The aggregates of facts affected to a marked extent by a multiplicity of causes, numerically expressed, enumerated or estimated according to reasonable standard of accuracy, collected in a systematic manner for a pre-determined purpose in view and placed in relation to each other. LIMITATIONS OF STATISTICS: Fallowing are the limitations of statistics: Statistical results are true only on average or in the long run, They cannot be applied on individuals. e.g., if we say that average life of Pakistani is 50 years that does not mean that each and every Pakistani will die at the age of 50 years. Statistics deals only with (numeric) quantitative facts and it does not deal with qualitative facts. e.g. It deals with heights, weights, but not with morality, friendship, character etc. Statistical results may lead to wrong conclusions if sufficient care is not exercised in collection, analysis and interpretation of data. Only a person who has good knowledge of statistics can apply’ statistical conclusions efficiently. Statistics is the science that provides only the analysis of data but it cannot change the nature of causes affecting the data. BUSINESS STATISTICS & MATHEMATICS 13 FUNCTIONS OR USES OF STATISTICS: 1. Statistics simplifies the complex data. It presents the large amount of data in a form which is easily understood. 2. Statistics presents numerical facts in a definite form. e.g., statement that price of sugar has increased by 150% gives exact information as compared to the statement that price of sugar is high. 3. Comparison: Statistics simplifies the comparison of two or more series of data. Statistics helps in the study of a variable in relation with other factors. e.g., to study the increase in production of wheat in relation with area, fertilizers, weather, supply of water demand of wheat etc. 5. Forecasting: Statistics helps in forecasting the future behaviour of data on the basis of previous trends. 6. Tests: There are many procedures and tests in statistics which are used in many physical sciences and social sciences to test the validity of laws. 7. Policies: Statistical analysis provides the basis for designing policies in all the fields of life. e.g.. transfer of population from rural areas to urban areas may lead toa policy to provide jobs and basic (facilities) infrastructure of life in rural areas. BRANCHES OF STATISTICS: As a science, statistics may be sub-divided into two branches ice., descriptive statistics and inferential statistics. Descriptive Statistics: Descriptive statistics involves the methods of collection, processing, presentation and characterization of data in order to describe its main features. For example: A new medicine is given to a group of patients and results were collected and percentages show that 70% of the patients were cured, 23% had no affect while 7% faced the side affects. In this case we used the descriptive statistics. Inferential Statistics: Branch of statistics that deals in drawing conclusions about a data while studying only a part of the data. It involves testing of hypothesis and estimation, In above example, if a doctor estimates that in future 70% or more patients will recover by the medicine and decides to suggest the medicine in future, then he is applying inferential statistics. Scope of Statistics: Initially statistics was used only in the affairs of the state but now a days it is applied in all the fields of human life. It is applied in Business, Industry, Agriculture, Commerce, Economics. Physics, Chemistry, Biology, Biochemistry, Psychology, Metrology, Engineering, Administration, Education, Health, Geology and many other fields. Here we discuss how statistics is used in Business and Commerce. BUSINESS STATISTICS & MATHEMATICS 14 Role of Statistics In Business and Commerce: Statistics plays a vital role in Business. It provides the quantitative basis for making decisions in all matters of business, ic, What quantity, quality and variety, customer wants. Statistics helps us to plan production according to taste and requirement of customer on a price within the budget of the customer. Quality of product, performance of machines and workers may be improved by statistical techniques. It provides the methods to test the efficiency of new production methods. Banks use statistics for a number of purposes. The banks lend the money deposited by people and get profit. But banks must be well aware of the amounts deposited, loaned and demands of withdrawal at different time periods, statistics helps in forecasting all these amounts. The insurance companies decide the premium rate on the basis of estimated interest rates in future and the mortality rates in past. They also use statistics for finding accident rate or death rate in the past to forecast the future rates. BUSINESS STATISTICS & MATHEMATICS 15 eee CHAPTER SUMMARY Word Statistics Means: * Aggregates of numerical facts. Plural of word statistic where statistic is a numeric quantity measured from sample. Branch of science which deals in collection, presentation, analysis and interpretation of numerical data for inference. Definition of Statistics: It is the branch of science that deals in collection, Processing, presentation, analysis and interpretation of data, in order to make decision while facing uncertainty, oR The aggregate of facts affected to a marked extent by a multiplicity of causes, numerically expressed, enumerated or estimated according to reasonable standard of accuracy, collected in a systematic manner for a pre-determine purpose in view and placed in relation to each other. Statistics may be sub-divided into two branches: @ (ii) Descriptive statistics is the branch of statistics that involves the methods of collection, processing, presentation and characterization of data in order ta describe its features. Inferential statistics involves the methods of estimation of the characteristics of population and making decisions about the population on the basis of the results obtained from sample. BUSINESS STATISTICS & MATHEMATICS 16 EXERCISE-1 Q.1 Define the following: (i) Statistic (ii) Statistics (iii) Descriptive statistics (iv) Inferential statistics Q.2 Provide the short answers: What are different meanings of statistics in history? (ii) What are the characteristics of statistics? (iii) Write down limitations of statistics. (iv) Write down functions of statistics. (v) What is the scape of statistics? BUSINESS STATISTICS & MATHEMATICS 17 _— — EE SHORT QUESTIONS AND THEIR ———————— SS SSS Q.1 The word statistics is derived from which word? Ans. Word statistics is derived from Malian word “statista” or Latin word status or German word “statistik” or French word “statistique” all these mean “a political state, Q.2 Define statistics? ‘Ans. Branch of science that deals in collection, processing, presentation, analysis and interpretation of numerical data in order to draw inferences while facing uncertainty uncertainty. Q.3 What are different senses in which word statistics is used? Ans. Statistics is used in three different senses: (i) Plural sense. (ii) Singular sense. Gi) Plural of the word “statistic”. Q.4 Whats the plural sense of word statistics? ‘Ans. Any collection of numerical data, collected in a systematic way for a specific purpose is described as statistics. Q.5 What is the singular sense of word statistics? s" means a science that deals in collection, Ans. In singular sense, “statisti and inferences in case of uncertainty. Presentation, processing, ana Q.6 What do you mean by plural of the word statistic? Ans. Any numerical information obtained from sample is called statistic ¢.g., Mean, Median or Mode calculated from sample data is called statistic and plural of statistic is called statistics. Q.7 What are different functions of statistics? Ans. (i) It helps in collection of data. (ii) Ithelps in processing of data. It is used for presentation of data. (iv) It simplifies the data and explains different features of data. (¥)__Itis used for the comparison of data. BUSINESS STATISTICS & MATHEMATICS 18 Q.8 Ans. Qs Ans. Q.10 Qu Ans. Q.12 Ans. (vi) It applies tests on the hypothesis, (vii) It helps in estimation for planning. Write down the scope of statistics? Statistics is applied in Business, Commerce, Management, Banking and Finance, Population census, agriculture, engineering, medical science and many other fields. Statistics can be divided into how many branches? Statistics may be divided into two branches: (i) Descriptive statistics. (ii) Inferential statistics. What is descriptive statistics? Branch of sta descriptive stal istics that describes the features of the collected data is called ics. It involves collection, processing and presentation of data. What do you mean by inferential statistics? Branch of statistics that deals in drawing conclusions about a data while studying only a part of the data. It involves testing of hypothesis and estimation. What are limitations of statistics? Statistical results are true on average and cannot be applied on individuals. Statistical laws are not exact they give approximate information. Statistics may be misused by untrained and wrongly motivated persons. Statistics mainly focus on quantitative data, it does not give enough details about qualitative data. BUSINESS STATISTICS & MATHEMATICS 19 Data Organization OBJECTIVES Having studied this chapter, you will be able to: Describe the data collection process. Understand types of data and the basis of their classification. Use techniques of organizing data in tabular form in order to enhance data analysis and interpretation. BUSINESS STATISTICS & MATHEMATICS 20 OUTLINES Introduction Steps to Solve a Statistical Problem Data and Methods of Collection of Data Data Organization Classification Tabulation Constant and Variable Qualitative and Quantitative Variables Discrete and Continuous Variables Discrete and Continuous Frequency Distributions Chapter Summary Exercise Short Questions and their Answers BUSINESS STATISTICS & MATHEMATICS 21 INTRODUCTION: A pharmaceutical manufacturer needs to determine whether a new drug is more effective than those currently in use. WHAT HE HAS TO DO? He has to collect data about the results of drugs in use, from hospitals, Then new drug is given to a group of patients and data relating to its results are collected. Data collected are normally in raw form and do not give the comprehensive information, hence these data are firstly arranged, categorized according to requirements and then presented in different forms (i.¢., Tabular or graphic or descriptive form). Then all these facts are analyzed for different comparisons and for other purposes. This analysis enables him to interpret the different features of data (drugs) and hence to reach a decision, ic. New drug is more effective or not. Above example shows that solution of a statistical problem involves the fallawing steps: 1. Collection of data 2. Organization of data 3. Presentation of data 4. — Analysis of data 5. Interpretation of data 6. Decision making / prediction DATA: Data is collection of facts and figures obtained from a statistical problem. WHY WE NEED DATA? Why we need to collect data? Four main reasons could be given: 1. To provide the necessary input to a survey. 2. To measure performance in an ongoing service or production process. 3. To assist in formulating alternative courses of action in a decision making process. 4. To satisfy our curiosity. COLLECTION OF DATA: In the solution of statistical problem, data collection is the first and important step. Statistical decision depends very much on the collection of appropriate data. Any discrepancy in data collection may lead to wrong decisions. Process of collection must be planned according to the objectives of statistical research. Depending upon the method of collection, data can be divided into two groups: 1. Primary data 2. Secondary data 1. Primary Data and Primary Source: First hand unprocessed information collected by an organization or researcher is called primary data, Data collector directly collects the raw data (primary data), and BUSINESS STATISTICS & MATHEMATICS 22 then it is compiled and organized. An organization or a person who collects the primary data is called primary source. In Pakistan, Federal Bureau of Statistics, Census Department, Central Board of Revenue, State Bank of Pakistan and Election Commission of Pakistan are the examples of primary sources. 2. Secondary Data or Secondary Source: Many times, in solving a statistical problem, an organization or a person publishes the data already collected by other organization or person. Second hand information used by an organization collected by some other source is called secondary data. Organization that published the secondary data is called secondary source. e.g., If Economic Survey of Pakistan publishes some information collected by State Bank of Pakistan, then Economic Survey of Pakistan is secondary source where as State Bank of Pakistan is primary source. METHODS OF COLLECTING PRIMARY DATA: Following are the methods used for the collection of primary data: 1. Direct Personal Observation: In this method investigator interviews the persons concerned or observes facts personally. This method gives accurate results, but is slow and expensive. It is suitable in laboratory experiments or experiments involving small number of units. 2. Indirect Oral Investigation: Some times informants are reluctant to disclose the facts or they give wrong information. In such situations information is collected on the evidence of persons or organizations supposed to know the required informant. In order to avoid the possibility of any one giving wrong information more than one evidences are recommended. This method reduces the possibility of wrong information but is slow and expensive. However it may be applied in extensive enquiries 3. Through Correspondents: In this method correspondents and agents send the required information on the basis of judgment, instead of exact measurements. e.g., a correspondent of agriculture department sends the estimated production of wheat from a particular district. It is fast method and involves low cost. However, it gives only the estimates. 4. Through Enumerators: In this method trained enumerators are appointed to collect data. In this method forms, involving the questions which cover up all the required information (called schedule), are filled by informants with the assistance of enumerators. Assistance of enumerators causes the correct information, This method is considered to be very accurate but very expensive and only government organization can afford this method. BUSINESS STATISTICS & MATHEMATICS 23 By Mail: In this method a questionnaire is sent by mail. The informants fill the questionnaire and return them. This method is not very expensive. Informant can fill it at his/her own convenience. These days electronic media (T.V, Radio, Telephone, Fax, Internet) has made it a popular method of data collection. Registration: In this method, information is reported to concemed department, when an event occurs. This method is usually adopted by government departments. e.g., Birth, death, sale and purchase of vehicles and land are registered by government departments. Usually expenses of data collection are charged by informants in the form of fee hence it is low cost and sometimes profitable data collection technique. METHODS FOR COLLECTION OF SECONDARY DATA: Secondary data can be collected from the following sources: 1. Official Sources: Secondary data may be collected from offices of international, national, provincial and local organizations and research departments. (a) International Organizations: e.g., U.N.O, UNICEF, World Bank, Asian Bank etc. (b) National Departments: e.g., Federal Bureau of Statistics, Federal Ministries, Departments of Health and Education, etc. (c) Provincial Departments: e.g., Provincial Bureau of Statistics, Provincial Ministries, Provincial Department of Health and Education etc. 2. — Semi-official Sources: e.g., State Bank of Pakistan, WAPDA, P.L.D.C, Research Institutes, etc. 3. Private Sources: ¢.g., Associations, chamber of commerce and industry etc. 4. Publication of Research Organizations: e.g., Institute of Education and Research (IER), Pakistan Institute of Development Economics, Encyclopedias, Journals, News Papers and Websites etc. ORGANIZATION OF DATA: Once data are collected it needs to be organized in a way that achieves the following objectives: 1. To describe the most significant features of data ata glance. BUSINESS STATISTICS & MATHEMATICS 24 2. To categorize the data into different groups that describe similarities and dis- similarities. 3. To provide the basis for further statistical analysis in order to make decisions and estimate the future trends of data. Organization of data involves three main steps: Gi) Editing of data (ii) Classification of data (iii) Tabulation of data (i) Editing of Data: Process of removing discrepancies and errors during data collection is called editing. Objective of editing is to get a data that is complete, consistent and accurate. (ii) Classification of Data: Collected data is usually in raw form and cannot be comprehended easily. It is, therefore, suggested to classify the data and present in the form of tables, diagrams and graphs. “Classification of data means to divide it into groups on the basis of similarities and dis-similarities™. According to L-R. Conor, “classification is the process of arranging things in groups or classes according to their resemblances and affinities”. Classification may be compared with the process of arranging books in a library. Books of different subjects are arranged in different sections and then books of the same subjects are also classified according to authors, publishers, topics and time period and so on. Types of Classification with respect to Characteristics: Data can be classified by many characteristics, some important of them are: (i) | Spatial or Geographical Classification: If the data are classified on the basis of location or area then it is called spatial or geographical classification. c.g., income tax collected from different provinces in Pakistan. (ii) | Temporal or Chronological Classification: When data are classified according to its time of occurrence classification is named as temporal or chronological classification e.g., income tax collected from 1990 to 1995 by C.B.R. (iii) Qualitative or Attribute Classification: If data are classified on the basis of some quality or attribute such as, colour, intelligence etc., classification is called attribute or qualitative classification. BUSINESS STATISTICS & MATHEMATICS 25 oo (iv) Quantitative Classification: If data are classified on the basis of quantity or magnitude such as weight, height, income etc. then classification is called quantitative classification. ¢.g., Frequency distribution of any data. Types of Classification with respect to Number of Characteristics: Following are the types of classification: (i) One Way Classification: If data are classified by one characteristic, classification is said to be one way classification. e.g., Population of Pakistan may be classified by religion. i.e., Muslims, Hindus, Sikhs, Christians etc. (ii) | Two Way Classification: If data are classified by two characteristics af a time, classification is said to be two way classification. e.g.. Population of Pakistan may be classified by religion and sex, (ili) Three Way Classification: If data are classified by three characteristics, classification is said to be three way classification. ¢.g., Population of Pakistan may be classified by religion, qualification and sex. (iv) Many Way Classification: If data are classified by more than three characteristics, classification is said to be many way classification or multi-way classification. Division: Sometimes we classify qualitative data on the basis of a characteristic and divide it into further sub-classes, this process is called division. Different types of divisions are: i) Two Fold Division (Dichotomy): If we divide a characteristic into two sub-classes one possessing the characteristic and other not possessing it, then it is called two-fold division or dichotomy. e.g., if we study the characteristic religion, we may divide it into Muslim and Non-Muslim. (ii) Three Fold Division (Trichotomy): If we divide a characteristic into three sub-classes then it is called three-fold division or trichotomy. ¢.g., Characteristic of religion is subdivided into Muslim, Sikh, Hindu. (ili) Manifold Division: If we divide a characteristic into more than three sub-classes then it is said to be manifold division. e.g., Characteristic of language is sub-divided into Urdu, Punjabi, Sindhi, Balochi and Pashto. BUSINESS STATISTICS & MATHEMATICS 26 TABULATION: Once data are classified into different classes, it is usually suitable to present the data in the form of table, so that it may be comprehended at a glance. Arrangement of data into horizontal rows and vertical columns is called table. Process of arranging data into rows and columns is called tabulation. Tabulation may be categorized as below: (i) Simple Tabulation: If data having only one characteristic (i.e, one way classification) is represented by a table then tabulation is called simple tabulation. (ii) | Double Tabulation: If data having two characteristics (i.e. two way classification) is represented by a table then tabulation is said to be double tabulation. (iii) Tripple Tabulation: If data having three characteristics (i.e., three way classification) is represented by a table then tabulation is said to be tripple tabulation. (iv) Complex Tabulation: If data having more than three characteristics i.e., many way classification is represented by a table then it is called complex tabulation. TABLE: Arrangement of data into rows and columns is called table. Table is the simplest way of the presentation of data. It gives comprehensive information at a glance. It may be used for diagrammatic presentation and statistical analysis. A good table consists of the following parts: (i) Title: Title is the heading of the table. It describes the contents of the table. It should be brief and comprehensive. It is written in capital letters and at the top of table. (ii) Prefatory Note: Prefatory note appears after title and it gives further details about the title. (iii) | Column Caption: The headings for different columns of a table are called column captions. Column caption should be brief, clear and must be arranged in order of importance. (iv) Box Head: Part of the table where column headings are written is called box head. (v) | Row Caption: The headings for different rows of a table are called row captions. They should be brief, clear and arranged in order of importance. BUSINESS STATISTICS & MATHEMATICS 27 (vi) Stub: The part of the table containing row captions is called stub. (vii) Body of the Table: It is main part of the table. It contains numeric information in cells. Footnote: Any thing not clear fram table, prefatory note, row caption or column caption is described in this part. It gives additional details about table. (ix) Source Note: It is usually written at bottom of table and it describes the source of data. It also describes the reliability of data. Specimen of a good table is as follows: TABLE 2.1 Title Prefatory Note ( Box head Column Captions Stub Row Captions . Foot Note ® Source: Characteristics of a Good Table: Main characteristics for a good table are: @ It should be simple, brief and comprehensive. (ii) If the data are too large, then it should be divided into more tables instead of a large table. (iii) The table should suit the size of paper. (iv) Totals, averages, percentages should be placed closed together. (y) Bold lines should be used to separate different classes, (vi) Units used in table must be mentioned. e.g., weight in kgs and height in inches, etc. (vii) Large quantities may be approximated to thousands, Lakhs, millions etc., to reduce unnecessary details. BUSINESS STATISTICS & MATHEMATICS 28 ——_——=——rmem SA TEMATICS (iii) Arrangement of data in table should be alphabetical, geographical, chronological or in order of quantity. (ix) Table should describe main features at single view. Difference between Classification and Tabulation: Classification and tabulation may confuse the readers, i.e., either they consider them two distinct processes or very same process. But, infact classification and tabulation both go together to present data into different classes according to characteristics, Classification is the process of arranging data into different classes or groups according 10 common characteristics. It may or may not be in the form of rows and columns, Whereas tabulation is the process of arranging data into rows and columns according to common characteristics. EXAMPLE 1 According to population census, 1961, population of Punjab was enumerated to be 25581 thousands of which 13643 thousands were males and 11938 thousands were females. During the same census, the population of Baluchistan was 1161 thousands of which 640 thousands were males and 521 thousands were females. Further in 1972 Census the population of Punjab in enumerated to be 37508 thousands of which 19934 thousands were males and 17566 thousands were females. While for Baluchistan the 1972 census shows that there were total Population of 2405 thousands of which 1272 thousands were males and 1133 thousands were females. Make a table of this information showing different parts of it. POPULATION OF PUNJAB AND BALUCHISTAN FOR 1961 AND 1972 CENSUS (TITTLE) Figures in thousands (prefatory note) Punjab Baluchistan s21 | tier } Body Stub [1972 | 19942 | 17566 Alll areas including Gawader (Footnote) Source: Population census reports 1961 and 1972. To further our discussion in next paragraph we will be focusing on some important concepts frequently encountered throughout this text. Constant and Variable: . A quantity which assumes only one value is called constant. This value remains unchanged while solving the stati Quantity that changes from individual to individual or with the different time intervals is called variable. Actually variables are the quantities which posses different values while solving the statistical problem e.g., Time, Distance, Speed etc. Variables may be classified as qualitative variable and quantitative variable. 1133 2405 BUSINESS STATISTICS & MATHEMATICS 29 Qualitative Variable: A variable that changes in quality only is called qualitative variable e.g., colour, flavour, wisdom etc. qualitative variables are described in words not in numbers. Quantitative Variable: A variable that changes in quantity is called quantitative variable. Quantitative variables are described in numbers. e.g. weight height, length, time, score etc. Quantitative variable is sub-divided into discrete and continuous variables. Discrete Variable: A quantitative variable that has separate values at specific points along the number line, with gaps between them, is called a discrete variable. e.g., Number of ATM transactions during a day. In Other Words: A quantitative variable which is countable (finite or infinite) is called discrete variable e.g. Number of students, Number of cars in a country etc. Continuous Variable: A quantitative variable that has a connected string of possible values at afl points along the number line, with no gaps between them, is called continuous variable. In other words, quantitative variable which is measurable but not countable is called continuous variable. e.g., weight, height, length etc. Following figure may explain the types of data: Data Words ‘Numbers ‘Count Measure BUSINESS STATISTICS & MATHEMATICS 30 ee oo FREQUENCY DISTRIBUTION: A tabular arrangement of data divided into different classes along with their respective frequencies (number of values falling in each class) is called frequency distribution. Discrete Frequency Distribution: The discrete frequency distribution is preferred whenever the number of different values of the variable is ten or fewer and data is discrete. To construct a discrete frequency distribution, we simply list the different values of the variable in ascending order in a column and set up second column to tally the data, After all the data have been tallied, we add the tallies that correspond to each value and report this figure in a third column labelled “frequency”. Thus, the first and third column constitute the discrete frequency distribution. Definition: “A discrete frequency distribution is a table consisting of two columns of information; the values of the variable and the frequency with which each value occurs in the data set”. EXAMPLE 2 To monitor the mobility of our society, a private consulting group surveyed 50 individuals and asked them to respond to several questions, one of which was “How many times within the last three years have you changed residences"? The responses to this question were as follows: 5 1 0 1 1 1 5 3 «0 1 $1 1 1 3 0 2 4 0 2 4 0 0 0 3 o 0 6 1 2 4 5 0 2 4 0 4 3 0 7 2 o 4 6 6 Identify the variable of interest, classify the data, and organize these data into a discrete frequency distribution. SOLUTION It is quantitative variable that produce discrete data. Number of different values of the variable in the sample is 8, therefore, a discrete frequency distribution is appropriate. Values from O to 7 are listed in the column X, and 50 values are tallied in the second column. Third column of frequency is the number of tally marks. BUSINESS STATISTICS & MATHEMATICS 31 TABLE 2.3 Remarks: 1. For convenience observations are tallied in the sets of tally marks consisting of 4 vertical lines and one horizontal line. ie., Hf and next lines in a new set. 2. Usually discrete frequency distribution is preferred for discrete data, provided that number of distinct values in column X is not too large. Continuous Frequency Distribution: A table consisting of two columns i.e., the values of variable organized into classes and the frequency of the values that occurs within each class is called continuous frequency distribution. A continuous distribution is preferred for continuous set of data or a set of discrete data with large number of values of the variable of interest ie., variable X. In order to make a continues frequency distribution, first we need some guidelines to make the classes. 1. Number of Classes: Choosing classes or intervals is an immediate problem i.e how many classes should we use? If we develop too many, then each class will have relatively few observations in it and the efficiency of grouping will be lost. On the other hand, too few classes overly bunches the data together and may hide Certain patterns within data set. As a guideline most of the statisticians suggest 5 to 15 classes depending on the number of observations in data set. If we denote the number of classes by C and number of values in data set by n. Then a relation suggested is: n 2 2° for calculating the required number of classes. For example for number of values i.e. n= 16 Then 16 > 2 2 2 2 > c #4 BUSINESS STATISTICS & MATHEMATICS 32 Hence number of classes, suggested in this case is [C = 4]. A better rule suggested by H.A Sturges for the number of classes is: C = 14+33logn Ir n= 16 C = 1433 log 16 C = 4.97 = S approximately 2. Class Width: After we have chosen C, we must decide how wide to make each class. It is a good practice to maintain a constant value for the class width. This assures uniformity and makes it easier to construct the frequency distribution. To satisfy the principle of inclusion, we must ensure that the C classes span the whole data set. Thus class width denoted by W, can be found as follows: woz Max - Min ~ Cc where Max = Maximum value in the set of data Min = Minimum value in the data set C = Number of classes In practice, value of W is with many decimal places, we should round up it to the accuracy of original data. e.g., if the data are recorded in tenths and if W = 1.5273 then it is rounded as W = 1.5. 3. Generating the Classes: There are several ways to begin generating classes, we recommend the following. Starting with the minimum value in the data set, consecutively add the value of the class width. The resulting sequence of numbers forms the classes. Consider the following example. EXAMPLE 3} Suppose minimum value in a set of data is 0.5 and maximum value is 9.2. Consecutively adding 1.5 ie., W produces the sequence 0.5, 2.0, 3.5, 5.0, 6.5, 8.0 and 9.5. Since the maximum value 9.2 is accounted for, the classes are as follows: Classes 0.5-2.0 2.0 - 3.5 3.5- 5.0 5.0- 6.5 6.5- 8.0 8.0- 9.5 BUSINESS STATISTICS & MATHEMATICS 33 Definition of class implies that the first class 0.5 — 2.0 is understood be 0.5 Introduction @> Rules for Drawing Graphs / Charts @> Types of Graphs / Charts (A) Types of Graphs: (i) Histogram (ii) Frequency Curve (iii) Frequency Polygon Types of Charts: (i) Bar Charts (ii) Simple Bar Chart (iii) Multiple Bar Chart (iv) Component Bar Chart (v) Percentage Bar Chart (vi) Pie Chart @> Chapter Summary @> Exercise @> Short Questions and their Answers BUSINESS STATISTICS & MATHEMATICS 52 INTRODUCTION: It is often said, “one picture is worth a thousand words”. Indeed, statistician often employs graphic techniques to more vividly describe sets of data. These days graphs are becoming very popular as they have an immediate visual impact that frequency distribution lacks. By using graphs we can quickly, easily and succinctly convey information with minimum effort. RULES FOR DRAWING GRAPHS / CHARTS: Before we discuss, how to construct a graph / chart, we must consider important tules for drawing graphs / charts. ie., () Think a clear and comprehensive title. (ii) Source of data must be given. (iii), Decide independent and dependent variables. (iv) Take independent variable along x-axis and dependent variable along y-axis. (v) Select a suitable scale for the variables. (vi) The vertical scale (y-axis) and horizontal scale x-axis should start at zero. (vii) A scale break can be used between zero and the first number, only, if the first value of data is too large. (viii) Label the axes. Labels should be clear to describe the variables and their units, (ix) Differentiate the lines/curves by different lines or colours, if more than one eurves/lines are drawn on the same graph. (x) The graph should be good looking. It should not be over crowded with too many curves. TYPES OF GRAPHS: Commonly used graphs are: (i) Histogram — (ii) Frequency curve (iii) Frequency polygon (iv) Cumulative frequency polygon (Ogive) Histogram: A frequency distribution is presented by histogram, Histogram is a set of adjacent and vertical rectangles in which width of rectangle represents the class width and height represents the class frequency. To construct the histogram we follow the procedure as below; (i) Take class boundaries along x-axis, (ii) Take frequencies along y-axis. (iii) Adjacent rectangles along x-axis are constructed such that their width is equal to class interval size. (iv) Heights of rectangles are proportional to class frequencies, if class boundaries are of equal width. BUSINESS STATISTICS & MATHEMATICS 53 (v) If class boundaries are not of equal width then height of rectangles is proportional to adjusted frequencies. Where adjusted frequencies are calculated by dividing the frequencies with corresponding class interval sizes. Frequency af the class ie, Adjusted frequency ofachss = go> Remarks: 1. Histogram is constructed either for equal class width or un-equal class width. 2. Histogram is constructed on class boundaries not on class limits. EXAMPLE 1 Construct a histogram for the frequency distribution given below: 25-29 30-34 35-39 40 —44 8 10 6 ‘alues Frequency SOLUTION! 19.5 -24.5 24.5 — 29.5 29.5 - 34.5 34.5 - 39.5 39.5 - 44.5 HISTOGRAM Y-axis Frequency 245 295 345 395 445 Values BUSINESS STATISTICS & MATHEMATICS 54 — EXAMPLE 2) Construct the histogram for the following frequency distribution: Values 5-10 10-15 15-25 25-45 45-55 ] Frequency 20 30 50 60 40 SOLUTION! As sizes of class intervals are not equal, therefore we adjust the frequencies to construct a histogram: Values | requester | Ces interval Ashes 5-10 20 0-15 30 15 = 25 50 25-45 60 45-55 HISTOGRAM Frequency x-axis BUSINESS STATISTICS & MATHEMATICS 55 Frequency Curve: Another graph that presents the frequency distribution is frequency curve. Frequency curve is constructed by two methods: Method 1: This method involves the following steps: i) Obtain the class marks (Mid-points of class limits or class boundaries), by dividing sum of limits or boundaries for each class by 2. (ii) Take class marks along x-axis and frequencies along y-axis. (iii) Draw a dot against each class mark with respect to corresponding class frequency. (iv) Join the points by means of free hand curve, not by lines. Method 2: This method involves following steps: Gi) — Draw histogram. Gi) Draw mid-points at the top of each rectangle of histogram. Gii) Join the mid-points by means of curve (not by lines). EXAMPLE 3 Draw the frequency curve for the frequency distribution in example 1. SOLUTION| BUSINESS STATISTICS & MATHEMATICS 56 FREQUENCY CURVE xaxis ° 2 @ 3 FT 42 Class Marks Frequency Polygon: Frequency polygon is also a graphic presentation of a frequency distribution. It is also constructed by two methods. These methods involve all the steps involved in the construction of frequency curve. Except the following: (i) Points are joined by straight lines instead of smooth curves. (ii) Two extra classes are added at both ends with zero frequencies ie., a class before first class with zero frequency and a class after last class with zero frequency. (iii), Extend the graph for zero frequencies in step (ii) and hence graph touches x-axis. EXAMPLE 4 Draw frequency polygone by two methods for frequency distribution in example 1. SOLUTION Class Marks BUSINESS STATISTICS & MATHEMATICS: Method 2: Frequency Frequency ye FREQUENCY POLYGON y-axis eo S58 o 27) 320 T4247 x-axis Glass Marks (Values) 19.5 - 24.5 24.5 = 29.5 29,5 = 34.5 34.5 - 39.5 39.5 - 44.5 FREQUENCY POLYGON axis 145 195 245 29.5 34.5 39.5 445 49.5 x-axis 57 BUSINESS STATISTICS & MATHEMATICS 58 Ogive or Cumulative Frequency Polygon: In this graph, cumulative frequencies are plotted against the upper or lower class boundaries. Plotted points are joined by lines and graph so formed is called ogive. As cumulative frequency distribution is categorized as less than cumulative and more than cumulative frequency distribution. Therefore ogive is also categorized in two types ie. less than ogive and more than ogive. EXAMPLE 5 Following data was collected during a traffic survey. Draw less than ogive and more than ogive for following frequency distribution: Speed of vehicles | 40-50 | 50-60 | 60-70 | 70-80 No. of vehicles SOLUTION) Less than cumulative frequency distribution; 80-90 | 90-100 | Less than 40 Less than 50 Less than 60 Less than 70 Less than 80 Less than 90 Less than 100 LESS THAN OGIVE yraxis No. of vehicles with less than stated speed x-axis 9 40 50 60 70 80 90 100 Speed of vehicle BUSINESS STATISTICS & MATHEMATICS 59 eee More than cumulative frequency distribution: More than 40 56 More than 50 5! More than 60 44 More than 70 32 More than 80 14 More than 90 3 More than 100 iy MORE THAN OGIVE y-axis 60 so 40 30 stated speed 20 No. of vehicles with more than ° 40 50 60 70 80 90 100 Speed of vehicle TYPES OF CHARTS: Important charts are: (a) Bar Charts: (i) Simple Bar Chart (ii) Multiple Bar Chart (iii) Component Bar Chart (iv) Percentage Bar Chart (b) Pie Chart Simple Bar Chart: Simple bar chart is used to represent only one variable at a time. Variable may be classified on the basis of time, quantity, region or quality. In simple bar chart horizontal or vertical bars of equal width but of different lengths are drawn. Length of each bar is proportional to magnitude of a quantity. BUSINESS STATISTICS & MATHEMATICS: 60 EXAMPLE 6 The following data represent the number of accidental deaths in the United States due to various causes during a recent year: Cause of Death Machines Airplanes Buses. Caught in objects Dog bites Present the data by a simple bar chart. SOLUTION} Causes of Death EXAMPLE 7, a factory: _ SIMPLE BAR CHART y-axis Dogs Objects Buses Planes Machines 30 40 Number of Deaths 50 60 ais Following is the time series that shows the production (in millions of units) of Year | 1970 | 1971] 1972 [1973 [1974 | 1975 | Production s3 | 78 | 78 | 87 | 67 | 86 | Construct a simple bar chart to represent the above data.

You might also like