You are on page 1of 392
STATISTICS FOR WELT Sa G ARULMOZHI Sal Tata McGraw-Hill Published by the Tata McGraw-Hill Education Private Limited, 7 West Patel Nagar, New Delhi 110 008. Statistics for Management, 2/e Copyright © 2009, by Vijay Nicole Imprints Private Limited. No part of this publication may be reproduced or distributed in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise or stored in a database or retrieval system without the prior written permission of the publishers and copyright holders. The program listings (ifany) may be entered, stored and executed in a computer system, but they may not be reproduced for publication. This edition can be exported from India only by the publishers, Tata McGraw-Hill Education Private Limited. ISBN-13: 978-0-07-015368-4 ISBN-10: 0-07-015368-X Information contained in this work has been obtained by publishers, from sources believed to be reliable. However, neither publishers nor copyright holders guarantee the accuracy or completeness of any information published herein, and neither publishers nor copyright holders shall be responsible for any errors, omissions, or damages arising out of use of this information. This work is published with the understanding that publishers and copyright holders are supplying information but are not attempting to render engincering or other professional services. If such services are required, the assistance of an appropriate professional should be sought. Laser Typeset at: Vijay Nicole Imprints Private Limited, Chennai - 600 042 Printed at: Novena Offset Printing Co., Chennai - 600 005 RZLLCRCFRARRR, SR che Contents Preface x = Chapter 1 Introduction Definition of Statistics Meaning of Statistics Applications of Statistics Limitations of Statistics Computer Packages Chapter 2 Diagrammatic Representation Population and Sample Tabular and Graphical Methods Frequency Distribution Odd XN ivawnl Im ‘Number of Classes 10 Width of Classes 10 Graphical Presentation of Frequency Distribution : 12 Bar Diagram 13 Histogram 18 Frequency Polygon 20 Frequency Curve 21 Ogive 2 Pie Chart Bly Exercise 27 Chapter 3 Measures of Central Tendency & Measures of Dispersion 31 Measures of Central Tendency 31 Arithmetic Mean 32 Median 41 Mode 47 Methods 4B Harmonic Mean 58 Measures of Dispersion 59 vi Statistics for Management Range Variance and Standard Deviation The Inter Quartile Range or the Quartile Deviation Mean Deviation Percentile Comparison Exercise Chapter 4 Moments, Skewness and Kurtosis 93 Introduction OB KARINSSSs Moments 8 Relation Connecting Central and Raw Moments 4 Sheppard’s Correction for Grouping % Skewness 102 Measures of Skewness 104 Kurtosis 106 Measures of Kurtosis 107 Quartiles 109 Exercise 114 Chapter 5 Correlation and Regression 119 Correlation Correlation Analysis 9 Definition of Corelati 1 Types of Correlation 120 Measures of Correlation Cd Scatter Diagram 123 Karl Pearson’s Coefficient of Correlation 127 Steps Involved in the Computation of Correlation Coefficient 132 Rank Correlation CS Spearman’s Rank Correlation Coefficient 140 Repeated Ranks 145 Correlation Coefficient for Grouped Data 146 Regression 148 Correlation and Regression 148 Definition 1 Regression Lines 149 Regression Equation of y on x 150 Contents vii Regression Equation of x on y 152 Regression Coefficients 152 Properties of Regression Coefficients 153 Angle Between Regression Lines 154 Standard Error of Estimate and Coefficient of Determination 158 Exercise 159 Chapter 6 Probability 171 What is a Probability? 171 The Probability Scale 171 Probability Interpretation 176 Probabilities of ‘or’ Events 179 Permutations and Combinations 180 Permutation 180 Combination Theorems of Probability 195 Conditional Probability 205 Multiplicative Rule and Independent Events 207 Multiplication Theorem for Independent Events 208 Exercise 210 Chapter 7 Theoretical Distributions 217 B Ili Distributi 217 ‘The Binomial Distributi 21 Binomial A > Exercise 1 227 Poisson Distribution 230 Constants of Poisson Distribution 231 Fitting a Statistical Distribution 234 Fitting a Poisson Distribution 235 Exercise It 236 Importance of Normal Distribution 238 Exercise JJ Chapter 8 Theory of Sampling 247 Population 247 Classification of Population 248 Basic Properties of Population 248 Sample 249 Census ST viii Statistics for Management Sampling 251 Advantages of Sampling over Census 252 Principles of Sampling 253 Limitations of Sampling 254 Sampling and Non-sampling Errors Sampling Methods 256 Simple Random Sampling or Unrestricted Random Sampling, 257 Stratified Random Sampling 262 263 Merits Demerits 91449419 7 Systematic Sampling 264 Cluster Sampling (Single Stage Sampling) 265 Multistage Sample 266 Judgement Sampling [Purposive or Deliberate Sampling] 267 Quota Sampling 267 Convenience or Chunk Sampling 268 Selection of Appropriate Method of Sampling 268 Exercise OR Chapter 9 Tests of Hypotheses 273 Theory of Estimation 273 Theory of Test of Hypothesis 23 Statistical Hypothesis 274 Null Hypothesis 274 Alternative Hypothesis 274 Types of Error and Level of Significance "275 Rejection Region and Critical Value 276 Tips 216 P-value Approach 278 Sampling Distribution 278 Standard Error CD Large Sample Tests 281 Test of Hypothesis Concerning a Population Parameter 282 Test of Hypothesis Concerning Two Populations 294 Small Sample Tests 301 Students’ t - Distribution 302 Application of Students t-Distribution Chi-Square Distribution 302 F-Distribution 303 Applications of Chi-Square( x2) Test 319 Lontents ix Applications of F-Test 327 Solved Problems 333 Chapter 10 Non-parametric Methods 341 The Sign Test 343 The Sign Test for a Population Median 343 The Sign Test for Paired-sample or Two Related Samples m4 The Wilcoxon's Tests 9 The Wilcoxon’s Signed-Rank Test 350 The Wilcoxon's Rank Sum Test 351 Mann-Whitney U-Test 362 Median Test 366 Runs Test 369 The Kruskal-Wallis H-Test 4. Exercise I 375 Exercise I] 376 Chapter 11 Index Numbers 385 Classification of Index Numbers 386 Method of Construction of Index Number 387 Notations 387 Price [Quantity] Relatives 388 Identity Property 388 Circular Property 388 Time Reversal Property 388 The Construction of Various Indices 388 Simple Price Index (Simple Quantity Index, Simple Value Index) 388 Unweighted Index Numbers 390 Weighted Index Numbers 395 Comparison of Different Methods 400 Value Index 408 Test of Consistency of Index Number Formulae 411 Unit Test 411 Time Reversal Test 4il Factor Reversal Test 412 Circular Test 412 Consumer Price Index 413 i 413 Base Shifting 414 x Statistics for Management Chain Base Method Uses of Index Numbers 415 417 Limitations of Index Numbers CC Guidelines for The Construction of Index Numbers 419 The Purpose of the Index 419 The Choice of Base Period 419 Choice of Average 420 Choice of Commodities 420 Choice of Appropriate Weights 421 Choice of Index 421 Exercise 421 Chapter 12 Analysis of Time Series 425 Definitions 425 Utility of Time Series Analysis 426 Variations/components of Time Series 426 Secular Trend or Long-term Movement 426 Seasonal variation (S) QT Cyclic Variation (C) 2B Irregular Variation (I or R) 2B Principles or Models of Time Series 29 Preliminary Adjustments 430 Calendar Variation 40 Population Changes 430 Price Changes 431 Miscellaneous Changes 81 Estimation of Secular Trend 431 Free Hand Drawing Method 431 Semi-average Method 432 Moving Average Method 435 Method of Least Square 439 Estimation of Seasonal Variation 49 Simple Average Method 450 Ratio-to-Trend Method 452 Ratio-to-Moving Average Method 456 Link Relative Method 460 Working Method 462 Exercise Contents xi Estimation of Cyclical Variation 462 Percent of Trend Method 463 Relative Cyclical Residual Method 463 Estimation of Irregular Variation 464 An Illustration Involving All Four Components ofa Time Series 465 470 481 Index aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. Diagrammatic Representation 9 6 18 6 10 4 14 MW 19 6 10 4 if 12 15 3 6 4 0 8 4 19 8 5 6 15 0 3 0 18 16 7 4 8 0 13 5 9 3 0 12 5 oll 19 2 4 i 0 10 1 16 3 5 2 12 13 18 12 1 1 1 If the collected information is kept as it is, no useful information can be drawn. When the collected information is not summarised or rearranged in a meaningful manner, we refer to them as raw data or ungrouped data (Refer data given in Table 2.1). It is ‘raw’ because it is unprocessed by statistical methods. Raw numbers alone do not provide any underlying pattern from which conclusions can be drawn. Only by organising the data we can gain information to use in future planning. So the data should be rearranged, condensed and presented in a more meaningful form ina frequency table (refer Table 2.2) and is known as grouped data. The grouping of related facts into classes helps us in comparison and further analysis of data. FREQUENCY DISTRIBUTION Without sacrificing the information all the numbers given in Table 2.1 are converted to a tabular from (refer Table 2.2) and is called frequency table. One way in which we can compress data is to use a frequency table or a frequency distribution. Data organised in a frequency distribution are called grouped data. A frequency distribution is a tabular summary showing the number of items in each of several non-overlapping classes. Data organised in a frequency distribution is called grouped data. Let us see how Table 2.2 is constructed from data given in Table 2.1. Class Interval A frequency table is constructed from raw data as follows: aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. Diagrammatic Representation 13 Bar Diagram Various kinds of diagrams have been developed for presenting statistical data and the most common form of one-dimensional diagram is the bar diagram. A bar graph is a graphical device for depicting data that have been summarised in frequency, relative frequency or percent frequency distribution. It is the simplest of all statistical diagrams. It consists of a series of bars of equal width standing on a common base line, at equal intervals, the length of the bars being proportional to the magnitude of the variables they represent. A variety of different diagrams go under the general heads of bar charts. The feature that they have in common is that lines or bars have lengths representing the frequencies. Variations of simple bar diagram can be used to present more complicated data. The bars should be equally spaced to get a neat look of the diagram and different shades may be used for different bars. The following are the most common types of bar diagrams used in practice: () Simple bar diagram (i) Subdivided bar diagram (iii) Percentage bar diagram (iv) Multiple bar diagram (v) Deviation bars Simple bar diagram Simple bar diagram is used to represent only one variable. Examples of simple bar diagram aré: (a) Marks of a student in five subjects or Marks of five students in a subject. (b) Production of a company in 6 years. (c) Salary of employees of six firms. (d) Runs scored by a cricketer in 7 matches. Even lines can be used while drawing a bar diagram if the number of variables are large. . Example 1 The following data represent the runs scored by 5 players in a home series test match: Players Runs Tom 32 Jerry 45 Jack np Jill 65 Tick 20 Draw an appropriate diagram for the performance of the players. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. Diagrammatic Representation 17 Solution 3 90 80 10 60. [Computer ; 50 IScience ae Science 30, 0 Arta 0 [Commerce College xt College x2 College x3 ‘voneges Fig. 2.4 A multiple bar diagram showing the number of students admitted in four colleges in different courses Example 5 The following data corresponds to average water levels of dams A and B during 2001 to 2003. . DamA DamB 2001 R 55 2002 oa 40 2003 2 B Draw multiple bar diagram for the above data. Solution 80 i 7 wl 60 Bo 3 2 4. 32 20: 20: 10) Dam A Fig: 2.5 Water levels of two dams in the past three years Deviation bars To represent net results like net profit, net loss, net export or import, deviation charts are used. In these type of data both positive and aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. Diagrammatic Representation 21 Class interval No. of bugs 0 -200 10 200 — 400 15 400 — 600 35 600 — 800 25 800 ~1000 IS Total 100 Solution Inthis problem all the class intervals are of equal width and so first Histogram is drawn and then the middle points are joined to get the frequency polygon. ’ 3ST 3 30 325 = 15 104 5 ‘200 400 600 800 1000 No. of lines Fig. 2.9 Histogram and frequency polygon showing the no. of bugs and software code Frequency Curve Frequency curve is obtained by smoothing the minor irregularities of a frequency polygon in such a way that the total area enclosed should represent the total frequencies. Example 9 The following data represent the distribution of salary of 50 U.G. students who got placement through college placement cell. Class interval No, of UG Students 0 - 1500 6 1500 - 2000 12 2000 — 2500 18 2500 - 3000 10 3000 - 3500 4 Draw Histogram, frequency polygon and frequency curve for the above data. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. Diagrammatic Representation 25 acircle of any radius to represent all the data. Then we use the relative frequencies to subdivide the circle into sectors, or parts, that correspond to the relative frequency for each class. So instead of representing the variables by bars they can be represented by sectors whose areas are proportional to the values of the variable. This representation is known as apie diagram. So pie chart is a circle, divided up into sectors having areas representing the frequencies. Example 12 The table gives the amount spent by a family during a month. Total income 20,000 Food & clothing 5,000 Fuel and travel 3,000 House rent 3,500 Education 2,000 Miscellaneous 3,000 And the remaining amount is kept as savings for the future. Draw pie chart for the expenditure distribution. Solution To draw pie chart for the above data, let us first draw a circle of any radius. The angle at the center of the circle is 360 and it corresponds to Rs. 20,000, the total income of the family. The angle in the sector that corresponds to food is calculated as (5000 x 360)/20000 = 90. In the same way the degrees corresponding to food and clothing, fuel and travel, rent, education and miscellaneous can be calculated and we get the degree for the sectors food and clothing, fuel and travel, rent, education and miscellaneous as: Ttem ‘Amount Degree 5000*360 SODOK360._ 9g Food sone 20,000 3000%360 SQ00%360 54, Fuel 3000 0000 Hous 3500 3500%360 gy SUSIE, 20,000 : 2000360 _ Education 2000 20,000 * 36 3000-360 see ea Miscellaneous 3000 20,000 3500*360 i SON 8 Savings 25000 aT Total | 20000 360 aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. Diagrammatic Representation 29 During the year 2006, oil consumption was 20 million barrels per day. The following data represent the percentage breakdown of the sources of that consumptions. Source of consumption Electric utilities Highway transportation House, industry and business Miscellenious Total a. Construct a bar chart. b. Construct a pie chart. c. Which of these charts is preferable and why? The following are the time taken by the police department on receipt of the complaint to find the culprit. 5 0) B 4 no 6 5 12 6 10D & B oD 3 7 © 2 2 BB 2 8 4 Q 123 20 6 14 107 R 8 B 2 A 114 49 2 ‘% Usage 1s 29 a 33 4 58 a. Form the frequency distribution table. b. Plot the histogram c, Form the cumulative distribution. d. Plot the ogive. SAxRsSvrR 19 1 45 36 7 A Draw ogives for the following distribution. How many students are getting marks between 60 and 72. Marks 50 - 55 55 - 60 60 - 65 65 - 70 10 - 75 75 - 80 80 -1 00 No. of Students 6 10 22 30 16 12 15 aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. Measures of Central Tendency & Measurement of Dispersion 33 1. Ungrouped Data Direct Method _\f we use the letter X to represent an observation, then to distinguish each observation when we are dealing with ‘n’ observations, we use the numbers 1, 2, 3, ..., 2 as subscripts. Thus, X, represent the first observation, X, represent second observation and so on and_X, represent the nth observation. The sample mean is represented by X bar and written > +X, 4X44 as Fatt Aten Xe ey ig the capital Greek letter sigme and a 2 = Xx indicates summation, then mean can be represented by Y = zx n n Example 1 Consider the marks scored by 8 students in an examination 45,82, 70, 92, 35, 58, 75, 90. Find the average performance of the students. Solution The mean or average of these marks is (45 +82+70+ 92 +35 +58 +75 + 90/8 = 547/8 = 68.375 Example 2. The amount spent by Mr. Sai’s family during the twelve months of 2006 on clothing are 500, 2000, 3500, 400, 800, 100, 2000, 150, 4500, 300,100,200. Find the average expenditure of the family on clothing. Solution Let the expenditure be denoted by the variable X. The values of X are 500, 2000, 3500, 400, 800, 100, 2000, 150, 4500, 300, 100,200. 2B = ZX _ 14550 1919 50 n 12 . The average spending on clothing is 1212.50. Example 3 From the following data of scores of a player in9 matches, calculate the player’s average score. Matches 1} 2];3 |4})5})6 4,7] 8] 9 Runs scored | 85 | 23 | 40 | 89 | 45 | 50 | 5 | 12 | 65 Solution Number of matches = 9. Let the runs scored be denoted by x. -. The values of X are x [85 [23 | 40 [99 [ 45] s0[s | 12 [ 65 n=9,DX=414 aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. Measures of Central Tendency & Measurement of Dispersion 37 xX-A x f a 2 fa 0 40, 4 —160 1 50. =3 —150 2 100 =2 ~200 3 120 -l —120 4 75 0 0 3 55 1 55 6 20 2 40 7 25 3 75 8g 15 4 60. Total 500 630 + 230 = 400 Mean = X = a+(24 )c here a= 4,2 fd = -400, N= 500,C=1 Solution For the calculation of arithmetic mean we can use either short cut method or direct method Let us choose 4 = 25 and C=5 aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. Measures of Central Tendency & Measurement of Dispersion 41 C= Width of the class interval = 500 and ¥ fd = -345 X= 4750 + at esto = 4520 750 Let us do the same problem by direct method Salary C.l eagles t xf 3000 - 3500 30 3250 97500 3500 - 4000 150 3750 562500 4000 - 4500 200 4250 85000 4500 - 5000 185 4750, 878750 [$208 - 5500 125 | 5250 656250 5500 - 6000 60 5750 345000 Total 750, 3390000 xX 2 TIX _ 3390000 = 4520 if N 750 Note For some problems direct method is easier and statistical constants of some problems can be easily computed using shortcut formula. For few problems both shortcut method and direct method involve same amount of computation. We have to chose one of them depending on the data given. Weighted Arithmetic mean While calculating arithmetic mean, all the observations in a distribution are given equal importance. But in practical problems, some items in a distribution are more important than others. In such case by giving weightage according to the importance of various items, a descriptive measure, called weighted arithmetic mean is computed. This weighted average is a representative of the distribution. For example, in a family budget, expenditure on food is more, compared to the amount spent on entertainment. Also allotment for fuel and education varies. In weighted mean proper weightage is given to various items, the weights attached to each item being proportional to the importance of the item in the distribution. IfX,, X,,X3,-..X, aren values of the variables and W’,, W,, W,,.... W, are the corresponding associated weights then weighted mean is " WAM= WX + Wy Xp + Wg ton tWy Xn _ DWX W,+W,+W,+...+W, LW MEDIAN After the mean, the most common measure of central tendency is the median. To find the median of a set of numbers, we arrange them in ascending or descending order of magnitude and pick out the one in the aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. Measures of Central Tendency & Measurement of Dispersion 45 where, / = Lower limit of the median class m = Cumulative frequency preceding the median class Jf = Frequency in the median class C = Width of the median class Example 15 The following table shows the distribution of 105 families according to their expenditure per day. Find median of the expenditure. Expenditure 0-10 | 10-20 | 20-30 | 30-40 | 40-50 No. of families 4 25. 27 | 24 15, Solution Class interval f og 0-10 uM 4 10-20 25 39 20-30 27 66 [___ 30-40 24 90, 40-50 15 105 N _ 105 . A ; Here — = — =52.5. Cumulative frequency just greater than 52.5 is 66. The corresponding frequency 27 and class 20-30 is median class. “1=20 m=39 f=27 C=10 N zm Median =/+| xc t =20+( 25 arg 27 = 20+(0.5)x10= 25 ~. median is 25. Example 16 Draw the ogives for the following data and calculate the median. Class 10-19 20-29 | 30-39 | 40-49] 50-59 | 60- 69 25 36 | 40 27 10 Compute the median value using the formula. Compare and comment on the results. Solution Median is calculated using the formula. To use the formula the open end class interval problem should be converted into a closed class interval problem. Frequency 12 aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. Measures of Central Tendency & Measurement of Dispersion 49 Ff, = frequency succeeding the modal class C= width of the modal class Example 19 Calculate mode for the following data: Monthly rent No a Smile 1500 - 2000 17 2000 - 2500 40 2500 - 3000 75 3000 - 3500 | 27 3500 - 4000 15 4000 - 4500 20 4500 - 5000 6 Total 200 Solution Consider the class interval and the corresponding frequencies CI f 1500 - 2000 17 2000 - 2500 40 2500 - 3000 75 3000 - 3500 27 3500 - 4000 15 4000 - 4500 20 4500 - 5000 6 Total 200 The highest frequency here is 75. So 2500 - 3000 is the modal class. 1= lower limit of modal class = 2500 F,= frequency of the modal class = 75 J,= frequency preceding the modal class = 40 J,= frequency succeeding the modal class = 27 C= width of the modal class = 500 Mode 1+ Ach}. h 2f-fo- = 2500+( 754) .500 275-40-27 35 = 2500+ — x 500 = 2710.84 83 Example 20 From the following data of monthly income of 250 software professionals find median and mode. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. Measures of Central Tendency & Measurement of Dispersion 53 Grouping Table Marks Coll Col.2 Col3 Cold Col5 = Col.6 68 70 71 72 73 74 75 76 78 80 8 20 2 46 22 ] 30 20 38 58 ] 84 46 @ 94 ] @ 12 4 ] 16 Je 2 Using the grouping table, analysis table is written by writing the column numbers 1, 2, 3, 4, 5, 6 on the left hand side and the probable modal values 72, 73, 74, 75 on the right hand side. The analysis table is written by considering, maximum values in different columns. The maximum frequency in the first column is 48 and the corresponding mark is 75, put ‘1’ in the first row corresponding to 75. The maximum value in the second column is 84 and it corresponds to marks 73 and 74. Put ‘1’ inthe second row corresponding to 73 and 74. Proceeding for other columns like this, the analysis table is Analysis Table Column No. Marks From the analysis table we notice that 74 is occuring maximum number (5 times) of times. So mode = 74. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. Measures of Central Tendency & Measurement of Dispersion 57 Analysis Table Col. No. | 50-60 | 60-70 | 70-80 | 80-90 | 90-100 1 1 1 afufaffro 1 1 1 3 6 3 1 It is evident, that the group 70-80 is occurring maximum number of times and so it is the modal class. If the modal class is distinctly clear then we can use the formula method directly. If the modal class is not clear, then we use grouping method and fill the modal class. If the modal class is unique, then we use the formula method, otherwise empirical relation is used. In any one of the following cases also the mode is determined bymethod of grouping: ° — Ifthe maximum frequency is repeated. * — Ifthe maximum frequency occurs in the very beginning or at the end of the distribution. * — If there are irregularities in the distributions. GEOMETRIC MEAN The geometric mean or harmonic mean are not used like other measures of central tendency, but introducing them will give a little knowledge with basic statistical constants. The geometric mean is the nth root of the product of numbers, which has application in economics for computing average interest rate and in population genetics. If Y,, X,,X,, X,, .... X,, are n observations then the geometric mean of these observations is {X,X7X3..X, . It is the nth root of the product of the observations in a data set. For example, 3, 1, 2, 10 are the numbers, then, the G.M is YX Xa XgXq =ABx1x 2x10 = 2.78. The geometric mean can also computed by: 1. Taking the logarithm of each number. 2. Computing the arithmetic mean of the logarithms. 3. _ Raising the base used to take the logarithms to the arithmetic mean aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. Measures of Central Tendency & Measurement of Dispersion 61 The variance is the minimum sum of squared differences of each score from any number. In other words, if we have used any number other than the mean as the value from which each score is subtracted, the resulting sum of squared differences would be greater. (You can try it yourself — see if any number other than 625 used into the preceding calculation yields a sum of squared differences less than 102500). The Standard deviation is simply positive square root of the variance. Variance and standard deviation of a population are denoted by «? and o respectively. Variance and standard deviation of a sample are denoted by s?and s, respectively Ungrouped or Raw data Example 26 Compute the variance for the following data of stock price of SUN foot ware company quoted in Chennai Stock Exchange in the last eleven days of November 2006. 85, 86, 87, 88, 89, 90,91, 92, 93, 94, 95. Solution x ¢ 85 25 86 16 87 9 88 4 89 I 90 0 0 91 1 1 92 2 4 93 2 9 94 4 16 95 5 25 {_ Totat 0 110 | 89210 | 110 Direct method {f we use the formula > 190, ya TL 99 n 11 o? = 15x? x? = S210 90? - 8110-8100 =10 Since ¥ is an integer we can also use the formula. of =D Fp =U. 19 Shortcut method To apply shortcut suciod we take A= 90, C=1 aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. Measures of Central Tendency & Measurement of Dispersion 65 Mean= X= 4+(2f4)xc -75+(5)xs=4929 120 ‘ 1 EM) | 2 Vi =o =|}— 2 | oT ‘ariance = 0 he“ ( v Ne 3 (#+-3) jes 120 120 =[6.975~(0.3583)* ]x25=171.165 Example 29 Indian army recruited 140 people with the following height distribution: 168 - 169 4 169 - 170 5 Find the mean height and S.D height of recruits. Solution : frequency ‘| X-A | Height x d=—— fd iP cnt c | “ | * 168 - 169 4 168.5 169 - 170 5 169.5 25 125 Total 140 L 0 688 16 64 aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. Measures of Central Tendency & Measurement of Dispersion 69 where, for raw data Q, =Ist Quartile =size of (“Jo item and Q, = 3rd Quartile =size of cv +I)th item. For grouped data, Noy Q=l+ 4 7 xC and 3 N-M , = 04] 4 xc f Example 31 The following data corresponds to the marks obtained by 7 students in an examination 20, 28, 40, 12, 30, 15, 50. Find the quartile deviation and its coefficient from the above data. Solution Arrange the items in increasing order 12, 15,20, 28, 30, 40, 50 Q, = Size of (Ae) item = Size of (2) item = 2nd item =15 7 3 * O, = Size of 3(V+Dth item = Size of [3-2] item = 6th item 20, = 40 gp= 228 2-15 195 2 2 Coeff of QD. = B= = 40-15 _ 25 _ 9 455 Q,+0, 40415 55 aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. Measures of Central Tendency & Measurement of Dispersion 73 Solution 4 xlerlo A fa je? |x - Model} s|X -Mode| o}3s [s =20 80 4 20 1 ie 12 -21 63 3 21 2 10 | 22 20 40 2 20 3 17_| 39 -17 17 1 1 7 4 22 | 61 oO 9 oO 0 s_| 1s | 76 | is [5 1 15 6 | 10 | 86 20 40 2 20 7 | 8 [94 24 R 3 24 8 | 6 [100 24 96 4 24 Total [ 100 ~18 +83=5| 423 161 The highest frequency is 22 and the corresponding Xis 4. So mode = 4. 100 To find median let us get ue === 50-In the cumulative frequency column the cumulative frequency just greater than this is 61 and the corresponding. value of 4 is the median. Mean = X= 4+(22)xc= 44S )r-4a0s Le ie -(2#) xc 43 (5\ | 2_| = (42275 =2.056 100 (s) Mean deviation about mode = a f\|xX- Medel = 99 =161 Continuous data Example 34 Find mean deviation about mean and mean deviation about median for the following data: Class interval | Frequenc; aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. Measures of Central Tendency & Measurement of Dispersion 77 Comparing average down time of computer A and computer B, we conclude that mean downtime of B is more but C.V of A is more than that of B. Combined mean and S.D fn, and n, aré the number of items in two groups and X,and X,are their arithmetic means and o?,o? are the variances of the two groupsthen the combined mean and variances denoted by X and g? are calculated by the formula, = s 2 2 2 2 _ no; +d) }+n(o, +4, ya Mtitews ae? = ilo: ) 2 (2 ’) ntn, mtn where d, and d, are deviations of two samples from their means and d,=X,-X and d,=X,-X The above formula can be extended to three groups if,, ,, 7, are the number of observations, X,,X,,X, are their means and oj, a3 and oF are their variances, yXy + mXy +4X3 an nam +h g? = MCL th) +m(o} + dz) +ns(o5 +43) hth X= id where, d, = X,-X Example 36 The following data pertain to information on police recruits who were examined by three medical officers. Find the mean weight and the S.D of the entire data grouped together. Medical examiner | No. examined | Mean wt. (ibs) | S.D (Ibs) x 50 113 6 Y 60 120 7 Zz 90. 5 8 Solution n=50, n,=60, n,=90 X,=113, X,=120, X,=115, o,=6, 6,=7, oa, =8 When combined mean and variance of three groups are required we have xe 1X, +1) X +03X3 mtn tn, aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. Measures of Central Tendency & Measurement of Dispersion 81 XX, + Kt XX, 5 ie, 2+84+94+X,+X, 1X, 4+X5 =27-19=8 Also, o =ty x? -** n ay 2.737 = (2? +87 497 4X42 4X5") ~5.47 a n 5x(2.737 45.47) =44644814X2 4X3 X} +X? =183-4-64-81=34 but from (1).X,= 8 —X, substituting this in (2) (B-X5)° +X = 34 2X2 -16X, +30=0 cancelling the common factor2 in all the terms, X23 -8X,+15=0 @) Solving the equation, we get X, = 3, 5. If.X,=3, then X,=5. Orifwe take X, = 5 then X, = 3. Therefore the two observations are 5 and 3. Example 41 Find the missing frequency for the following series if total frequency is 160 and median is 35.83. Income | No.of persons | 14 | 28 50- | 55- BP 60 13 [ 7 | Solution Let the frequency of class 30-35 be x and that of 45-50 is 160-(14 +28 +x+30+20+ 13 +7)=160-(112+x)=48—x ~. The frequency table Income _|_No. of persons (f) og 1 20 - 25 4 14 25 - 30 28 42 30 - 35 x 42+x 35 - 40 30 +x 40 - 45 20 +x 4B-x 140 13 153 aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. Measures of Central Tendency & Measurement of Dispersion 85 2. Inasmall town, a survey was conducted in respect of profit made by textile shops. The following results are obtained. Profit or Loss in.1000 Rs. No. of shops 4 30- 40 24 40- 50 18 50- 60 10 a. Calculate the average profit made by a retail shop. b. Total profit by all shops. c. The coefficient of variation of earnings. 3. From the following table showing the wage distribution in a factory, determine a. The mean wage. b. The median wage. c. The modal wage. d. The wage limits for the middle 50% of the wage earners. e. The percentage of workers who earned between Rs. 75 and 125. f. The percentage who camed more than Rs. 150. g. The percentage who eamed less than Rs. 100 per week. No. of g employees 20 - 40 8 40 - 60 12 60 - 80 20 80 - 100 30 100 - 120 40 120 - 140 35 140 - 160 18 160 - 180 7 180 - 200 5 aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. Measures of Central Tendency & Measurement of Dispersion 89 20. 21. ‘The mean marks of 100 students were found to be 40. Later on it was discovered that a score of 53 was misread as 83. Find the correct mean corresponding to correct score. Ans: ¥ = 39.2 Calculate median and mode for the following distribution. Production per dry in Tons No. of days 21-22 | 23-24 | 25-26 | 27-28 | 29-30 7 13. 22 10 8 DR. 2. 24. 27. Define mean, median and mode. Given mean = 70.2 and mode = 70.5. Find median using empirical relationship between them. A random sample of 50 customers in a bank are considered and the waiting time of these customers are as follows: 3, 2,7, 14, 6,9, 3,4, 11, 15, 10, 12, 20, 16, 18, 16, 17, 14, 15, 15,7, 6, 9, 22, 18, 19,22, 24, 18,11, 12, 17, 19, 3,8, 7,21, 17, 18, 16, 14, 12, 16, 18, 14,7, 16, 20,22, 10. Form the frequency table for the customers waiting time and compute any three measures of central tendency. Time taken to receive baggage after landing in an airport in 8 occassions for Mr. Govind are as follows: 15, 18, 17, 10, 20, 25, 14, 10. Find mean, median and mode for the above data. . Given the following set of data from a sample of size 8, 8,-6,-7,9,-4,4, 14,10 Compute the mean, median and mode. . A manufacturer of flash light batteries took a sample of 15 batteries from a week’s production and used them continuously until they were drained. The number of hours they were used until failure were: 320, 420, 620, 435, 265, 430, 260, 950, 850, 600, 400, 390, 340, 250, 370. a. Compute the mean, median, mode. b. In what way these measures be useful to the manufacturer? Find Q.D for the scores of 15 golfers. 61, 75, 49, 78, 82, 22, 80,45, 90, 46, 53, 45, 63,62, 60. . Age distribution of 100 insurance policy holders is as follows: Age (on nearest birthday) | No. of Policy holders TE = 19:5 9 20 - 25.5 16 12 26 14 12 6 3 Calculate coefficient of quartile deviation and mean deviation about mean. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. aa You have either reached a page that is unavailable for viewing or reached your viewing limit for this book. Moments, Skewness and Kurtosis INTRODUCTION The measure of central tendency and dispersion is inadequate to describe a distribution completely. The measures of central tendency give an idea about the concentration of the values in the central part of the distribution. The measures of dispersion give the degree of scatter or variation of the variables about a central value. It is an yard stick to determine the extent of variations of spread of items about the central value. Two distributions may have the same mean and S.D, but may still differ in their shape of the distribution. So, to measure the principal characteristics of a distribution, further description of the distribution is necessary and is provided by measures of skewness and kurtosis. Moments is one of the tools to define and describe these measures. So first let us define moments. MOMENTS The turning effect or rotating effect of a force is generally referred to as moment. In statistics, the effect of average of various powers of deviations taken from the mean of a distribution is referred to as moments. There are two types of moments, namely raw moments and central moments. If deviations are taken from any arbitrary constant say A, then it is known as raw moment and if the deviations are taken from arithmetic mean, it is known as central moment. First let us define raw and central moments and give the relation between them. The r* order raw moment about any arbitrary constant A (positive or negative) denoted by 14,(A)is defined as, 1 r H(A) =H) = 75 (X- 4) ) and the r* order central moment denoted by /, is defined as, 1 sy H, = 7 E(x -X) (2)

You might also like