You are on page 1of 657
a coneise..cour seth ADVANCED LEVEL STATISTICS Lg zai worked a A) yy 'J. CRAWSHAW and J. CHAMBERS n o c o ° ec + = ° a oO c { A CONCISE COURSE IN ADVANCED LEVEL STATISTICS With Worked Examples TR: 5 cop) Fourth Edition J CRAWSHAW ase Former and De ‘Text @ J Crawshaw and J Chambers 1984, 1990, 1994, 2001 Original ilustrations © Nelson Thosnes Ltd 1994, 2001 ‘Text © ICT Statistics Supplement, Douglas Butler, 2001 ‘The rights of J Crawshaw and J Chambers to be identified as authors ofthis work. have been asserted by them in accordance with the Copyright, Design and Patents Act 1988. Al rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical including photocopy, recording, or any information storage and retrieval system, without permission in \writing from the publisher or under licence from the Copyright Licensing Agency Limited of 90 Tottenham Court Road, London W1T 4LP. ‘Any person who commits any unauthorised actin relation to this publication may be laible tocriminal prosecution and civil claims for damages. First published in 1984 by: ‘Stanley Thomes (Publishers) Lid Second Edition 1990 Third Edition 1994 Fourth Edition 2001, Reprinted in 2002 by: Nelson Thornes Lta Delta Place 27 Bath Road ‘CHELTENHAM GL53 7TH United Kingdom m0 / 98765 4 A catalogue record of this book is available from the British Library ISBN 0 7487 8475-X Page make-up by Mathematical Composition Setters Lid Printed and bound by Graticas Estella Contents Preface 1 Representation and summary of data Discrete data Continuous data Stem and leaf diagrams (stemplots) Ways of grouping daca Histograms Frequeney polygons Frequency curves Circular diagrams or pie charts The mean Variability of data The standard deviation, s, and the variance, s? Combining sets of data Scaling sets of data Using a method of coding to find the mean and standard deviation Cumulative frequency Cumulative percentage frequency diagrams Median, quartiles and percentiles Skewness The normal distribution Box and whisker diagrams (box plots) Summary 2 Regression and correlation Scatter diagrams Regression function Linear correlation and regression lines The product-moment correlation coefficient, r Spearman's coefficient of rank correlation, 7, Summary 3. Probability Experimental probability Probability when outcomes are equally likely Subjective probabilities Probability notation and probability laws lustrating two or more events using Venn diagrams Probability rule for combined events Exclusive (or mutually exclusive) events vii 118 118 119 119 139 146 154 168 169 171 171 171 175 175, 179 Exhaustive events Conditional probability Independent events Probability trees Bayes’ Theorem Some useful methods Arrangements Permutations of r objects from n objects Combinations of r objects from 1 objects Summary Probability distributions I discrete variables Probability distributions Expectation of X, EX) Expectation of any funetion of X, Ele(X)) Variance, Var(X) or V(X) The Cumulative distribution function, F(x) ‘Two independent random variables Distribution of X, +X, ++ +X, Comparing the distributions of X, +X and 2X Summary Special discrete probability distributions The uniform distribution ‘The geometric distribution Expectation and variance of the geometric distribution, The binomial distribution Expectation and variance of the binomial distribution The Poisson distribution Using the Poisson distribution as an approximation to the binomial distribution ‘The sum of independent Poisson variables Summary Probability distributions I - continuous variables Continuous random variables Probability density function (p.d.£.) Expectation of X, E(X) Expectation of any function of X Variance of X, VariX) The mode Cumulative distribution function F(x) Obtaining the p.d.f,, f(x), from the cumulative distribution function The continous uniform (or rectangular) distribution Expectation and variance of the uniform distribution The cumulative distribution function, F(x), for a uniform distribution Summary 180 182 185 193 197 204 206 214 214 221 270 270 271 275 278 286 291 299 301 304 314 314 314 320 324 327 329 334 341 345 347 348 351 10 The normal distribution Finding probabilities The standard normal variable, Z Using standard normal tables Using standard normal tables for any normal variable, X Using the standard normal tables in reverse to find z when ®(z) is known. Using the tables in reverse for any normal variable, X Value of « or a or both The normal approximation to the binomial distribution Continuity corrections Deciding when to use a normal approximation and when to use a Poisson approximation for a binomial distribution The normal approximation to the Poisson distribution Summary Linear combinations of normal variables The sum of independent normal variables The difference of independent normal variables Multiples of independent normal variables Summary Sampling and estimation Sampling Surveys Sampling methods Simulating random samples from given distributions Sample statistics The distribution of the sample mean Central limit theorem The distribution of the sample proportion, p Unbiased estimates of population parameters Point estimates Interval estimates The distribution Confidence intervals for the population proportion, p Summary Hypothesis tests: discrete distributions Hypothesis test for a binomial proportion, p (small sample size) Procedure for carrying out a hypothesis test One-tailed and two-tailed tests Summary of stages of a hypothesis (significance) test Type | and Type Il errors Significance test for a Poisson mean 4 Summary of stages of a significance test Summary of Type I and Type Il errors 360 361 361 362 368 371 374 378 382 383 387 390 392 403 403 407 410 414 421 421 422 424 431 436 436 441 444 447 447 449 462 469 A472 483 483 486 489 492 493 496 500 501 wi CONTENTS u 12 13 Hypothesis testing (z-tests and t-tests) Hypothesis testing One-tailed and two-tailed ests Critical z-values Summary of critical values and rejection criteria Stages in the hypothesis test Hypothesis test 1: testing 1 (the mean of a population) ‘Type I and Type Il errors Hypothesis test 2: testing a binomial proportion p when m is large Hypothesis test 3: testing, — 5, the difference between means of two normal populations Summary The 7? significance test The z* significance test Performing a 7? goodness-of-fit test Summary of the procedure for performing a 7? goodness-of-fit test Test 1 ~ goodness-of-fit test for a uniform distribution Test 2 — goodness-of-fit test for a distribution in a given ratio Test 3 — goodness-of-fit test for a binomial distribution Test 4 — goodness-of-fit test for a Poisson distribution Test 5 — goodness-of-fit test for a normal distribution Summary of the number of degrees of freedom for a goodness-of-fit test The 77 significance test for independence Summary. Significance tests for correlation coefficients Significance tests for correlation coefficients Test for the product-moment correlation coefficient, r Spearman’s coefficient of rank correlation, r Summary ICT statistics supplement Appendix Cumulative binomial probabilities Cumulative Poisson probabilities The standard normal distribution function Critical values for the normal distribution Critical values for the tdistribution Critical values for the * distribution Critical values for correlation coefficients Random numbers Answers 307 507 Sut 512 513 513 514 520 528 534 S47 560 617 645 645 648, 649 649) 650 651 652 653 Preface Introduction This fully revised and updated edition of A Concise Course in Advanced Level Statistics is a comprehensive text for use primarily by students and teachers of Advanced Level Mathematics, both ar AS and A2 level. It also provides a useful support for those studying statistics as part of science, social science and humanities courses. Features © Points of theory are explained concisely and illustrated clearly by worked examples, many taken from Advanced Level papers. © Carefully graded exercises help you to consolidate ideas and gain experience in applying theory to different situations, © Frequent hints pinpoint common misunderstandings and reinforce ideas, © Key concepts and formulae are highlighted in colour to increase clarity. Frequent summaries provide a quick reference. © Extensive miscellaneous exercises and end-of-chapter tests provide practice in tackling, examination questions, providing essential examination preparation. © Answers to all exercises are provided © An ICT supplement explores the use of ICT in the study of statistics Specifications ‘The text covers the main theory required in the specifications of all the examination boards for the statistics sections of AS and A2 Mathematics. Examination Questions We are grateful to the following Awarding Bodies for permission to reproduce questions from their past examinations: © Assessment and Qualifications Alliance (AQA), including Northern Examinations and Assessment Board (NEAB/JMB) and Associated Examining Board (AEB) © The Edexcel Foundation including University of London Examinations and Assessment Councils (L) © Mathematics in Education and Industry (MEI) © Oxford, Cambridge and RSA (OCR) including University of Cambridge Local Examinations Syndicate (C), Oxford &¢ Cambridge Schools Examination Board ( & C) and Oxford Delegacy of Local Examinations (O) © Welsh Joint Education Committee (WJEC) All answers and worked solutions provided for examination questions are the responsibility of the authors. We hope that you will enjoy using this text and that it will enhance your understanding of statistics and give you confidence to succeed. J Crawshaw & J Chambers 2001 Representation and summary of data In this chapter you will learn about © discrete and continuous data © stem and leaf diagrams (stemplots) © histograms, frequency polygons and the shape of a distribution © pie charts © means and weighted means © standard deviation and variance © cumulative frequency © medians, quartiles and inter-percentile ranges © skewness, including Pearson's coefficient and quartile coefficient: «the shape of the normal distribution © box-and-whisker diagrams (boxplots) and outliers DISCRETE DATA Ina survey of Im quadrats in a field the number of snails in each of 30 quadrats was recorded as follows: 023142 232011 1 35 0 20 This is an example of discrete raw data. Discrete data can take only exact values, for example the number of cars passing a checkpoint in 30 minutes, the shoe sizes of children in a class, the number of tomatoes on each plant in a greenhouse. The data are known as raw because they have not been ordered in any way. Frequency distribution for discrete data To illustrate the data more concisely, count the number of times each value occurs and summarise these in a table, known as a frequency distribution Number of snails 0 1 2 3 4 Ss Frequency 3 nM 8) 2 1 Toral30 The frequency distribution can be represented diagrammatically by a vertical line graph ot a bar chart, The height of the line or bar represents the frequency. Vertical line graph to show Bar chart to show number of snails number of snails 22 pe E10 g 0 “8 “3 6 6 4 a 2 | | 2 o — o ri o12345 ore 345 ibe Narr oss Notice that in the vertical line graph the distinct lines reinforce the discrete nature of the variable, © in the bar chart the bars are all the same width and they are labelled in the middle of the bar on the horizontal axis. The mode ‘The mode is the value that occurs most often, The mode is the most popular value, deriving from the French ‘a la mode’ meaning fashionable. It is easy to see from the diagrams above that the mode is 2 snails per quadrat. CONTINUOUS DATA The following data were obtained in a survey of the heights of 20 children in a sports club. Each height was measured to the nearest centimetre. 133 136 120 138 133 131 127 141 127 143 130 131 125 144 128 134 133. 129 This is an example of continuous raw data. Continuous data cannot take exact values but can be given only within a specified range o: measured to a specified degree of accuracy NTATION AND Si MIMARY OF DATA 3 For example, the measurement 144 cm (given to the nearest em) could have arisen from any value in the interval 143.5 em {e) 20 boys snd 20 girls took pain a reacion- timing experiment. Their results were ‘measured ro the nearest hundredth of a second, Girls: 0.22, 0.21, 0.18, 0.18, 0.16, 0.19 noundeer State the value ringed! and the width of the interval that iti in when the diagram illustrates {a}. the times taken for a journey, where 6/8 sepresents 6.8 hours, {b) the masses, in gro three decimal places, of components, where 6 |8 represents 0.068 g. 0.12. WAYS OF GROUPING DATA The following frequency distributions show some of the way’ that data can be grouped. The information is more concise than the raw data, but the disadvantage is that the original information has been lost. (i) Frequency distribution to show the lengths, to the nearest millimetre, of 30 rods 37-46 Length (mm) Frequency 4 u 2 3 a) The interval 27-31 means 26.5 mm < length < 31.5 mm. The class boundaries are 26.5, 31.5, 36.5, 46.5, S15 The class widths are 5, 5, 10, 5 (ti) Frequency distribution to show the marks in a test of 100 students Mark 30-39 40-49 Frequency 10 14 26 20 18 R 60-69 70-79 80-99 This distribution can be interpreted in two ways (a) As discrete data, the interval 30-39 represents 30 < mark < 40, The class boundaries are 30, 40, 50, 60, 70, 80, 100 The class widths are 10, 10, 10, 10, 10, 20 (b) As continuous data, assuming marks are to the nearest integer, 30-39 would represent 29.5 < mark < 39.5. The class boundaries are 29.5, 39.5, 49.5, 59.5, 69.5, The class widths are 10, 10, 10, 10, 10, 20 \ili) Frequency distribution to show the lengths of 50 telephone calls Length of call (min) Frequer The interval ‘3~" means 3 minutes < time <6 minutes, so any time including 3 minutes and up to (but not including) 6 minutes comes into this interval, The class boundaries are 0, 12, 18 The class widths are (iv) Frequency distribution to show the masses of 40 packages brought to a particular counter at a post office Mass (g) 100 500 800 Frequency 8 10 16 6 The interval \-250" means 100 g < mass < 250 g, so any mass over 100 grams up to and including 250 grams comes into this interval. ‘The class boundaries are 0, 100, 250, 500, 800 The class widehs are 100, 150, 250, 300 *) Frequency distribution to show the speeds of 50 cars passing a checkpoint Speed (km/h) 20-30 30-40 Frequency 2 7 The interval 30-40 means 30 km/h < speed < 40 km/h. ‘The class boundaries are 20, 30, 40, 60, 80, 100 The class widths are 10, 10, 20, 20, 20 JARY OF DATA 11 (vi) Frequency distribution to show ages (in completed years) of applicants for a teaching post rr Age (years) 21-24 25-28 29-32 33-40_—_41-52 Frequency 4 2 1 1 Since the ages are given in completed years (not to the nearest year) then ‘21-24" means 21 € age < 25. Someone who is 24 years and 11 months would come into this category. Sometimes this interval is written ‘21’ and the next is ‘25—’, etc. The class boundaries are 21, 25, 29, 33, 41, 53 The class widths are 44 4 8 12 HISTOGRAMS Grouped data can be displayed in a histogram as in the following diagram. represents tne passenger 20 38 44 10 8 z ° 20 30 40 8 4 70 80 $0 100 ‘Aas of passengers This histogram represents the following table for the distribution of ages of passengers on a shuttle flight from Denver, Colorado to Salt Lake City, Utah. “Age, x years O] [MODE] [7] Clear memories [SHIFT] [Sel nd Fl [A Input data Do this in the Efe? = 198, To clear SD mode Therefore the standard deviation is 1.22 (2 d.p.), as before. Ina grouped frequency distribution, the mid-interval value is taken as representative of the interval, as in the following example Example 1.24 ty (eandates per minute) Freauency den 6 7 8 § wo it Tina (minutes) ‘An intelligence test was taken by 115 candidates, For each candidate the time taken to complete the test was recorded, and the times were summarised in a histogram (see diagram). Write down the frequency for each of the class intervals 0-1, 1-2, 2-3, 3-5 and 5-10 minutes. Calculate estimates of the mean and standard deviation of che times taken to complete the est {c) Solution 1.24 Frequency = frequency density x interval width. Note that the interval 2-3, for example, represents 2 < time <3. ‘Time (min) Frequency To calculate estimates for the mean and standard deviation, use mid-interval values, x. Oe fe Time (min) x f fe fx o-4 os 10 5 25 12 Ls 15 22.5 33.75 23 25 25 62.5 156.25 35 4 40 160 640 5-10 75 25 187.5 1406.25 3 fet = 2238.75 Efe lls Efe = 437.5 fof? _,_ [2238.75 oe ie a 2299 ) VF Tig 3802.2 Ds s The mean time is 3.8 minutes and the standard deviation is 2.2 minutes. [You could have calculated these directly using the calculator in SD mode. Check them yourself. If you are given summary information, rather than the raw data or frequency distribution, you cannot use the calculator in SD mode. You will have to use the formulae to calculate the mean and standard deviation, as in the following example. Example 1.25 (a) Cartons of orange juice are advertised as containing 1 litre. A random sample of 100 cartons gave the following results for the volume, x. Ey= 1014, Ex Calculate the mean and the standard deviation of the volume of orange juice in these 100 cartons, = 102.83 (b) A machine is supposed to cut lengths of rod 50 em long. A sample of 20 rods gave the following results for the length, x. Efe = 997, (i) Calculate, the mean length of the 20 rods. (ii) Calculate the variance of the lengths of the 20 rods. State the units of the variance in your answer, Solution 1.25 (a) Ev = 101.4, Ex? = 102.83, 2 = 100 Ex 101.4 =~ = 1.014 n 100 ‘The mean volume is 1.014 litres. 14? = 0.0101... V 100 The standard deviation of the volume is 0.010 litres (2 s.f.) (b) Dfe= 997, ¥ fx? =49 711, D f= 20 Efe _ 997 i) w= ag gs fh a= Foe sy mans ‘The mean length of the rods is 49.85 cm. : Ef sors 8) Variance = 5 0.5275 Mean and stand n 1. Do not use the statistical program on your 2. The table shows the weekly wages in £ of each of saleulator for this question. (i) For each of the following sets of numbers, calculate che mean and the standard deviation. Try using both Forms of the formula for the standard deviation in parts (a) £0 (). In pars (d) 0 (choose one of the methods. (a) 2,4,5,68 (b) 6,89, 11 fe) 1, 14,17, 23,29 (a) 5, 13,7, 916,15 (e) 46,27, 31,008, 62 (6 200, 203, 266, 207, 209 (il) Now cheek your answers using your caleulator in SD (STAT) mode. 100 factory warkers. (a) Draw a histogram to illustrate this information, (b) Galeulae the mean wage and the standard Number of Wage £ workers 200 b. Fora set of 20 numbers Ex = 300 and Ex? = 5500. For a second ser of 30 numbers Sic= 480 and Sx* = 9600. Find the mean and the ‘standard deviation of the combined set of 50 numbers. If the mean of the following frequency distribution is 3.66, find the value of a. 1 2.3 45 6 3. 9 @ UW 8B 7 ‘A bag contained five balls each bearing one of the numbers 1, 2, 3, 4, 5. A ball was drawn fram the bag, its number noted, and then replaced. ‘This was done 50 times in all and che table below shows the resulting frequency distribution. Number 10203 4 ~«5 10. The manager of a car showroom monitored the numbers of cars sold during two successive five-day periods. During the first five days the numbers of cars sold per day had mean 1.8 and variance 0.56. During the nex five days the numbers of cars sold per day had mean 2.8 and variance 1.76, Find the mean and variance of the ‘numbers of cars sold per day during the full en days, (NEAB) 11. Prior to the start of delicate wage negotiations in a large company, the unions and the ‘management rake independent samples of the work force and ask them at what pereentage level they believe a settlement should be made ‘The results are as follows: Standard Sample Size Mean deviation ‘management’ 350 12.4% = 2.1%. ‘union’ 237 10.7% = 1.8%. Frequency = x = Ly BD Ifthe mean is 2.7, determine the values of x and y Parplan Opinion Polls Led conducted a nationwide suevey into the atsitudes of teenage girls. One of che questions asked was “Whar is the ideal age for a girl to have her firse baby?” In reply, the sample of 165 girls from the Northern zone gave a mean of 23.4 years and a standard deviation of 1.6 years. Subsequently, the overall sample of 384 girls (Northern plus Southern zones) gave a mean of 24.8 years and a standard deviation of 2.2 years Assuming that no girl was consulted twice, calculate the mean and standard deviation for the 219 girls from the Southern zone. (AEB) Assuming thar no individual was consulted by both sides, calculate the mean and standard deviation for these 587 workers (AEB) 12, Ina germination experiment, 200 rows of seeds, with ten seeds per row, were incubated. The frequency distribution of the number of seeds which germinated per row is shown below. Number of seeds germinated Frequency 0 4 10 16 28 a4 44 32 16 10 6 0 fa) Calculate the mean and the standard deviation of the number of seeds germinating per row. For another $0 rows an analysis shows that the ‘mean is 4:4 seeds and the standard deviation is 2.2 seeds. (b) Determine the mean and, to two decimal places, the standard deviation for the 250 rows, i) 13, The figures in the table below are the ages, co the nearest year, of a random sample of 30 people negotiating & mortgage with a bank. 29°26 31 42 38 4535 37 38 38 36 39 49 40 32 3234-27 61 29 33 31 33 52 44 3230 38 42 33 Copy and complete che following stem and leaf diagram. Use the diagram to identify two features of the shape of the distribution. a fad 30 | 1 35 Find the mean age of the 30 people. Given that 18 of them are men and thar the mean age of the men is 37,72, find the mean aye of che 13 women, (MEN, 14, A travel ageney has two shops, R and S. The number of holidays purchased in a particular ‘week and the mean and standard deviation of the costs of these holidays at each shop are shown in the following table Caleulate the mean, and, to che nearest penny, the standard deviation of the costs ofall the 56 holidays purchased. REPRESENTATION AND SUMMARY OF DATA 51 Numberof Meaneost SD, holidays eo Shop R 32 190.35 Shop $ 24 202.25 w 15, ‘Three random samples of 50, 30 and 20 bags respectively are taken from the production line of “12 ka bags’ of cat litter. The contents of each bag are chen weighed. A summary ofthe results is shown in the table. Find, in kilograms to two decimal places, the mean weight per bag and the standard deviation for the 100 bags (Ly 16, ‘The average height of 20 boys is 160 em, with a standard deviation of 4 em. The average height ‘of 30 girls is 155 cm, with a standard deviation Of 3.5.om. Find the standard deviation ofthe whole group of 50 children, SCALING SETS OF DATA Example 1.29 Sweets are packed into bags with a nominal mass of 75 g. Ten bags are picked at random from the production line and weighed. Their masses, in grams, are 76, 74.2, 75.1, 7. .7, 72, 74.3, 75.4, 74, 73.1, 8 (a) Use your calculator to find the mean mass and the standard deviation. It was later discovered that the scales were reading 3.2 g below the correct weight. (b) What was the correct mean mass of the ten bags and the correct standard deviation? (c) Compare your answers to (a) and (b) and comment. Solution 1.29 (a) According to the scales with measurements being given in grams 4.06, s 17 (2d, 1.166 ... 52 A CONCISE COURSE (b) The correct readings are: 79.2, 77.4, 78.3, 76.9, 75.2, 77.5, 78.6, 7.2, 76.3, 76 7.26, $= 1.166 ... = 1.17 (2 dp.) (c} Notice that 77.26 ~ 74.06 = 3.2 i.e. correct mean ~ original mea So correct mean = original mean + 3.2: correct s.d. = original s. If each reading is increased by 3.2, then the mean is increased by 3.2. The standard deviation, however, remains unaltered. Showing the two sets of readings on a graph helps to show that although the mean increased, the spread of the data about the mean remained the same. Ctiginl mean Original cats xX xX XH xx x 72 73 74 75 76 7 7s 73 x xX x Km xx x New data 4 New mesn In general, if each number is increased by a constant c the mean is increased by ¢, @ the standard deviation remains unaltered, x+e, theny and Now consider what happens when each number in a set of readings is multiplied by a constant, For the four numbers 2, 3.5,5,6 £=4.125, s,=1515 ‘Multiplying each number by 3 to obtain y, where y= 3x gives the numbers 6, 10.5, 15, 18. For these, $= 12.375, s,=4.546 ... Now 12.375+4125=3, so p=3% and 4.546 + 1S1S--=3, sos,=3s, AND SUNIMARY OF Data $8 ‘You can see from the diagram that the new set of data is much more spread out, gna san Original data Y i. x x x x 2 2 H 2 1 i i 16 A x x x x New data 4 New mean In general, if each number is multiplied by a constant k © the mean is multiplied by & © the standard deviation is multiplied by |k | where | k is the positive value of k. ie. ify=kx, then p= ke =|k\s For example, ify =-}2, then y=-JRands,=4s, since Combining these two results, where a and b are constants Example 1.30 Joe’s mean mark for the physics tests for the term was 72. His teacher decided to scale all the marks according to the formula y= 2x ~ 6, where y is the new mark and x the original mark. Find Joe’s new mean mark, Solution 1.30 Joe's new mean mark is 84, Example 1.31 The standard deviation of three numbers a, b, cis 3.2. (a) State the standard deviation of the three numbers 3a, 3b, 3c. (b) State the standard deviation of the three numbers a +2, 6 +2, ¢+2. (c) State the standard deviation of the three numbers 2a + $,2b+5,2c+5, ic) 54 A CONCISE COURSE IN A-LEV Solution 1.31 (a) Ify= (b) If {c) Ify = 2x45, then s,=2. sx, then sy = 3s, +2, then sy=s,. Compa If you wish to compare two sets of data, for example examination marks in two papers, you can scale one of the sets of data so that the two means are the same and the two standard deviations are the same. Example 1.32 For students on an Electronics course the assessment consists of two components: a written ‘examination paper and a project. The marks for the examination paper are distributed with a mean of 62 and a standard deviation of 16. Those for the project have a mean of 37 and a standard deviation of 6. Anna, a student on the course, scored 80 marks on the examination paper and 46 marks for her project. (a) Transform each of Anna’s marks into a standardised score, such that, for each ‘component, the mean and standard deviation for all students on the course are $0 and 20, respectively. {b) Hence compare Anna’s relative performance in the two assessment components. (NEAB) Solution 1.32 (a) Standardised values: 5 Examination 2, Sy Let then Now a=1.25 Substituting in © =-27.5 ‘The transformation for the examination paper is y = 1.25x—27.5 When x= 80, y= 1.25 x 80~27.5= 72. Anna’s standardised mark for the examination is 72.5. Project Let then Now The transformation for the project is y = 34x — When x = 46, y= 34 x 46 -734=80 Anna’s standardised mark for the project is 80. (b) Relatively, Anna performed better on the project than in the examination, SUMMARY OF DATA 55 Exercise 1h Scaling sets of data 1. {a} Find the mean and the standard deviation of the set of numbers 4, 6,9, 3, 5,6, 9 (b) Deduce the mean and the standard deviation of the set of numbers 514, $16, 519, 513, S15, 516, 519. (c) Deduce the mean and the standard deviation of the set of numbers 52, 78, 117, 39, 65, 78,117. 2. A set of numbers has a mean of 22 anda standard deviation of 6. If 3 is added to each number of che set, and each resulting number is then doubled, find the mean and standard deviation of the new set. (C Additional) 3. A set of values of a variable X has a mean yt and a standard deviation-a. State the new value of the ‘mean and of the standard deviation when each of the variables is (a) increased by 2, (b) multiplied by p. Values of a new variable Y are obtained by using the formula Y= 3X + 5. Find the mean and the standard deviation of the set of values of Y. (C Additional) 4, Show thar the standard deviation of the integers 1,2,3,4,5, 6 7 is 2. ‘Using this result find the standard deviation of the numbers (a) 101, 102, 103, 104, 105, 196, 107. {b} 100, 200, 300, 490, 500, 600, 700. (c)_ 2.01, 3.02, 4.03, 5.04, 6.05, 7.06, 8.07. (4) Write down seven integers which have mean 5 and standard deviation 6. (L Additional) 5. Ibis proposed to convert a set of marks whose mean is 52 and standard deviation is 4 t0 2 set of ‘marks with mean 61 and standard deviation 3. ‘The equation for the transformation necessary to convert the marks is y= ax + b, Find (a). the values of a and b, (b)_ the value of the scaled mark which corresponds to a mark of 64 in the original data, (c)_ the value in the original data if the scaled mark is 79 ‘The marks of five students in a mathematics rest were 27, 31, 35, 47, 50. (a) Calculate the mean mark and the standard deviation. (b) The marks are scaled so thac the mean and standard deviation become 50 and 20 respectively. Calculate, to the nearest whole umber, the new marks corresponding to the original marks of 31 and 50. (C Additional) I is proposed to convert a set of values of a variable X, whose mean and standard deviation are 20 and 5 respectively, to a set of values of a variable Y whose mean and standard deviation are 42 and § respectively. Ifthe conversion formula is ¥=«X +6, calculate the values of a and of b (C Additional) In order to compare the performances of candidates in two schools test was given, The ‘mean mark ar school A was 45, and the mean ‘mark ar school B was 31 with a standard deviation of 5. The marks of school A are scaled so that the mean and standard deviation are the same as school B and a mark of 85 at school A becomes 63. Find the values of a and b if the transformation used is y'=ax + b. Find also the original standard deviation of the marks from school A. 56 ONCISE 9. The following is a set of 109 examination marks ordered for convenience, 6 11 11 2 13 14 16 17 18 20 21 21 23 24 25 25 35 25 26 26 27 27 28 28 28 29 29 29 30 31 31 32 32 32 33 33 34 34 35 36 36 37 37 37 37 38 38 38 39 39 39 39 39 39 39 39 49 40 40 40 40 40 41 41 41 42 42 42 42 43 43°43 44 43 46 46 47 47 47 47 48 50 50 51 51 52 52 52 53 53 54 54 55 57 58 58 59 59 61 62 63 64 66 66 67 70 76 77 82 la} Construct a grouped frequency distribution using a class width of 10 and starting with 0-9, Draw a histogram and comment on the shape of the distribution, (by (c} Using the frequency table estimate the mean and standard deviation of the marks, The marks are co be scaled linearly by the relation Y=a+bX where X is the old mark and Y the new mark. The new mean and standard deviation are to be 50 and 10 respectively. Using your estimates in (c) calculate suitable values for a and 6. @ The mean of the marks scored by candidates in an examination is 45, These marks are scaled linearly to give a mean of 50 and a standard deviation of 15, Given that the scaled mark of 80 corresponds to an original mark of 70, calculate {a) the standard deviation of the original marks, (b) the mark which is unchanged by the sealing. Given that the greatest and least sealed marks are 92 and 2 respectively, calculate the corresponding original marks, (C Additional) USING A METHOD OF CODING TO FIND THE MEAN AND STANDARD DEVIATION Example 1.33 Salt is packed in bags which the manufacturer claims contain 25 kg each. Eighty bags are examined and the mass, x kg, of each is found, The results are E(x — 25} E(x ~ 25)? = 85.1, Find the mean and the standard deviation of the masses. Solution 1.33 You do not know the actual masses and a coding has been used to summarise the results. The coding is y =~ 25, where By-= 27.2 and Sy’ Therefore “80 = 0.34 Now ify=x-25,then x=y4 So Ray+25 Therefore F= 0.34425 25.34 Also $5™5, so $= 0.9737 ..- 85.1 St oa = 0.948 15 0.9737 .. ‘The mean mass is 25.34 kg and the standard deviation is 0.97 kg (2 d.p.). NOTE: The value 25 used here is sometimes known as the assumed mean, {ESENTATION AND SUMMARY OF DATA 57 Example 1.34 - 200000 Use the coding y = See to find the mean and standard deviation of the following: ey x 125000 150000175000 200000 = 225000 ~=— 250000 275.000 f § 19 27 35 24 12 3 Solution 1.34 so. 200 000 ie. 000y + 200 000 25 0005 + 200 000 y =2000000 x a f fy i 25.000 23 o1s4 125000 5 a5 5 1500002 » =38 76 175000 a a 7 200 000 0 35 0 0 247 . 225 000 1 2 2 Pa = age 0-18 250 000 2 2 x 8 1812 sn 275 000 3 3 9 27 = 1.393... Ef=125 25 000 x (-0.184) + 200 000 195 400 25 000s, = 25 000 x 1.393... = 34 840.207 ... = 34 800 (3 s.f.) ‘The mean is 195 400 and standard deviation 34 $00 (3 s. In general, if the set of numbers x, Yin Yar 05 Jy by means of the coding .%, is transformed to the set of numbers -a b then x=a+by so Rsatby ands, =bs, Exercise 1i Coding 1, Find the mean and the standard deviation of the following sets of data, using the coding, indicated: _ fa) | x f 7 304 1 308 s 312 9 316 4 320 4 324 2 450 (b) [Interval f a 100

You might also like