Professional Documents
Culture Documents
STATISTICS
INTRODUCTION:
In its modern meaning, Statistics can be understood as the application of scientific methods to inductive
research.
The application of statistical methods in the textile field began shortly before the Second World War.
The manufacture of textile is largely a system of mass production. For an effective quality control system for
a mass production unit, Statistical Techniques must analyze the data collected by testing laboratory. The
analysis can involve either simple techniques such as average, standard deviation, co-efficient of variation or
advanced methods such as variance, correlation, regression etc. Simple analysis is common practice in the
quality control section of all textile mills particularly for commodity textiles. Advanced statistical analysis is
used for detailed and more accurate interpretation of test data. Advanced analysis is often used for value
added technical textiles.
All forms of textiles have measurable qualities. These qualities are extremely important because they have a
profound effect upon the performance characteristics of finished product.
In the last few decades, the role of Statistical Methods has increasingly become to be seen as a body of
propaedeutic knowledge for the most varied fields of industrial technique. It has expanded from the original
sectors of Quality Control and Market Research to embrace more general problems of process management
in conditions of uncertainty (Phenomena of wear, reliability, allocation and interference of machinery, etc.).
Statistical methods also represent an indispensable aid in planning and the correct interpretation of
experiments and thus for applied research.
In particular, many specific applications have been developed in the textile sector. For some time now, both
the physical methods of measurements on fibres and theory of fibre structure are firmly based on principles
of Mathematical Statistics.
What is Statistics?
“Statistics is the science that deals with the collection, analysis, and interpretation of numerical
information”
“The science of collecting and analyzing data for the purpose of drawing conclusions and making
decisions.”
The Growth and Development of Modern Statistics:
Historically, the growth and development of modern statistics can be traced separate phenomena-the needs
of government to collect data on its citizenry, the development of mathematics of probability theory, and the
advent of the computer.
Data have been collected throughout recorded history. During the ancient Egyptian, Greek, and Roman
civilizations, data were obtained primarily for the purposes of taxation and military conscription. In the
middle Ages, church institutions often kept records concerning birth, deaths, and marriages. In America,
various records were kept during colonial times and beginning in 1790, the federal Constitution required the
taking of a census every ten years. In fact, the expanding needs of census helped spark the development of
tabulating machines at the beginning of the twentieth century. This led to the development of large-scale
mainframe computers and eventually to the personal computer revolution.
These developments have profoundly changed the field of statistics in the last 30 years. Mainframe packages
such as SAS and SPSS became popular during the 1960s and 1970s. during the 1980s, statistics software
experienced a vast technical revolution. Besides the usual improvements manifested in periodic updates, the
availability of personal computers led to the development of new packages. In addition, personal computer
versions of existing packages such as SAS, SPSS, and Minitab quickly became available, and the increasing
use of popular spreadsheet packages such as Lotus-1-2-3 and Microsoft Excel led to the incorporation os
statistical features in these packages.
Sampling is the selection of part of an aggregate or totality known as population, on the basis
of which a decision concerning the population is made.
The following are the advantages and/or necessities for sampling in statistical decision-
making:
1. Cost: Cost is one of the main arguments in favor of sampling, because often a sample
can furnish data of sufficient accuracy and at much lower cost than a census.
2. Accuracy: Much better control over data collection errors is possible with sampling
than with a census, because a sample is a smaller-scale undertaking.
3. Timeliness: Another advantage of a sample over a census is that the sample produces
information faster. This is important for timely decision making.
4. Amount of Information: More detailed information can be obtained from a sample
survey than from a census, because it take less time, is less costly, and allows us to
take more care in the data processing stage.
5. Destructive Tests: When a test involves the destruction of an item under study,
sampling must be used. Statistical sampling determination can be used to find the
optimal sample size within an acceptable cost.
Parameter:
Any numerical value calculated from an entire population is called parameter. Parameter is a constant
value it is represented by Greek Letters. e. g. proportion opposed to war
Statistic:
Any numerical value calculated from a sample is called statistic. Or numerical function of sample used to
estimate population parameter. Statistic varies from sample to sample that why it may call a variable.
Statistics are denoted by English alphabet.
Variable:
A property or attribute of each unit, e. g age, height
Observation:
Values of all variables for an individual unit
Precision:
Spread of estimator of a parameter
Accuracy:
How close estimator is to true value - opposite of
Bias:
Systematic deviation of estimate from true value. Numerical function of sample used to estimate
population parameter.
Precision:
Spread of estimator of a parameter
3 of 131
Accuracy:
How close estimator is to true value - opposite of
Bias:
Systematic deviation of estimate from true value
Importance of Statistics:
Statistics is perhaps a subject that is used by everybody. The following functions and uses of statistics in
most diverse fields serve to indicate its importance.
i. Statistics assists in summarizing the larger sets of data in a form that is easily understandable.
ii. Statistics assists in the efficient design of laboratory and field experiments as well as surveys.
iii. Statistics assists in a sound and effective planning in any field of inquiry.
iv. Statistics assists in drawing general conclusions and in making predictions of how much of a
thing will happen under given conditions.
v. Statistics techniques being powerful tools for analyzing numerical data are used in almost
every branch of learning. In the Textile, Biological and Physical Sciences, Genetics,
Agronomy, Anthropometry, Astrnomy, Physics, Geology, etc. are the main areas where
statistical techniques have been developed and are increasingly used.
vi. A businessman, an industrialist and a research worker all employ statistical methods in their
work. Banks, Insurance companies and Governments all have their statistics departments.
vii. A modern administrator whether in public or private sector, leans on statistical data to
provide a factual basis for decision.
viii. A politician uses statistics advantageously to lend support and credence to his arguments
while elucidating the problems he handles.
ix. A social scientist uses statistical methods in various areas of socio-economic life of a nation.
It is sometimes said “A social scientist without an adequate understanding of statistics, is
often like a blind man groping in a dark room for a black cat that is not there”.
Statistical Studies:
Descriptive:
One group, e. g. survey, poll
Comparative:
more groups, e. g. compare effectiveness of different teaching methods. (Textile)
Experimental:
Investigator actively intervenes to control study conditions. Look at relationship between predictor
(explanatory) and response (outcome) variables Establish causation, e. g. drug trial
Observational:
Investigator records data without intervening. Difficult to distinguish effects of predictors and
confounding
variables (lurking variables). Establish association, e. g. Framingham Heart Study
Summation Notation:
The notation is called summation notation, and is a symbolic representation of the series:
x1+ x2+ x3+ + xn. The symbol is the capital letter sigma in Greek, and it indicates that the
sequence function to its right should be summed. If xi represents a measurement of variable X, then
in statistics an entire set of n such measurements is typically summed from x1 to xn. This series is
n
x
i =1
i
indicated by . However, where it is clear that it is the entire set being summed, the lower and
n
x x x
i =1
i
i
upper limits of summation are often omitted. When this is clear, then = = .
Examples:
4
3x = 3 ( 2) + 3 ( 3) + 3 (4) + 3 (5) = 42
x=2
3
(x − a ) = ( x
i =1
i 1 − a ) + ( x2 − a) + ( x3 − a).
2
(x
i =1
− i +1) = ( x −1+1) + ( x − 2 +1) = 2 x −1
5
x =1
The typical element is 5 and it does not change. The sequence is therefore 5, 5, 5…,
3
And 5 = 5 + 5 + 5 =15
x =1
n
Theorem: Let c be a constant and x be the variable of summation. Then, c = nc
x =1
7
e.g. 3a = 6 (3a ) =18a
x=2
n n
Theorem: Let c be a constant. Then cxi = c xi
x =1 i =1
3 3
e.g. ( 3x − 5 ) = 3 x − 3(5)
x =1 i =1
= 3 (1 + 2 + 3) – 15
= 18 – 15
=3
n n n n
Theorem: ( xi yi zi ) = x yi zi
i =1 i =1 i =1 i =1
i
4 4 4 4
e.g. Equation Section 1 ( x 2 + ax + 5) = x2 + ax + 5
x =1 i =1 i =1 i =1
4 4
= xi =1
2
+ a x + 4(5)
i =1
= (1 + 4 + 9 + 16) + a (1 + 2 + 3 + 4 ) + 20
= 50 + 10a
4 4 4
( X
i =1
2
+ 3i) = X 2 + 3i
i =1 i =1
4
= 4 X 2 + 3i
i =1
= 4X2 + 3(1+2+3+4)
= 4X2 + 30
n m
x
i =1 j =1
ij means that we first sum over the subscript j, using the theory for single summation and we
7 3
m) Evaluate a
j =3 i = 0
2
ij
( )p q
n
n− x
a)
n
x
x
=1
x =0
x( ) p q
n
n− x
b)
n
x
x
= np
x =0
x( x −1) ( ) p q
n
n− x
c)
n
x
x
= n(n − 1) p 2
x =0
n
a(1 − r n )
d) ar
x =1
x −1
=
1− r
Types of Data:
Data: A Set of counts or measurements
Character
Nominal, e. g. color: red, green, blue
Binary e. g. (M, F), (H, T), (0,1)
Ordinal, e. g attitude to war: agree, neutral disagree
Numeric
Discrete, e. g. number of children
Continuous. e. g. distance, time, temperature
also:
Interval, e. g. Fahrenheit temperature
Ratio (real zero), e. g distance, number of children
There is basically two types of random variables, which can be studied that the observed outcomes or
data: categorical and numerical. Categorical random variable yield categorical responses, while
numerical random variables yield numerical responses. For example, the response to the question
“Do you currently own U. S. Government Saving Bonds?” is categorical. The choices are limited to
“yes” or “no”. On the other hand, responses to questions such as ‘To how many magazines do you
currently subscribe?” or “How tall are you?” are clearly numerical. In the first case, the numerical
random variable may be constructed as discrete, while in second case, it can be thought of as
continuous.
Discrete data: are numerical responses that arise from a counting process, while continuous data
are numerical responses that arise from a measuring process. “The number of magazines subscribed
to” is an example of a discrete numerical variable, since the response takes on one of a (finite)
number of integers. On the other hand, “The height of an individual” is an example of a continuous
numerical variable, since the response can take on any value within a continuum or interval,
depending on the precision of the measuring instrument. For example, a person whose height is
6 of 131
1 7 58
reported as 67 inches may measured as 67 inches, 67 inches, or 67 inches if more
4 32 250
precise instrumentation is available. Therefore, we can see that height is a continuous phenomenon
that can take on any value within an interval.
GRAPHICAL REPRESENTATION:
Tabulation is a good method of condensing and representing data in a readily understandable form,
but many people have no taste for figures. They would prefer a way of representation where figures
could be avoided. This purpose is achieved by the presentation of statistical data in a visual form.
The visual display of statistical data in the form of points, lines, areas and other geometrical forms
and symbols, is the most general terms known as Graphical Representation. Statistical data can be
studied with this method without going through figures, presented in the form of tables.
Such visual representation can be described in the sections that follow. The basic difference between
a graph and a diagram is that a graph is a representation of data by a continuous curve, usually
shown on a graph paper while a diagram is any other one, two or three-dimensional form of visual
representation
DIAGRAMS:
Diagrammatic representation is best suited to spatial series and data split into different categories.
Whenever a comparison of the same type of data at different places is to be made, diagrams will be
the best way to do that. Diagrammatic representation has several advantages over tabular
representation of figures. Beautifully and neatly constructed diagrams are more attractive than simple
figures. Diagrams, being a visual display, leave more effective and long lasting impression on the
mind of a reader. They make unwieldy data intelligible at a glance. Comparison is made easier with
diagrams. Diagrams have some disadvantages too. Diagrams are less accurate than tables; cost
money and time and the amount of information conveyed is limited. However, this method of
representation is excessively used in business and administration.
Different types of diagrams or charts commonly used for displaying statistical data are described
below:
1) Linear or One-Dimensional Diagrams: They consist of Simple Bars, Multiple Bars and
Component Bar charts. Here the values are represented only by one dimension, generally the length
of the bar.
2) Areal or Two-Dimensional Diagrams: They consist of Rectangles, Sub-divided Rectangles and
Squares, the areas of which are proportional to the values of given quantities. This device is used to
represent data having moderately large variations.
3) Cubic or Three-Dimensional Diagrams: They are in the form of Cubes and Cylinders, whose
volumes are proportional to the values they represent. These diagrams are used when the variation
7 of 131
among the values of the data to be portrayed is so large they even the square roots of the values
concerned fail to reduce the variation appreciably.
4) Pi-Diagrams: They are in the form of circles and sectors. Here the areas of Circles or Sectors
are in proportion to the values they represent or compare.
5) Pareto-Diagrams:
6) Pictograms: They consist of pictures or small symbolic figures representing the statistical data.
Simple Bar Chart: Simple bar diagrams are made to represent geographical, historical, numerical
and the qualitative data. The vertical or horizontal bars are made to represent the data when the
difference between different quantities is not very large. The different quantities may be arranged in
ascending or descending order but the time series data (A time series consists of numerical data
collected, observed or recorded at more or less regular intervals of time each hour, day, month, quarter
or year.) are not arranged.
Urban 61.2
Suburban 28.8
Rural 10.0
0 10 20 30 40 50 60 70
Percentage of Schools
Example: Draw simple bar diagram to frequency distribution of all faults present on 70 pieces of the
article mentioned in the following table
20
15
Frequency
15 12
10 10 9
10 8
4
5 1 1
0
0
1 2 3 4 5 6 7 8 9 10
No. of faults
flaw f f%
missing weft 4 25.00
incorrect interweaving 1 6.25
hole 2 12.50
group of broken threads 9 56.25
9 of 131
hole 2
incorrect interweaving 1
missing weft 4
0 2 4 6 8 10
Example: Draw a simple bar diagram to represent the turnover of a company for 5 years
Years: 1965 1966 1967 1968 1969
Turnover (Rupees): 35,000 42,000 43,500 48,000 48,500
60
48 48.5
50 42 43.5
40 35
Thousand
30
20
10
0
1965 1966 1967 1968 1969
Year
Multiple Bar Chart: A multiple bar chart shows two or more characteristic
corresponding to the values of a common variable in the form of grouped bars, whose lengths
are proportional to the values of the characteristics, and each of which is shaded or coloured
differently to aid identification. This is a good device for the comparison of two or more
kinds of information. For example, imports, exports and productions of a country can be
compared from year to year by grouping the three bars together.
Example: Draw multiple bar charts to show the area and production of cotton in the
Punjab from the following data:
3420
3233
4000
2866
Production
2229
3000
1937
1588
2000 Area
Production
1000
0
1965-66 1970-71 1975-76
Year
80
60 Female
31 28 30
40 Male
19
20 33 32 35
21
0
Peshawar Rawalpindi Sargodha Lahore
Example: Place the data for female and male hair colour together in a component
and multiple parts frequency bar chart and a consecutive-parts frequency bar chart.
Frequency
20 Male
15 14 20 Female
10 10
5 8 4
3 4 1
0
Black Blonde Brown Red
Colour
The necessary computations required for the drawing of sub-divided rectangles are given
below and the diagram is shown:
Family A Family B
Items of
Actual Percentage Actual Percentage
Expenditure
Expenses Expenses Expenses Expenses
Food 24 60.0 60 50.0
Clothing 4 10.0 14 11.7
House Rent 4 10.0 16 13.3
Education 3 7.5 6 5.0
Litigation 2 5.0 10 8.3
Conventional 1 2.5 6 5.0
12 of 131
Miscellaneous 2 5.0 8 6.7
Total 40 100.0 120 100.0
Example: The following table shows the number of employees in a certain Textile Mills.
Represent the data by means of a pictogram.
Pi-Chart
Misc.
23% Food
34%
Fuel and
Light
10%
Clothing
House Rent 20%
13%
Profit and Loss Chart: This is virtually a percentage component bar chart in which
profits can be shown above the normal base line and losses below the base line. Since the
bars are to be extended from the zero line to show losses, we start from the top. For an
illustration, the following data are represented:
1. Divide each observation in the data set into two parts, the Stem and the Leaf.
2. List the stems in order in a column, starting with the smallest stem and ending with the
largest.
3. Proceed through the data set, placing the leaf for each observation in the appropriate stem
row.
Depending on the data, a display can use one, two or five lines per stem. Among the different
stems, two-line stems are widely used.
Example 2.5 The quantity of glucose in blood of 100 persons is measured and recorded in
Table2.0b (unit is mg%). Using SPSS we obtain the following Stem-and-Leaf display for this data
set.
70 79 80 83 85 85 85 85 86 86
86 87 87 88 89 90 91 91 92 92
93 93 93 93 94 94 94 94 94 94
95 95 96 96 96 96 96 97 97 97
97 97 98 98 98 98 98 98 100 100
101 101 101 101 101 101 102 102 102 103
103 103 103 104 104 104 105 106 106 106
106 106 106 106 106 106 106 107 107 107
107 108 110 111 111 111 111 111 112 112
112 115 116 116 116 116 119 121 121 126
Table 2.0b Quantity of glucose in blood of 100 students (unit: mg%)
15 of 131
GLUCOSE
1.00 7 . 9
2.00 8 . 03
11.00 8 . 55556667789
15.00 9 . 011223333444444
18.00 9 . 556666677777888888
18.00 10 . 001111112223333444
16.00 10 . 5666666666677778
9.00 11 . 011111222
6.00 11 . 566669
2.00 12 . 11
Stem width: 10
The stem and leaf display of Figure partitions the data set into 12 classes corresponding to 12
stems. Thus, here two-lines stems are used. The number of leaves in each class gives the class
frequency.
Advantages of a stem and leaf display over a frequency distribution (considered in the next
section):
17 665
18 809
19 85
20 688
21 836
22 209
23 4487
24 4238
25 467
26 488647478
27 98965853
Pareto Diagram 28 96869
Purposes:
Pareto diagrams are named after Vilfredo Pareto, an Italian sociologist and economist, who
invented this method of information presentation toward the end of the 19th century. The
chart is similar to the histogram or bar chart, except that the bars are arranged in decreasing
order from left to right along the abscissa. The fundamental idea behind the use of Pareto
diagrams for quality improvement is that the first few (as presented on the diagram)
contributing causes to a problem usually account for the majority of the result. Thus,
targeting these "major causes" for elimination results in the most cost-effective improvement
scheme.
How to Construct:
17 of 131
1. Determine the categories and the units for comparison of the data, such as frequency, cost,
or time.
2. Total the raw data in each category, then determine the grand total by adding the
totals of each category.
3. Re-order the categories from largest to smallest.
4. Determine the cumulative percent of each category (i.e., the sum of each category
plus all categories that precede it in the rank order, divided by the grand total and
multiplied by 100).
5. Draw and label the left-hand vertical axis with the unit of comparison, such as
frequency, cost or time.
6. Draw and label the horizontal axis with the categories. List from left to right in rank
order.
7. Draw and label the right-hand vertical axis from 0 to 100 percent. The 100 percent
should line up with the grand total on the left-hand vertical axis.
8. Beginning with the largest category, draw in bars for each category representing the
total for that category.
9. Draw a line graph beginning at the right-hand corner of the first bar to represent the
cumulative percent for each category as measured on the right-hand axis.
10. Analyze the chart. Usually the top 20% of the categories will comprise roughly 80%
of the cumulative total.
Tips:
Exercise:
Construct a Pareto diagram from the data given in the table below.
HISTOGRAM
A Histogram is used to display in bar graph format measurement data distributed by categories.
A HISTOGRAM IS USED FOR:
1. Making decisions about a process, product, or procedure that could be improved after examining the
variation (example: Should the school invest in a computer-based tutoring program for low achieving
students in Algebra I after examining the grade distribution?; are more shafts being produced out of
specification that are too big rather than too small?)
19 of 131
2. Displaying easily the variation in the process (example: Which units are causing the most difficulty
for students?; is the variation in a process due to parts that are too long or parts that are too short?)
1. Gather and tabulate data on a process, product, or procedure. This could be time, weight, size,
frequency of occurrences, test scores, GPA's, pass/fail rates, number of days to complete a cycle,
diameter of shafts built, etc.
2. Calculate the range of the data by subtracting the smallest number in the data set from the largest.
Call this value R.
3. Decide about how many bars (or classes) you want to display in your eventual histogram. Call this
number K. This number should never be less than four and seldom exceeds 12. With 100 numbers,
K=7 generally works well. With 1000 pieces of data, K=11 works well.
4. Determine the fixed width of each class by dividing the range, R by the number of classes K. This
value should be rounded to a "nice" number, generally a number ending in a zero. For example 11.3
would not be a "nice" number. 10 would be considered a "nice" number. Call this number i, for
interval width. It is important to use "nice" numbers else the histogram created will have wierd scales
on the X axis.
5. Create a table of upper and lower class limits. Add the interval width i to the first "nice" number less
than the lowest value in the data set to determine the upper limit of the first class. This first "nice"
number becomes the lowest lower limit of the first class. The upper limit of the first class becomes
the lower limit of the second class. Adding the internal width (i) to the lower limit of the second class
determines the upper limit for the second class. Repeat this process until the largest upper limit
exceeds the biggest piece of data. You should have approximately K classes or categories in total.
6. Sort, organize, or categorize the data in such a way that you can count or tabulate how many pieces
of data fall into each of the classes or categories in your table above. These are the frequency counts
and will be plotted on the Y axis of the histogram.
7. Create the framework for the horizontal and vertical axes of the histogram. On the horizontal axis
plot the lower and upper limits of each class determined above. The scale on the vertical axis should
run from zero to the first "nice" number greater than the largest frequency count determined above.
8. Plot the frequency data on the histogram framework by drawing vertical bars for each class. The
height of each bar represents the number or
9. Frequency of values occurring between the lower and upper limits of that class.
10. Interpret the histogram for skew and clustering problems:
EXAMPLE
The data below are the spelling test scores for 20 students on a 50 word-spelling test. The scores
(number correct) are: 48, 49, 50, 46, 47, 47, 35, 38, 40, 42, 45, 47, 48, 44, 43, 46, 45, 42, 43, 47.
The largest number is 50 and the smallest is 35. Thus, the range, R = 15. We will use 5 classes, so
K=5. The interval width i= R/K = 15/5=3.
20 of 131
The we will make our lowest lower limit, the lower limit for the first class 35. Thus the first upper
limit is 35+3 or 38. The second class will have a lower limit of 38 and an upper limit of 41. The
completed table (with frequencies tabulated) will look like the following:
Class Lower Limit Upper Limit Frequency
1 35 38 1
2 38 41 2
3 41 44 4
4 44 47 5
5 47 50 8
In 1977, John Tukey published an efficient method for displaying a five-number data summary. The graph is
called a boxplot (also known as a box and whisker plot).
Box and whisker diagrams are often used to display statistically analyzed data. The
traditional box and whisker diagram displays the range (maximum to minimum) of the
data, the median, and the 1st and 3rd quartile about the median. The second quartile is also
the median. A review of quartiles is provided below. An alternate form of the box and
whisker diagram is to show the mean and 1 standard deviation about the mean. This latter
form of the box and whisker diagram is easier to compute. This document will describe
the two variations of the box and whisker diagram.
21 of 131
This simplest possible box plot displays the full range of variation (from min to max), the likely
range of variation (the IQR), and a typical value (the median). Not uncommonly real datasets will
display surprisingly high maximums or surprisingly low minimums called outliers. John Tukey has
provided a precise definition for two types of outliers:
• Outliers are either 3×IQR or more above the third quartile or 3×IQR or more below the first quartile.
• Suspected outliers are slightly more central versions of outliers: either 1.5×IQR or more above the
third quartile or 1.5×IQR or more below the first quartile.
If either type of outlier is present the whisker on the appropriate side is taken to 1.5×IQR from the quartile
(the "inner fence") rather than the max or min, and individual outlying data points are displayed as unfilled
circles (for suspected outliers) or filled circles (for outliers). (The "outer fence" is 3×IQR from the quartile.)
22 of 131
IQR = 1.35
Suspected outliers are not uncommon in large normally distributed datasets (say more than 100
data-points). Outliers are expected in normally distributed datasets with more than about 10,000
data-points. Here is an example of 1000 normally distributed data displayed as a boxplot:
Note that outliers are not necessarily "bad" data-points; indeed they may well be the most
important, most information rich, part of the dataset. Under no circumstances should they be
23 of 131
automatically removed from the dataset. Outliers may deserve special consideration: they may be
the key to the phenomenon under study or the result of human blunders.
A box and whisker plot is based on medians. The first step is to rewrite the data in order, from smallest
length to largest:
Now find the median of all the numbers. Notice that since there are 13 numbers, the middle one will be the
seventh number:
This must be the median (middle number) because there are six numbers on each side.
The next step is to find the lower median. This is the middle of the lower six numbers. The exact centre is
half-way between 8 and 9 ... which would be 8.5
Now find the upper median. This is the middle of the upper six numbers. The exact centre is half-way
between 14 and 14 ... which must be 14
Now you are ready to construct the actual box & whisker graph. First you will need to draw an ordinary
number line that extends far enough in both directions to include all the numbers in your data:
First, locate the main median 12 using a vertical line just above your number line:
24 of 131
Now locate the lower median 8.5 and the upper median 14 with similar vertical lines:
Next, draw a box using the lower and upper median lines as endpoints:
Finally, the whiskers extend out to the data's smallest number 5 and largest number 20:
The shading below, as an example, shows the quarter of the numbers that are between 12 and 14:
Here is a picture of the quarter of the data that is between 8.5 and 12. Notice that the data is more spread out
here:
This picture is showing where half the data numbers are. Half of all the fish caught had a length between
8.5 and 14 centimetres:
Id No. Current Salary Salary Begin
1 $57,000 $27,000
2 $40,200 $18,750
25 of 131
3 $21,450 $12,000
Q. Employees’ salary data from SPSS data Editor is
4 $21,900 $13,200 given here construct the box-and-whisker plot by using
5 $45,000 $21,000 SPSS program
6 $32,100 $13,500 61 $22,500 $9,750
7 $36,000 $18,750 62 $48,000 $21,750
8 $21,900 $9,750 63 $55,000 $26,250
9 $27,900 $12,750 64 $53,125 $21,000
10 $24,000 $13,500 65 $21,900 $14,550
11 $30,300 $16,500 66 $78,125 $30,000
12 $28,350 $12,000 67 $46,000 $21,240
13 $27,750 $14,250 68 $45,250 $21,480
14 $35,100 $16,800 69 $56,550 $25,000
15 $27,300 $13,500 70 $41,100 $20,250
16 $40,800 $15,000 71 $82,500 $34,980
17 $46,000 $14,250 72 $54,000 $18,000
18 $103,750 $27,510 73 $26,400 $10,500
19 $42,300 $14,250 74 $33,900 $19,500
20 $26,250 $11,550 75 $24,150 $11,550
21 $38,850 $15,000 76 $29,250 $11,550
22 $21,750 $12,750 77 $27,600 $11,400
23 $24,000 $11,100 78 $22,950 $10,500
24 $16,950 $9,000 79 $34,800 $14,550
25 $21,150 $9,000 80 $51,000 $18,000
26 $31,050 $12,600 81 $24,300 $10,950
27 $60,375 $27,480 82 $24,750 $14,250
28 $32,550 $14,250 83 $22,950 $11,250
29 $135,000 $79,980 84 $25,050 $10,950
30 $31,200 $14,250 85 $25,950 $17,100
31 $36,150 $14,250 86 $31,650 $15,750
32 $110,625 $45,000 87 $24,150 $14,100
33 $42,000 $15,000 88 $72,500 $28,740
34 $92,000 $39,990 89 $68,750 $27,480
35 $81,250 $30,000 90 $16,200 $9,750
36 $31,350 $11,250 91 $20,100 $11,250
37 $29,100 $13,500 92 $24,000 $10,950
38 $31,350 $15,000 93 $25,950 $10,950
39 $36,000 $15,000 94 $24,600 $10,050
40 $19,200 $9,000 95 $28,500 $10,500
41 $23,550 $11,550 96 $30,750 $15,000
42 $35,100 $16,500 97 $40,200 $19,500
43 $23,250 $14,250 98 $30,000 $15,000
44 $29,250 $14,250 99 $22,050 $10,950
45 $30,750 $13,500 100 $78,250 $27,480
46 $22,350 $12,750 101 $60,625 $22,500
47 $30,000 $16,500 102 $39,900 $15,750
48 $30,750 $14,100 103 $97,000 $35,010
49 $34,800 $16,500 104 $27,450 $15,750
50 $60,000 $23,730 105 $31,650 $13,500
51 $35,550 $15,000
52 $45,150 $15,000
53 $73,750 $26,250
54 $25,050 $13,500
55 $27,000 $15,000
56 $26,850 $13,500
57 $33,900 $15,750
58 $26,400 $13,500
59 $28,050 $14,250
60 $30,900 $15,000
26 of 131
$140,000
29
$120,000
32
18
$100,000
103
34
35
29 71 100
$80,000
66
$60,000
32
34
$40,000
103
71
35
88
66
$20,000
$0
▪ The 4 M’s:
• Methods, Machines, Materials, Manpower
▪ The 4 P’s:
• Place, Procedure, People, Policies
▪ The 4 S’s:
• Surroundings, Suppliers, Systems, Skills
Note: You may use one of the four categories suggested, combine them in any fashion or
make up your own. The categories are to help you organize your ideas.
This tool is called a fishbone diagram because it looks like the skeleton of a fish. The purpose of the
fishbone diagram is to get to the main causes for something. A fishbone diagram can help to figure out why
a process works well. The fishbone diagram could also help to explain outcomes like grades; it is a way to
look at a process. This tool helps to show a cause effect relationship.
28 of 131
Fishbone diagrams work with cause and effect. They help to discover the root cause or main cause.
Fishbone diagrams help to get to the bottom of things. They solve the mystery of why?
By using the fishbone diagram, the problem will be addressed in a systematic, step-by-step way. This helps
to think through what needs to be done.
Fishbone Diagram
Purpose:
To identify all of the possible factors that contribute to a problem, i.e. the "effect."
Guidelines
Clearly describe the problem, i.e. the "effect," to be diagrammed. (For example: files out of
place, too many students in line, or job cost above estimate.)
Four often-used categories are people, equipment, methods and materials. These categories are only
suggestions. The team may use any category that helps them think creatively.
2. Draw a box around each category with an arrow pointing at the effect arrows.
3. Brainstorm the detailed factors that contribute to the problem (i.e. the "effect"). Ask for each factor,
"what causes this cause (i.e. factor)?" These are written on the diagram and connected to the
appropriate main category with arrows.
4. Each cause may have sub-causes, which should be shown on the diagram. Continue to ask "why" in
order to identify root causes.
Exercise # 2
Q. 1 During 1995 a nationwide auto-leasing company sold 15,000 cars to the individuals who had originally leased
them. The types of cars involved are shown in the following frequency distribution.
Q. 2 It has been estimated that the number of barrels of oil in the Western Hemisphere, excluding Alaska, is given the pie
chart shown in the given figure. Assuming that there are 130 billion barrels of oil in reserve, answer the following:
a. How many barrels are in reserve in the United States?
b. How many barrels are there in Mexico?
c. How many barrels are there in Canada?
United States
29.7%
South America
Mexico
17.8%
46.1%
Canada
6.3%
Multiple Choices:
For Questions 1-3, refer to the following circle graph that gives the place of origin of the 150,000 immigrants
living in one of a large city in the southeast.
Russia China
10% 28%
Poland
7%
Other countries
6%
France
23% Italy Belgium
14% 12%
a. Draw a bar graph and pie chart to picture this information for both dogs and cats
b. Which graph makes comparisons easier? Explain your answer.
Q. 4 In the 1992 Gas Mileage Guide published by the U.S. Department of Energy, an estimate of 27 miles per gallon (mpg) is
given for the Ford Ranger pickup truck equipped with a 2.3-litter engine and two-wheel drive, five-speed transmission.
These estimates are based on repeated tests of the vehicles. In one survey of 50 Ford Ranger pickup trucks, the following
mpg results were obtained.
31 24 29 30 31 26 26 29 31
25 23 32 27 32 28 29 27 30
27 27 33 26 28 27 27 28
28 26 25 28 26 29 24 27
33 28 28 24 24 30 29 26
24 30 31 34 27 32 33 28
Q. 5 A scientist from the Environmental Protection Agency took samples of the toxic substance polychlorinated biphenyl
(PCB) levels from the soil at 60 different waste disposal facilities located throughout the United States. The following
results (in 0.0001 grams per kilogram of soil) were obtained:
38.8 35.6 31.8 32.8 36.3 40.2 39.7 33.9 34.4 33.1 39.3 34.8
35.3 38.1 35.7 39.1 37.8 39.5 36.4 38.6 37.6 37.8 31.7 35.7
39.1 35.8 38.4 34.5 37.9 38.2 38.3 40.1 38.8 33.9 30.8 37.6
31.8 32.4 35.9 36.1 38.1 37.6 36.7 30.8 37.8 35.5 39.8 36.9
40.2 33.8 34.7 39.0 36.0 37.3 31.4 31.7 32.9 30.7 37.5 31.8
Data Array: The arrangement of raw data by observations in either ascending or descending order.
Data Point: A single observation from a data set.
Ogive: A graph of a cumulative frequency distribution.
Relative Frequency Distribution: The display of a data set that shows the fraction or percentage of the total data set that
falls into each of a set of mutually exclusively and collectively exhaustive classes
Representative Sample: A sample that contains the relevant characteristics of the population in the same proportions, as
they are included in that population.
Q. 6
Circle the correct answer or fill in the blank.
32 of 131
1. In comparison to a data array, the frequency distribution has advantage of representing data in compressed
form. T F
1. A histogram is a series of rectangles, each proportional in width to the number of items falling within a specific
class of data. T F
1. The classes in any relative frequency distribution are both all-inclusive and mutually exclusive. T F
1. When a sample contains the relevant characteristics of a certain population in the same proportions, as they are
included in that population, the sample is said to be representative sample. T F
1. A population is a collection of all the elements we are studying. T F
1. Before information is arranged and analyzed, using statistical methods, it is known as preprocessed data. T F
1. One disadvantage of the data array is that it does not allow us to easily find the highest and lowest value in the
data set. T F
1. Discrete data can be expressed only in whole numbers. T F
1. As a general rule, statisticians regard a frequency distribution as incomplete if it has fewer than 20 classes. T
F
1. It is always possible to construct a histogram from a frequency polygon. T F
1. Arranging raw data in order of time of observation forms a data array. T F
1. A baseball player’s batting average is computed using a sample. T F
1. The class widths of a frequency distribution are of equal size. T F
1. Which of the following represents the most accurate scheme of classifying data?
(a) Quantitative methods.
(b) Qualitative methods
(c) A combination of quantitative and qualitative methods
(d) A scheme can be determined only with specific information about the situation.
1. Which of the following is not an example of compressed data
(a) Frequency distribution
(b) Data array
(c) Histogram
(d) Ogive.
1. Why is it true that classes in frequency distributions are all-inclusive?
(a) No data point falls into more than one class
(b) There are always more classes than data points.
(c) All data fit into one class or another
(d) (a) and (c) but not (b).
1. When constructing a frequency distribution, the first step is
(a) Divide the data into at least five classes
(b) Sort the data points into classes and count the number of points in each class
(c) Decide on the type and number of classes for dividing the data.
(d) None of these.
1. As a general rule, statisticians tend to use which of the following number of classes when arranging data?
Q.7 The given table shows the 100 count-test results, which have been entered into a frequency distribution. Present the
above frequency distribution in frequency polygon and histogram
Answers:
Summary Statistics
1. Last year a small statistical consulting company paid each of its five statistical clerks $22,000, two
statistical analysts $50,000 each, and the senior statistician/owner $270,000. The number of
employees earning less than the mean salary is:
a. 0
b. 4
c. 5
d. 6
e. 7 (e)
2. The following table represents the relative frequency of accidents per day in a city.
Accidents 0 1 2 3 4 or more
Relative 0.55 0.20 0.10 0.15 0
Frequency
37 of 131
Which of the following statements are true?
a. I only
b. II only
c. III only
d. I, II and III
e. I, II (c)
3. During the past few months, major league baseball players were in the process of negotiating with
the team owners for higher minimum salaries and more fringe benefits. At the time of the
negotiations, most of the major league baseball players had salaries in the $100,000 · $150,000 a
year range. However, there were a handful of players who, via the free agent system, earned nearly
three million dollars per year. Which measure of central tendency of players' salaries, the mean or the
median, might the players have used in an attempt to convince the team owners that they (the
players) were deserving of higher salaries and more fringe benefits?
a. Not enough information is given to answer the question.
b. Either one, since all measures of central tendency are basically the same.
c. Mean.
d. Median.
e. Both the mean and the median.
a. 4.67, 3.82
b. 3.82, 4.67
c. 4.67, 1.95
d. 1.95, 4.67
e. 4.67, 1.84 (c)
6. The effect of acid rain upon the yield of crops is of concern in many places. In order to determine
baseline yields, a sample of 13 fields was selected, and the yield of barley (g/400m2) was
determined. The output from SAS appears below:
QUANTILES(DEF=4) EXTREMES
38 of 131
N 13 SUM WGTS 13 100% MAX 392 99% 392 LOW HIGH
MEAN 220.231 SUM 2863 75% Q3 234 95% 392 161 225
STD DEV 58.5721 VAR 3430.69 50% MED 221 90% 330 168 232
SKEW 2.21591 KURT 6.61979 25% Q1 174 10% 163 169 236
USS 671689 CSS 41168.3 0% MIN 161 5% 161 179 239
CV 26.5958 STD MEAN 16.245 1% 161 205 392
The mean, standard deviation, median, and the highest value are:
QUANTILES(DEF=4) EXTREMES
N 24 SUM WGTS 24 100% MAX 22.6 99% 22.6 LOW HIGH
MEAN 9.09 SUM 218.3 75% Q3 11.45 95% 22.52 0.7 15.1
STD DEV 6.64 VARIANCE 44.0 50% MED 8.15 90% 21.8 1 19.8
SKEWNE 0.924 KURTO -0.0209 25% Q1 3.775 10% 1.6 2.2 21.3
USS 2998 CSS 1012.73 0% MIN 0.7 5% 0.77 2.2 22.3
CV 72 STD MEAN 1.35 1% 0.7 2.8 22.6
T:MEAN=0 6.7153 PROB>|T| 0.0001 RANGE 21.9
SGN RANK 150 PROB>|S| 0.0001 Q3-Q1 7.675
The mean, standard deviation, tenth percentile, and the highest value are:
a. 170, 169
b. 170, 170
c. 169, 170
d. 176, 169
e. 176, 176 (a)
39 of 131
9. If most of the measurements in a large data set are of approximately the same magnitude except for
a few measurements that are quite a bit larger, how would the mean and median of the data set
compare and what shape would a histogram of the data set have?
a. The mean would be smaller than the median and the histogram would be skewed with a long
left tail.
b. The mean would be larger than the median and the histogram would be skewed with a long
right tail.
c. The mean would be larger than the median and the histogram would be skewed with a long
left tail.
d. The mean would be smaller than the median and the histogram would be skewed with a long
right tail.
e. The mean would be equal to the median and the histogram would be symmetrical. (b)
10. In measuring the centre of the data from a skewed distribution, the median would be preferred over
the mean for most purposes because:
a. the median is the most frequent number while the mean is most likely
b. the mean may be too heavily influenced by the larger observations and this gives too high an
indication of the centre
c. the median is less than the mean and smaller numbers are always appropriate for the centre
d. the mean measures the spread in the data
e. the median measures the arithmetic average of the data excluding outliers. (b)
11. In general, which of the following statements is FALSE?
a. The sample mean is more sensitive to extreme values than the median.
b. The sample range is more sensitive to extreme values than the standard deviation.
c. The sample standard deviation is a measure of spread around the sample mean.
d. The sample standard deviation is a measure of central tendency around the median.
e. If a distribution is symmetric, then the mean will be equal to the median. (d)
12. The frequency distribution of the amount of rainfall in December in a certain region for a period of
30 years is given below:
Rainfall Number
(in inches) of years
2.0 - 4.0 3
4.0 - 6.0 6
6.0 - 8.0 8
8.0 - 10.0 8
10.0 - 12.0 5
a. 7.30
b. 7.25
c. 7.40
d. 8.40
e. 6.50 (c)
13. A consumer affairs agency wants to check the average weight of a new product on the market. A
random sample of 25 items of the product was taken and the weights (in grams) of these items were
classified as follows:
a. 83.00
b. 75.00
c. 83.75
d. 18.75
e. 84.50 (c)
14. A random sample of 40 smoking people is classified in the following table:
Ages Frequency
10 - 20 4
20 - 30 6
30 - 40 12
40 - 50 10
50 - 60 8
Total 40
a. 4.5
b. 8.0
c. 34.5
d. 38.0
e. 1520.0 (not available)
15. A frequency distribution of weekly wages for a group of employees is given below:
a. $112.50
b. $125.00
c. $105.41
d. $117.13
e. $118.50 (not available)
16. Consider the following cumulative relative frequency distribution:
Less than
or equal to Cum. rel. freq.
5.0 0.23
10.0 0.34
15.0 0.41
41 of 131
20.0 1.00
If this distribution is based on 800 observations, then the frequency in the second interval is:
a. 34
b. 272
c. 80
d. 88
e. 456 (d)
Class Frequency
0-5 8
5 -10 2
10-15 6
15-20 8
20-25 5
25-30 5
30-35 0
35-40 1
Number of Number of
Bacteria Samples
20-30 5
30-40 20
40-50 15
50-60 5
42 of 131
The 80th percentile is approximately:
a. 45
b. 47
c. 80
d. 48
e. 36 (b)
21. Recently, the City of Winnipeg has been criticized for its excessive discharges of untreated sewage
into the Red River. A microbiologist take 50 samples of water downstream from the treated sewage
outlet and measures the number of coliform bacteria present. A summary table is as follows:
Number of Number of
Bacteria Samples
50-60 5
60-70 20
70-80 10
80-90 10
a. 70
b. 71
c. 66
d. 76
e. 65 (c)
22. Using the same data as in the previous question, the 75th percentile is approximately:
a. 76.5
b. 77.5
c. 75.0
d. 78.5
e. 78.0 (b)
23. A sample of 99 distances has a mean of 24 feet and a median of 24.5 feet. Unfortunately, it has just
been discovered that an observation which was erroneously recorded as "30" actually had a value of
"35". If we make this correction to the data, then:
a. the mean remains the same, but the median is increased
b. the mean and median remain the same
c. the median remains the same, but the mean is increased
d. the mean and median are both increased
e. we do not know how the mean and median are affected without further calculations; but the
variance is increased. (c)
24. The term test scores of 15 students enrolled in a Business Statistics class were recorded in ascending
order as follows:
4, 7, 7, 9, 10, 11, 13, 15, 15, 15, 17, 17, 19, 19, 20
After calculating the mean, median, and mode, an error is discovered: one of the 15's is really a 17.
The measures of central tendency which will change are:
where L indicates that the earthquake had an intensity below 4.0 and a H indicates that the
earthquake had an intensity above 9.0. The median earthquake intensity of the sample is:
where L indicates that the earthquake had an intensity below 4.0 and a H indicates that the
earthquake had an intensity above 9.0. One measure of central tendancy is the x% trimmed mean
computed after trimming x% of the upper values and x% of the bottom values. The value of the 20%
trimmed mean is:
The n's represent children who had not put the jacket on after 120 seconds (in which case the
children were allowed to stop). Which of the following would be the best value to use as the
"typical" time required to put on the jacket?
Group A Group B
sample size 45 30
sample mean 1000 lbs 800 lbs
sample std. dev 80 lbs 70 lbs
a. Group A is less variable than Group B because Group A's std. deviation is larger.
b. Group A is relatively less variable than Group B because Group A's coefficient of variation
(the ratio of the standard deviation to the mean) is smaller
c. Group A is less variable than Group B because the std deviation per animal is smaller.
d. Group A is relatively more variable than Group B since the sample mean is larger.
e. Group A is more variable than Group B since the sample size is larger. (b)
32. "Normal" body temperature varies by time of day. A series of readings was taken of the body
temperature of a subject. The mean reading was found to be 36.5°C with a standard deviation of
0.3°C. When converted to °F, the mean and standard deviation are: (°F = °C(1.8) + 32).
a. 97.7, 32
b. 97.7, 0.30
c. 97.7, 0.54
d. 97.7, 0.97
e. 97.7, 1.80 (c)
33. A scientist is weighing each of 30 fish. She obtains a mean of 30 g and a standard deviation of 2 g.
After completing the weighing, she finds that the scale was misaligned, and always under reported
every weight by 2 g, i.e. a fish that really weighed 26 g was reported to weigh 24 g. What is mean
and standard deviation after correcting for the error in the scale? [Hint: recall that the mean measures
central tendency and the standard deviation measures spread.]
45 of 131
a. 28 g, 2 g
b. 30 g, 4 g
c. 32 g, 2 g
d. 32 g, 4 g
e. 28 g, 4 g (c)
34. A researcher wishes to calculate the average height of patients suffering from a particular disease.
From patient records, the mean was computed as 156 cm, and standard deviation as 5 cm. Further
investigation reveals that the scale was misaligned, and that all reading are 2 cm too large, e.g., a
patient whose height is really 180 cm was measured as 182 cm. Furthermore, the researcher would
like to work with statistics based on metres. The correct mean and standard deviation are:
a. 1.56m, .05m
b. 1.54m, .05m
c. 1.56m, .03m
d. 1.58m, .05m
e. 1.58m, .07m (b)
35. Rainwater was collected in water collectors at thirty different sites near an industrial basin and the
amount of acidity (pH level) was measured. The mean and standard deviation of the values are 4.60
and 1.10 respectively. When the pH meter was recalibrated back at the laboratory, it was found to be
in error. The error can be corrected by adding 0.1 pH units to all of the values and then multiply the
result by 1.2. The mean and standard deviation of the corrected pH measurements are:
a. 5.64, 1,44
b. 5.64, 1.32
c. 5.40, 1.44
d. 5.40, 1.32
e. 5.64, 1.20 (b)
36. Which of the following statements is NOT true?
a. In a symmetric distribution, the mean and the median are equal.
b. The first quartile is equal to the twenty-fifth percentile.
c. In a symmetric distribution, the median is halfway between the first and the third quartiles.
d. The median is always greater than the mean. (d)
e. The range is the difference between the largest and the smallest observations in the data set.
37. An experiment was conducted where a person's heart rate was measured 4 times in the space of 10
minutes. This was repeated on a sample of 20 people. Which of the following is not correct?
a. The standard deviation within subjects refers to the repeated measurements of a single
person's heart rate.
b. The standard deviation among subjects refers to the variation in heart rates among different
people.
c. The variation among subjects was larger than the variation within subjects.
d. The variation in heart rates based on measurements taken for 30 seconds was larger than the
variation of heart rates based on measurements taken for 15 seconds.
e. The average of the heart rate computed from the 15 seconds measuring period was about the
same as the average of the heart rates computed from the 30 second measurement periods.
(d)
38. Here is a summary graph of complex carbohyrates for each of the three fibre groups in the cereal
dataset.
46 of 131
Presentation of Data
The device of gathering data often results in a massive volume of statistical data, which are in the form of individual
measurements or counts. It is difficult to learn anything by examining the unorganized data, which is more often confusing than
clarifying. The mass of data is therefore to be organized and condensed into a form that can be more rapidly and easily understood
and interpreted. For this purpose, techniques of classification and graphic displays are used. Techniques of the presentation of data
are
• Individual Item Data
• Classification
• Tabulation
• Frequency Distribution
• Stem and Leaf Display
• Graphical Presentation
• Diagrams
• Graphs
Individual Item Data
In individual item data the observations are listed as per individual.
e.g. Let X represent the marks of a student out of 100 in Statistics
Student Name A B C D E F G
X 86 76 45 78 90 65 55
is an example is individual item data.
Frequency Distribution:
x1 +x 2 +x 3 + . . . +x n x X i
X= = i =1
= (If data set is individual item data)
n n n
If data is discrete frequency distribution or grouped data then A. M. is given below
n
f x + f x + f x + . . . + f n xn fX
fi xi
X= 1 1 2 2 3 3 = i =1n =
f1 + f 2 + f3 + ... + f n
f i
f
i =1
The arithmetic mean is the sum of all the values divided by the total number of values.
Example # 1:
Calculate the arithmetic mean of the following data by
1. Direct Method
2. Short-cut Method
i. X: 5, 6, 8, 7, 9, 12, 15, 14, 17
ii. Y: 2.3, 1.2, 1.5, 1.9, 2.5, 3.2, 1.6, 2.6, 1.8, 2.6
48 of 131
iii.
X 4 6 8 12 15 16 18 19
f 5 6 12 8 10 9 8 5
iv.
X 1.3 1.8 2.8 3.12 5.5 4.16 3.18 4.19
f 21 25 36 24 39 45 12 10
v.
a. b. Group f Group f
0 ⎯10 5 20 ⎯40 10
10 ⎯20 12 41 ⎯60 25
20 ⎯30 14 61 ⎯80 24
30 ⎯40 13 81 ⎯95 34
40 ⎯50 16 96 ⎯102 56
50 ⎯60 10 103 ⎯125 24
60 ⎯70 11 126 ⎯145 15
Solution:
1. Direct 70 ⎯80 11 Method: 146 ⎯163 15
i. X: 5, 80 ⎯90 12 6, 8, 7, 9, 12, 15, 14, 164 ⎯170 13 17
n = 9.
X =5+6+8+7+9+12+15+14+17= 93.
X=
X = 93 = 10.333
n 9
ii. Y: 2.3, 1.2, 1.5, 1.9, 2.5, 3.2, 1.6, 2.6, 1.8, 2.6
n = 10.
Y =2.3+1.2+1.5+1.9+2.5+3.2+1.6+2.6+1.8+2.6
Y=
Y = 21.2 = 2.12
n 10
iii.
X f fX
4 5 20 fX = 781, f = 63
6 6 36
8 12 96
X=
fX =
781
= 12.39683 Ans.
12
15
8
10
96
150
f 63
16 9 144
18 8 144
19 5 95
63 781
49 of 131
iv.
X f fX
1.30
1.80
21
25
27.30
45.00
fX = 775.83, f = 223
fX
2.80 36 100.80
775.83
3.12 24 74.88 X= = = 3.4791
5.50 39 214.50 f 223
4.16 45 187.20
3.18 12 38.16
4.19 21 87.99
223 775.83
iv.
Group f X fX
fX = 21047.5, f = 216
20 ⎯40 10 30 300
41 ⎯60 25 50.5 1262.5
X=
fX =
21047.5
= 97.44
61 ⎯80 24 70.5 1692
f 216
81 ⎯95 34 88 2992
96 ⎯102 56 99 5544
103 ⎯125 24 114 2736
126 ⎯145 15 135.5 2032.5
146 ⎯163 15 154.5 2317.5
164 ⎯170 13 167 2171
216 21047.5
X =a+
D = 10 + 3 = 10.333
n 9
a =2
X =a+
D = 2 + 1.2 = 2.12
n 10
iii) iv)
x f D=X-12 fD X f D=X-3 fD
4 5 -8 -40 1.30 21 -1.70 -35.70
6 6 -6 -36 1.80 25 -1.20 -30.00
8 12 -4 -48 2.80 36 -0.20 -7.20
12 8 0 0 3.12 24 0.12 2.88
15 10 3 30 5.50 39 2.50 97.50
16 9 4 36 4.16 45 1.16 52.20
18 8 6 48 3.18 12 0.18 2.16
19 5 7 35 4.19 21 1.19 24.99
63 25 a = 12 223 106.83
a= X =a+
fD = 12 + 25 = 12.39683
f 63
X =a+
fD = 3 + 106.83 = 3.4791
f 223
a = 40
X =a+
fD = 40 + 680 = 46.5385 ____________________
f 104
X =a+
fu h = 40 + 68 10 = 46.5385
f 104
Group f X D=X-95 fD
a = 95
20 ⎯40
fD = 95 + 527.5 = 97.44
10 30 -65 -650
41 ⎯60 X =a+
f
25 50.5 -44.5 -1112.5
216
61 ⎯80 24 70.5 -24.5 -588
81 ⎯95 34 88 -7 -238
96 ⎯102 56 99 4 224
103 ⎯125 24 114 19 456
126 ⎯145 15 135.5 40.5 607.5
146 ⎯163 15 154.5 59.5 892.5
164 ⎯170 13 167 72 936
216 54 527.5
Considering
( X − a) = [( X − X ) + ( X − a)]
2 2
= [( X − X ) + ( X − a) + 2( X − a)( X − X )]
2 2
= ( X − X ) + n( X − a ) + 2( X − a) ( X − X )
2 2
= ( X − X ) + n( X − a ) . (X − X ) = 0
2 2
( X − X ) ( X − a)
2 2
n1 X + n2 X + n3 X + + nk X
Xc =
n1 + n2 + n3 + + nk
X 1, X 2 , X 3 , X k then X c , the mean for all the data, is given by
=
n Xi i
(i = 1, 2,3, , k)
n i
Question No. 1
If a sample size of 22 items f a mean of 15 and another sample size and another sample size of 18 items has a mean of
20. Find the mean of the combined sample.
n1 = 22 X 1 = 15
n2 = 18 X 2 = 20
n1 X 1 + n2 X 2
Xc =
n1 + n2
22(15) + 18(20)
=
22 + 18
690
=
40
= 17.25
Question No.2
The average salary of male employees in a firm was Rs. 520 and that of females was Rs. 420. The mean salary
1
log G = [log x1 + log x2 + + log xn ]
n
1 1
= log xi = log X
6. If Y = aX + b where a and b are any two numbers and n n a 0 , then
1
G = anti log[ log X ]
n
Y = aX +b
Proof: Considering
Y = aX + b Summing overall values
Y = (aX + b)
= a X + nb
Y = a X + nb
n n n
Y = aX +b
The Geometric Mean: G.M. or G.
The geometric mean, G, of a set of n positive values x1, x2, x3, , xn , is defined as the positive nth root of their product,
1
i.e. G.M. = n x1.x2 .x3 .....xn = ( x1.x2 .x3 .....xn ) n , where x > 0
Example: Find the geometric mean of 12, 13, 15, 16, 17, 20
1
Solution: G.M. = (12 13 15 16 17 20) 6 = 15.28043764
When n is large, the computation is of the geometric mean becomes laborious, as we have to multiply all the values and then
extract the nth root. The arithmetic is simplified by using logarithms to the base 10. Thus, taking logarithms, we get
53 of 131
1
log G = [log x1 + log x2 + + log xn ]
n
1 1
= log xi = log X
n n
1
G = anti log[ log X ]
n
It means the geometric mean is the anti-logarithm of the arithmetic mean of the logarithms of the values themselves.
For data organized into a grouped frequency distribution, having k classes with class marks x1, x2, x3, , xk and the corresponding
frequencies f1, f 2, f3, , f k , the formula for the geometric mean is given by
1
log G = [log x1f1 + log x2f2 + + log xkfk ]
n
1 1
= [ f1 log x1 + f 2 log x2 + + f k log xk ] = fi log xi
n n
1
G = anti log[ f log X ]
n
Example: The Birch Company, a manufacturer of electrical circuit boards, has manufactured the following number of
units over the past five years:
1992 1993 1994 1995 1996
12,500 13,250 14,310 15,741 17,630
Solution:
X Log(X) 1
12500 4.0969 G = anti log[
n
log X ]
13250 4.1222
20.81805
14310 4.1556 = anti log[ ]
15741 4.1970 5
17630 4.2463 = 14575.0482
20.8180
a. b. c.
X f Group f Group f
1250 20 120 ⎯141 45 170 ⎯174 35
1360 23 141 ⎯161 54 175 ⎯179 53
1564 21 161 ⎯181 47 180 ⎯184 54
1235 24 181 ⎯196 48 185 ⎯189 65
1587 25 196 ⎯203 52 190 ⎯194 120
1689 26 203 ⎯226 35 195 ⎯199 85
1798 28 226 ⎯246 42 200 ⎯199 52
1598 15 246 ⎯264 32 205 ⎯209 40
1756 19 264 ⎯270 30 210 ⎯214 25
Solution:
a.
54 of 131
X f Log(X) f log(X)
1
1250 20 3.0969 61.9382 G = anti log[
n
f log X ]
1360 23 3.1335 72.07139
640.0551
1564 21 3.1942 67.07897 = anti log[ ]
201
1235 24 3.0917 74.20001 = 1528.8107
1587 25 3.2006 80.01442
1689 26 3.2276 83.91837
1798 28 3.2548 91.13411
1598 15 3.2036 48.05365
1756 19 3.2445 61.64597
201 640.0551
b.
1
Group f X log(X) f log(X)
G = anti log[
n
f log X ]
120 ⎯141 45 130.50 2.1492 96.7149
141 ⎯161 885.2577
54 151.00 2.2068 119.1686 = anti log[ ]
161 ⎯181 47 171.00 2.2577 106.1109 385
181 ⎯196 48 188.50 2.2923 110.0283 = 199.2373
196 ⎯203 52 199.50 2.3075 119.9898
203 ⎯226 35 214.50 2.3541 82.3938
226 ⎯246 42 236.00 2.3909 100.4193
246 ⎯264 32 255.00 2.4216 77.4913
264 ⎯70 30 267.00 2.4314 72.9409
385 885.2577
c.
1 1 1 1
x +x +x + +
x1
H .M . = H = Resciprocal of 1 1 1
n
n
=
1
x
i
n
=
1
X
Harmonic mean for the frequency distribution is given by
H .M . =
f
f
X
Example: An automobile is running at the rate of 10 Km/hr during the first 60 Km; at 20 Km/hr during second 60 Km; 30
Km/hr during the third 60 Km; 40 Km/hr during the fourth 60 Km and 50 Km/hr during the last 60 Km. What would be the
average speed?
Solution:
n
X 1/X H .M . =
1
10 0.10 x
i
20 0.05
5
30 0.03 = = 21.74
.23
40 0.03
50 0.02
0.23
Example: Calculate the Harmonic Mean from the following data
a. b. c.
X f Group f Group f
1250 20 120 ⎯141 45 170 ⎯174 35
1360 23 141 ⎯161 54 175 ⎯179 53
1564 21 161 ⎯181 47 180 ⎯184 54
1235 24 181 ⎯196 48 185 ⎯189 65
1587 25 196 ⎯203 52 190 ⎯194 120
1689 26 203 ⎯226 35 195 ⎯199 85
1798 28 226 ⎯246 42 200 ⎯199 52
1598 15 246 ⎯264 32 205 ⎯209 40
Solution:
1756 19 264 ⎯270 30 210 ⎯214 25
a.
56 of 131
X f 1/X f/X
H .M . =
f
1250 20 0.000800 0.016 f
1360 23 0.000735 0.016912 X
1564 21 0.000639 0.013427
201
1235 24 0.000810 0.019433 = = 1514.7176
1587 25 0.000630 0.015753 0.132698
1689 26 0.000592 0.015394
1798 28 0.000556 0.015573
1598 15 0.000626 0.009387
1756 19 0.000569 0.01082
201 0.132698
b.
H .M . =
f
f
Group f X 1/X f/X X
170 ⎯174 35 172 0.005814 0.203488 529
= = 190.8177
175 ⎯179 53 177 0.005650 0.299435 2.772279
180 ⎯184 54 182 0.005495 0.296703
185 ⎯189 65 187 0.005348 0.347594
190 ⎯194 120 192 0.005208 0.625000
195 ⎯199 85 197 0.005076 0.431472
200 ⎯204 52 202 0.004950 0.257426
205 ⎯209 40 207 0.004831 0.193237
210 ⎯214 25 212 0.004717 0.117925
529 2.772279
Exercise # 1
Q.1 The deviation of a data about x = 22 are 0, 2, -3, -4, 6, 8 –1, 3, 0.
Q.2 A computer calculated a mean value of 42 from 20 observations. It was later discovered at the time of checking that he
had copied down two values as 45 and 38, whereas the correct values were 35 and 58. Find correct value of mean.
Q.3 The following table shows the diameters of rivets manufactured by a company.
Q.4 Find the arithmetic mean and geometric means of the series 1, 2, 4, 8, 16, . . . , 2 n. Find also the harmonic mean.
Q.5 Find (i) arithmetic mean, (ii) geometric mean, and (iii) harmonic mean of the series 1, 3, 9, 27, 81, . . . , 3n.
Q.6 The following data relate to sizes of shoes sold at a store during a given week. Find the median of the shoe. Also
calculate the quartiles, the 7th decile and the 64th percentile.
Size of 1 1 1 1 1
5 5 6 6 7 7 8 8 9 9
Shoes 2 2 2 2 2
No. of pairs 2 5 15 30 60 40 23 11 4 1
Q.7 In a group of 500 wage-earners, the weekly wages of 4% were under Rs. 60 and those of 15% were under Rs. 62.50. 15%
of the workers earned Rs. 95 and over, and 5% of them got Rs. 100 and over.
The median and quartile wages were Rs. 82.25, Rs. 72.75 and Rs. 90.50; the fourth and sixth decile wages were Rs.
78.75 and Rs. 85.25 respectively.
Put the above information in the form of a frequency distribution and estimate the mean wage of the 500 wage-earners
there from.
Q.8 The following table shows the distribution of the maximum loads in short tons supported by certain cables produced by a
company.
Max. Loads
9.8 ⎯10.2 10.3⎯10.7 10.8⎯11.2 11.3⎯11.7 11.8⎯12.2 12.3⎯12.7
(Short tons)
No. of pairs 7 12 17 14 6 4
Determine the mean, the median, and the mode.
Q.9 A professor has decided to use a weighted average in figuring grades for his seminar students. The homework average
will count for 20 percent of a student’s grade; the midterm, 25 percent; the final, 35 percent; the term paper, 10 percent;
and quizzes, 10 percent. From the following data, compute the final average for the five students in the seminar.
1 85 89
2 78 84
3 94 88
4 82 79
5 95 90
Q.11 The growth in bad-debt expense for Johnston Office Company over the last few years follows. Calculate the average
percentage increase in bad-debt expense over his time period. If this rate continues, estimate the percentage increase in
bad-debts for 1997, relative to 1995.
Hayes Textile has shown the following percentage increase in net worth over the last 5 years:
Q.13 A test “Stein Strength Test in Pounds” is made and the readings are as given below:
Construct discrete frequency distribution and also compute A.M. by short-cut method
Stein Strength Test in Pounds
Q.14
Circle the correct answer or fill in the blank.
69. The value of the every observation in the data set is taken into account when we calculate its median. T F
70. When the population is either negative or positively skewed, it is often preferable to use the median as the best measure of
location because it always lies between the mean and the mode. T F
71. Measures of central tendency in a data set refer to the extent to which the observations are scattered. T F
72. A measure of peakedness of a distribution curve is its skewness. T F
73. With ungrouped data, the mode is most frequently used as the measure of central tendency. T F
74. If we arrange the observations in a data set from highest to lowest, the data point lying in the middle of the data set. T
F
75. When working with grouped data, we may compute an approximate mean by assuming that each value in a given class is
equal to its midpoint. T F
76. The value most often repeated in a data set is called the arithmetic mean. T F
77. If the curve of a certain distribution tails off toward the left end of the measuring scale on the horizontal axis, the
distribution is said to be negatively skewed. T F
78. After grouping a set of data into a number of classes, we may identify the median class as being the one that has the largest
number of observations. T F
79. A mean calculated from grouped data always give a good estimate of the true value, although it is seldom exact.
T F
80. We can compute a mean for any data set once we are given its frequency distribution. T F
81. The mode is always found at the highest point of a graph of a data distribution. T F
82. The number of elements in a population is defined by n. T F
83. For a data array with 50% observations, the median will be the value of the 25 th observation in the array. T
F
84. Extreme values in a data set have a strong effect on the median. T F
85. The difference between the largest and smallest observations in a data set is called the geometric mean. T F
86. The dispersion of a data set gives insight into the reliability of the measure of central tendency. T F
87. The standard deviation is equal to the square root of the variance. T F
88. The difference between the highest and the smallest observations in a data set is called the geometric mean. T F
89. The interquartile range is based on only two values taken from the data set. T F
90. The standard deviation is measured in the same units as the observations in the data set. T F
91. A fractile is a location in a frequency distribution that a given population (or fraction) of the data lies at or above. T
F
92. The variance, like the standard deviation, takes into account every observation in the data set. T F
93. The coefficient of variation is an absolute measure of dispersion. T F
94. The measure of dispersion most often used by statisticians is the standard deviation. T F
95. One of the advantages of dispersion measures is that any statistic that measures absolute variation also measure relative
variation. T F
59 of 131
96. One disadvantage of using the range to measure dispersion is that it ignores the nature of the variations among most of the
observation. T F
97. The variance indicates the average distance of any observation in the data set from the mean. T F
98. Every population has a variance, which is signified by s 2. T F
99. According to Chebyshev’s theorem, no more than 11 percent of the observations in a population can have population
standard scores greater than 3 or less than –3. T F
100. The interquartile range is a specific example of an interfractile range. T F
101. It is possible to measure the range of an open-ended distribution. T F
102. The interquartile range measures the average range of the lower fourth of a distribution. T F
103. When calculating the average rate of debt expansion for a company, the correct mean to use is the
a. Arithmetic mean
b. Weighted mean
c. Geometric mean
d. Either (a) or (c).
104. The mode has all the following disadvantages except
a. A data set may have no model value.
b. Every value in a data set may be a mode.
c. A multimodal data set is difficult to analyze.
d. The mode is unduly affected by extreme values.
105. What is the major assumption we make when computing a mean from grouped data?
a. All values are discrete.
b. Every value in a class is equal to the midpoint.
c. No value occurs more than once.
d. Each class contains exactly the same number of values.
106. Which of the following statement is NOT correct?
a. Some data sets do not have means.
b. Calculation of a mean is affected by extreme data values.
c. A weighted mean should be used when it is necessary to take the importance of each value into account.
d. All these statements are correct.
107. Which of the following is the first step in calculating the median of a data set?
a. Average the middle two values of the data set.
b. Array the data.
c. Determine the relative weights of the data values in terms if importance.
d. None of these.
108. Which of the following is NOT an advantage of using a median?
a. Extreme values affect the median less strongly than they do the mean.
b. A median can be calculated for qualitative descriptions.
c. The median can be calculated for every set of data, even for all set containing open-ended classes.
d. The median is easy to understand.
e. All these are advantages of using a median.
109. Why is it usually better to calculate a mode from grouped, rather than ungrouped, data?
a. The ungrouped data tend to be bimodal.
b. The mode for the grouped data will be the same, regardless of the skewness of the distribution.
c. Extreme values have less effect on grouped data.
d. The chance of an unrepresentative value being chosen as the mode is reduced.
110. In which of these cases would the mode be most useful as an indicator of central tendency?
a. Every value in a data set occurs exactly once.
b. All but three values in a data set occur once; three values occur 100 times each.
c. Al values in a data set occur 100 times each.
d. Every observation in a data set has the same value.
111. Which of the following is an example if a parameter?
a. x.
b. n.
c.
d. All of these
e. (b) and (c), but not (a)
112. Which of the following is NOT a measure of central tendency.
a. Geometric mean
b. Median
c. Mode
d. Arithmetic mean
e. All these are measures of central tendency.
113. When a distribution is symmetrical and has one mode, the highest point on the curve is called the
a. Range
b. Mode
c. Median
60 of 131
d. Mean
e. All of these
f. (b), (c), and (d), but not (a)
114. When referring to a curve that tails off the left end, you would call it
a. Symmetrical
b. Skewed right
c. Positively skewed
d. All of these
e. None of these
115. Disadvantages of using the range as a measure of dispersion include all of the following except
a. It is heavily influenced by extreme values
b. It can change drastically from one sample to the next
c. It is difficult to calculate
d. It is determined by only two points in the data set.
116. Why is it necessary to square the differences from the mean when computing the population variance?
a. So that extreme values will not affect the calculation.
b. Because it is possible that N could be very small.
c. Some of the differences will be positive and some will be negative.
d. None of these
117. Assume that a population has = 100 and = 10. If a particular observation has a standard score of 1, it can be
concluded that
a. Its value is 110
b. It lies between 90 and 110, but its exact value cannot be determined
c. Its value is greater than 110
d. Nothing can be determined without knowing N
118. Assume that a population has = 100 and = 10, and N = 1,000. According to Chebyshev’s theorem, which of the
following situations is NOT possible?
a. 150 values are greater than 130.
b. 93 values lie between 100 and 108.
c. 22 values lie between 120 and 125.
d. 70 values are less than 90.
e. All these situations are possible.
119. Which of the following is an example of a relative measure of dispersion?
a. Standard deviation.
b. Variance.
c. Coefficient of variation.
d. All of these.
e. (a) and (b), but not (c).
120. Which of the following is true?
a. The variance can be calculated for grouped or ungrouped data.
b. The standard deviation can be calculated for grouped or ungrouped data.
c. The standard deviation can be calculated for grouped or ungrouped data, but the variance can be calculated only for
ungrouped data.
d. (a) and (b), but not (c).
121. If one were to divide the standard deviation of a population by the mean of the same population and multiply this value by
100, one would have calculated the same population and multiply this value by 100, one would have calculated the
a. Population standard score.
b. Population variance.
c. Population standard deviation.
d. Population coefficient of variation.
e. None of these.
122. How does the computation of a sample variance differ form the computation of a population variance?
a. is replaced by x .
b. N is replaced by n-1.
c. N is replaced by n.
d. (a) and (c), but not (b).
e. (a) and (b), but not (c).
123. The square of the variance of a distribution is the
a. Standard deviation.
b. Mean.
c. Range.
d. Absolute deviation.
e. (a) and (d).
f. None of these.
124. Chebyshev’s theorem says that 99 percent of the values will lie within 3 standard deviations from the mean for
61 of 131
a. Bell shaped distributions
b. Positively skewed distributions.
c. Left-tailed distributions.
d. All distributions.
e. No distributions.
125. If a curve can be divided into two equal parts that are mirror images, it is _______________________. If it cannot be
divided in this way, it is _______________________.
Measures of Dispersion:
Measures of Dispersion:
Data are required to obtain the average dimensions and the degree of dispersion so that we can determine whether it is alright to
receive or ship the lot, and whether the production process used for manufacturing the lot was suitable, or if some action must be
taken.
Products from the same production line usually differ slightly in dimension, hardness or other qualities. If, after measuring ten
samples, they were all found to measure 10.0, 10.0, 10.0, . . . , 10.0, there would be cause for doubt. We would suspect that the
measuring instrument was wrong or we might even wonder if they had ever been measured at all!. We commute to work every
day and even if we take the same route and the same vehicle we usually find that on some days the trip in exactly the same time
every day, it would require a good deal of effort. In this way, when we look at the certain amount of data we can detect some
dispersion, actually we live in a world of dispersion.
A second important property that describes a set of numerical data is variation. Variation is the amount of dispersion or spread in
the data.
There are two measures of dispersion
1. Absolute Measures of Dispersion
a. Range
b. Quartile Deviation
c. Mean Deviation
d. Standard Deviation
e. Variance
2. Relative Measure of Dispersion
a. Coefficient of Range
b. Coefficient of Quartile Deviation
62 of 131
c. Coefficient of Mean Deviation (U%)
d. Coefficient of Variation (C.V.%)
Absolute measure is in same unit whereas relative measures are in ratio form used for comparison of different data sets
Range:
It is the difference between maximum and minimum values of the data set, mathematically range is defined as
R = X m − X 0 where
X m is the largest value of the data set and
X o is the smallest value of the data set
X − Xo
Coefficient of Range= m
Xm + Xo
Quartile Deviation:
Quartile deviation is the semi-inter-quartile range i.e.
Q3 − Q1
Q.D = . Where Q3 − Q1 is the inter-quartile range.
2
Q3 − Q1
Coefficient of Quartile Deviation=
Q3 + Q1
Mean Deviation:
Mean deviation is the average of absolute deviation of data set from A.M. , mathematically can be described as
M .D =
X −X . For individual item data
n
M .D =
f X −X . For frequency distribution
f
M .D.
Coefficient of Quartile Deviation=
X
M .D.
U%= 100
X
Q. Show that
( X )
2
( X − X ) = X
2
i. 2
−
n
( D)
2
( X − X ) = D
2
ii. 2
− . Where D = X − a
n
( u ) 2
2
X −a
( X − X ) = u −
2
h . Where u =
2
iii.
n h
Solution:
i.
63 of 131
L.H .S . = X − X ( )
2
= (X 2 + X − 2X X )
2
= X 2 + nX − 2X X
2
X X
2
= X + n 2
− 2 X
n n
( X ) ( X )
2 2
=X 2
+ −2
n n
( X )
2
=X2 − = R.H .S .
n
Similarly
( fX )
2
f ( X − X ) = fX
2
2
−
f
i.
(
L.H .S . = X − X )
2
= ( D − a) − D − a ( )
2
X = D − a; X = D − a
(
= D−D )
2
( D)
2
= D2 − = R.H .S .
n
ii.
(
L.H .S . = X − X )
2
X −a X −a
= (a + hu ) − (a + hu
2
u= ; u =
h h
= (u − u ) h
2
2
( u ) 2
2
= u −
2
h = R.H .S .
n
Standard Deviation:
Standard deviation is the average of squared deviation about A.M., mathematically can be written as
( X − X )
2
X X D2 D
2 2
2
S .D. = S = = − = −
n n n n n
Where D = X − a
64 of 131
f (X − X ) = ( fX ) ( fD )
2 2 2
S .D. = S = fX 2
− = fD 2
−
f f f
( fu ) h
2
= fu −2
f
S .D.
C.V.%=Coefficient of variation= 100
X
Variance:
Variance is the square of S.D.
( X − X )
2
X X D2 D
2 2 2
Variance = Var ( X ) = S 2
= = − = −
n n n n n
f ( X − X ) = fX − fX
2 2
2
Variance = Var ( X ) = S 2
=
f f f
fD − fD = fu − fu
2 2
2 2
= h2
f f f f
S .D.
C.V.%=Coefficient of variation= 100
X
Example:
Calculate Range, Q.D., M.D., S.D and S2 from the following data, using all the discussed above methods.
Also calculate their respective relative measures
i. X: 12, 15, 16, 18, 20, 32, 18, 19, 20, 22, 23
ii.
X 12 14 16 18 20 22 24 26 28
f 8 9 5 10 12 8 4 3 7
iii.
Diameter (in inches) Frequency
47 ⎯ 49 12
50 ⎯ 52 15
53 ⎯55 16
56⎯58 19
59⎯61 20
62⎯64 20
65⎯67 22
68⎯70 14
71⎯73 15
74⎯76 13
Solution:
Range:
i. X: 12, 15, 16, 18, 20, 32, 18, 19, 20, 22, 23
X m =32 X o =12 R = X m − X 0 =32-12=20
ii.
=X 12 14 16 18 20 22 24 26 28
f 8 9 5 10 12 8 4 3 7
65 of 131
X m =28 X o =12 R = X m − X 0 =28-12=16
iii.
Diameter (in inches) Frequency X
47 ⎯ 49 12 48
50 ⎯ 52 15 51
53 ⎯55 16 54
56⎯58 19 57
59⎯61 20 60
62⎯64 20 63
65⎯67 22 66
68⎯70 14 69
71⎯73 15 72
74⎯76 13 75
ii.
X 12 14 16 18 20 22 24 26 28
f 8 9 5 10 12 8 4 3 7
c.f. 8 17 22 32 44 52 56 59 66
n = f = 180
n +1 66 + 1
Q1 = th value = th value = 16.75 th value = 14
4 4
3(n + 1) 3(66 + 1)
Q3 = th value = th value = 50.25th value = 22
4 4
Q − Q1 22 − 14
Q.D. = 3 = =4
2 2
iii.
66 of 131
Diameter
Frequency C.B. C.F.
(in inches)
47 ⎯ 49 12 46.5 ⎯ 49.5 12
50 ⎯ 52 15 49.5 ⎯ 52.5 27
53 ⎯55 16 52.5 ⎯ 55.5 43 Q1 group
56⎯58 19 55.5 ⎯ 58.5 62
59⎯61 20 58.5 ⎯ 61.5 82
62⎯64 20 61.5 ⎯ 64.5 102
65⎯67 22 64.5 ⎯ 67.5 124 Q3 group
68⎯70 14 67.5 ⎯ 70.5 138
71⎯73 15 70.5 ⎯ 73.5 153
74⎯76 13 73.5 ⎯ 76.5 166
166
h n n 166
Q1 = l + −c, = = 41.5 ,
f 4 4 4
h n 3
Q1 = l + − c = 53 + ( 41.5 − 27) ) = 89.44
f 4 16
h 3n 3n h 3n 3
Q3 = l + − c = 124.5 Q1 = l + − c = 67.5 + (124.5 − 124 ) = 67.61
f 4 4 f 4 14
Q − Q1 89.44 − 67.61
Q.D = 3 = = 10.915
2 2
Mean Deviation:
i.
X X −X X −X n = 11
12 -7.55 7.55 X − X = 38.55
16 -3.55 3.55
20 0.45 0.45
M .D. =
X − X = 38.55 = 3.50
18 -1.55 1.55 n 11
20 0.45 0.45 M .D. 3.50
23 3.45 3.45 Coefficient of M.D.= = = 0.1791
X 19.5455
15 -4.55 4.55 M .D.
18 -1.55 1.55 U%= 100 = 17.91%
X
32 12.45 12.45
19 -0.55 0.55
22 2.45 2.45
215 0.00 38.55
ii
X f fX X −X f X −X
X- X
=
12 8 96 -7.21 96.00 57.70
14 9 126 -5.21 126.00 46.91
16 5 80 -3.21 80.00 16.06
18 10 180 -1.21 180.00 12.12
20 12 240 0.79 240.00 9.45
22 8 176 2.79 176.00 22.30
24 4 96 4.79 96.00 19.15
26 3 78 6.79 78.00 20.36
28 7 196 8.79 196.00 61.52
66 1268 7.09 265.58
(X − X )
2
X X −X 67 of 131
12 -7.55 57.003
X=
fX =
1268
= 19.21
16 -3.55 12.603 f 66
20 0.45 0.2025
18 -1.55 2.4025
20 0.45 0.2025 M .D. =
f X −X 265.58
= = 4.0239
23 3.45 11.903 f 66
15 -4.55 20.703 M .D. 4.0239
Coefficient of M.D.= = = 0.2094
18 -1.55 2.4025 X 19.21
32 12.45 155 M .D.
U%= 100 = 20.94%
19 -0.55 0.3025 X
22 2.45 6.0025 iii
215 0 268.73 Diameter
(in inches)
Frequency X fX X −X X −X f X −X
47 ⎯ 49 12 48 576 -13.5723 13.57229 162.8675
50 ⎯ 52 15 51 765 -10.5723 10.57229 158.5843
=
53 ⎯55 16 54 864 -7.57229 7.572289 121.1566
56⎯58 19 57 1083 -4.57229 4.572289 86.87349
59⎯61 20 60 1200 -1.57229 1.572289 31.44578
62⎯64 20 63 1260 1.427711 1.427711 28.55422
65⎯67 22 66 1452 4.427711 4.427711 97.40964
68⎯70 14 69 966 7.427711 7.427711 103.988
71⎯73 15 72 1080 10.42771 10.42771 156.4157
74⎯76 13 75 975 13.42771 13.42771 174.5602
166 10221.00 1121.855
X=
fX = 10221 = 61.5723
f 166
M .D. =
f X − X = 1121.855 = 6.7582
f 166
M .D. 6.5782 M .D.
Coefficient of M.D.= = = 0.1068 U%= 100 = 10.68%
X 61.5723 X
Standard Deviation:
( X − X )
2
( X − X )
2
268.73
S .D. = S = = = 4.94
n 11
S .D. 4.94
C.V. = 100 = 100 = 25.27%
X 19.5455
68 of 131
X-method
X X2
12 144 X=
X = 215 = 19.5455
16 256 n 11
X X
2
20 400 2
4471 215
2
D-method
X D=X-18 D2
12 -6 36 X =a+
D = 18 + 17 = 19.5455
16 -2 4 n 11
D D
2
20 2 4 2 2
295 17
18 0 0 S .D. = S = − = − = 4.94
n n 11 11
20 2 4
S .D. 4.94
23 5 25 C.V. = 100 = 100 = 25.27%
15 -3 9 X 19.5455
18 0 0
32 14 196
19 1 1
22 4 16
17 295
69 of 131
Exercise
Q.1 Find the Quartile Deviation from the following data (i) graphically, (ii) using an appropriate formula.
Income per
41 − 50 51 − 60 61 − 70 71 − 80 81 − 90 91 − 100 Total
week (Rs.)
No. of
30 36 43 104 73 14 300
Earners
The Main Characteristics of the Mode, the Median, and the Mean
Fact
The Mode The Median The Mean
No.
It is the value of the middle point
It is the most frequent value in It is the value in a given aggregate,
of the array (not midpoint of
1 the distribution; it is the point of which would obtain if all the values
range), such that half the item are
greatest density. were equal.
above and half below it.
The value of the mode is The sum of deviations on either side
The value of the media is fixed by
established by the predominant of the mean are equal; hence, the
2 its position in the array and doesn't
frequency, not by the value in algebraic sum of the deviation is
reflect the individual value.
the distribution. equal zero.
The aggregate distance between
It is the most probable value, the median point and all the value It reflects the magnitude of every
3
hence the most typical. in the array is less than from any value.
other point.
A distribution may have 2 or
more modes. On the other hand, Each array has one and only one An array has one and only one
4
there is no mode in a median. mean.
rectangular distribution.
It cannot be manipulated Means may be manipulated
The mode does not reflect the algebraically: medians of algebraically: means of subgroups
5
degree of modality. subgroups cannot be weighted and may be combined when properly
combined. weighted.
It may be calculated even when
It cannot be manipulated It is stable in that grouping
individual values are unknown,
6 algebraically: modes of procedures do not affect it
provided the sum of the values and
subgroups cannot be combined. appreciably.
the sample size n are known.
It is unstable that it is
Value must be ordered, and may Values need not be ordered or
7 influenced by grouping
be grouped, for computation. grouped for this calculation.
procedures.
Values must be ordered and It can be compute when ends are It cannot be calculated from a
8
group for its computation. open frequency table when ends are open.
It is stable in that grouping
It can be calculated when table It is not applicable to qualitative
9 procedures do not seriously affected
ends are open. data.
it.
71 of 131
The Geometric Mean: The geometric mean (G) of n non-negative numerical values is the nth
root of the product of the n values.
If some values are very large in magnitude and others are small, then the geometric
mean is a better representative of the data than the simple average. In a "geometric
series", the most meaningful average is the geometric mean (G). The arithmetic
mean is very biased toward the larger numbers in the series.
An Application: Suppose sales of a certain item increase to 110% in the first year
and to 150% of that in the second year. For simplicity, assume you sold 100 items
initially. Then the number sold in the first year is 110 and the number sold in the
second is 150% x 110 = 165. The arithmetic average of 110% and 150% is 130% so
that we would incorrectly estimate that the number sold in the first year is 130 and
the number in the second year is 169. The geometric mean of 110% and 150% is G
= (1.65)1/2 so that we would correctly estimate that we would sell 100 (G)2 = 165
items in the second year.
The Harmonic Mean: The harmonic mean (H) is another specialized average, which
is useful in averaging variables expressed as rate per unit of time, such as mileage
per hour, number of units produced per day. The harmonic mean (H) of n non-zero
numerical values x(i) is: H = n/[ (1/x(i)].
The harmonic means is: H = 4/[(1/2.5) + (1/2.0) + 1/(1.5) + (1/6.0)] = 2.31 minutes.
If all machines working for one hour, how many parts will be produced? Since four
machines running for one hour represent 240 minutes of operating time, then: 240 /
2.31 = 104 parts will be produced.
The Order Among the Three Means: If all the three means exist, then the Arithmetic Mean is never less
than the other two, moreover, the Harmonic Mean is never larger than the other two.
Example # 1
Find the Arithmetic Mean of the following data
72 of 131
15.6, 17.0,
73 of 131
Probability:
Introduction:
The word probability has two basic meanings:
1. A quantitative measure of uncertainty
2. A measure of degree of belief in a particular statement or problem
The foundation of probability were laid by two French mathematicians of the seventeenth century-Blaise
Pascal (1623-1662) and Pierre De Fermat (1601-1665) in connection with gambling problems. Later on it
was developed by Jakob Bernoulli (1654-1705).
Visit sites:
http://www.maths.tcd.ie/pub/HistMath/People/Pascal/RouseBall/RB_Pascal.html
http://www.thocp.net/biographies/pascal_blaise.html
Q. “Use of this product maybe hazardous to your health. This product contains saccharin, which has been
determined to cause cancer in laboratory animals.” How might probability have played a part in this
statement?
Answer: Extensive tests with animals indicated (with other factors hold as constant as possible) that subjects
that consumed saccharin were more likely to develop cancer than those not exposed to saccharin.
Extrapolating these results to humans, it was concluded that consumption produces an increase risk of
cancer.
Q. A well-known soft drink company decides to alter the formula of its oldest and most popular product.
How might probability theory be involved in such a decision?
Answer: This decision involves estimates of consumer performance; brand loyally competition response and
numerous other factors, all involving uncertainty. Hence estimates are based on probabilities.
Q. In textile testing how might probability theory be involved.
Answer: In textile testing, the results of every test is based on sample e.g. in determining the count of yarn,
different samples are used not the whole cone of yarn, uncertainty arises due to sample theory which is
measured by using the concept of probability.
Basic Terminologies in Probability:
In general, probability is the chance something will happen. Probabilities are expressed as fractions
(1/6, ½, 8/9) or as decimals (0.167, 0.500, 0.899) between zero and 1.
Assuming a probability of zero means that something can never happen; a probability of 1 indicates that
something will always happen.
Experiment:
The term experiment means a planned activity or process whose results yield a set of data.
Random Experiment:
An experiment, which produces different results even though it is repeated a large number of times
under essentially similar conditions is called a random experiment.
e.g.
1. Flipping of a fair coin
2. Throwing a balanced die
3. Drawing a card from a well-shuffled deck of 52 playing cards
http://jducoeur.org/game-hist/seaan-cardhist.html
4. Selecting a sample
ix)
Introduction to Playing Cards:
Total cards = 52
Total Suits = 4
76 of 131
Suit Ace Number Cards Picture Cards Total
1. Heart A, 2, 3, 4, 5, 6, 7, 8, 9, 10, J, Q, K 13
2. Diamond A, 2, 3, 4, 5, 6, 7, 8, 9, 10, J, Q, K 13
52
3. Club A, 2, 3, 4, 5, 6, 7, 8, 9, 10, J, Q, K 13
4. Spade A, 2, 3, 4, 5, 6, 7, 8, 9, 10, J, Q, K 13
36 12
Event: 36
“An event is an individual outcome or any number of outcomes of a random experiment or a trial” or
“Any subset of a sample space S of the experiment, is called an event” i.e. A S .
It is customary to denote an event by a capital letter A, B, C, . . . etc.
Note:
1. Every set is a subset of itself. In particular S S , so the sample space S is also an event. ‘S’
is called sure event.
2. (null set, called impossible event) is an event because S .
3. Let A be an event then A or Ac is called the negation of A
Example:
All possible events of the sample space S={H, T}:
A= (impossible event)
B={H}
C={T}
D={H, T} (sure event)
Rule:
A sample space consisting of n sample points can produce 2 n different events (subsets)
Q. How many events are possible for each of the given sample space
i) One coin is tossed: S ={H, T}
ii) Two coins are flipped: S = {HH, HT, TH, TT}
iii) Three coins are flipped: S = {HHH, HHT, HTH, THH, HTT, THT, TTH, TTT}
iv) One die is rolled: S = {1, 2, 3, 4, 5, 6}
v) Two dice are rolled
Examples:-
It is not uncommon for people to confuse the concepts of mutually exclusive events and
independent events.
If event A happens, then event B cannot, or vice-versa. The two events "it rained on Tuesday" and
"it did not rain on Tuesday" are mutually exclusive events. When calculating the probabilities for
exclusive events you add the probabilities.
Independent events
The outcome of event A, has no effect on the outcome of event B. Such as "It rained on Tuesday"
and "My chair broke at work". When calculating the probabilities for independent events you
multiply the probabilities. You are effectively saying what is the chance of both events happening
bearing in mind that the two were unrelated.
To be or not to be.....?
So, if A and B are mutually exclusive, they cannot be independent. If A and B are independent,
they cannot be mutually exclusive. However, If the events were it rained today" and "I left my
umbrella at home" they are not mutually exclusive, but they are probably not independent either,
because one would think that you'd be less likely to leave your umbrella at home on days when it
rains. That fact aside use the following to understand the definition.
What happens if we want to throw 1 and 6 in any order? This now means that we do not mind if
the first die is either 1 or 6, as we are still in with a chance. But with the first die, if 1 falls
uppermost, clearly It rules out the possibility of 6 being uppermost, so the two Outcomes, 1 and 6,
are exclusive. One result directly affects the other. In this case, the probability of throwing 1 or 6
with the first die is the sum of the two probabilities, 1/6 + 1/6 = 1/3.
The probability of the second die being favourable is still 1/6 as the second die can only be one
specific number, a 6 if the first die is 1, and vice versa.
Therefore the probability of throwing 1 and 6 in any order with two dice is 1/3 x 1/6 = 1/18. Note
that we multiplied the the last two probabilities as they were independent of each other!!!
The probability of throwing a double three with two dice is the result of throwing three with the first
die and three with the second die. The total possibilities are, one from six outcomes for the first
event and one from six outcomes for the second, Therefore (1/6) * (1/6) = 1/36th or 2.77%.
The two events are independent, since whatever happens to the first die cannot affect the throw of
the second, the probabilities are therefore multiplied, and remain 1/36th.
Often when you work out the probability of an event, you sometimes do not need to work out the
probability of an event occurring you need the opposite. The probability that the event will not
occur. For example, The probability of throwing a 1 on a die is 1/6 therefore the probability of a
'non-1' is (1-1/6) which equals 5/6.
Converse probabilities are used to work out such problems such as, "What is the probability of
exactly one soccer match ending in a draw within a group of three separate matches?"
Let us assume the chance of a draw occurring in any match is 1/3 or 33.33%. To fulfill our target of
only one match ending in a draw we would require the other matches to not end in a draw or (1-
(1/3)) which equals 2/3 or 66.66%.
Therefore the probability of only one match out of three being drawn is 1/3x2/3x2/3 which equals =
4/27 or (.33*.67*.67) = 14.81%
In our group of three matches there are three ways for only one match to draw, DXX, XDX, XXD,
therefore we need to add together all the probabilities, three in this case.
The final answer to the probability of one match drawing is (4/27)+(4/27)+(4/27) = 4/9 or
(=.1481+.1481+.1481) = 44.44%.
Converse probabilities are used to work out the infamous birthday problem. Many people find the
answer puzzling but it can be proved by either asking your personal manger for birthday dates or
flicking through a the who’s who in your reference library.
"How many people should be gathered in a room together before it is more likely than not that two
of them share the same birthday?"
When the first person enters the room and announces their birthday, the probability of the second
person sharing the same birthday is 1/365. Conversely, the probability of the second birthday
being different is the opposite of the first calculation, 364/365. When two birthdays are known, the
probability of the third being different is 363/365, as there are now two 'favourable' outcomes
among 365. The compound probability of birthday 2 being different from birthday 1, and of
birthday 3 being different from the other two, these being independent outcomes, is:-
(364/365)*(363/365) = 0.991796 or 99.2% chance that two people will not share the same
birthday.
Note the start of the sequence is (365/365). We have removed this as it does not affect the result
of the calculation.
All that is necessary now is to continue adding terms to the fraction until it equals less than 1/2 or
50%, since as soon as the probability is less than 1/2 that all birthdays are different, the probability
is clearly more than 1/2 that any two are the same. In other words it is more likely than not that two
people in the room share the same birthday. The following chart shows the number of the people
in the room and the probability that they DO NOT share the same birthday.
79 of 131
People Chance %
2 99.7
3 99.2
4 98.4
5 97.3
6 96.0
7 94.4
8 92.6
9 90.5
10 88.3
11 85.9
12 83.3
13 80.6
14 77.7
15 74.7
16 71.6
17 68.5
18 65.3
19 62.1
20 58.9
21 55.6
22 52.4
23 49.3
24 46.2
50 3.0
3,254,690 to
100
1 on
The fraction drops to less than 1/2 with 23 iterations, so it is more likely than not that in any
gathering of 23 or more persons, two of them will share a birthday.
Only 50 people need be present for the 'coincidence' of two of them having the same birthday to
become, roughly, a 30-1 on chance.
In a company of 100 employees the odds are more than three million to one on that two share a
birthday.
The birthdays proposition is one where a gambler who can estimate probabilities can make money
from unsuspecting punters.
Venn Diagram:
A diagram that is used to represent sets by circular regions, parts of circular regions or their complements
with respect to a rectangle representing the space S is called a Venn diagram devised by English logician
John Venn (1834-1923). The Venn diagrams are used to represent sets and subsets in a particular way and to
verify the relationship among sets and subsets.
80 of 131
A A ( A B)
A B A B A \ B or A B B \ A or A B
A S =S
A B C A B C A B C
A B C A B C A B C
A B C (A B ) (C D) ( A B C) ( A B C)
This ability to represent a "sharing of conditions" makes Venn diagrams useful tools for solving complicated
problems. Consider the following example:
Example:
81 of 131
Twenty-four dogs are in a kennel. Twelve of the dogs are black, six
of the dogs have short tails, and fifteen of the dogs have long hair.
There is only one dog that is black with a short tail and long hair.
Two of the dogs are black with short tails and do not have long hair.
Two of the dogs have short tails and long hair but are not black. If
all of the dogs in the kennel have at least one of the mentioned
characteristics, how many dogs are black with long hair but do not
have short tails?
Solution:
Draw a Venn diagram to represent the situation described in the problem.
Represent the number of dogs that you are looking for with x.
• Notice that the number of dogs in each of the three categories is labeled OUTSIDE of the circle in a
colored box. This number is a reminder of the total of the numbers which may appear anywhere inside
that particular circle.
• After you have labeled all of the conditions listed in the problem, use this OUTSIDE box number to
help you determine how many dogs are to be labeled in the empty sections of each circle.
• Once you have EVERY section in the diagram labeled with a number or an expression, you are ready
to solve the problem.
• Add together EVERY section in the diagram and set it equal to the total number of dogs in the
kennel (24). Do NOT use the OUTSIDE box numbers.
• 9 - x + 2 + 1 + 1 + 2 + x + 12 - x = 24
• 27 - x = 24
• x = 3 (There are 3 dogs which are black with long hair but do not have a short tail.)
In general the events E1, E2, ... , En are said to be mutually exclusive if the occurrence of any
one them automatically implies the non-occurrence of the remaining n-1 events. In other words,
two mutually exclusive events cannot both occur.
Examples
• A flipped coin coming up heads and the same coin coming up tails are mutually exclusive events.
• A student passing a test and failing it are mutually exclusive (though someone can fail a test, retake it, and
then pass).
Q. Which of the following are pair of mutually exclusive in the drawing of one card from a standard deck of
52 or in rolling of two dice.
i) A heart and a queen
ii) A club and red card
iii) An even number and a spade
iv) An ace and an even number
v) A total of 5 points and 5 on one die
vi) A total of 7 points and an even number of points on both dice
vii) A total of 9 points and a 2 on one die
viii) A total of 10 points and a 4 on one die
Answer:
i) Not mutually exclusive
ii) m.e.
iii) not m.e.
iv) m.e.
v) m.e.
vi) not m.e.
vii) not m.e.
viii) not m.e.
Probability:
1. Quantitative Approach
2. Subjective Approach
Definition of Probability:
There are three basic approaches to define quantitative probability
1. Mathematical or Classical Definition of Probability
2. Axiomatic Definition of Probability
83 of 131
3. Relative Frequency Definition of Probability
For example, in a deck of 52 cards, the probability of pulling one of the 13 hearts from the deck is much
higher than the likelihood of pulling out the ace of spades. To calculate an exact value for the probability of
drawing a heart from the deck, divide the number of hearts you could possibly draw by the total number of
cards in the deck.
In contrast, the possibility of drawing the single ace of spades from the deck is:
After looking at these examples, you should be able to understand the general formula for calculating
probability. Here’s another example:
Joe has 3 green marbles, 2 red marbles, and 5 blue marbles. If all the marbles are dropped into a dark bag, what is
the probability that Joe will pick out a green marble?
There are 3 ways for Joe to pick a green marble (since there are 3 different green marbles), but there are 10
total possible outcomes (one for each marble in the bag). Therefore, you can simply calculate the probability
of picking a green marble:
When calculating probabilities, always be careful to count all of the possible favorable outcomes among the
total possible outcomes. In the last example, you may have been tempted to leave out the three chances of
picking a green marble from the total possibilities, yielding the equation P = 3⁄7. If you did that, you’d be
wrong.
84 of 131
The Range of Probability
The probability, P, of any event occurring will always be 0 ≤ P ≤ 1. A probability of 0 for an event means
that the event will never happen. A probability of 1 means the event will always occur. For example,
drawing a green card from a standard deck of cards has a probability of 0; getting a number less than seven
on a single roll of one die has a probability of 1.
The probability that an event will not occur. In that case, just figure out the probability of the event
occurring, and subtract that number from 1.
Example # 2: A pair of fair coins is tossed then find the probability of appearing
i) One head
ii) Two heads
iii) At least one tail
iv) At most one head
Solution:
Sample space when two coins are tossed: S = {HH, HT, TH, TT}
Example # 3: Three coins are tossed simultaneously then find the probability of appearing
i) One head
ii) At least head
iii) At least two tails
iv) No head
Solution:
Sample space when three coins are tossed:
S = {HHH, HHT, HTH, THH, HTT, THT, TTH, TTT}
Example # 4 When a fair dice is rolled what is the probability that the upper face turned up
i) 6
ii) not 6
iii) even integer
iv) less than 5
v) between 3 and 5
vi) either 4 or 6
Solution:
Sample space when one die is rolled: S = {1, 2, 3, 4, 5, 6}
i) Let ii) Let
A = 6 appear. B = Not 6.
A = {6} B = {1, 2, 3, 4, 5 }
n(S) = 6, n(A) = 1 n(S) = 6, n(B) = 5
n( A) n( B )
P( A) = P( B) =
n( S ) n( S )
1 5
= =
6 6
86 of 131
iii) Let iv) Let
C = Even Integer. D = Less than 5
C = {2, 4, 6} D = {1, 2, 3, 4}
n(S) = 6, n(C) = 3 n(S) = 6, n(D) = 4
n(C ) n( D )
P(C ) = P( D) =
n( S ) n( S )
3 1 4 2
= = = =
6 2 6 3
v) Let vi) Let
E = Between 3 and 5. F = Either 4 or 6
E = {4} F = {4, 6}
n(S) = 6, n(E) = 1 n(S) = 6, n(F) = 2
n( E ) n( F )
P( E ) = P( F ) =
n( S ) n( S )
1 2 1
= = =
6 6 3
Example # 5 Two fair dice are rolled at the same time and the number of dots appearing on both dice is
counted. Find the probability that this sum is
i) 7
ii) Odd number greater than 6
iii) Less than 2
iv) More than 12
v) At least 4
vi) Between 2 & 12 inclusively
vii) At most 8
viii) Divisible by 4
Solution:
Sample space when one die is rolled:
(1,1) (2,1) (3,1) (4,1) (5,1) (6,1)
(1, 2) (2, 2) (3, 2) (4, 2) (5, 2) (6, 2)
(1,3) (2,3) (3,3) (4,3) (5,3) (6,3)
S= ; n(S) = 36
(1, 4) (2, 4) (3, 4) (4, 4) (5, 4) (6, 4)
(1,5) (2,5) (3,5) (4,5) (5,5) (6,5)
(1, 6) (2, 6) (3, 6) (4, 6) (5, 6) (6, 6)
i) Let ii) Let
A = Sum is 7. B = Odd number greater than 6
n(S) = 36, n(A) = 6 B = {7, 9, 11}
n( A) n(S)=36, n(B) = 6+4+2=12
P( A) =
n( S ) n( B )
P( B) =
6 1 n( S )
= =
36 6 12 1
= =
36 3
iii) Let iv) Let
C = Less than 2. D = More than 12
C={} D={}
n(S) = 36, n(C) = 0 n(S) = 36, n(D) = 0
n(C ) n( D )
P(C ) = P( D) =
n( S ) n( S )
87 of 131
0 0
= =0 = =0
36 36
Permutations and combinations are counting tools. They have vast applications in probability, especially in
determining the number of successful outcomes and the number of total outcomes in a given scenario.
Questions about permutations and combinations on the Math IC will not be complex, nor will they require
advanced math. But you will need to understand how they work and how to work with them. Important to
both of these undertakings is a familiarity with factorials.
Factorials
The factorial of a number, n!, is the product of the natural numbers up to and including n:
If you are ever asked to find the number of ways that the n elements of a group can be ordered, you simply
need to calculate n!. For example, if you are asked how many different ways 6 people can sit a table with six
chairs, you could either list all of the possible seating arrangements or just answer 6! = 6 5 4 3 2
1 = 720.
Permutations
Permutation means arrangement of things. The word arrangement is used, if the order of things is
considered.
A permutation is an ordering of elements. For example, say you’re running for student council. There are six
different offices to be filled—president, vice president, secretary, treasurer, spirit coordinator, and
parlimentarian—and there are six candidates running. Assuming the candidates don’t care which office
they’re elected to, how many different ways can the student council be composed?
The answer is 6! because there are 6 students running for office, and thus, 6 elements in the set.
88 of 131
Say that due to budgetary costs, there are now only the three offices of president, vice president, and
treasurer to be filled. The same 6 candidates are still running. To handle this situation, we will now have to
change our method of calculating the number of permutations.
In general, the permutation, nPr, is the number of subgroups of size r that can be taken from a set with n
elements:
At a dog show, three awards are given: best in show, first runner-up, and second runner-up.
A group of 10 dogs are competing in the competition. In how many different ways can the
prizes be awarded?
Example: Suppose we have to form a number of consisting of three digits using the digits
1,2,3,4, To form this number the digits have to be arranged. Different numbers will get formed
depending upon the order in which we arrange the digits. This is an example of Permutation.
Now suppose that we have to make a team of 11 players out of 20 players, This is an example of
combination, because the order of players in the team will not result in a change in the team. No
matter in which order we list out the players the team will remain the same! For a different team to
be formed at least one player will have to be changed.
Addition rule : If an experiment can be performed in ‘n’ ways, & another experiment can be
performed in ‘m’ ways then either of the two experiments can be performed in (m+n) ways. This
rule can be extended to any finite number of experiments.
Example: Suppose there are 3 doors in a room, 2 on one side and 1 on other side. A man
want to go out from the room. Obviously he has ‘3’ options for it. He can come out by door ‘A’ or
door ‘B’ or door ’C’.
Multiplication Rule : If a work can be done in m ways, another work can be done in ‘n’ ways, then
both of the operations can be performed in m x n ways. It can be extended to any finite number of
operations.
89 of 131
Example.: Suppose a man wants to cross-out a room, which has 2 doors on one side and 1
door on other site. He has 2 x 1 = 2 ways for it.
Ex. 5! = 5 x 4 x 3 x 2 x 1 =120
Note 0! = 1
Or (n-1)! = [n x (n-1)!]/n = n! /n
Putting n = 1, we have
O! = 1!/1
or 0 = 1
Permutation
Number of permutations of ‘n’ different things taken ‘r’ at a time is given by:-
nP = n!/(n-r)!
r
Clearly the first place can be filled up in ‘n’ ways. Number of things left after filling-up the first place
= n-1
So the second-place can be filled-up in (n-1) ways. Now number of things left after filling-up the
first and second places = n - 2
Hence:
nP = n (n-1)(n-2) --------------(n-r+1)
r
nP = n!/(n-r)!
r
Number of permutations of ‘n’ different things taken all at a time is given by:-
nP = n!
n
Proof :
Now we have ‘n’ objects, and n-places.
Concept.
We have nP = n!/n-r
r
Putting r = n, we have :-
nP = n! / (n-r)
r
But nP = n!
n
Note : Factorial of negative-number is not defined. The expression –3! has no meaning.
Examples
91 of 131
Q. How many different signals can be made by 5 flags from 8-flags of different colours?
= 8!/(8-5)!
= 8 x 7 x 6 x 5 x 4 = 6720
Q. How many words can be made by using the letters of the word “SIMPLETON” taken all at a
time?
= 9! = 362880.
Number of permutations of n-thing, taken all at a time, in which ‘P’ are of one type, ‘g’ of them are
of second-type, ‘r’ of them are of third-type, and rest are all different is given by :-
n!/p! x q! x r!
Example: In how many ways can the letters of the word “Pre-University” be arranged?
13!/2! X 2! X 2!
Number of permutations of n-things, taken ‘r’ at a time when each thing can be repeated r-times is
given by = nr.
Proof.
Hence total number of ways in which first, second ----r th, places can be filled-up
= n x n x n ------------- r factors.
= nr
Example: A child has 3 pocket and 4 coins. In how many ways can he put the coins in his
pocket.
92 of 131
Ans. First coin can be put in 3 ways, similarly second, third and forth coins also can be put in 3
ways.
This problem is a permutation since the question asks us to order the top three finishers among 10
contestants in a dog show. There is more than one way that the same three dogs could get first place, second
place, and third place, and each arrangement is a different outcome. So, the answer is 10P3 = 10!⁄(10 -3)! =
10!⁄7! = 720.
Graphing calculators and most scientific calculators have a permutation function, labeled nPr. In most cases,
you must enter n, then press the button for permutation, and then enter r. This will calculate a permutation
for you, but if n is a large number, the calculator often cannot calculate n!. If this happens to you, don’t give
up! In cases like this, your knowledge of the permutation function will save you. Since you know that 100P3
is 100!⁄(100 -3)! you can simplify it to 100! /97!, or 100 99 98 = 970,200.
Combinations
Combination means selection of things. The word selection is used, when the order of things has no
importance.
A combination is an unordered grouping of a set. An example of a scenario in which order doesn’t matter is
a hand of cards: a king, an ace, and a five is the same as an ace, a five, and a king.
Combinations are represented as nCr, or , where unordered subgroups of size r are selected from a set of
size n. Because the order of the elements in a given subgroup doesn’t matter, this means that will be less
than nPr. Any one combination can be turned into more than one permutation. nCr is calculated as follows:
Here’s an example:
Suppose that a committee of 10 people must elect three leaders, whose duties are all the
same. In how many ways can this be done?
In this example, the order in which the leaders are assigned to positions doesn’t matter—the leaders aren’t
distinguished from one another in any way, unlike in the student council example. This distinction means
that the question can be answered with a combination rather than a permutation. We are looking for how
many different groups of three can be taken from a group of 10:
There are only 120 different ways to elect three leaders, as opposed to 720 ways when their roles were
differentiated.
Exercise:
1. How many distinct arrangements are possible using all the letters in the word
DOODLED? (Express your answer in factorial notation.)
2. How many committees of 6 people can be selected from 10 girls and 4 boys?
(Give answer in factorial form.)
3. How many distinct arrangements are possible using all the letters in the word
TORONTO? (Give answer in factorial form.)
4. How many ways can four students be seated in a row if two of them must be
seated together?
Permutations, Combinations, Binomial Theorem
5. How many distinct arrangements are possible using all the letters in the word
“wormwood”? Leave answer in factorial form.
6. In how many ways can 4 girls and 2 boys be seated at a round table? (Express
your answer in factorial notation.)
Multiple Choice:
1. Find the number of ways 8 different books can be arranged on a shelf if 3
particular books must be placed together.
a) 4320
b) 40 320
c) 241 920
d) 6720
Answer: a)
2. Harry, Peter and four girls sit in a row. Harry can’t sit in an end seat. In how
many ways can they be arranged?
a) 96
b) 600
c) 480
d) 240
Answer: c)
Long Answer:
1. Given the letters of the word ELEMENTS, how many 5 letter “words” can be
found in which vowels and consonants alternate?
Answer: 80
2. How many integers that do not contain repeated digits are there from 1 to 1000
inclusive?
Answer: 738
3. How many four digit multiples of five that are greater than 4000 can be formed
from the digits 0,1,2,3,5,9? (Repetitions are not allowed)
Answer: 36
4. There are 12 people to be seated around two circular tables, the first with seven
chairs and the second with five chairs. In how many ways can this be done?
94 of 131
Answer: 13 685 750 ways
5. How many four digit multiples of 5 can be formed using the digits 1, 2, 5, 7, 9, 0,
if no repetitions are allowed?
Answer: 108
6. How many different 3 letter “words” are possible using the letters in OCTOBER?
Answer: 135 arrangements
7. How many different two letter arrangements can be made from the letters in the
word SEEN?
Answer: 7
8. How many 4 digit numbers, greater than 7500 can be made using the digits
0,3,5,7,8,9, without repetition?
Answer: 156 numbers
9. A standard deck consists of 52 cards. How many 5 card hands can be made if a
hand must contain 2 pairs, each pair of a different value? The fifth card must
have a different value than either of the pairs.
Answer: 123 552 ways
10. Five boys and five girls are going for a picnic. Six can ride in one car and four in
another. In how many ways can they be distributed between the two cars?
Answer: 210 ways
11. How many 5 card hands can be formed from a deck of 52 cards if the hands
contain a pair and 3 of a kind?
Answer: 3744 ways
12. Four boys and four girls have decided to go out for dinner at a local restaurant.
There are two circular tables available, one near the window, and one in the centre
of the restaurant. Each table seats 4 people.
a) In how many ways can the eight people be seated randomly?
b) In how many ways can they be seated if all the boys must sit at one table and all
the girls at another table?
Answer: a) 2520
b) 36
13. A committee of 10 people is to be chosen from 10 boys and 10 girls. How many
ways can this be done if the committee must contain 9 boys and 1 girl?
Answer: 100 ways
Problem 1
Mike, a DJ at a high-school radio station, needs to play two or three more songs before the end of the
school dance. If each composition must be selected from a list of the 10 most popular songs of the
year, how many song sequences are available for the remainder of the dance?
(A) 6
(B) 90
(C) 120
(D) 720
(E) 810
95 of 131
Members of a student parliament took a vote on a proposition for a new social event on Fridays. If all the
members of the parliament voted either for or against the proposition and if the proposition was accepted in
a 5-to-2 vote, in how many ways could the members vote?
(A) 7
(B) 10
(C) 14
(D) 21
(E) 42
A set consists of all integers between 600 and 1999, inclusive. If a number is selected at random from this
set, what is the probability that it is divisible by both 5 and 13?
(A) 65/1399
(B) 13/280
(C) 1/50
(D) 3/200
(E) 21/1399
Jake, Lena, Fred, John and Inna need to drive home from a corporate reception in an SUV that can seat 7
people. If only Inna or Jake can drive, how many seat allocations are possible?
(A) 30
(B) 42
(C) 120
(D) 360
(E) 720
Circular permutations
(b) If clock-wise and anti-clock-wise orders are taken as not different, then total number of
circular-permutations is given by (n-1)!/2!
Proof(a):
(a) Let’s consider that 4 persons A,B,C, and D are sitting around a round table
Thus, we use that if 4 persons are sitting at a round table, then they can be shifted four times, but
these four arrangements will be the same, because the sequence of A, B, C, D, is same. But if A,
B, C, D, are sitting in a row, and they are shifted, then the four linear-arrangement will be different.
Hence if we have ‘4’ things, then for each circular-arrangement number of linear-arrangements =4
Similarly, if we have ‘n’ things, then for each circular – agreement, number of linear – arrangement
= n.
= n. (number of circular-arrangements)
Proof (b) When clock-wise and anti-clock wise arrangements are not different, then observation
can be made from both sides, and this will be the same. Here two permutations will be
counted as one. So total permutations will be half, hence in this case.
Circular–permutations = (n-1)!/2
(a) If clock-wise and anti-clockwise orders are taken as different, then total number of circular-
permutations = nPr /r
(b) If clock-wise and anti-clockwise orders are taken as not different, then total number of
circular – permutation = nP /2r
r
Example: How many necklace of 12 beads each can be made from 18 beads of different colours?
= 18!/(6 x 24)
Restricted – Permutations
(a) Number of permutations of ‘n’ things, taken ‘r’ at a time, when a particular thing is to be
always included in each arrangement
= r n-1 Pr-1
(b) Number of permutations of ‘n’ things, taken ‘r’ at a time, when a particular thing is fixed: = n-1 Pr-
1
(c) Number of permutations of ‘n’ things, taken ‘r’ at a time, when a particular thing is never taken:
= n-1 Pr.
(d) Number of permutations of ‘n’ things, taken ‘r’ at a time, when ‘m’ specified things always come
together = m! x ( n-m+1) !
(e) Number of permutations of ‘n’ things, taken all at a time, when ‘m’ specified things always
come together = n ! - [ m! x (n-m+1)! ]
Example: How many words can be formed with the letters of the word ‘OMEGA’ when:
Ans.
(iii) Three vowels (O,E,A,) can be arranged in the odd-places (1st, 3rd and 5th) = 3! ways.
And two consonants (M,G,) can be arranged in the even-place (2nd, 4th) = 2 ! ways
= 36 ways
Number of Combination of ‘n’ different things, taken ‘r’ at a time is given by:-
nC = n! / r ! x (n-r)!
r
Proof: Each combination consists of ‘r’ different things, which can be arranged among themselves
in r! ways.
= nPr -------(2)
nP = r! . nCr
r
or n!/(n-r)! = r! . nCr
or nC = n!/r!x(n-r)!
r
= n!/(n-r)!xr!
Restricted – Combinations
(a) Number of combinations of ‘n’ different things taken ‘r’ at a time, when ‘p’ particular things
are always included = n-pCr-p.
(b) Number of combination of ‘n’ different things, taken ‘r’ at a time, when ‘p’ particular things
are always to be excluded = n-pCr
Ans:
(i) A particular player is always chosen, it means that 10 players are selected out of the
remaining 14 players.
= 14!/4!x19! = 1365
(ii) A particular players is never chosen, it means that 11 players are selected out of 14 players.
= 14!/11!x3! = 364
(iii) Number of ways of selecting zero or more things from ‘n’ different things is given by:- 2n-1
=>Total number of ways of selecting one or more things out of n different things
= 2n – 1 [ nC0=1]
Example: John has 8 friends. In how many ways can he invite one or more of them to dinner?
Ans. John can select one or more than one of his 8 friends.
(iv) Number of ways of selecting zero or more things from ‘n’ identical things is given by :- n+1
101 of
131
Example: In how many ways, can zero or more letters be selected form the letters AAAAA?
(V) Number of ways of selecting one or more things from ‘p’ identical things of one type ‘q’
identical things of another type, ‘r’ identical things of the third type and ‘n’ different things is
given by :-
Example: Find the number of different choices that can be made from 3 apples, 4 bananas and 5
mangoes, if at least one fruit is to be chosen.
Ans:
(VI) Number of ways of selecting ‘r’ things from ‘n’ identical things is ‘1’.
Example: In how many ways 5 balls can be selected from ‘12’ identical red balls?
Ans. The balls are identical, total number of ways of selecting 5 balls = 1.
Example: How many numbers of four digits can be formed with digits 1, 2, 3, 4 and 5?
Restricted Permutations
<<(circular permuations) previous | next (restricted combination)>>
Restricted – Permutations
(a) Number of permutations of ‘n’ things, taken ‘r’ at a time, when a particular thing is to be
always included in each arrangement
= r n-1 Pr-1
(b) Number of permutations of ‘n’ things, taken ‘r’ at a time, when a particular thing is fixed: = n-1 Pr-
1
(c) Number of permutations of ‘n’ things, taken ‘r’ at a time, when a particular thing is never taken:
= n-1 Pr.
(d) Number of permutations of ‘n’ things, taken ‘r’ at a time, when ‘m’ specified things always come
together = m! x ( n-m+1) !
(e) Number of permutations of ‘n’ things, taken all at a time, when ‘m’ specified things always
come together = n ! - [ m! x (n-m+1)! ]
Example: How many words can be formed with the letters of the word ‘OMEGA’ when:
Ans.
(iii) Three vowels (O,E,A,) can be arranged in the odd-places (1st, 3rd and 5th) = 3! ways.
And two consonants (M,G,) can be arranged in the even-place (2nd, 4th) = 2 ! ways
= 36 ways
= 120-36 = 84 ways.
104 of
131
Number of Combination of ‘n’ different things, taken ‘r’ at a time is given by:-
nC = n! / r ! x (n-r)!
r
Proof: Each combination consists of ‘r’ different things, which can be arranged among themselves
in r! ways.
= nPr -------(2)
nP = r! . nCr
r
or n!/(n-r)! = r! . nCr
or nC = n!/r!x(n-r)!
r
= n!/(n-r)!xr!
Restricted – Combinations
(a) Number of combinations of ‘n’ different things taken ‘r’ at a time, when ‘p’ particular things
are always included = n-pCr-p.
(b) Number of combination of ‘n’ different things, taken ‘r’ at a time, when ‘p’ particular things
are always to be excluded = n-pCr
Ans:
(i) A particular player is always chosen, it means that 10 players are selected out of the
remaining 14 players.
= 14!/4!x19! = 1365
(ii) A particular players is never chosen, it means that 11 players are selected out of 14 players.
= 14!/11!x3! = 364
(iii) Number of ways of selecting zero or more things from ‘n’ different things is given by:- 2n-1
=>Total number of ways of selecting one or more things out of n different things
= 2n – 1 [ nC0=1]
Example: John has 8 friends. In how many ways can he invite one or more of them to dinner?
Ans. John can select one or more than one of his 8 friends.
(iv) Number of ways of selecting zero or more things from ‘n’ identical things is given by :- n+1
106 of
131
Example: In how many ways, can zero or more letters be selected form the letters AAAAA?
(V) Number of ways of selecting one or more things from ‘p’ identical things of one type ‘q’
identical things of another type, ‘r’ identical things of the third type and ‘n’ different things is
given by :-
Example: Find the number of different choices that can be made from 3 apples, 4 bananas and 5
mangoes, if at least one fruit is to be chosen.
Ans:
(VI) Number of ways of selecting ‘r’ things from ‘n’ identical things is ‘1’.
Example: In how many ways 5 balls can be selected from ‘12’ identical red balls?
Ans. The balls are identical, total number of ways of selecting 5 balls = 1.
Example: How many numbers of four digits can be formed with digits 1, 2, 3, 4 and 5?
Set A consists of all positive integers less than 100; Set B consists of 10 integers, the first four of which are
2, 3, 5, and 7. What is the difference between the median of Set A and the range of Set B?
(A) Statement (1) alone is sufficient, but statement (2) alone is not sufficient.
(B) Statement (2) alone is sufficient, but statement (1) alone is not sufficient.
(C) BOTH statements TOGETHER are sufficient, but NEITHER statement ALONE is sufficient.
(D) Each statement ALONE is sufficient.
(E) Statements (1) and (2) TOGETHER are NOT sufficient.
A university needs to select a nine-member committee on extracurricular life, whose members must belong
either to the student government or to the student advisory board. If the student government consists of 10
members, the student advisory board consists of 8 members, and 6 students hold membership in both
organizations, how many different committees are possible?
(A) 72
(B) 110
(C) 220
(D) 720
(E) 1096
Solution: http://www.projectgmat.com/solutions.html
108 of
131
eddeec
Example # 6 From a pack of 52 cards, two are drawn at random. Find the probability that one is a king and
the other is a queen
Solution:
4
C1. 4C 1. 44C0 16 8
52 =
P(one king & other queen) = C4 = 1326 663
Example # 7 A box contains 4 red, 4 white and 5 green balls. Three are drawn from the box together. Find
the probability that they may be
i) All of different colours
ii) All of the same colours
Solution:
Red White Green Total
4 4 5 13
Drawn = 3
These rules let you test if one number can be evenly divided by another, without having to do too
much calculation!
A number is
If: Example:
divisible by:
128 is
2 The last digit is even (0,2,4,6,8)
129 is not
381 (3+8+1=12, and 12÷3 = 4)
Yes
3 The sum of the digits is divisible by 3
217 (2+1+7=10, and 10÷3 = 3 1/3)
No
1312 is (12÷4=3)
4 The last 2 digits are divisible by 4
7019 is not
175 is
5 The last digit is 0 or 5
809 is not
114 (it is even, and 1+1+4=6 and 6÷3
= 2) Yes
6 The number is divisible by both 2 and 3
308 (it is even, but 3+0+8=11 and
11÷3 = 3 2/3) No
If you double the last digit and subtract it from the
rest of the number and the answer is: 672 (Double 2 is 4, 67-4=63, and
63÷7=9) Yes
• 0, or
7
• divisible by 7
905 (Double 5 is 10, 90-10=80,
(Note: you can apply this rule to that answer again and 80÷7=11 3/7) No
if you want)
109816 (816÷8=102) Yes
8 The last three digits are divisible by 8
216302 (302÷8=37 3/4) No
If you sum every second digit and then 1364 ((3+4) - (1+6) = 0) Yes
subtract all other digits and the answer is:
11 3729 ((7+9) - (3+2) = 11) Yes
• 0, or
• divisible by 11
25176 ((5+7) - (2+1+6) = 3) No
Exercise # 4
Q. 1 A single throw of two fair dice, find the probability that the product of the numbers is
i) Between 8 and 16
ii) Divisible by 4
iii) Divisible by 4 or 6
Q.2 A card is selected from an ordinary deck of 52 cards. What is the probability of getting
i) A queen
ii) A diamond
iii) Picture card
iv) The king of clubs
Q.3 A bag contains 6 white and 4 black identical balls. A ball is selected at random then what is the
probability that the selected ball is white.
Q.4 If a bag contains 3 white and 2 black balls. If two balls are selected at random, what is the probability
that the
i) Both balls are white
ii) Both are of different colours
Q.5 From a group of 6 men and 8 women 5 people are chosen at random. Find the probability that there are
more men than women.
Q.6 Three applicants are to be selected at random out of 4 boys and 6 girls. What is the probability of
selecting:
i) All girls
ii) At least one boy
Q.7 A bag contains 14 identical balls 4 of are red, 5 black and 5 white. Six balls are drawn from the bag.
Find the probability that:
i) 3 are white
ii) at least two are white
Q.8 Three distinct integers are chosen at random from the first 20 positive integers. Compute the
probability that:
i) Their sum is even
ii) Their product is even
111 of
131
Q.9 Of 12 eggs in a refrigerator 2 are bad. From those 4 eggs are chosen at random to make a cake. What is
the probability that
i) Exactly one is bad
ii) At least one is bad
Q.10 The face cards are removed from a full pack. Out of the remaining 4 are drawn at random. What is
the probability that they belong to different suits.
Q.11 A normal pack of 52 cards contains four aces and 48 other cards. Find the probability that a random
hand of 13 cards contains
i) Four aces
ii) No ace
iii) At least one ace
iv) At least 2 aces
Q.12 If a symmetrical 6 sided die is thrown 4 times. What is the probability that at least one six appears?
Q.13 A store receives 5 red shirts and 10 green shirts. A random sample of 5 shirts is selected. Determine
the probability that:
i) It contains 3 red shirts
ii) It contains 1 red shirt
iii) What percentage of the samples contain 3 green shirts
Example: If there is a party and every person shakes hands with each other once, and there are 45
handshakes, how many people are there at the party?
Solution: If there are n people at the party, then each person will shake hands with n-1 other people. So
with n people each making (n-1) handshakes, it appears at first sight that there are n(n-1) handshakes.
However, each handshake will have been counted twice, i.e. A->B and
B->A, so we must divide by 2.
n(n-1)/2 = 45
n(n-1) = 90
n^2 - n - 90 = 0
Axiom, in mathematics and logic, general statement accepted without proof as the basis for logically
deducing other statements (theorems). Examples of axioms used widely in mathematics are those related to
equality (e.g., "Two things equal to the same thing are equal to each other"; "If equals are added to equals,
the sums are equal") and those related to operations (e.g., the associative law and the commutative law). A
postulate, like an axiom, is a statement that is accepted without proof; however, it deals with specific subject
matter (e.g., properties of geometrical figures) and thus is not so general as an axiom. It is sometimes said
that an axiom or postulate is a "self-evident" statement, but the truth of the statement need not be evident
and may in some cases even seem to contradict common sense. Moreover, a statement may be an axiom or
postulate in one deductive system and may instead be derived from other statements in another system. A set
of axioms on which a system is based is often wished to be independent; i.e., no one of its members can be
deduced from any combination of the others. (Historically, the development of non-Euclidean geometry
grew out of attempts to prove or disprove the independence of the parallel postulate of Euclid.) The axioms
should also be consistent; i.e., it should not be possible to deduce contradictory statements from them.
Completeness is another property sometimes mentioned in connection with a set of axioms; if the set is
complete, then any true statement within the system described by the axioms may be deduced from them.
Definition: The event A is said to imply the event B if every element of A is also an element of B,
written A B or A B . If A implies B and B implies A, then A and B are equal, or A=B
Definition: The union of A and B, (that is, A or B) denoted A B is the set of all elementary events
that are either in A or B (or both). A B = {x S : x Aor x B or x A B}
Definition: The intersection of A and B, (that is, A or B) denoted A B or, more conveniently, as AB,
is the set of all elementary events that are in both A and B. A B = {x S : x A and x B}
Definition: Given two sets A and B, the set difference of A and B is denoted A-B or A\B, and is defined
as
A B = {x S : x A and x B}
Definition: A collection of sets { Ai , i = 1, 2,..., n} {we will permit n= } is said to be disjoint or mutually
exclusive if
Ai Aj = ; all I, j such that i j.
Note:
n
Sum of probability of all possible outcomes is equal to one i.e. P( E ) =1
i =1
i
Laws of Probability:
Prove the following probability laws:
a. If is the impossible event, then P( ) = 0
b. P( A) =1 − P( A)
If A and B are any two events defined in a sample space
c. if A B , then P(A) P(B)
d. P( A B) = P( B) − P( A B)
e. P( A B) = P( A) − P( A B)
f. P( A B ) = P ( A) + P ( B ) − P ( A B )
a. If A and B are mutually exclusive then
P ( A B ) = P ( A) + P ( B )
g. If A, B and C are any three events in a sample space S, then the probability of at least one of them
occurring is given by
P( A B C ) = P ( A) + P ( B ) + P (C ) − P ( A B ) − P ( A C ) − P ( B C ) + P ( A B C ) In general, the
formula for the k events is
h. P( A B) 1 − P( A) − P( B)
Proof:
1. If is the impossible event, then P( ) = 0
Proof: S =S
Applying probability, we get
P( S ) = P(S )
P ( S ) + P ( ) = P ( S ) S and are mutually exclusive events
P( ) = 0 Hence proved.
2. P( A) =1 − P( A)
Proof: A A= S
Applying probability, we get
114 of
131
P( A A) = P(S )
P( A) + P( A) = 1 A and A are mutually exclusive events & P(S) = 1
P( A ) = 1 – P(A) Hence proved.
3. if A B , then P(A) P(B)
Proof: A B A B= A BB B
From the Venn Diagram A
Shaded area = A B
B = A (Shaded area)
=A A B A B
Now applying probability, we get
P(B)=P{A ( A B )}
P(A) P(B) P( A B ) 0
Hence Proved
4. P( A B) = P( B) − P( A B)
Proof:
From the Venn diagram, we can write
B = (Shaded area) (A B)
= ( A B) ( A B)
P(B) = P( ( A B) ( A B) )
= P( A B ) + P( A B)
since ( A B ) and ( A B) are mutually exclusive
P( A B ) = P(B) – P( A B) Hence proved
5. DYS similar to 4
6. P( A B ) = P ( A) + P ( B ) − P ( A B ) also
If A and B are mutually exclusive then P ( A B ) = P ( A) + P ( B )
Proof:
From the Venn diagram, we can write
A B = A Shaded area
= A ( A B)
P( A B ) = PA) + P( A B) ------- (i)
since ( A B ) & A are mutually exclusive
Now considering
B = (Shaded area) (A B)
= ( A B) ( A B)
P(B) = P( ( A B) ( A B) )
= P( A B ) + P( A B) since ( A B ) and ( A B) are mutually exclusive
P( A B ) = P(B) – P( A B)
115 of
131
(i) P( A B ) = PA) + P(B) – P( A B) Hence proved
a. If A & B are mutually exclusive then P( A B) = 0, therefore
P( A B ) = PA) + P(B)
7. If A, B and C are any three events in a sample space S, then the probability of at least one of
them occurring is given by:
P( A B C ) = P ( A) + P ( B ) + P (C ) − P ( A B ) − P ( A C ) − P ( B C ) + P ( A B C ) In general, the
formula for the k events is
= P( A ) − P( A
i
i
i j
i Aj ) + P( A1 A2 A3 )
8. P( A B) 1 − P( A) − P( B)
Proof: P ( A B ) = P ( A) + P ( B ) − P ( A B )
P ( A B ) = P ( A) + P( B) − P( A B)
= 1 − P( A) +1 − P( B) − P( A B)
= 1 − P( A) − P( B) + 1− P( A B)
P( A B) 1 − P( A) − P( B) Since 1 − P ( A B ) 0
Hence proved
Problem For each of the following, state whether it is always true, always
false, or neither.
(a) (P(A) < P(A ∩ B)
(b) P(A) > P(A ∪ B)
(c) P(A ∩ B) ≤ P(A)
(d) ((A|B) = 1 − P(A ̄|B), where P(B) > 0 and A ̄ is the
complement of A (i.e. the set of outcomes that are not in A).
Solution
(a) always false . Since A ∩ B ⊂ A, we always have P(A) ≥ P(A ∩ B).
More intuitively, it can never be strictly more likely for A and B to
116 of
131
occur than for A to occur.
(b) always false . Since A ⊂ A∪B, we always have P(A) ≤ P(A∪B). As
before, it can never be strictly more likely for A to occur than for A or
B to occur.
(c) always true . We showed this is part (a). In particular, whenever A
and B happen, then A must have happened, so A is more probable
that A and B.
(d) always true . P(A|B) + P(A ̄|B) = P(Ω|B) = 1. Subtracting P(A ̄|B)
from both sides gives P(A|B) = 1 − P(A ̄|B). If I tell you that B
happens, then you can divide this into two disjoint pieces: when B
and A happen and when B happens and A doesn’t happen (that is,
when B and A ̄ happen). And these pieces make up everything, so their
probabilities must sum to 1.
Example: What is the probability of being dealt a bridge hand void in a specified suit from an ordinary
deck of 52 playing cards.
Solution:
The probability of something is the number of ways it works divided by the total number of ways possible.
There are a total of
52!
C1352 = = 635013559600 different possible bridge hands.
13!(52 − 13)!
8122425444
= 0.01279 or about a 1.279 percent chance.
635013559600
Conditional Probability:
Conditional probability represents the chance that one event will occur given that a second event has already
occurred.
Conditional probability can be defined as the probability of an event A if we assume that another event B
has occurred. We write a conditional probability of A, given B as PA\B) and compute the conditional
probability by the formula
P( A B)
P( A \ B) = ; Provided that P(B) 0
P( B)
117 of
131
Similarly conditional probability of B, given A can be written as
P( A B)
P( B \ A) = ; Provided that P(A) 0
P( A)
Example: A math teacher gave her class two tests. 25% of the class passed both tests and 42% of the
class passed the first test. What percent of those who passed the first test also passed the second test?
Solution: Let A = the students pass the first test &
B = the students pass the second test
Given that
P(A) = 0.42, P( A B ) = 0.25
P( A B) 0.25
P(B/A) = = = 0.60 = 60%
P( A) 0.42
Example: At National Textile University, the probability that a student takes Technology and Computer
is 0.087. The probability that a student takes Technology is 0.68. What is the probability that a student takes
Computer given that the student is taking Technology?
P(A/B) = ?
118 of
131
6
P( A B) 1
P(A/B) = = 36 =
P( B) 18 3
36
ii. Let C = The sum is greater than 6
6
P( A C ) 2
P(A/C) = = 36 =
P(C ) 21 7
36
iii. Let D = The two dice had the same outcomes
0
P( A D)
P(A/D) = = 36 = 0
P( D) 6
36
Example: What is the probability that a randomly selected poker hand contains exactly 3 aces given
that it contains at least 2 aces
Solution: Let D1 and D2 be the events of finding a defective fuses in the first and second tests, respectively.
We are interested in P( D1 D2 ).
Defectives Good Total
2 5 7
P( D1 D2 ) = P(D1).P(D2/D1)
119 of
131
2 1 1
= =
7 6 21
This can be generalize:
For three events A, B, C
P( A B C ) = P(A).P(B/A).P(C/ A B )
Example: Suppose that five good and two defective fuses have been mixed up. To find the defective ones,
we test them one by one at random, and without replacement. What is the probability that we find both of
the defective fuses in exactly three tests?
Solution: Let D1, D2 and D3 be the events that the first, second and third fuses tested are defective,
respectively. Let G1, G2 and G3 be the events that the first, second and third fuses tested are good,
respectively. We are interested in the probability of the event (G1 D2 D3 ) ( D1 G2 D3 )
Defectives Good Total
2 5 7
P (G1 D2 D3 ) ( D1 G2 D3 ) = P (G1 D2 D3 ) + P( D1 G2 D3 )
= P(G1).P(D2/G1).P(D3/ G1 D2 )+ P(D1).P(G2/D1).P(D3/ D1 G2 )
5 2 1 2 5 1
= + 0.095
7 6 5 7 6 5
Exercise:
Q.1 A box contains 15 items, 4 of which are defective and 11 are good. Two items are selected. What is the
probability that the first is good and the second defective? Ans. 0.16
Q.2 Two cards are dealt from a pack of ordinary playing cards. Find the probability that the second card
dealt is a heart. Ans. 0.25
Q.3 Box A contains 5 green and 7 red balls. Box B contains 3 green, 3 red and 6 yellow balls. What is the
probability that the ball drawn is green? Ans. 1/3
Q.4 An urn contains 10 white and 3 black balls. Another urn contains 3 white and 5 black balls. Two balls
are transferred from first urn and placed in the second and then one ball is taken from the latter. What is the
probability that it is a white ball
Q.6 In throwing two fair dice, what is the probability of sum 5 if they land on different numbers?
Q.7 If eight defective and 12 nondefective items are inspected one by one, at random, and without
replacement, what is the probability that (a) the first four items inspected are defective; (b) from the first
three items at least two are defective? Ans. 0.0144, 0.344
Independent Events:
Two events A and B in the same sample space S, are said to be independent if the occurrence of one event
does not affect the occurrence or non-occurrence of the other, that is P(A/B) = P(A) and P(B/A) = P(B). It
then follows that
120 of
131
Two events A and B are independent if and only if P( A B ) =P(A).P(B)
or
Two events in which the outcome of the second is not affected by the outcome of the first.
• Landing on heads after tossing a coin AND rolling a 5 on a single 6-sided die.
• Choosing a marble from a jar AND landing on heads after tossing a coin.
• Choosing a 3 from a deck of cards, replacing it, AND then choosing an ace as the
second card.
• Rolling a 4 on a single 6-sided die, AND then rolling a 1 on a second roll of the die.
Important Note:
1. Two mutually exclusive events A and B are independent if and only if P(A).P(B) = 0, which is true
either P(A) = 0 or P(B) = 0
2. If both events A and B have nonzero probabilities and are independent then they can never be
mutually exclusive.
3. Two events that are mutually exclusive are also dependent.
for m.e. events A and B P( A B ) = 0 P( A B ) = P(A).P(B)
A & B are dependent
4. Two events that are not mutually exclusive, may either be independent or dependent events.
for not m.e. events A and B P( A B ) 0 either P( A B ) = P(A).P(B) or P( A B )
P(A).P(B)
A & B are either independent or dependent.
Example: A drawer contains 3 red paperclips, 4 green paperclips, and 5 blue paperclips. One paperclip is
taken from the drawer and then replaced. Another paperclip is taken from the drawer. What is the
probability that the first paperclip is red and the second paperclip is blue?
Solution: Because the first paper clip is replaced, the sample space of 12 paperclips does not change from
the first event to the second event. The events are independent.
P(red then blue) = P(red) x P(blue) = 3/12 • 5/12 = 15/144 = 5/48.
Example: A dresser drawer contains one pair of socks of each of the following colors: blue, brown, red,
white and black. Each pair is folded together in matching pairs. You reach into the sock drawer and choose a
pair of socks without looking. The first pair you pull out is red -the wrong color. You replace this pair and
choose another pair. What is the probability that you will choose the red pair of socks twice?
121 of
131
1
Solution: P(red) =
5
1 1 1
P(red and red) = P(red) · P(red) = =
5 5 25
Example: A card is chosen at random from a deck of 52 cards. It is then replaced and a second card is
chosen. What is the probability of choosing a jack and an eight?
4
Solution: P(jack) =
52
4
P(8) =
52
4 4 1
P(jack and 8) = P(jack) · P(8) = =
52 52 169
Theorem: If A and B are two independent events in a sample space, then show that
i. A and B are independent
ii. A and B are independent
iii. A and B are independent
Solution:
Since A & B are independent therefore P( A B ) = P(A).P(B)
i. the events A B and A B are m.e. and their union is A, i.e. A = ( A B ) ( A B ).
Therefore P(A) =P( A B ) + P( A B ).
P( A B ) = P(A) − P( A B )
= P(A) − P(A).P(B) [ since A & B are independent]
= P(A)[1 − P(B)]
= P(A)P( B )
Hence A and B are independent
ii. Similarly P(B) =P( A B ) + P( A B ).
P( A B ).= P(B) − P( A B )
= P(B) − P(A).P(B) [ since A & B are independent]
= P(B)[1 − P(A)]
= P(B)P( A )
= P( A )P(B)
Hence A and B are independent
iii. Using the De-Morgan’s Law, A B = A B
P( A B) = P ( A B)
= 1 − P( A B)
= 1 − [P(A) + P(B) − P( A B )]
= 1 − [P(A) + P(B) − P(A).P(B)]
={1 − P(A)} − P(B) +P(A).P(B)
= P( A ) − P(B) {1 − P(A)}
= P( A ) − P(B) P( A )
= P( A ) [1 − P(B) ]
= P( A ) P( B )
Hence A and B are independent
Exercise:
122 of
131
Q.1 Two cards are drawn from a well-shuffled ordinary deck of 52 cards. Find the probability that they
are both aces if the first card is (i) replaced, (ii) not replaced.
Ans. 1/169, 1/221
Q.2 A pair of fair dice is thrown twice. What is the probability of getting totals of 5 and 11?
Ans. 1/81
Q.3 The probability that a man will be alive in 25 years is 3/5, and the probability that his wife will be
alive in 25 years is 2/3. Find the probability that (i) both will be alive, (ii) only the man will be alive, (iii)
only the wife will be alive, (iv) at least one will be alive and (v) neither will be alive in 25 years.
Ans. 2/5, 1/5, 4/15, 13/15, 2/15
Q.4 Three missiles are fired at a target. If the probabilities of hitting the target are 0.4, 0.5 and 0.6,
respectively, and if missiles are fired independently, what is the probability
i. That all the missiles hit the target?
ii. That at least one of the three hits the target?
iii. That exactly one hits the target?
iv. That exactly 2 hit the target? Ans. 0.12, 0.88, 0.38, 0.38
Q.5 A committee of three A, Band C is to make a decision on the basis of majority vote. What is the
probability of a wrong decision by the committee if the probabilities of a wrong decision by each
member are 0.05, 0.04, and 0.10 respectively?
1. The probability that the Red River will flood in any given year has been estimated from 200 years of
historical data to be one in four. This means:
a. The Red River will flood every four year.
b. In the next 100 years, the Red River will flood exactly 25 times.
c. In the last 100 years, the Red River flooded exactly 25 times.
d. In the next 100 years, the Red River will flood about 25 times.
e. In the next 100 years, it is very likely that the Red River will flood exactly 25 times.
2. The chances that you will ticketed for illegal parking on campus are about 1/3. During the last nine
days, you have illegally parked every day and have NOT been ticketed (you lucky person)! Today, on
the 10th day, you again decide to park illegally. The chances that you will be caught are:
a. greater than 1/3 since you were not caught in the last nine days.
b. less than 1/3 since you were not caught in the last nine days.
c. still equal to 1/3 since the last nine days do not affect the probability.
d. equal to 1/10 since you were not caught in the last nine days.
e. equal to 9/10 since you were not caught in the last nine days.
3. The chance that a person will contract AIDS after a sexual contact with an infected partner has been
estimated to be 1/4. This means:
a. A person will be infected after exactly 4 sexual contacts with infected partners.
b. Of 1000 people having sexual contacts with infected partners, exactly 250 will become
infected.
c. Of 200 people having sexual contacts with infected partners, about 50 will become infected.
d. In exactly 25% of all sexual contacts with infected partners, the infection will spread.
e. Of 20 people having sexual contacts with infected partners, it is very likely that exactly 5
people will become infected.
4. A random variable Y has the following distribution:
Y | -1 0 1 2
123 of
131
P(Y)| 3C 2C 0.4 0.1
a. 0.10
b. 0.15
c. 0.20
d. 0.25
e. 0.75
5. A random variable X has a probability distribution as follows:
r | 0 1 2 3
P(R=r) | 2k 3k 13k 2k
a. .90
b. .25
c. .65
d. .15
e. 1.00
6. Suppose that the allele for tallness (T) is dominant over shortness (t); that for Yellow (Y) is dominant
over green (y); and that for roundness (W) is dominant over wrinkled(w). Suppose we cross two plants
with genotypes TTYyWw and TtYyWw. The probability of a Tall, Yellow, Round plant is:
a. 9/16
b. 3/32
c. 1/16
d. 9/32
e. 3/16
1. A company president is deciding how to fill three vice-presidencies in the company: VP-Marketing,
VP-Finance, and VP-Production. Twelve executives are eligible and qualified for promotion, and
each could fill any of the three positions. In how many ways can the positions be filled?
See if you can do the problem two (slightly) different ways, which might actually represent two
different methods of thinking that the decision-maker might use.
The Lottery Commission is considering a new game in which five balls would be withdrawn
from a box containing 10 balls, numbered 0 to 9. The five balls would come out of the box at nearly
the same time, as they do in the current Lotto game, in which six balls come out of a box into a tube
at nearly the same time.
In this new game, however, the winning ticket must have the five lucky numbers in the same
order as they came out of the box.
124 of
131
3. Permutations/Combinations--Dinner Party
a. "I forgot to buy vegetables for our dinner party tonight. Will you go back
to the store and get three bags of frozen vegetables?" If the store has 10 bags each of 20 different frozen
vegetables, in how many ways, with respect to kinds of vegetables, can the errand be performed?
b. "I forgot to buy vegetables for our dinner party tonight. Will you go back
to the store and get three bags of different frozen vegetables?" If the store has 10 bags each of 20
different frozen vegetables, in how many ways, with respect to kinds of vegetables, can the errand be
performed?
c. In setting the dinner table, including place cards, for eight people, how
many seating arrangements are possible?
d. What if it is a round table and it really does not matter exactly where
people sit, but is does matter who is sitting next to whom. How many seating arrangements are possible
with 8 people?
4. Card Game
a. If you are dealt a hand of five cards from a standard deck of 52 cards, what
is the probability that your hand contains four aces? (This can be done using a permutation/combination
computation, but there is another way.)
b. What is the probability that your hand contains a straight flush--a sequence
of five consecutive cards, all of the same suit? (A-K-Q-J-10 is the highest sequence in each suit, and 6-
5-4-3-2 is the lowest.)
125 of
131
SOLUTIONS
1. Permutations/Combinations--Executive Decision
Method One: The decision-maker first selects three people from among the twelve, not yet
thinking about their job assignments (order not important). This can be done C(12,3) = 220 ways.
Then the decision-maker assigns the three chosen people to the three jobs (order important). This
can be done P(3,3) = 6 ways. So the total number of ways is 220 x 6 = 1,320.
Method Two: The decision maker selects a person and immediately assigns him/her to one of the three
positions (order important). This is repeated two more times. The number of ways is P(12,3) = 1,320.
Method Three--formula-free sequential method: 12 people can be selected for the first position, 11 for
the second, and 10 for the third. 12 x 11 x 10 = 1,320.
Formula-free sequential method: the first number has 10 possibilities, the second number has 9, the third
number has 8, the fourth number has 7, and the fifth number has 6. 10 x 9 x 8 x 7 x 6 = 30,240.
3. Permutations/Combinations--Dinner Party
Note: As long as there are three or more bags of each vegetable, the actual number of bags of each
vegetable is not relevant. The question asked how many ways "with respect to kinds of vegetables"
can the errand be performed. Therefore "n" is the number of kinds of vegetables, not the total
number of bags.
Formula-free: going around the table, there are 8 choices for the first seat, 7 choices for the second
seat, etc. 8 x 7 x 6 x 5 x 4 x 3 x 2 x 1 = 40,320.
d. The first person can sit anywhere. There are then 7 choices for the person
to the first person's right, 6 choices for the person to that person's right, etc. 7 x 6 x 5 x 4 x 3 x 2 x 1 =
5,040. This is also equal to P(7,7).
126 of
131
4. Card Game
a. Method One: There are C(52,5) = 2,598,960 possible hands. 48 of these contain 4 aces (4 aces and
the king of spades, 4 aces and the queen of spades, 4 aces and the jack of spades, etc.).
Or, the "sequential approach." As you pick up your cards, what is the probability of picking up four
aces, followed by a non-ace? By the multiplicative rule, this is:
But you do not have to get the four aces first and the non-ace last. You could get the non-ace fourth:
Or you could get the non-ace first, followed by the four aces:
By the addition rule, the union of all of the "or's" above would be the sum of the five individual
probabilities, which is 0.000018469, in exact agreement with the first method.
b. There are C(52,5) = 2,598,960 possible hands. Among these, how many are "straight flushes?"
There are nine straight-flushes in each suit: A-K-Q-J-10, K-Q-J-10-9, Q-J-10-9-8, J-10-9-8-7, 10-9-
8-7-6, 9-8-7-6-5, 8-7-6-5-4-, 7-6-5-4-3, and 6-5-4-3-2. There are four suits, so there are 36 possible
straight flushes.
Note that this is rarer than four-of-a-kind, even four aces. So, in poker, any straight flush will beat
any four-of-a-kind.
3. The following data are the results of an extensive series of standard linear density tests carried out on
a large delivery of worsted yarn. Results of linear-density test (Tex) on a large consignment of
worsted yarn.
Results of Linear-density tests (tex)
On a Large Consignment of Worsted Yarn
31.3 31.3 31.5 31.3 31.3 32.0 31.9 31.8 33.1 30.6
30.2 31.2 29.6 32.7 32.7 31.8 30.2 31.8 30.5 30.5
31.4 30.6 31.4 31.5 30.1 30.3 31.2 30.7 30.9 31.9
30.9 30.1 32.4 32.8 31.6 31.8 31.7 29.5 30.7 31.6
30.6 31.4 31.0 31.0 30.5 30.5 31.0 29.1 30.2 31.1
29.8 30.6 32.2 30.4 32.1 31.7 31.5 31.7 31.4 30.4
31.5 30.4 31.3 31.9 31.1 31.9 32.0 31.6 30.3 32.1
31.0 31.4 33.1 30.6 31.2 32.2 32.6 31.9 32.2 31.3
30.7 30.9 30.7 32.3 32.7 31.3 32.5 31.3 31.3 31.5
31.9 31.0 31.0 32.3 31.5 29.8 32.4 31.7 31.6 32.0
30.6 30.8 31.1 32.1 29.9 31.6 30.6 30.6 31.1 30.0
32.4 31.1 29.7 31.2 30.6 31.5 31.0 31.1 31.2 31.6
31.1 30.8 30.9 31.6 30.6 30.4 30.9 29.7 30.2 30.1
30.3 29.4 30.0 30.0 32.8 31.9 30.7 31.7 31.8 31.5
31.0 30.8 32.1 30.8 31.1 32.5 31.7 30.5 30.5 30.5
31.1 31.2 31.4 29.5 31.5 31.2 31.4 30.1 32.2 30.5
31.2 30.9 30.6 31.2 30.3 30.6 31.8 31.4 30.6 31.3
30.9 31.2 30.2 29.6 31.2 29.9 30.5 31.1 30.8 31.8
31.4 29.3 31.2 31.1 31.1 31.0 31.0 30.7 31.3 30.7
31.0 30.2
4. The following are the results of counting the number of wrap breakages during the weaving of 92
standard lengths of a certain kind of cloth.
Construct a frequency distribution for these data, calculate the relative frequencies, and draw a
frequency polygon.
128 of
131
6. Rayon yarn is wound on metal spools that are made to a specified of 226 g with a tolerance range of
3 g. A random sample of 100 spools were found to have the following masses (g)
206 210 231 235 225 225 223 210 212 218
227 211 208 230 228 223 230 228 208 226
209 228 210 208 206 210 227 215 213 210
218 208 226 227 207 207 226 226 232 226
227 225 228 227 209 225 234 209 223 210
233 217 227 210 228 210 225 229 210 231
228 226 208 224 216 210 217 227 226 219
207 208 225 212 210 224 208 209 223 230
232 230 209 220 223 206 206 226 209 222
209 227 211 218 227 207 209 226 229 225
Construct a frequency distribution wit a class interval of 3 g and draw the corresponding histogram.
Comment on the performance of the spool-making machines and on the size of the tolerance range.
7. Yarn is wound a large spools, which are run at high speeds. At times the spools run erratically, and
when this occurs the operation is stopped and the spools are doffed short of their intended load. The
following data refer to a sample of 147 spools, in which the frequency of doffing is given for the
percentage of the intended load at doffing, arranged in classes as shown
Class % 0-5 5-15 15-25 25-35 35-45 45-55 55-65 65-75 75-85 85-95 95-100
Frequency 45 4 10 5 2 11 10 6 2 1 51
2.
Joslyn's dance group consists of 8 girls and 5 boys. 4 people are to be chosen at random to perform at
the next recital. if the group of performers must include at least 2 girls, in how many ways can the
performers be chosen?:
Answer:
C(8,2)C(5,2)+C(8,3)C(5,1)+C(8,4)C(5,0)
(number of letters)! 9!
(28)(10)+(56)(5)+(70)(1) = = 7560
(number of repeating letters)! 4!2!
280 + 280 + 70
630
130 of
131
3. there are 12 people on kari's soccer team. individual pictures are taken and 8 pictures are selected to be hung
in a row. how many different arrangements of pictures are possible if kari's picture must be among those
hung?: there are 12 people on kari's soccer team. individual pictures are taken and 8 pictures are selected to
be hung in a row. how many different arrangements of pictures are possible if kari's picture must be among
those hung?
Answer:
12*8
96
so there are 96 different ways.
4. Six strangers arrive at a business seminar and each person shakes hands with every other person. How
many handshakes are there?:
Answer:
Each man shakes hands with 5 other men, for a total of 30 handshakes.
However, if man A shakes hands with man B, it is the same as B with A. Therefore we divide our total by two and get
the answer...
15 handshakes
Another way is 5+4+3+2+1 = 15...
5. how many different 4 digit PINs can be created using the numbers 2, 3, 0, 6, 7, and 8 with repitition. i know
with repitition it would be 6 x 6 x 6 x 6 = 1296
HOWEVER for a pin number like 2222 rearranging the numbers is useless because you don't get a new
number. This is the same for a number like 2223, or 2233 <- rearranging is pointless.
So, please help me fiigure out how to apply these restrictions with all the numbers: how many different 4 digit
PINs can be created using the numbers 2, 3, 0, 6, 7, and 8 with repitition.
i know with repitition it would be 6 x 6 x 6 x 6 = 1296
HOWEVER for a pin number like 2222 rearranging the numbers is useless because you don't get a new
number. This is the same for a number like 2223, or 2233 <- rearranging is pointless.
So, please help me fiigure out how to apply these restrictions with all the numbers
6. There are twenty tiles numbered 1-20. What is the propability of drawing two even numbered tiles, without
replacing them. Express awnser in permutation form.: There are twenty tiles numbered 1-20. What is the
propability of drawing two even numbered tiles, without replacing them. Express awnser in permutation form.
P(10,2)/P(20,2) there are 10 even numbers and you grab only 2 tiles; there are a total of 20 and you grab only 2
90/380
131 of
131
9/38