You are on page 1of 27

Summarise a set of data using a table or frequency distribution, and display it graphically using a line plot, a

box plot, a bar chart, histogram, stem and leaf plot, or other appropriate elementary device.
Contents
1 Descriptive
Statistics

2 Representation of
Data
2.1
Frequency
Table

2.2 Stem and
Leaf
Diagrams

2.3 Pie Chart ♦
2.4 Line Plot ♦
2.5 Boxplot ♦
2.6 Bar
Chart

2.7
Histogram


Descriptive Statistics
Descriptive statistics describe the main features of a collection of data quantitatively. Descriptive statistics
provide simple summaries about the sample and the measures. Together with simple graphics analysis, they
form the basis of quantitative analysis of data. Statistical inference is the process of drawing conclusions
from data that are subject to random variation. A statistical population is a set of entities concerning which
statistical inferences are to be drawn, often based on a random sample taken from the population. A sample is
a subject chosen from a population for investigation. A random sample is one chosen by a method
involving an unpredictable component.
Distribution - The distribution is a summary of the frequency of individual ranges of values for a variable.
The simplest distribution would list every value of a variable and the number of cases who had that value.
Frequency distributions are depicted as a table or as a graph. The range of a set of data is the difference
between the highest and lowest values in the set.
Central Tendency - The central tendency of a distribution locates the "center" of a distribution of values.
The three major types of estimates of central tendency are the mean, the median, and the mode.
Representation of Data
Frequency Table
A frequency table is a way of summarizing a set of data. It is a record of how often each value (or set of
values) of the variable in question occurs.
Example. Suppose that in thirty shots at a target, a marksman makes the following scores:
5,2,2,3,4,4,3,2,0,3,0,3,2,1,5,1,3,1,5,5,2,4,0,0,4,5,4,4,5,5.
CT3_(i)_(1)
Contents 1
The frequencies of the different scores can be summarised as:
Score Frequency: 0-4 -13%, 1-3 -10%, 2-5 -17%, 3-5 -17%, 4-6 -20% , 5-7 -23%.
Stem and Leaf Diagrams
A stem and leaf diagram is a way of grouping your data into classes. The good thing about it is that from
the diagram you can obtain the original data- so no information is lost.
Example. Suppose you have the heights of 20 people as follows:
154,143,148,139,143,147,153,162,136,147,144,143,139,142,143,156,151,164,157,149,146.
First decide upon what you want the classes (groups) to be, choosing classes of equal width. We might, for
example, choose our classes to have a width of 5 and have the following classes: 135-139, 140-144,
145-149, 150-154, 155-159, 160-164. We can easily see which heights fall into which classes:
135-139:139,136,139
140-144:143,143,144,143,142,143
145-149:148,147,147,149,146
150-154:154,153,151
155-159:156,157
160-164:162,164.
What we have here is almost a stem and leaf diagram. Note that with the data written in this way you can see
what the modal class is (the one with the most values- it is 140-144). You can also see the shape of the
distribution- most of the values are in the 140's with higher or lower values rarer. To change this into a
proper stem and leaf diagram, we just simplify it a little. Instead of writing out the full figures each time (
143,143,144,143,...) we write 14 and call this the stem and then write 3,3,4,3,... (these being the leaves). So
we finish up with (Key: 13|6 means 136):
Stem Leaf
13 6,9,9
14 2,3,3,3,3,4
14 6,7,7,8,8,9
15 1,3,4
15 6,7
16 2,4
Pie Chart
A pie chart is a way of summarising a set of categorical data. It is a circle which is divided into segments.
Each segment represents a particular category. The area of each segment is proportional to the number of
cases in that category. Look at this record of traffic traveling down a particular road.
Traffic Survey 31 January 2008
CT3_(i)_(1)
Frequency Table 2
Type of Vehicle Number of Vehicles
Cars 140
Motorbikes 70
Vans 55
Buses 5
Total vehicles 270
To draw a pie chart, we need to represent each part of the data as a proportion of 360, because there are 360
degrees in a circle. For example, if 55 out of 270 vehicles are vans, we will represent this on the circle as a
segment with an angle of: (55/270) x 360 = 73 degrees. This will give the following results:
Traffic Survey 31 January 2008
Type of Vehicle Number of Vehicles Calculation Degrees of a circle
Cars 140 (140/270) 360 = 187
Motorbikes 70 (70/270) 360 = 93
Vans 55 (55/270) 360 = 73
Buses 5 (5/270) 360 = 7
Line Plot
A line plot shows data on a number line with x or other marks to show frequency.
Example. The line plot below shows the test scores of 26 students.
CT3_(i)_(1)
Pie Chart 3
Boxplot
A boxplot is a way of summarizing a set of data measured on an interval scale. It is often used in exploratory
data analysis. It is a type of graph which is used to show the shape of the distribution, its central value, and
variability. The picture produced consists of the most extreme values in the data set (maximum and minimum
values), the lower and upper quartiles, and the median.
Example. These MINITAB boxplots represent lottery payoffs for winning numbers for three time periods
(May 1975-March 1976, November 1976-September 1977, and December 1980-September 1981).
The median for each dataset is indicated by the black center line, and the first and third quartiles are the
edges of the red area, which is known as the inter-quartile range (IQR). The extreme values (within 1.5 times
the inter-quartile range from the upper or lower quartile) are the ends of the lines extending from the IQR.
Points at a greater distance from the median than 1.5 times the IQR are plotted individually as asterisks.
These points represent potential outliers.
In this example, the three boxplots have nearly identical median values. The IQR is decreasing from one
time period to the next, indicating reduced variability of payoffs in the second and third periods. In addition,
the extreme values are closer to the median in the later time periods.
Bar Chart
A bar chart or bar graph is a chart with rectangular bars with lengths proportional to the values that they
represent. The bars can be plotted vertically or horizontally. Bar charts are used for plotting discrete (or
discontinuous) data i.e. data which has discrete values and is not continuous. Some examples of
discontinuous data include shoe size or eye color, for which you would use a bar chart. In contrast, some
examples of continuous data would be height or weight. A bar chart is very useful if you are trying to record
certain information whether it is continuous or not continuous data. Bar charts also look a lot like a
histogram.They are often mistaken for each other. A bar chart is a chart where the height of bars represents
CT3_(i)_(1)
Boxplot 4
the frequency. The data is discrete (discontinuous-unlike histograms where the data is continuous). The bars
should be separated by small gaps.
Histogram
Histograms are similar to bar charts apart from the consideration of areas. In a bar chart, all of the bars are
the same width and the only thing that matters is the height of the bar. In a histogram, the area is the
important thing. The illustration, below, is a histogram showing the results of a final exam given to a
hypothetical class of students. Each score range is denoted by a bar of a certain color.
Back to CT3 (i)
CT3_(i)_(1)
Bar Chart 5
Describe the level/location of a set of data using the mean, median, mode, as appropriate.
Contents
1 Numerical
Summaries
1.1
Mean

1.2
Median

1.3
Mode


Numerical Summaries
Mean
The sample mean, or average, of a group of values is calculated by taking the sum of all of the values and
dividing by the total number of values. In other words, for n values ; ; ; ...; , the mean = ( +
+ + ... + )/ , or
Median
A median is described as the numerical value separating the higher half of a sample, a population, or a
probability distribution, from the lower half. The median of a finite list of numbers can be found by arranging
all the observations from lowest value to highest value and picking the middle one. If there is an even
number of observations, then there is no single middle value; the median is then usually deemed to be the
mean of the two middle values.
Mode
The mode is the value that occurs most frequently in a data set or a probability distribution. Like the
statistical mean and the median, the mode is a way of capturing important information about a random
variable or a population in a single quantity. The mode is in general different from the mean and median. The
mode of a data sample is the element that occurs most often in the collection. For example, the mode of the
sample [1,3,6,6,6,6,7,7,12,12,17] is 6. Given the list of data [1,1,2,4,4] the mode is not unique - the data set
may be said to be bimodal, while a set with more than two modes may be described as multimodal.
Example. Comparison of common averages of values {1,2,2,3,4,7,9}
Type Description Example Result
Arithmetic
Mean
(1+2+2+3+4+7+9)/7 4
Median
Middle value of a data
set
1,2,2,3,4,7,9 3
Mode
Most frequent value in a
data set
1,2,2,3,4,7,9 2
CT3_(i)_(2)
Contents 1
Back to CT3 (i)
CT3_(i)_(2)
Mode 2
Describe the spread/variability of a set of data using the standard deviation, range,interquartile range, as
appropriate.
Numerical Summaries
Quartiles
The first quartile of a group of values is the value such the 25% of the values fall at or below this value. The
third quartile of a group of values is the value such that 75% of the values fall at or below this value. The first
quartile may be approximately calculated by placing a group of values in ascending order and determining
the median of the values below the true median, and the third quartile is approximately calculated by
determining the median of the values above the true median. For an odd number of observations, the median
is excluded from the calculation of the first and third quartiles. The distance between the first and third
quartiles is known as the Inter-Quartile Range (IQR). A useful graphical representation of a distribution
including the quartiles is a boxplot.
Example. The quartiles may be approximately calculated as follows: First order the data:
59,60,64,67,68,68,70,71,72,73. Since there are an even number of observations (10), the first half of the data
is considered in calculating the first quartile: 59,60,64,67,68. The median of these values is 64, so this is the
first quartile. The second half of the data is considered in calculating the third quartile: 68,70,71,72,73. The
median of these values is 71 so this is the third quartile. For this example, the Inter-Quartile Range is
71-64=7.
Example. Data set in a table.
i Quartile
1 102 -
2 104 -
3 105
4 107 -
5 108 -
6 109
(median)
7 110 -
8 112 -
9 115
10 116 -
11 118 -
Variance and Standard Deviation
The variance of a group of values measures the spread of the distribution. A large variance indicates a wide
range of values, while a small variance indicates that the values lie close to their mean. The variance is
calculated by summing the squared distances from each value to the mean of the values, then dividing by one
fewer than the number of observations, i.e.
The standard deviation is the square root of the variance.
CT3_(i)_(3)
Numerical Summaries 1
Example. The following calculation computes the variance for the student height data, where the mean was
previously calculated to be :
The standard deviation is the square root: .
Back to CT3 (i)
CT3_(i)_(3)
Variance and Standard Deviation 2
Explain what is meant by symmetry and skewness for the distribution of a set of data.
Skewness
Skewness is a measure of the asymmetry of the probability distribution. The skewness value is a real number,
i.e. can be positive or negative. In some situations skewness can be undefined. Qualitatively, a negative skew
indicates that the tail on the left side of the distribution is longer than the right side. Definition
For a sample of values the sample skewness is
where is the sample mean,
is the third central moment, and is the sample variance.
Back to CT3 (i)
CT3_(i)_(4)
Skewness 1
Explain what is meant by a set function, a sample space for an experiment, and an event.
Contents
1 Probability •
2 Multiple Events: Independent and Dependent
Events

3 Possibility Spaces •
4 Probability Tree •
5 Sample Space •
Probability
Probability is the likelihood or chance of an event occurring. Probability = the number of ways of achieving
success over the total number of possible outcomes. For example, the probability of flipping a coin and it
being heads is 1/2 because there is 1 way of getting a head and the total number of possible outcomes is 2 (a
head or tail). We write P(heads) = 1/2. The probability of something which is certain to happen is 1. The
probability of something which is impossible to happen is 0. The probability of something not happening is 1
minus the probability that it will happen.
Multiple Events: Independent and Dependent Events
Suppose now we consider the probability of 2 events happening. For example, we might throw 2 dice and
consider the probability that both are 6. We call two events independent if the outcome of one of the events
doesn't affect the outcome of another. For example, if we throw two dice, the probability of getting a 6 on the
second die is the same, no matter what we get with the first one- it's still 1/6. On the other hand, suppose we
have a bag containing 2 red and 2 blue balls. If we pick 2 balls out of the bag, the probability that the second
is blue depends upon what the color of the first ball picked was. If the first ball was blue, there will be 1 blue
and 2 red balls in the bag when we pick the second ball. So the probability of getting a blue is 1/3. However,
if the first ball was red, there will be 1 red and 2 blue balls left so the probability the second ball is blue is
2/3. When the probability of one event depends on another, the events are dependent.
Possibility Spaces
Example. Probabilities of events in simple situations. When working out what the probability of two
things happening is, a probability possibility space can be drawn. For example, if you throw two dice, what
is the probability that you will get: a) 8, b) 9, c) either 8 or 9?
1/2 1 2 3 4 5 6
1
2 A
3 A B
4 A B
5 A B
6 A B
A indicates the ways of getting 8 (a 2 and a 6, a 3 and a 5...). There are 5 different ways. The probability
space shows us that when throwing 2 dice, there are 36 different possibilities (36 squares). With 5 of these
possibilities, you will get 8. Therefore P(8)=5/36. B indicates the ways of getting 9. There are four ways,
therefore P(9)=4/36=1/9. You will get an 8 or 9 in any of the blobbed squares. There are 9 altogether, so P(8
CT3_(ii)_(1)
Contents 1
or 9)=9/36=1/4.
Probability Tree
Another way of representing 2 or more events is on a probability tree.
Example. There are 3 balls in a bag: red, yellow and blue. One ball is picked out, and not replaced, and then
another ball is picked out.
The first ball can be red, yellow or blue. The probability is 1/3 for each of these. If a red ball is picked out,
there will be two balls left, a yellow and blue. The probability the second ball will be yellow is 1/2 and the
probability the second ball will be blue is 1/2. The same logic can be applied to the cases of when a yellow
or blue ball is picked out first. In this example, the question states that the ball is not replaced. If it was, the
probability of picking a red ball (etc.) the second time will be the same as the first (i.e. 1/3). An outcome is
the result of an experiment or other situation involving uncertainty. The set of all possible outcomes of a
probability experiment is called a sample space. An outcome is the result of an experiment or other situation
involving uncertainty. The set of all possible outcomes of a probability experiment is called a sample space.
Sample Space
The sample space is an exhaustive list of all the possible outcomes of an experiment. Each possible result of
such a study is represented by one and only one point in the sample space, which is usually denoted by .
Example. Experiment Rolling a die once:
Sample space = {1,2,3,4,5,6}
Experiment Tossing a coin:
Sample space = {Heads,Tails}
Experiment Measuring the height (cms) of a girl on her first day at school:
Sample space = the set of all possible real numbers.
An event is any collection of outcomes of an experiment. Formally, any subset of the sample space is an
event. Any event which consists of a single outcome in the sample space is called an elementary or simple
event. Events which consist of more than one outcome are called compound events.
Back to CT3 (ii).
CT3_(ii)_(1)
Possibility Spaces 2
Define probability as a set function on a collection of events, stating basic axioms.
Introduction to Sets
In order to discuss the basic concepts of the probabilistic models which we wish to develop, it will be very
convenient to have available some ideas and concepts of the theory of sets. This subject is very extensive
one, and much has been written about it.
A mathematical set is defined as an unordered collection of distinct elements. That is, elements of a set can
be listed in any order and elements occurring more than once are equivalent to occurring only once. We say
that an element is a member of a set. An element of a set can be anything. It's easiest to begin with only
numbers as elements. For that reason, most of the examples will only include numbers, but this is only a
technique to make the topic less abstract.
For a set having an element , the following are all used synonymously: is a member of ; is
contained in ; is included in ; is an element of the set ; contains ; includes .
As stated above, sets are defined by their members. However some sets are given names to ease referencing
them. The set with no members is the empty or null set. The expressions and all specify the
empty set.
A set is a subset of set if every member of is a member of . We use the horseshoe notation to
indicate subsets.
Example. The expression {1,2} {1,2,3}. says that {1,2} is a subset of {1,2,3}. The empty set is a subset
of every set. Every set is a subset of itself. A proper subset of is a subset of that is not identical with
. The expression {1,2} {1,2,3} says that {1,2} is a proper subset of {1,2,3}.
A power set of a set is the set of all its subsets. A script is used for the power set. Note that the empty set
and the set itself are members of the power set.
The union of two sets and , written , is the set that contains all the members of and all the
members of (and nothing else). That is,
Example. .
The intersection of two sets and , written , is the set that contains everything that is a member
of both and . That is,
Example. .
Two sets and are disjoint if their intersection is empty. That is, The relative
complement of and , denoted (sometimes ), is the set containing all the members of
that are not members of . That is,
CT3_(ii)_(2)
Introduction to Sets 1
Example. .
Define a universe , or a set containing all of the elements we wish to consider. Then the absolute
complement of a set. is . The absolute complement of in is denoted (or ). The
union and intersection operations are symmetric. That is, if and are sets, , and
. Furthermore, they are associative. That is, if , , and are sets,
Union distributes over intersection and intersection distributes over union, i.e.
Remind two important relations which are known as De Morgan's laws, i.e.
A set of sets is usually referred to as a family of sets. For a family of sets , define the union and
intersection of the family by
To illustrate set theoretic operations we use so-called Venn diagrams or set diagrams that show possible
relations between a finite collection of sets. Venn diagrams were developed around 1880 by John Venn. They
are used to teach elementary set theory, as well as illustrate simple set relationships in probability and
statistics. Remind that there are two common notations for the complementary event, i.e. .
Here A is denoted by 1, B by 3, by 2 and by 4.
CT3_(ii)_(2)
Introduction to Sets 2
Example If A and B are two events in the sample space , then (A union B) = either A or B occurs
or both occur. (A intersection B) = both A and B occur. (A is a subset of B) = if A occurs,
so does B. or = event A does not occur. (the empty set) = an impossible event. (the sample
space) = an event that is certain to occur.
Example Experiment: rolling a dice once -
Sample space = {1,2,3,4,5,6}.
Events A = score = {1,2,3}.
B = score is even= {2,4,6}.
C = score is 7 =
D = the score is 4 or even or both = {1,2,3,4,6}
E = the score is 4 and even= {2}.
or = event A does not occur = {4,5,6}.
Back to CT3 (ii).
CT3_(ii)_(2)
Introduction to Sets 3
Derive basic properties satisfied by the probability of occurrence of an event, and calculate probabilities of
events in simple situations.
Measures and Probability Spaces
Another basic notion is the concept of event. An event A with respect to a particular sample space
associated with an experiment is a set of possible outcomes. In set terminology, an event is a subset of the
sample space . This means that itself is an event and so is the empty set (impossible event). When the
sample space is finite or countably infinite, every subset may be considered as an event. However, if is
noncountably infinite, a theoretical difficulty arises. It turns out that not every conceivable subset may be
considered as an event. Some nonadmissible subsets must be excluded.
Let be a non-empty set and be a collection of subsets of . We call a algebra if the following
hold:
.

. •
If is a sequence of subsets in then .

The pair is called a measurable space. A measure on is a mapping (set function)
that satisfies
.

For every of mutually disjoint sets, i.e. in ,

The triple is called a measure space. The total mass of , i.e. is finite if .
More generally, a measure is -finite if there is a sequence such that for any
we have and .
Definition: Let be an experiment and be a sample space associated with . With each event
we associate a real number called the probability of A satisfying the following properties
.


.

If are pairwise mutually exclusive events, then by 3,

Theorem If is the empty set, then .
. Since A and are mutually exclusive, it follows from property 3 that .
CT3_(ii)_(3)
Measures and Probability Spaces 1
Remark The converse of the above Theorem is not true. That is, if P(A)=0, we cannot conclude that
.
If is a complementary event of A, then
and using 2 and 3, we obtain .
Theorems If A and B are any two events, then
Observe that
and
Hence, by property 3,
and
Subtracting the second equation from the first we get the proof.
Theorem If , then .
Since , then
because and by 1, .
Back to CT3 (ii).
CT3_(ii)_(3)
Measures and Probability Spaces 2
Derive the addition rule for the probability of the union of two events, and use the rule to calculate
probabilities.
Multiplication Principle
Suppose that a procedure designated by 1 can be performed in ways. Lets assume that a second
procedure, designated by 2 can be performed in ways. Suppose that each way of doing 1 may be followed
by any way of doing 2. Then by the multiplication principle the procedure consisting of 1 followed by 2
may be performed in .
This principle may be extended to any number of procedures. If there are k procedures and the i th procedure
may be performed in ways, , then the procedure consisting of 1 followed by 2,...,k may be
performed in .
Example. A manufactured item must pass through three control stations. At each station the item is
inspected for a particular characteristic and marked accordingly. At the first station, three ratings are
possible while at the last two stations four ratings are possible. Hence there are ways in
which the item may be marked.
Addition Principle
Suppose that a procedure designated by 1 can be performed in ways. Assume that a second procedure,
designated by 2, can be performed in ways. Suppose furthermore that it is not possible that both 1 and 2
are performed together. Then the number of ways in which we can perform 1 or 2 is .
This principle, too, may be generalised as follows. If there are k procedures and the ith procedure may be
performed in ways, , then the number of ways in which we may perform procedure 1 or
procedure 2 or ... or procedure k is given by ., assuming that no two procedures may be
performed together.
Example. Suppose that we are planning a trip and are deciding between bus or train transportation. If there
are three bus roots and two train routs, then there are 3+2=5 different routs available for this trip.
Example. Suppose that we have n different objects. In how many ways, say , may these objects be
permuted (arranged)? If we have objects a,b,c, we can consider the following arrangements:
Thus the answer is 6. In general, arranging n objects is equivalent to putting them into a box with n
compartments, in some specified order.
The first slot may be filled in any one of n ways, the second slot in any one of ways,..., and the last
slot in exactly one way. Hence applying multiplication principle, we see that the box may be filled in
ways.
Definition If , we define and call it -factorial. We define
Thus the number of permutations of n different objects is .
Example. Consider n different objects. This time we wish to choose r of these objects, , and
permute the chosen r. Denote the number of ways of doing this by . We again resort to the above
scheme of filling a box having n compartments. This time we simply stop after the rth compartment has been
filled. Thus the first compartment may be filled in n ways, the second in (n-1) ways,..., and the rth
compartment in ways. Thus, by the multiplication principle we get
CT3_(ii)_(4)
Multiplication Principle 1
ways, or
Example. Consider n different objects. We are concerned with counting the number of ways we may choose
r ( ) out of these n objects without regard to order. For example, we have objects and r=2;
we wish to count ab, ac, ad, bc, bd, cd. We do not count ab and ba since the same objects are involved and
only the order differs. To get the general result we recall the formula for . Let be the number of
ways of choosing r over n, disregarding the order. Note that once the r items have been chosen, there are r!
ways of permuting them. Hence, applying multiplication principle we obtain Thus the
number of ways of choosing r out of n different objects, disregarding order, is given by
The numbers often
are called binomial coefficients, for they appear as coefficients in the expansion of the binomial expression
The numbers have many interesting properties only three of which we mention here
where .
Back to CT3 (ii).
CT3_(ii)_(4)
Addition Principle 2
Define the conditional probability of one event given the occurrence of another event, and calculate such
probabilities.
Conditional probability
Definition Let A and B be two events associated with an experiment . Define , the conditional
probability of the event B, given that A has occurred as
provided that
We start with
Example. A lot of 100 items consists of 20 defective and 80 nondefective items. Suppose that we choose 2
items from this lot, (a) with replacement; (b) without replacement.
Define the following two events.
If we are choosing with replacement, . For each time we choose
from the lot there are 20 defective items among the total of 100. However, if we are choosing without
replacement,the results are not quite immediate. Clearly, . In order to compute we
should know the composition of the lot at the time the second item is chosen. If A has occurred , then on the
second drawing there are only 99 items left, 19 of which are defective. That is, we should know whether A
did or did not occur. In our example, .
Whenever we compute we are essentially computing with respect to the reduced sample
space A, rather then with respect to the original sample space . When we evaluate we are asking
ourselves how probable it is that we shall be in , knowing that we must in , i.e. since then
. When we compute we are asking ourselves how probable it is that
we shall be in B, knowing that we must be in A. That is, the sample space has been reduced to A. It is a
simple matter to verify that , for a fixed A, satisfies the various postulates of probability.
.

.

If then

If A=S, . As it follows from the
definition.

Multiplication Theorem
or .
CT3_(ii)_(5)
Conditional probability 1
Example. Consider the lot of 20 defective and 80 non defective items. If we choose two items at random,
without replacement, what is the probability that both items are defective? As before, we define events A
and B as follows:
Hence we require , which we may compute as . But as we know
, while and hence .
Here are some properties of conditional probability.
If then .

If then , since
.

.

Definition We say that events represent a partition of the sample space if
for any .

.

for any .

In words: When the experiment is performed one and only one of the events occurs.
Back to CT3 (ii).
CT3_(ii)_(5)
Multiplication Theorem 2
Derive Bayes? Theorem for events, and use the result to calculate probabilities.
Bayes Theorem
Theorem Let be a partition of the sample space and let be an event. Then
Clearly,
Let be any event then and for any
since for any . Hence, by the multiplication theorem
The above formula gives us the probability of a particular , given the event A has occurred.
Example. Suppose that 10 of the undergraduates at a certain university are foreign students, and that 25
of the graduate students are foreign. If there are four times as many undergraduates as graduate students,
what fraction of foreign students are undergraduates? This question can be phrased as: If a student selected
at random is found to be a foreign student, what is the probability that the student is an undergraduate? This
information implies:
We want to find P(U|F). From Bayes' theorem, we have
These computations are summarized in the following table of joint probabilities of the indicator functions of
U and F:
0.08 0.72 0.8
0.05 0.15 0.2
0.13 0.87 1
Back to CT3 (ii).
CT3_(ii)_(6)
Bayes Theorem 1
Define independence for two events, and calculate probabilities in situations involving independence.
The notion of independence is one of the most important in Probability Theory. Intuitively we would like to
call two events or random variables independent if there is no mutual dependency. For example, tossing a
coin and rolling a die are independent events.
Recall that two events and are called independent if
This property is consistent with the intuition of independence.
However, it is not sufficient to define independence only for pairs of events or random variables. It is
possible that all the pairs of events , and are independent, but the triple
is not mutually independent.
Example.Consider a non-traditional dice with four faces. We number the first three faces with numbers 1, 2
and 3 respectively, and on the fourth face we put all the three numbers 1, 2 and 3. Now let us throw the dice,
and let
Simple logic says that depends on , since if and happen simultaneously then happens too
with probability one. But one can easily check that all the pairs , and are
independent, but the triple is not mutually independent. (See question at the end of the page).
Definition. Events are called mutually independent if
for any and any .
Questions
Formalise the above example as ,
and
1.
Prove that the pairs , and are independent, but the triple is not
mutually independent according to the above definition.
CT3_(ii)_(7)
Questions 1
Solutions
Back to CT3 (ii)
CT3_(ii)_(7)
Solutions 2
Statistics (CT3)
Component 1
Data and probability
Coursework questions
1. The numeric data in Table 1 is a record of the numbers of MP3 players sold in a
music store during 27 business days.
30 11 29 34 54 36
49 31 42 45 25 25
15 18 13 25 13
55 55 38 31 43
38 22 37 20 36
Table 1.
Sort the data in Table 1 in a stem-and-leaf plot.
2. Given set of numbers 0, 0, 2, 3, 5, 6, 7, 8, 8, 9, 10. Find the mean, median, mode, vari-
ance, standard deviation and skewness.
3. A set of 6 boxes labelled 1, 2, ..., 6 contains raisins. Boxes 2, 3, 4, 5 and 6 contain
the same number of raisins, but there are twice this number in box 1. If all raisins
are poured into a bag and a raisin is selected at random out of the bag, what is the
probability that
(a) the raisin came from box 3;
(b) the raisin did not come from box 1.
Statistics (CT3)
Component 1
Data and probability
Coursework solutions
1. Choose our classes to have a width of 10 and have the following classes: 10−19, 20−
29, 30 −39, 40 −49, 50 −59.
Stem Leaf
1 1, 3, 3, 5, 8
2 0, 2, 5, 5, 5, 9
3 0, 1, 1, 4, 6, 6, 7, 8, 8
4 2, 3, 5, 9
5 4, 5, 5
Table 2.
2. The mean is
x =
1
n
n

k=1
x
k
=
1
11
(0 + 0 + 2 + 3 + 5 + 6 + 7 + 8 + 8 + 9 + 10)
=
58
11
= 5.273.
The median is 6. The sequence has two modes, 0 and 8, and is called bimodal. The
variance is
m
2
=
1
n −1
n

k=1
(x
k
−x)
2
=
1
11 −1

(0 −5.273)
2
+ (0 −5.273)
2
+ (2 −5.273)
2
+ (3 −5.273)
2
+ (5 −5.273)
2
+(6 −5.273)
2
+ (7 −5.273)
2
+ (8 −5.273)
2
+ (8 −5.273)
2
+ (9 −5.273)
2
+ (10 −5.273)
2

= 12.618
and the standard deviation is S =

m
2
=

12.618 = 3.552 . The skewness g is
g =
n
−1

n
k=1
(x
k
−x)
3

n
−1

n
k=1
(x
k
−x)
2

3/2
.
Computing sums in the numerator and denominator we get
(0 −5.273)
3
+ (0 −5.273)
3
+ (2 −5.273)
3
+ (3 −5.273)
3
+ (5 −5.273)
3
+(6 −5.273)
3
+ (7 −5.273)
3
+ (8 −5.273)
3
+ (8 −5.273)
3
+(9 −5.273)
3
+ (10 −5.273)
3
= −136.566,
(0 −5.273)
2
+ (0 −5.273)
2
+ (2 −5.273)
2
+ (3 −5.273)
2
+(5 −5.273)
2
+ (6 −5.273)
2
+ (7 −5.273)
2
+ (8 −5.273)
2
+(8 −5.273)
2
+ (9 −5.273)
2
+ (10 −5.273)
2
= 126.182,
g =
11
−1
×(−136.566)
(11
−1
×126.182)
3/2
= −0.319 .
3. (a) Let x be the proportion of raisins in box 3,
x =
number of raisins in box 3
total number of raisins
.
Then the proportion of raisins in box 1 is 2x and
1 = 2x + 5x = 7x.
Thus the probability that the raisin come from box 3 is 1/7.
(b) Let y be the probability that the raisin came from box 1. Then
1 = y +
5x
2
.
Solving for y we get y = 2/7. Hence the probability that the raisin did not come
from box 1 is
1 −
2
7
=
5
7
.