Probability and Statics Ee Final 94 PDF

www.gradeup.
co
1
www.gradeup.co
ENGINEERING MATHEMATICS
1 PROBABILITY AND STATISTICS
The use of probability is very general in real life as we predict the chances of different events. Some
events seem to have a certainty about their outcome; while a few are certain not to happen. There
are others, which, with regard to their outcome, vary between the two extreme situations referred to
above. In a rough way, a measure of the extent of the happening or non-happening of an event may
be said to be given by the term Probability. The word probability and the word chance are
synonymous and may be taken, in this context, to be indistinguishable.
The concepts of permutation and combinations are used in probability so there should be good
understanding of permutation and combination before approaching probability question.
1. PERMUTATION AND COMBINATION
Permutation and Combination are basically counting principle.
1.1. Counting Principle:
The counting principle is used to calculate the number of distinct ways of doing a certain
task. Permutation and combination helps to calculate the ways using different rules.
A) Fundamental principle of multiplication:
If a task T can be divided into sub-tasks S1, S2 and S3 and there are m, n, r ways of
doing these things independently then the task can be done in m x n x r ways.
Note: Try to divide the task into sub-task and apply this method to calculate the total
ways. This is the basic method to calculate the total ways.
B) Fundamental principle of addition
If a task can be done by only m, n and r ways independently then the task can be done
in m + n + r ways.
Note: Always remember that this method is used when the task is not divided into sub-
tasks. Rather there are different ways to do the total tasks and thus addition is done.
Example: There are 3 cabs, 4 trains and 5 buses from City A to City B. Thus, total
different ways to move from city A to city B are 3 + 4 + 5 = 12 ways.
2
www.gradeup.co
1.2. Factorial:
The factorial of a natural number is the product of all the positive integers from 1 up to
the number mentioned. The factorial of a given integer n is usually written as n! :
n! = n (n – 1)(n – 2) x ……… x 1
n! = n (n – 1)!
0! = 1
𝑛!
𝑛𝑃𝑟 =
(𝑛– 𝑟)!
𝑛!
𝑛𝐶𝑟 =𝑛 𝐶𝑛–𝑟 = (𝑛–𝑟)!𝑟!
𝑛𝑃𝑟
𝑛𝐶𝑟 =
𝑟!
1.3. Permutation:
Permutation simply means an arrangement or order of things. So if in a question
arrangement or order matters or refers to making different words, sitting arrangements
or number then that is related to permutation. The key words for permutation are “Sitting
Arrangement, Order matters, Making Words, Making Numbers”.
Now some of the formulas based on different concepts:
Case 1) Total distinct ways to arrange n different things:
𝑛𝑃𝑛 = 𝑛!
Example: In how many ways 10 people can be arranged in 10 chairs?
Solution: 10𝑃10 = 10!
Case 2) Total distinct ways to arrange n different things taken all at a time where p are
alike, q are alike and r are alike.
𝑛𝑃𝑛 𝑛!
=
𝑝! 𝑞! 𝑟! 𝑝! 𝑝! 𝑟!
Case 3) Total distinct ways to arrange n different things taken r at a time:
A) Repetition is not allowed:
𝑛!
𝑛𝑃𝑟 =
(𝑛– 𝑟)!
B) Repetition is allowed: nr
Case 4) Total distinct ways to arrange n different things taken all at a time
A) Where r particular things are always together total distinct ways are :
(n − r + 1)! r!
B) Where r particular things are not together:
n! − (n − r + 1)! r!
Case 5) If Restriction or Constraint is given, then always start with the restricted
position:
3
www.gradeup.co
Case 6) Circular Permutations: Circular permutation is a special case of arrangement

where we have to arrange in circle. Suppose four numbers 1, 2, 3, 4 are to be arranged
in the form of a circle. The arrangement is read in anti-clockwise direction, starting from
any point as 1432, 4321, 3214 or 2143. These four usual permutations correspond to
one circular permutation. Thus circular permutations are different only when the relative
order of objects to be arranged is changed. Each circular permutation of n objects
corresponds to n linear permutations depending on where (of the n positions) we start.
This can also be thought of as keeping the position of one out of n objects fixed and
arranging remaining n – 1 in (n – 1)! ways.
Example: 5 persons can be arranged in 4! ways on a circular table.
Note:
• If nothing is given for making number questions, assume that repetition is allowed.
• Repetition is used: 1) in case of making number or 2) when there is infinite number of
things to be used.
Example: There are three choices of fruits for breakfast; lunch and dinner If Jonny has
to select one fruit for each meal then how many different types of meals are possible in
a day?
Solution: There are 3 choices for breakfast, 3 choices for lunch and 3 for dinner thus
total ways are = 3 x 3 x 3 = 27.
• If in circular permutation anti-clock wise and clockwise arrangement is same then total
(n−1)!
number of distinct ways is .
2
1.4. Combinations:
Combination is used for selection, grouping or for making teams. In combination,
arrangement of things does not matter or the order of things does not matter. So in
combination number of different ways is always less than the number of arrangements.
Key words for combination questions are “Selection, Order does not matter, Making a
𝑛𝑃𝑟
team, committee” etc. Also relation between combination and permutation is: 𝑛𝐶𝑟 =
𝑟!
Different cases for combinations are the following:
Case 1): Total ways to select r different things out of n different thing: n
Cr
Example: How many teams of 11 players can we make from 14 players?
Solution: There are 14
C11 ways to select 11 out of 14 players.
Case 2): Total ways to select r things out of n identical things: “Only one way”
Example: How many ways we can select 5 black balls from 15 identical black balls?
Solution: As all the balls are identical thus all ways to select any 5 balls are same.
Thus there is only one way to select 5 identical balls from 15 identical balls.
Case 3): Total ways to select zero or more things out of n different things:
𝑛𝐶0 +𝑛 𝐶1 +𝑛 𝐶2 +𝑛 𝐶3 +. . . . . . . .𝑛 𝐶𝑛–1 +𝑛 𝐶𝑛 = 2𝑛
4
www.gradeup.co
Example: How many ways we can select any number of marbles from 10 different
marbles?
Solution: We can select any number of marbles ranging from 0, 1, 2 to 10, thus total
ways would be sum of all these or
10𝐶0 +10 𝐶1 +10 𝐶2 +10 𝐶3 +. . . . . . . .10 𝐶9 +10 𝐶10 = 210
Case 4): Total ways to select zero or more things out of n identical things: (n + 1) ways
Example: In how many ways can we select any number of balls from 15 identical balls?
Solution: As all the balls are identical thus there is one distinct way to select any number
of balls from 0, 1, 2 to 15, thus there are total 16 ways, one for selecting each (0, 1, 2,
…15).
Case 5): Distribution: Whenever things are divided into groups, which has a fixed
number of things then total ways can be calculated as:
• Dividing (m+n+r) things into three group with m, n and r things:
(m+n+r)!
;
m!n!r!
• If some of the groups contain same number of things then divided by factorial of same
number of groups. If (2m + n) things has to distributed in three groups of m ,m and n
things then total ways to distribute is:
(2m + r)!
(m!)2 r! 2!
• If each group has been given a different name then multiply the total ways by number
of groups.
Example: Divide 21 persons in three groups “A”, “B”, and “C” with 5, 6, 10 persons in
any groups.
Solution: Total ways to distribute 21 persons in 3 groups of 5, 6 and 11 is
(21)!
; Now name can be given by 3! ways because any group can be associated with any
5!6!10!
(21)!
name. Thus total ways are: × 3!
5!6!10!
Example: In how many ways 15 people can be divided into three teams of 5, 7 and 3
people?
(15)!
Solution:
5!7!3!
Example: In how many ways 20 people can be divided into three teams of 7, 7 and 6
people?
(20)!
Solution:
(7!)2 6!2!
Example: In how many ways 21 people can be divided into three teams of 7 people
each?
(21)!
Solution:
(7!)3 3!
5
www.gradeup.co
Case 6) When restriction or constraint is given, always start with restriction:

Example: There are 15 players (A, B, C, D… M, N, O) in a team. We have to make a
team of 11 players such that
A) A and B are always in the team.
B) A comes in the team but B does not.
Solution: A) As A and B are always in the team so we have to select only 9 players out
of 13 players, thus total ways are 13
C9 .
Solution: B) As A is in the team so we have to select 10 more players and B can’t come
in the team so there are 13 players from which 10 players has to be selected. Thus total
ways to select 10 players from 13 players are 13
C10
Note:
• Try to find out the type of question (Permutation or Combination) using the key words.
• Always try to divide task into sub-tasks and then use fundamental principle of
multiplication.
If there are more ways to do the task then use the fundamental principle of addition.
2. PROBABILITY
Probability is a numerical way of describing how likely (or not) an event is to happen.
2.1. Definitions:
2.1.1 Random Experiment: An experiment is defined as any sort of operation whose
outcomes well defined and cannot be predicted in advance with certainty, or any action
which gives some outcome is called an experiment. Throwing a dice, picking a card from
a pack, tossing a coin, etc are experiments. For discrete probability, experiment has finite
known outcomes.
2.1.2 Trial: When an experiment is repeated under similar conditions and it does not
give the same result each time but may result in any one of the several possible
outcomes, the experiment is called a trial and the outcomes are called cases. The number
of times the experiment is repeated is called the number of trials.
For example: If tossing a coin is an experiment then tossing the coin 4 times gives us 4
trials.
2.1.3. Sample space: the sample space S for an experiment is the set of all possible
outcomes that might be observed. Each element of the sample space is called a sample
point. n(S) denotes the number of elements of sample space S.
Example:
• In toss of a coin, S = {H, T} where H and T are sample points representing a head and
a tail respectively.
• In throw of a die, S = {1, 2, 3, 4, 5, 6} where the numbers are the sample points
representing the six faces.
6
www.gradeup.co
2.1.4 Event: An event is a particular or a desired set of outcomes for a given experiment.
We can also define the event as a subset of sample space, which contains any element
of S.
Example: Consider the experiment of tossing two coins. The sample space is S = {TT,
TH, HT, HH} where H stands for head and T for tail. n (S) = 4.
Event may be heads on exactly one of these two coins (HT, TH). Thus n(E) = 2.
2.2. Definition of Probability:
When a certain experiment is done then the chances of an event is known as the
probability of that event. If each of the elements in the sample space are equally likely,
then we can define the probability of event E as the ratio of number of outcomes of an
event E with the total number of possible outcomes for the experiment.
n(E)
Probability of an Event is denoted by P(E) =
n(T)
Where n(E): the number of outcomes in Event E

n(T): the total number of possible outcomes in experiment.
2.3. Basic probability axioms
We now look at the basic rules of this probability definition:
The three basic probability axioms can be summarised as follows:
1. Probability of event is always between 0 and 1.
0  P (E)  1
If P(E) = 0 then impossible event. If P(E) = 1, then certain event.
2. P(S) = 1
The relative frequency of an event that is certain to occur must be 1. The sample
space, S contains all possible outcomes and therefore the probability of S must
be 1.
3. For an event A and its compliment Ac in sample space S
P(Ac) = 1 − P(A) as A  Ac =  and A  Ac = S
2.4. Type of Events:

2.4.1. Simple event
Each sample point in the sample space is called an elementary event or simple event. For
example: occurrence of head in throw of a coin is simple event.
2.4.2. Sure event
The set containing all sample points is a sure event as in the throw of a die the occurrence
of natural number less than 7, is a sure event.
2.4.3. Null event
The set which does not contain any sample point.
7
www.gradeup.co
2.4.4. Mixed/compound event

A subset of sample space S containing more than one element is called a mixed event or
a compound event.
2.4.5. Compliment of an event
Let S be the sample space and E be an event then E c or Ē represents complement of event
E which is a subset containing all sample points in S which are not in E. It refers to the
non occurrence of event E.
2.4.6. Equally Likely event:
When an experiment is done there may be many events and if the probability of each
event is same then the events are called equally likely events.
Example 1: While throwing a dice the probability of coming either 1 or 2 or 3 or 4 or 5
or 6 are equal and so the chances would be 1/6, thus all these events are equally likely.
Example 2: When selecting a number from 1 to 10, each number has same probability
as another number which means Probability (Getting 1) = P(Getting 2)= P(Getting
3)=…=P(Getting 10) =1/10. All these events are equally likely.
2.4.7. Independent and Dependent events: Two or more events are said to be
independent if happening of one does not affect other events. On the other hand if
happening of one event affects (partially or totally) other event, then they are said to be
dependent events.
Example 1: Tossing two coins simultaneously;
Event A: head comes on first coin.
Event B: tail comes on second coin.
Example 2: Two dices and 2 coins are thrown together;
Event A: exactly two head come.
Event B: sum of numbers on dice is even.
2.4.8. Mutually exclusive events or disjoint events:
When an experiment is done then there may be many events. If events are such that the
occurrence of one event ensures the non-occurrence of other events then the events are
called mutually exclusive events. It may be treated in a way that if there is not a single
outcome which is common between any of the two events then the events are called
mutually exclusive events.
Example 1: Throw a dice
Event A: the number 1 comes.
Event B: the number 2 comes.
Event C: the number 3 comes.
Event D: the number 4 comes.
Event E: the number 5 comes.
8
www.gradeup.co
Event F: the number 6 comes.

Example 2: throw a coin
Event A: head comes
Event B: tail comes
2.4.9. Collectively Exhaustive Events:
A set of events is exhaustive if the performance of the experiment results in occurrence
of at least one of them. For example:
• In throw of a die, the event of occurrence of an even number and the event of
occurrence of an odd number are exhaustive.
• In throw of a fair coin, occurrence of a head or a tail is exhaustive.
Exhaustive events cover the whole of the sample space. Their union is equal to S.
2.5. Odd for an Event and Odd against of Event:
If an event A happens in m number of cases and if total number of exhaustive cases are
n then we can say that -
m
The probability of event A, P(A) =
n
m n−m
and P( A ) = 1− =
n n
P( A )
 Odds in favour of A =
P( A )
m/n m
= =
(n − m) / n n−m
Odds in against of
P( A ) ( n − m ) / n n−m
A= = =
P( A ) m/n m
So Odds in favour of A = m : (n – m)
Odds in against of A = (n – m): m
2.6. Addition theorem of Probability:
If there is more than one event for an experiment then the additional principle is used to
calculate the probability of happening of either (at least one )of these events.
Case I: When events are mutually exclusive:
If A and B are mutually exclusive events then n (A  B) = 0  P (A  B) = 0
P (A  B) = P (A) + P (B)
Similarly; P (A  B  C) = P (A) + P (B) + P (C)
Similarly; P (A1 + A2 + ...+ An ) = P (A1 ) + P(A2 ) +....+ P (An)
i.e. P (∑ Ai ) = ∑ P (Ai)
9
www.gradeup.co
Case II: When events are not mutually exclusive.
If A & B are two events which are not mutually exclusive then.
P (A  B) = P (A) + P (B) – P (A  B)
or P (A+B) = P (A) + P (B) – P (AB)
For any three events A, B, C
P (A  B  C) = P (A) + P (B) + P (C) – P (A  B) – P (B  C) – P (C  A) + P (A  B 
C)
or P (A + B+ C) = P (A) + P (B) + P(C) – P (AB) – P (BC) – P (CA) + P (ABC)
In general for n events A1, A2 , ……… , An of sample space S
P(A1 ∪ A2 ∪. . . . . . . An )
n
= ∑ P(Ai ) − ∑ P(Ai ∩ Aj ) + ∑ P(Ai ∩ Aj ∩ Ak ) . . . . . +(−1)n−1 P(A1 ∩. . . .∩ An )

i=1 1≤i<j<n 1≤i<j<k
2.7. Multiplication Theorem of Probability: Multiplication principle of probability is used to
calculate the probability of more than one event simultaneously. Thus,
Case I: When events are independent:
If A1, A2 , ......., An are independent events, then
P (A1. A2. ....An) = P (A1 ) P (A2 ) .....P (An).
So if A and B are two independent events then happening of B will have no effect on A.
So P (A/B) = P (A) and P (B/A) = P (B), then
P (A  B) = P (A). P (B) OR P (AB) = P (A) . P (B)
Case II: If the events are mutually exclusive then
P(A ∩ B) = P (A and B) = 0
2.8. Type of questions asked in GATE:

Basically the questions asked in GATE can be divided in four categories.
1) Based on Card: There are total 52 cards in a pack. There are four different types
of cards in a pack of 52 cards- Spade, Diamond, Club, Heart. There may be questions
based on cards.
2) Based on coin, child and True/False: These are the question where total number
of outcomes for a single trail is 2. There are always 2n total outcomes possible in case
of n trail ( n tosses, n children or n questions)
Type 1) Toss a coin then only two outcomes Head or Tail.
Type 2) Select a child, then it would be either boy or girl.
Type 3) If there is a question which has only two answer True or False.
10
www.gradeup.co
3) Based on Dice: These are the question where one or more than one dices are thrown
for an experiment. Total number of outcomes is 6N where N is the number of dice
thrown together or number of times on dice thrown.
4) General: There may be questions which can be done using the concept of
permutation and combination. Use P & C method to calculate total outcomes as well
as favourable outcomes for an event.
2.09. Geometric Probability: This the concept which is used in GATE for geometric figures.
This is similar to probability concept, as here the favourable outcome is the favourable
area, and total outcome is the total area.
SHADED AREA OR FAVORABLE AREA
P(G) =
TOTAL AREA
2.10. Conditional Probability:
If A and B are dependent events, then the probability of B when A has happened is
B
called conditional probability of B with respect to A and it is denoted by P(B/A) or P( ).
A
In this case A serves as a new (reduced) sample space, and the probability is the fraction
of that part of set A which corresponds to A  B. Thus
It may be seen that
B P(A∩B)
P( ) = ; where P(A)  0
A P(A)
Similarly, the conditional probability of A given B is

P(A∩B)
P(A/B) = ; where P(A)  0
P(B)
From the above two expressions, we can state the probability of intersection of two
events A and B where P(A)  0 and P(B)  0 as
A A
P(A  B) = P(A) . P ( ) or P(B)P ( ). (Multiplication theorem)
B B
In Case of independent Events:

If events A and B are such that
P(A  B) = P(A) P(B),
they are called independent events. Assuming P(A)  0, P(B)  0, in this case
P(A/B) = P(A) , P(B/A) = P(B)
This means that the probability of A does not depend on the occurrence or
nonoccurrence of B, and conversely. This justifies the term independent.
Similarly , m events A1 , ,…….. , Am are called independent if
P(A1  ….  Am) = P(A1) ……. P(Am)
as well as for every k different events Aj1 , Aj2 , . . . . . . , Ajk
P(Aj1 ∩ Aj2 ∩. . . .∩ Ajk ) = P(Aj1 ) P(Aj2 ). . . . P(Ajk )
where k = 2, 3, … , m − 1.
11
www.gradeup.co
2.11. Total Probability:
Consider a sample space S, let Ai, i = 1 to n be the set of n mutually exclusive and
exhaustive set of sample space S.

n
Thus A i  A j =  for 1  i < j  n and ∑i=1 P(Ai ) = 1 as ∪ Ai = S
i=1
A1 Ak S
A1A AkA
A2A A3A
A2 A3 A4
Let A be any event of S. Then total probability of the event A is given by

P(A) = ∑ni=1 P(Ai )P(A/Ai )
where P(A/A i) gives us the contribution of A i in the occurrence of A.
This result is obtained as A = (A1  A)  (A2  A)  …….  (An  A)
 P(A) = P(A1  A) + P(A2  A) + …….+ P(An  A)

A A A A
= P(A1 )P ( ) + P(A2 ) P ( ) +. . . . . +P(An ) P ( ) = ∑ni=1 P(Ai ) P ( )
A1 A2 An Ai
2.12. Bayes Theorem:
Bayes theorem gives probability of occurrence of an event when the outcome of
experiment is known.
Consider a sample space S, let Ai, i = 1 to n be the set of n mutually exclusive and
exhaustive set of sample space S.

n
Thus Ai  Aj =  for 1  i < j  n and ∑i=1 P(Ai ) = 1 as ∪ Ai = S
i=1
A1 Ak S
A1B AkB
A2B A3B
A2 A3 A4
Let B be an event of S which has already occurred then conditional probability of
occurrence of any one of the event say Ak out of the Ai , i = 1, 2, ….. n events is
A P(Ak ∩B) P(Ak ) P(B/Ak )
P ( k) = =
B P(B) P(B)
Now using the concept of total probability, we get Baye’s theorem as follows:
A P(Ak ) P(B/Ak )
P ( k) = ∑n
B i=1 P(Ai ) P(B/Ai )
12
www.gradeup.co
3. STATISTICS
Statistics is the science of collecting, organizing and interpreting numerical facts which we
often call data. Synonyms for data are scores, measurements and observations. The study and
collection of data involves classifying data in various heads. The process involves lot of
representations of a characteristic by numbers and it is termed as measurement.
3.1. Measure of Central Tendency:
The most commonly used measures of central tendency are
3.1.1. The Arithmetic Mean, Average, Expectation (E(x)):
The arithmetic mean of a statistical data is defined as the quotient of the sum of all the
values of the variable by the total number of items. It is denoted by A.M.
Case 1: For an individual series
Σx
A.M. =
n
Case 2: For a frequency distribution,

Σfx
A.M = (Mean of grouped data)
Σf
Case 3: If Weighted Arithmetic Mean:

If w1, w2, w3, …, wn are the weights assigned to the values x 1, x2, x3, …, xn respectively,
then the weighted average is defined as:
w1 x1 +w2 x2 +⋯+wn xn
Weighted Arithmetic Mean = .
w1 +w2 +⋯+wn
Case 4: If If p(x1), p(x2), p(x3), …, p(xn) are the probabilities assigned to the random
Variable x1, x2, x3, …, xn respectively,
E[x] = ∑ x ip(xi )
Properties:
1. Mean tells the total only.
2. Mean lies between minimum and maximum value of data set and equal to both only
if all values are same.
3. There must be at least one data point which is more than or equal to mean. Also there
must be at least one data point which is less than or equal mean.
4. If each data point is +, -, x, or ÷ by k then mean is also +, -, x, or ÷ by k.
3.1.2. Median:
Median is defined as the middle most or the central value of the variables in a set of
observations, when the observations are arranged either in ascending or in descending
order of their magnitudes. It divides the arranged series in two equal parts. Median is a
position average, whereas, the arithmetic mean is the calculated average.
13
www.gradeup.co
Case I: When n is odd.

n+1 n+1
In this case th value is the median i.e. M = th term.
2 2
Case II: When n is even.

n n
In this case there are two middle terms th and ( + 1) th. The median is the average of
2 2
n n 
+  + 1
2 2 
those two terms, i.e. M = th term
2
Case III: When the series is continuous.
In this case the data is given in the form of a frequency table with class-interval, etc.,
and the following formula is used to calculate the Median.
n
−C
M=L+ 2
× i, where
f
L = lower limit of the class in which the median lies

n = total number of frequencies, i.e., n = f.
f = frequency of the class in which the median lies
C = cumulative frequency of the class preceding the median class
i = width of the class-interval of the class in which the median lies.
Properties:
1. Median is the middle value , when data is sorted (ascending or descending order)
2. Mean lies between minimum and maximum value of data set.
3. At least 50% of the data points are more than or equal to median. Also at least 50%
of the data points are less than or equal mean.
4. If there are more than 3 data points then increasing the maximum value or decreasing
the minimum value will not change the median of the data set.
5. If each data point is +, -, x, or ÷ by k then median is also +, -, x, or ÷ by k.
3.1.3 Mode:
Mode is defined as that value in a series which occurs most frequently. In a frequency
distribution mode is that variate which has the maximum frequency.
Continuous Frequency Distribution:
i) Modal Class: It is that class in grouped frequency distribution in which the mode lies.
fm −f1
Mode = L + × i, where
2fm −f1 −f2
L = the lower limit of the modal class

i = the width of the modal class
f1 = the frequency of the class preceding modal class
fm = the frequency of the modal class
f2 = the frequency of the class succeeding modal class.
14
www.gradeup.co
Sometimes it so happened that the above formula fails to give the mode. In this case,
the modal value lies in a class other than the one containing maximum frequency. In such
cases we take the help of the following formula:
f2
Mode = L + × i, where L, f1, f2, i have usual meanings.
f1 +f2
Properties:
1. If each data point is +, -, x, or ÷ by k then mode is also +, -, x, or ÷ by k.
3.1.4. Asymmetrical Distribution:
A distribution in which mean, median and mode coincide is called symmetrical
distribution. If the distribution is moderately asymmetrical, then mean, median and mode
are connected by the formula.
Mode = 3 Median – 2Mean
3.2. Measures of Dispersion
Dispersion means scatterness. The degree to which numerical data tend to spread about
an average value is called the dispersion of the data.
3.2.1 Range:
Range = L – S, where L = largest value; S = smallest value.
L−S
Coefficient of Range =
L+S
3.2.2 Mean Deviation:

It is the average of the modulus of the deviations of the observations in a series taken
form mean or median.
Methods for Calculation of Mean Deviation:
Case I: For Ungrouped Data.
In this case the mean deviation is given by the formula
Σ|x−A| Σ|d|
Mean Deviation = M.D. = = ,
n n
where ‘d’ stands for the deviation from the mean or median and |d| is always positive
whether d itself is positive or negative and n is the total number of items.
Case II: For Grouped data.
Let x1, x2, x3, …, xn occur with frequencies f1, f2, f3, ,fn respectively and let f = n and M
can be either Mean or Median, then the mean deviation is given by the formula.
Σf|x−M| Σf|d|
Mean Deviation = =
Σf n
Where d = |x – M| and f = n.
Mean Deviation
Coefficient of Mean Deviation =
Median
Mean Deviation
or = (In case the deviations are taken from mean)
Mean
15
www.gradeup.co
3.2.3 Standard Deviation and Variance:

The square of the standard deviation is called variance and is denoted by 2.
The Positive square root of the average of squared deviations of all observations taken
from their mean is called standard deviation. It is generally denoted by the Greek
alphabet  or s.
Standard Deviation for Ungrouped Data:
1. Direct method:
In case of individual series, the standard deviation can be obtained by the formula.
(𝐱−𝐱̄ )𝟐 𝚺𝐝𝟐
𝛔=√ =√
𝐧 𝐧
where d = x - 𝐱̄ and x = value of the variable or observation, x̄ = arithmetic mean, n =

total number of observations.
2. Standard deviation for grouped data:
It is calculated by the following formula.
Σf(x−x̄ )2
σ=√
n
where x̄ is A.M., x is the size of the item, and f is the corresponding frequency in the case
of discrete series.
3. Standard deviation in continuous series:
Direct Method. The standard deviation in the case of continuous series is obtained by the
following formula.
Σf(x−x̄ )2
σ=√
n
where x = mid-value, x̄ = A.M., f = frequency, n = total frequency.

4. If If p(x1), p(x2), p(x3), …, p(xn) are the probabilities assigned to the random Variable
x1, x2, x3, …, xn respectively,
𝐕𝐚𝐫(𝐱) = (𝛔𝐱 )𝟐
Var(x) = E[x 2 ] − (E[x])2
2
Var(x) = ∑ xi2 p(xi ) − (∑ x ip(xi ))
Properties:
1. If each data point is +, - by k then there is no change in Standard deviation and
Variance.
2. If each data point is x or ÷ by k then standard deviation is also x, or ÷ by |k| and
variance is x, or ÷ by k2 .
3.2.4. Combined Standard Deviation:
Let 1 and 2 be the S.D. of the two groups containing n 1 and n2 items respectively. Let
x̄ 1 and x̄ 2 be their respective A.M. Let x and  be the A.M. and S.D. of the combined group
respectively. Then
n1 x̄ 1 +n2 x̄
x̄ = .
n1 +n2
16
www.gradeup.co
n1 σ2 2 2 2
1 +n2 σ2 +n1 d1 +n2 d2
σ=√ , where d1 = x̄ 1 − x̄ and d2 = x̄ 2 − x̄ .
n1 +n2
Σ(x−x̄ )2
Variance = , or Variance = 2
n
  = √Variance
2
Σfd′2 Σfd′
Variance = [ −( ) ]  i2 (Continuous Series)
n n
3.2.5. Symmetric and Skew-symmetric

In a symmetrical distribution, Mean, Median, Mode coincide. Here, frequencies are
symmetrically distributed on both sides of some central value.
A distribution which is not symmetrical, is called skew-symmetrical. In a moderately
skew-symmetric distribution,
Mean - Mode = 3 (Mean - Median)
x Me M0
Symmetrical
x Me M0 x Me M0
–vely skewed + very skewed
1. In a positively skew-symmetric distribution

Mean>= median >=mode
2. In a negatively skew-symmetric distribution,
Mode>=Median>= mean
3. Zero skewed Data
Mean=Median=Mode
3.2.6 Coefficient of Standard Deviation or Coefficient of Variation (CoV)
(Measurement of Consistency):
The measure of variability which is independent of units is called coefficient of
variation(C.V.) and defined as
σ
C.V. = × 100, x̄ ≠ 0
x̄
17
www.gradeup.co
Where σand x̄ are standard deviation and mean of data respectively.
The series having greater C.V. is said to be more variable than the other series. The series
having lesser C.V. is said to be more consistent than the other.
3.2.7 Moments:
Moments are just particular expected values that summarise features of a distribution.
𝐸[(𝑥 − 𝑐)𝐾 ] is the kth moment or kth order moment of X about c denoted by 𝜇𝑘
Moments about the mean are called central moments.
A non-central moment refers to a moment about zero.
Note :
1. E[X] (= μ) , is a first-order moment, which provides information on the average value.
2. E[X] and 𝐸[(𝑋)2 ], moments are called first and second moment respectively.
3. Variance is called second moment about mean.
4. 𝜇3 is the measurement of skewness. 𝜇3 > 0is called positive skewness, 𝜇3 < 0is called
negative skewness and 𝜇3 = 0skewness or symmetric.
4. RANDOM VARIABLE AND PROBABILITY DISTRIBUTION
Random Variable : is a function defined from Sample space to Real Number.
X: Ω → R
Random Variable associates each sample point to a unique real number. Example : X(s1 ) = x1
4.1. Discrete Random Variable and Distribution:
Discrete Random Variable : If outcomes are discrete. Example: number of tosses,
number of heads, Sum of dices etc…
4.1.1 General Probability distribution of random variable
A distribution, in which values of the random variable and their corresponding
probabilities are given is called the probability distribution of the random variable.
Let us suppose that a discrete variable X assumes values 𝑥1 , 𝑥2 , . . . . . . 𝑥𝑛 with probability
𝑝1 , 𝑝2 . . . . . 𝑝𝑛 respectively, where 𝑝1 + 𝑝2 +. . . . . +𝑝𝑛 = 1 and 0 ≤ 𝑝𝑖 ≤ 1 for all 𝑖 = 1, 2, . . . . 𝑛.
Then the following table describes the probability distribution.
𝑋 𝑥1 𝑥2 𝑥3 𝑥4 ……… 𝑥𝑛
𝑃(𝑋) 𝑝1 𝑝2 𝑝3 𝑝4 ……… 𝑝𝑛
18
www.gradeup.co
Example :
A Coin is tossed 3 times. Random variable can be defined as number of Heads.
Sample Space Ω = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}
Random Variable : X(HHH) = 3, X(HHT) = 2, X(HTH) = 2, X(HTT) = 1 X(THH) = 2, X(THT) =
1, X(TTH) = 1, X(TTT) = 0
Random Variable : xi = {3,2,1,0}
Basic Properties:
𝑥𝑖 Cases Number of Cases 𝒇(𝒙𝒊 ) 𝑷(𝒙𝒊 ) or 𝒘(𝒙𝒊 )
0 { TTT} 1 1/8
1 {HTT, THT, TTH} 3 3/8
2 {HHT, THH, HTH} 3 3/8
3 {HHH} 1 1/8
1. E[ g ( x i )] =  g(x i ) i p( x i ) Example: E[x 2 ] = ∑ xi2 p(xi )

2
2. Var(g(xi )) = E(g(xi )2 ) − E(g(xi ))2 = ∑ g(xi )2 p(xi ) − (∑ g(xi )p(xi ))
3. Suppose x and y are two random variable and a and b are constant then
1. E(ax ± by) = aE(x) ± bE(y)
2. E[ax − by] = aE[x] − bE[y]
3. Var(ax ± b) = a2 Var(x)
4. Var(ax ± by) = a2 Var(x) + b2 Var(y) ± 2abCov(x, y) Here Cov(x, y) = E(xy) − E(x)E(y)
5. If x and y are independent variable then Cov(x, y) = 0 or E(xy) = E(x)E(y) and
Var(ax ± by) = a2 Var(x) + b2 Var(y)
4.1.2 Bernoulli distribution
A Bernoulli trial is an experiment which has (or can be regarded as having) only two
possible outcomes – s (“success”) and f (“failure”).
Sample space S = {s, f}. The words “success” and “failure” are merely labels – they do
not necessarily carry with them the ordinary meanings of the words.
For example: in life insurance, a success could mean a death!
Random variable X defined by X(s)=1, X(f ) = 0 . X is the number of successes that occur
(0 or 1).
Probability measure: P(x=1)= θ, P(x=0) = 1− θ
PMF(Probability Mass function): 𝑃(𝑥) = 𝜃 𝑥 (1 − 𝜃)1−𝑥 ; x =0,1 and 0 <θ < 1
Moments:
Mean μ =E(x)= 𝜇 = 𝜃
Variance: Var(x) = 𝜎 2 = 𝜃 − 𝜃 2 = 𝜃(1 − 𝜃)
19
www.gradeup.co
4.1.3. Geometric distribution:

Consider again a sequence of independent, identical Bernoulli trials with P({s}) = θ. The
variable of interest now is the number of trials that have to be performed until the first
success occurs. Because trials are performed one after the other and a success is awaited,
this distribution is one of a class of distributions called waiting-time distributions.
Random variable X: number of the trial on which the first success occurs
Distribution: For X = k there must be a run of (k − 1) failures followed by a success, so
PMF(Probability Mass function):𝑃(𝑋 = 𝑘) = 𝜃(1 − 𝜃)𝑘−1 ; k = 1, 2,3..; Here (0 <θ < 1)
1
Mean μ =E(x)= 𝜇 =
𝜃
1–𝜃
Variance: Var(x) = 𝜇 =
𝜃2
4.1.4 Binomial Distribution:

In n independent trials of a random experiment, let X be the number of times an event A
(success) occurs. In each trial, event A has same probability as P(A) = p referred to as
success. Then in a trial non-occurrence of A is referred as failure and given by q = 1− p.
Here X can assume values from o to n. Now X = r means success occurs in r trials and (n
− r).
Note: this is two parameter-based distribution (𝑛 𝑎𝑛𝑑 𝑝)
Random variable X: number of the successes in n trials
PMF(Probability Mass function): 𝑃(𝑋 = 𝑘) = 𝑛𝐶𝑘 𝑝𝑘 (1 − 𝑝)𝑛−𝑘
Mean: μ =E(x)= = np
Variance: Var(x) = 𝜎 2 = 𝑛𝑝𝑞
Hence Variance ≤ Mean of the Binomial Distribution. i.e. mean of the binomial distribution
is always greater than the variance.
4.1.5 Hyper geometric distribution:
Suppose n objects are selected at random, one after another, without replacement, from
a finite population consisting of r “successes” and N − r “failures”. The trials are not
independent, since the result of one trial (the selection of a success or a failure) affects
the make-up of the population from which the next selection is made
Distribution: probability of r success out of k successes is n trials is
𝑟
𝐶𝑘 × 𝑁𝑟𝑘𝐶𝑛𝑘𝑟
PMF(Probability Mass function): 𝑃(𝑋 = 𝑘) = 𝑁𝐶
𝑛
𝑛𝑘
Mean: 𝜇 =
𝑁
4.1.6 Poisson Distribution:

It is limiting case of binomial distribution. If the number of events n is very large (n → ∞)
and probability of success in each experiment is p (‘p’ being very small) and np = λ (say)
is finite, then
Note: this is one parameter-based distribution (λ)
20
www.gradeup.co
Random variable X: number of the trials : 0, 1, 2, 3,……..

𝑒 −𝜆 𝜆𝑘
PMF(Probability Mass function): 𝑃(𝑋 = 𝑘)or 𝑃(𝑘) = , where 𝑘 = 0, 1, 2, . .. and 𝜆 = 𝑛𝑝.
𝑘 !
Here 𝜆 is known or parameter of distribution.
Mean = Variance = λ.
4.2. Continuous Random Variable:
Definition: If outcomes are continuous. The range of a continuous random variable is
an interval (or a collection of intervals) on the real line:
Example: duration of an event, point in a plane, value between 2 and 5, Height of a
person etc..
4.2.1. General Continuous Probability Distribution:
I. Probability density function
The probability associated with an interval of values, (a, b) say, is represented as 𝑃(𝑎 <
𝑥 < 𝑏) or 𝑃(𝑎 ≤ 𝑥 ≤ 𝑏) these have the same value and is the area under the curve of the
probability density function (PDF) defined as 𝑓(𝑥) from a to b. So, probabilities can be
evaluated by integrating the PDF. This relationship defines the PDF.
Thus:
𝑏
𝑃(𝑎 < 𝑥 < 𝑏) = ∫𝑎 𝑓(𝑥)𝑑𝑥 ; Here the condition for function to serve as pdf is
𝑓(𝑥) ≥ 0 ; −∞ ≤ 𝑥 ≤ ∞
∞
and also ∫−∞ 𝑓(𝑥)𝑑𝑥 = 1
II. Cumulative distribution function
The cumulative distribution function (CDF) is defined to be the function:

𝑏
𝐹(𝑎 < 𝑥 < 𝑏) = 𝑃(𝑎 < 𝑥 < 𝑏) = ∫ 𝑓(𝑥)𝑑𝑥
𝑎
For a continuous random variable, F(x) is a continuous, non-decreasing function,
defined for all real values of x.
III. Axiom of Probability:

∞
1. ∫−∞ 𝑓(𝑥)𝑑𝑥 = 1
𝑏
2. 𝐹(𝑎 < 𝑥 < 𝑏) = 𝑃(𝑎 < 𝑥 < 𝑏) = ∫𝑎 𝑓(𝑥)𝑑𝑥
𝑏
𝐹(𝑥 < 𝑏) = 𝑃(𝑥 < 𝑏) = ∫−∞ 𝑓(𝑥)𝑑𝑥 is also known as Cumulative probability at point b.
∞
𝐹(𝑎 < 𝑥) = 𝑃(𝑎 < 𝑥) = ∫𝑎 𝑓(𝑥)𝑑𝑥
𝑏
3. Point probability is zero. 𝐹(𝑥 < 𝑏) = 𝑃(𝑥 < 𝑏) = ∫−∞ 𝑓(𝑥)𝑑𝑥
21
www.gradeup.co
IV. Expectation or Mean of a continuous Variable:
Expected values are numerical summaries of important characteristics of the
distributions of random variables.

∞
𝐸[𝑥] = ∫ 𝑥𝑓(𝑥)𝑑𝑥
−∞
Expectation of any function 𝑔(𝑥) is defined as

∞
𝐸[𝑔(𝑥)] = ∫ 𝑔(𝑥)𝑓(𝑥)𝑑𝑥
−∞
∞
Example: 𝐸[(𝑥 2 + 3)] = ∫−∞(𝑥 2 + 3)𝑓(𝑥)𝑑𝑥
V. Variance and standard deviation

∞ ∞ 2
𝑉𝑎𝑟(𝑥) = 𝐸[𝑥 ] − (𝐸[𝑥])2 = ∫ 𝑥 2 𝑓(𝑥)𝑑𝑥 − (∫ 𝑥𝑓(𝑥)𝑑𝑥 )
2
−∞ −∞
Similarly:
∞ ∞ 2
𝑉𝑎𝑟(𝑔(𝑥)) = 𝐸[𝑔(𝑥)2 ] − (𝐸𝑔(𝑥)])2 = ∫ 𝑔(𝑥)2 𝑓(𝑥)𝑑𝑥 − (∫ 𝑔(𝑥)𝑓(𝑥)𝑑𝑥 )
−∞ −∞
4.2.2 Uniform distribution
X takes values between two specified numbers a and b.
Note: this is two parameters-based distribution (a and b).

1
;𝑎 ≤ 𝑥 ≤ 𝑏
Probability density function: 𝑓(𝑥) = {𝑏−𝑎
0; 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
𝑑−𝑐
Cumulative Distribution function: 𝐹(𝑐 < 𝑥 < 𝑑) = ;𝑎 ≤ 𝑐 ≤ 𝑑 ≤ 𝑏
𝑏−𝑎
Moment:
𝑎+𝑏
Expectation: 𝐸[𝑥] = ;
2
(𝑏−𝑎)2
Variance: 𝑉𝑎𝑟[𝑥] = 𝜎 2 = ;
12
4.2.3 Exponential Distribution :
The exponential distribution is used as a simple model for the lifetimes of certain types
of equipment. x takes values from 0 onwards ;
Note: This is one parametric distribution (𝜆).
Probability density function: 𝑓(𝑥) = 𝜆𝑒 −𝜆𝑥 ; 𝑥 > 0
Cumulative Distribution function: 𝐹(𝑐 < 𝑥 < 𝑑) = 𝑒 −𝜆𝑐 − 𝑒 −𝜆𝑑 ; 0 ≤ 𝑐 ≤ 𝑑
Moment:
1
Expectation: 𝐸[𝑥] = 𝜇 = ;
𝜆
1
Variance: 𝑉𝑎𝑟[𝑥] = 𝜎 2 =
𝜆2
22
www.gradeup.co
4.2.4 Normal distribution:

This distribution, with its symmetrical “bell-shaped” density curve is of fundamental
importance in both statistical theory and practice.
(i) it is a good model for the distribution of measurements that occur in practice in a wide
variety of different situations for example heights, weights, IQ scores or exam scores.
(ii) it provides good approximations to various other distributions – in particular it is a
limiting form of the Binomial and Poisson.
(iii) The distribution has 2 parameters, which can conveniently be expressed directly as
the mean 𝜇and the standard deviation𝜎 of the distribution. The distribution is symmetrical
about 𝜇.
The notation used for the Normal distribution is 𝑋~𝑁(𝜇, 𝜎 2 )
𝑥−𝜇 2
1 )
Probability density function: 𝑓(𝑥) = 𝑒 −( 𝜎 ; −∞ < 𝑥 < ∞
𝜎√2𝜋
Moment:
Expectation: 𝐸[𝑥] = 𝜇
Variance: 𝑉𝑎𝑟[𝑥] = 𝜎 2
4.2.5 Normal Standard distribution:
If we put 𝐸[𝑥] = 𝜇 and 𝑉𝑎𝑟[𝑥] = 𝜎 2 ; then we get normal standard distribution:
where
1 2
Probability density function: 𝑓(𝑥) = 𝑒 −𝑥 ; −∞ < 𝑥 < ∞
√2𝜋
We write 𝑍~𝑁(0, 12 )
Conversion from X to Z;
Since Z is symmetric about zero we can write :
𝑃(𝑍 < 𝑘) = 𝑃(𝑧 > 𝑘) = 1 − 𝑃(𝑍 < 𝑘) and 𝑃(𝑍 > −𝑘) = 𝑃(𝑧 < 𝑘) = 1 − 𝑃(𝑍 > 𝑘)
AREA RESULT for Normal Distribution:
For a symmetrical distribution (normal curve),
(i) the interval (𝜇 − 𝜎, 𝜇 + 𝜎) contains 68.27% items.
(ii) the interval (μ-2σ, μ + 2σ) contains 95.45% items.
(iii) the interval (μ-2σ, μ + 2σ) contains 99.74% items.
****
23
www.gradeup.co
24
www.gradeup.co
25

Probability and Statics Ee Final 94 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Probability and Statics Ee Final 94 PDF

Uploaded by

Copyright:

Available Formats

www.gradeup.

1 PROBABILITY AND STATISTICS

synonymous and may be taken, in this context, to be indistinguishable.

understanding of permutation and combination before approaching probability question.

1. PERMUTATION AND COMBINATION

Permutation and Combination are basically counting principle.

1.1. Counting Principle:

A) Fundamental principle of multiplication:

ways. This is the basic method to calculate the total ways.

B) Fundamental principle of addition

different ways to move from city A to city B are 3 + 4 + 5 = 12 ways.

Case 6) Circular Permutations: Circular permutation is a special case of arrangement

Case 6) When restriction or constraint is given, always start with restriction:

Where n(E): the number of outcomes in Event E

2.4. Type of Events:

2.4.4. Mixed/compound event

Event F: the number 6 comes.

Case I: When events are mutually exclusive:

If A and B are mutually exclusive events then n (A  B) = 0  P (A  B) = 0

Similarly; P (A  B  C) = P (A) + P (B) + P (C)

Similarly; P (A1 + A2 + ...+ An ) = P (A1 ) + P(A2 ) +....+ P (An)

Case II: When events are not mutually exclusive.

or P (A+B) = P (A) + P (B) – P (AB)

For any three events A, B, C

P (A  B  C) = P (A) + P (B) + P (C) – P (A  B) – P (B  C) – P (C  A) + P (A  B 

or P (A + B+ C) = P (A) + P (B) + P(C) – P (AB) – P (BC) – P (CA) + P (ABC)

In general for n events A1, A2 , ……… , An of sample space S

= ∑ P(Ai ) − ∑ P(Ai ∩ Aj ) + ∑ P(Ai ∩ Aj ∩ Ak ) . . . . . +(−1)n−1 P(A1 ∩. . . .∩ An )

2.7. Multiplication Theorem of Probability: Multiplication principle of probability is used to

calculate the probability of more than one event simultaneously. Thus,

Case I: When events are independent:

If A1, A2 , ......., An are independent events, then

P (A1. A2. ....An) = P (A1 ) P (A2 ) .....P (An).

So P (A/B) = P (A) and P (B/A) = P (B), then

P (A  B) = P (A). P (B) OR P (AB) = P (A) . P (B)

Case II: If the events are mutually exclusive then

2.8. Type of questions asked in GATE:

Similarly, the conditional probability of A given B is

In Case of independent Events:

2.11. Total Probability:

exhaustive set of sample space S.

Let A be any event of S. Then total probability of the event A is given by

where P(A/A i) gives us the contribution of A i in the occurrence of A.

This result is obtained as A = (A1  A)  (A2  A)  …….  (An  A)

 P(A) = P(A1  A) + P(A2  A) + …….+ P(An  A)

2.12. Bayes Theorem:

Bayes theorem gives probability of occurrence of an event when the outcome of

exhaustive set of sample space S.

Let B be an event of S which has already occurred then conditional probability of

Case 2: For a frequency distribution,

Case 3: If Weighted Arithmetic Mean:

Case I: When n is odd.

Case II: When n is even.

L = lower limit of the class in which the median lies

L = the lower limit of the modal class

3.2.2 Mean Deviation:

3.2.3 Standard Deviation and Variance:

where d = x - 𝐱̄ and x = value of the variable or observation, x̄ = arithmetic mean, n =

where x = mid-value, x̄ = A.M., f = frequency, n = total frequency.

3.2.5. Symmetric and Skew-symmetric

1. In a positively skew-symmetric distribution

Where σand x̄ are standard deviation and mean of data respectively.

having lesser C.V. is said to be more consistent than the other.