Professional Documents
Culture Documents
S Chaturvedi
August 18, 2021
Contents
1 Rudiments of Probability Theory 3
1.1 Compound events and associated probabilities . . . . . . . . . 4
1.2 Mutual exclusiveness of events, independence of events . . . . 6
1.3 Conditional probability, Baes theorem . . . . . . . . . . . . . . 7
1
9 Boltzmann probabilities 17
11 Doppler broadening 18
2
1 Rudiments of Probability Theory
Probability theory deals with results of experiments performed on a system
assumed to exist in certain well defined ’states’. For simplicity we will assume
that the states can be labelled by a discrete index i taking values from 1
to k. By an experiment one means performing a specific operation on the
system and recording the outcome. Some typical examples of systems and
experiments performed on them are:
1. System-A coin, States: Heads (H) or Tails (T ), Experiment- Tossing
the coin and recording the outcome H or T .
2. System-A die, States: 1, 2, 3, 4, 5, 6, Experiment- Casting the die and
recording the outcome i, i = 1, · · · , 6.
3. System: A deck of cards, States:1, 2, · · · , 52, Experiment- Pulling out
a card from the deck and recording the outcome i, i = 1, · · · , 52.
4. Two coins, States: HH, HT, T H, T T , Experiment: Tossing the coins
and recording which of the four outcomes is realized.
The use of the term ‘state’ here is inspired by quantum mechanics. In con-
ventional probability theory what we call a state is referred to as a simple
event and the set of all possible states, the state space, as the sample space.
Remark: It is appropriate to note that here we are dealing with the ‘clas-
sical’ probability theory. Important and fundamental differences arise when
we pass to quantum probabilities. This happens because of the differences
in the notions of specifiction of a quantum system, the act of measurement
and specification of the state of a composite quantum systems are radically
different from their classical counterparts.
If an experiment is performed a number of times, say N times, a particular
simple event i may occur ni times. The number ni may of course depend
on N . To make that explicit we write ni as ni (N ). We may now define the
frequency of occurrence of the event i as
ni (N )
fi =
N
Clearly
X
fi ≥ 0; fi = 1;
i
3
Remark: Performing N experiments on the same system at different in-
stants of time may also be viewed as performing a single experiment at the
same instant of time on N copies of the same system. While the process
of computing the frequencies of outcomes from the first point of view may
be regarded as having been obtained by a ‘time averaging’ procedure, those
computed from the second point of view may appropriately be regarded as
‘ensemble averages’.
To remove the dependence on N , we consider the limit N → ∞ and arrive
at the notion of the probability pi of the event i :
ni (N )
pi = lim
N →∞ N
Again, as with fi ’s we have
X
pi ≥ 0; pi = 1
i
pi = 1/Ω, i = 1, 2, · · · , Ω
where Ω gives the size of the sample space, the total number of simple events.
The probabilities thus assigned are referred to as ‘a priori’ probabilities.
4
Simple events Compound event
Sample space
Given two compound events A and B, the probability p(A ∪ B), giving the
probability that the events A or B occur is given by
p(A ∪ B) = p(A) + p(B) − p(A ∩ B)
as should be evident from the figure below:
A B
5
1.2 Mutual exclusiveness of events, independence of
events
Two compound events A and B are said to be be mutually exclusive if they
have no simple events in common i.e. A ∩ B is empty and in that case we
have
p(A ∩ B) = 0
and as a result
p(A ∪ B) = p(A) + p(B)
(Since simple events are, by definition mutually exclusve we immediately see
that this is consistent with the way we associate probabilities with compound
events.) Thus, with reference to the figure below,
A B
Sample space
we see that while the events B and C are mutually exclusive, the events A
and B and the events A and C are not. As a result only in the first case do
we have the situation that the probability that B or C occur is obtained by
adding up the probability of occurrence of B and that of C.
Two events A and B are said to be independent if
p(A ∩ B) = p(A)p(B)
i.e. if the probability that A and B occur is the product of the respective
probabilities.
6
1.3 Conditional probability, Baes theorem
Having defined the notion of a joint probability we now introduce the notion
of a conditional probability P (A|B) –the probability that A occurs given that
B has occurred. It is defined as follows
p(A ∩ B)
p(A|B) =
p(B)
p(A|B) 6= p(B|A)
Further, if for two events A and B are independent then for two such events
we have
justifying the use of the term ‘independent’ in that the occurrence of one
does not affect the chances of occurrence of the other.
Further from the definition of conditional probability and the symmetry
of the joint probability it follows that
p(A|B)p(B) = p(B|A)p(A)
7
[a] Compute the probability of occurrence of
[i] A.
[2] B.
[3] C.
[4] A or B.
[5] A or C.
[7] B or C.
[8] A and B.
[9] A and C.
[10] B and C.
[11] A given that B has occurred.
[12] A given that C has occurred.
[13] B given that A has ocurred.
[14] B given that C has ocurred.
[15] C given that A has ocurred.
[16] C given that B has ocurred.
8
III pi = 0 for i 6= 6, p6 = 1
1 3
IV pi = 8
for i 6= 3, p3 = 8
Consider now cases I and II. If one were to select one of these dice for purposes
of gambling, one would obviously select the die in II and bet on number
six. The reason for choosing die II is that compared to the die in I one is
less uncertain about the outcome. Indeed, if one wanted to be absolutely
certain about the outcome one would choose the die in case III. Based on
these intuitively obvious considerations we can draw the following comparison
table:
I Most uncertain
III No uncertainty
IV Same uncertainty as in II
We can see that the uncertainty does not depend on which event has certain
probability but rather on values of all the probabilities.
Consider now an experiment involving rolling two dice. The sample space
now consists of 36 points which can be labelled as (i, j) with i, j = 1, · · · 6
by associating i and j with the first and the second die respectively. Sup-
pose we choose I for the first die and III for the second. Since the dice are
independent pij = pi (I)pj (III). The uncertainty in the present case is clearly
the uncertainty in I as the uncertainty in III is zero and we can tentatively
stipulate that the uncertainty in this combination (I,III) = uncertainty(I) +
uncertainty(III) as uncertainty(III) =0. Consider now the combination I and
II. In this combination all we can say is that the uncertainty must be at least
as large as the uncertainty in I and the uncertainty in II separately. Clearly
uncertainty(I)+uncertainty(II) satisfies this requirement.
Based on these cosiderations we may now list the desired properties for
a quantifier of the uncertainty associated with an experiment:
• The uncertainty of an experiment consisting of two independent events
equals the sum of their individual uncertainties.
9
• The maximum uncertainty of an experiment occurs if all the probabil-
ities are equal. If one event has probability 1 then the uncertainty is
zero.
where kB is a constant.
The most mysterious part of this expression is the logrithmic term. One
can easily check that it is needed to satisfy the first requirement in the list
above.
Exercise 2
Verify that the above expression for a quantitative measure of uncertainty in
an experiment does indeed bear out the intuitive conclusions drawn above in
the context of experiments I, II, III, IV, (I,III) and (I,II).
Exercise 3
Show that S attains its maximum value for pi = 1/Ω where Ω is the size of
the sample space and that the maximum value of S is kB ln Ω.
Exercise 4
Use the P method of lagrange
P multipliers to maximize S subject to the con-
straints i pi = 1 and i pi Ei = U to show that the maximum occurs when
pi ’s are given by
e−βEi X
pi = , Z= e−βEi (Boltzmann probabilities)
Z i
S = kB (ln Z + βU )
10
3 Discrete random Variables, probability dis-
tributions
There are many circumstances in which we can associate numerical values
x1 , x2 , · · · xn to the n points i = 1, 2, · · · , n in the sample space. In such
circumstances one speaks of a random variable X taking values in the set
{x1 , x2 , · · · , xn } and interprets the probability pi as the probability p(X = xi )
that the random variable X has a numerical value xi . With this understand-
ing, for notional convenience, it is customary to abbreviate p(X = xi ) as
p(xi ) and refer to the collection {p(xi ), i = 1, 2, · · · , n} as the probability
distribution of the discrete random variable X. The statements
X
pi ≥ 0, pi = 1
i
translate into
X
p(xi ) ≥ 0, p(xi ) = 1
i
11
that we get n heads if the coin is tossed N times. Here n is the random
variable taking values in the set {0, 1 · · · , N }. Elementary considerations
based on what we have already learnt show that
N n N −n
p(n) = p q
n
µ = lim (N p)
N →∞
p→0
p(n) = q n p
12
4.4 Multinomial distribution
Setting n = n1 , n2 = N − n and p = p1 , q = p2 with n1 + n2 = N and
p1 + p2 = 1, the binomial distribution can be written as
N!
p(n1 , n2 ) = pn1 pn2 .
n1 !n2 ! 1 2
This immediately generalises to a multinomial distribution:
N!
p(n1 , n2 , · · · , nm ) = pn1 1 pn2 2 · · · pnmm
n1 !n2 ! · · · nm !
with n1 + n2 + · · · + nm = N and p1 + p2 + · · · + pm = 1
[In mathematics literature one often deals with what is called the cumulative
distribution function
Z x
P r(X ≤ x) = dx0 p(x0 )
−∞
to accommodate cases where p(x) may not be well defined. The Dirac delta
function distribution is a case in point]
A useful analogue of the generating function in the context of the dis-
crete probability distribution function is the characteristic function F (k) of
a continuous probability distribution function p(x) defined as the Fourier
transform of p(x)
Z ∞
F (k) = dxe−ikx p(x)
−∞
13
6 Some important continuous probability dis-
tributions
6.1 Uniform distribution
(
1 for 0 ≤ x ≤ 1
p(x) =
0 otherwise
(x − µ)2
1 −
p(x) = p e 2σ 2
(2πσ 2 )
hXi = µ; hX 2 i − hXi2 = σ 2
The importance of the Gaussian distribution in all walks of life stems from
the central limit theorem :
Let X1 , X2 , · · · , Xn be independent and identically distributed random vari-
ables with mean µ and variance σ 2 . They need not be Gaussian random
variables. Then in the limit n → ∞p the probability of the random variable
Zn = (X1 + X2 + · · · + Xn − nµ)/(σ (n)) tends to a Gaussian distribution
with zero mean and unit variance.
The Gaussian distribution generalises to multivariate situations as fol-
lows:
r
DetA −(x−µ)T A(x−µ)
p(x1 , · · · , xn ) = e
π
where x and µ denote n-dimensional columns with entries (x1 , · · · , xn ) and
(µ1 , · · · , µn ) respectively and A is an n × n real symmetric matrix with pos-
itive eigenvalues.
14
6.3 Lorentzian distribution
Besides the ubiquitous Gaussian distribution, another continuous probability
distribution that one comes across in physics is the Lorentzian distribution:
1 λ
p(x) =
π (x − µ)2 + λ2
p(x) = δ(x − µ)
In (iii) the prime denotes differentiation with respect to x and xi ’s are the
zeros of f (x).
Exercise 5
Compute the integrals
Z ∞ Z ∞ Z ∞ Z ∞
−αx2 2 2
(a) dxe (b) 4 2
dxx δ(x − 4) (c) dx1 dx2 e−(x1 +x2 +x1 x2 )
−∞ −∞ −∞ −∞
Exercise 6
Compute hXi and hX 2 i−hXi2 for the Poisson and the Gaussian distributions.
Exercise 7
Compute the generating function and the characteristic function for the Pois-
son and the Gaussian distributions respectively and hence deduce the quan-
titities in Ex. 6.
15
7 Intuitive notions of temperature, heat, ther-
mal equilibrium
Temperature is a measure of ‘hotness’ or ‘coldness’. When a hot object is
brought in contact with a cold object, there is a net flow of ‘heat’ from the
hot object to the cold object. After some time the flow of heat stops and
we say that the two objects are in thermal equilibrium - they are at the
same temperature. Heat is thus to be thought of as the ‘energy in transit’.
Quantification of the notion of temperature culminating in the advent of the
Kelvin scale has a long and fascinating history as you might have learnt in
courses earlier. The notion of being in thermal equilibrium is transitive: If
A is in thermal equilibrium with B (i.e. there is no net flow of heat when
the two are brought in thermal contact with each other) and B is in thermal
equilibrium with C then A and C are also in thermal equilibrium with each
other. The zeroth law of thermodynamics expresses the transitivity of the
notion of being in thermal equilibrium.
16
let us assume that the total energy
1
E= (p21 + · · · + p2N )
2m
is taken to specify the macrostates of the system. The number of microstates
with energies lying between E and E + dE will be given by the number of
points lying between two shells of spheres in a 6N dimensional space.
9 Boltzmann probabilities
Consider Two macroscopic systems A and B which are in thermal equilib-
rium with a reservoir (or a bath) at a temperature T . Let pA (EA ) denote
the probability that of finding the system A in a particular microstate cor-
responding to the energy EA . Let pB (EB ) denote the correponding quantity
for the system B.
Consider now the composite system made up of A and B. Let pA+B (EA+B )
denote the probability that the composite system is in a microstate corre-
sponding to the energy EA+B .Assuming that A and B are non interacting
EA+B = EA + EB we have
The RHS denotes the probability that the composite system is in a microstate
corresponding to the energy EA + EB . In view of the independence of A and
B we have
pA+B (EA + EB ) = pA (EA )pB (EB )
Further using the fact that the derivative of a function f (x + y) is same as
its derivative with respect to x or y we get
or
p0A (EA ) p0 (EB )
= B = −β
pA (EA ) pB (EB )
where β is a parameter independent of A or B and therefore only depends
the properties of the reservoir viz. its temperature.
17
This relation then gives the structure of the probabilities associated with
a microscopic corresponding to given energy:
pA (EA ) = CA e−βEA
With the identification β = 1/kB T , as would be done later, one is led to the
probabilities associated with microstates corresponding to energy E.
18