Parameter Estimation Techniques in Statistics

Parameter Estimation
Png Wen Han
27/4/2022
Contents
1 Classical Fisher Information 2
1.1 Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Multiple measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Cramer Rao Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Extension to vector parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Extension to vectorized Gaussian parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Quantum Fisher Information 5

2.1 Properties of quantum Fisher information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Example 1: Unitaries families . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Example 2 : Pure state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4 Example 3: Mixed state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.5 Example 4 : GHZ state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.6 Example 5 : Seperable state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3 Appendix 13
3.1 Derivation of scalar CRLB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Symmetric logarithmic derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 Derivation of quantum Fisher information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.4 Uncertainty relation in angular momentum component . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Preliminaries : Revision on Statistics
Classical (cont variable)

R
Mean µX = E(X) xfX (x|θ)dx
(x − µX )2 fX (x|θ)dx
R
VarianceVar(X) =
E([X − E(X)]2 ) R
Covariance (x − µX )(y − µY )
Cov(X, Y ) = fX,Y (x, y|θ)dx
E([X − E(X)][Y − E(Y )])
Correlation : tell whether X and Y has the same behaviour.
Cov(X, Y )
Cor(X, Y ) =
|Cov(X, Y )|
1
1 Classical Fisher Information
1.1 Parameter Estimation
Suppose we have a random variable X with probability density distribution function f (x|θ) where θ ∈ Θ. Since the
parameter θ is unknown to us, how do we decide which θ is right information for X ? The measure of how much
information we can extract from a data given an unknown parameter θ is Fisher information.
The relation between θ and X is implicit in the log-likelihood function
l(x|θ) = ln f (x|θ)
. The rst order derivative of the log-likelihood function is suce to determine if X provides any information on θ.
∂ f ′ (x|θ)
l′ (x|θ) = ln f (x|θ) =
∂θ f (x|θ)
The integration over all probability density distribution gives the Fisher information
Z
2
F (θ) = [l′ (x|θ)] f (x|θ)dx
= E0 {[l′ (X|θ)]2 } (1)
Now we can see if l′ (x|θ) ∼ 0, I(θ) ∼ 0 then X provides no information on θ. For l′ (x|θ)2 ≫ 0,we can learn
signicant amount of information on θ from data X. The alternative expression (see appendix) for Fisher information
are[4]
F (θ) = V arθ [l′ (X|θ)] (2)
or (refer to the appendix)
F (θ) = −Eθ [l′′ (x|θ)] (3)
For discrete distribution function,

Z X
f (x, θ)dx → f (x, θ)
x
the mean value E(..) becomes
X
E0 {...} = (...)f (x, θ)
and the Fisher information for discrete probability distribution read
X ∂ ln f (x, θ) 2
F (θ) = f (x, θ)
x
∂θ
2
X 1 ∂f (x, θ)
= (4)
x
f (x, θ) ∂θ
1.2 Multiple measurement
Extending the case to n number of random variable X = (X1 , X2 , ..., Xn ), θ̂ = r(X1 , X2 , ..., Xn ) = r(X). The
generalized Fisher information read
Z Z
′
l (x|θ)2 f (x|θ)dx1 ...dxn

Fn (θ) = ...
or
Fn (θ) = V arθ [ln′ (x|θ)]
2
or
Fn (θ) = −Eθ [ln′′ (x|θ)]

= nF (θ) (5)
1.3 Cramer Rao Bound
Say now we found the θ which provides non-zero amount of Fisher information, we wish to quantify the precision
of estimator V arθ [θ̂] further. This can be done through the Cramer Rao bound. It can be easily derived from the
Cauchy -Schwartz inequality
2
(Covθ [θ, ln′ (x|θ)]) ≤ V arθ [θ̂]V arθ [ln′ (x|θ)]
where Covθ [θ, ln′ (x|θ)] = ∂
∂θ E(θ̂).
2
∂
E(θ̂) ≤ V arθ [θ̂]nF (θ)
∂θ
2
∂
V arθ [θ̂] ≥ E(θ̂) /nF (θ)
∂θ
This is the Cramer Rao bound/ information inequality. It shows that as I(θ) increaces, the more precise is the
value of θ
∂
For unbiased estimator/parameter, E(θ̂) = θ, ∂θ E(θ̂) = 1,
V arθ [θ̂] ≥ 1/nF (θ) (6)
where 1/nI(θ) is the CramerRao lower bound (CRLB).

Example 1 :
Suppose a random sample X1 , ..., Xn form s a nromal distirbutionN (µ, θ) with given µ and θ unknown.
(x − µ)2

1
f (x|θ) = √ exp −
2πθ 2θ
then
(x − µ)2 1 1
l(x|θ) = − − log(2π) − log(θ)
2θ 2 2
(x − µ)2 1
l′ (x|θ) = −
2θ2 2θ
(x − µ)2 1
l′′ (x|θ) = − 3
+ 2
θ 2θ
therefore
F (θ) = −Eθ [l′′ (x|θ)]

1
= 2
2θ
2θ 2
and the CRLB is
n .
3
1.4 Extension to vector parameter
Consider the case where wehave only a data x and vector parameter θ = (θ1 , θ2 , ..., θN ). The Cramer Rao bound is
given by
V ar(θi ) ≥ [F −1 (θii )]
where I(θ) is N ×N Fisher Information matrix.
∂ 2 ln f (x|θ)

F [(θ)]ij = −E (7)
∂θi ∂θj
1.5 Extension to vectorized Gaussian parameter
For single data point x with single parameter θ, the PDF in gaussian form x ∼ N (µ, σ) read
(x − µ)2

1
P (x|θ) = √ exp −
2πσ 2σ 2
where σ>0
For multiple data entry x = (x1 , ..., xN ) with single parameter θ, the PDF x ∼ N (µ, σ) is modied to
N
(xi − µ)2

Y 1
P (x|θ) = √ exp −
i=1
2πσ 2σ 2
N
!
−n/2 1 X
= (2πσ) exp − 2 (xi − µ)2
2σ i=1
for simplest case the model has two parameters θ = (µ, σ)

Extending to vectorized case where θ = (µ, σ) x ∼ N (µ(θ), C(θ))
and

1 1 −1
P (x|θ) = exp − (x − µ(θ))C (θ)(x − µ(θ))
(2π)N/2 det[C(θ)] 2
where C is positive denite.
Example : C = σ 2 I , P (x|θ) becomes familar form
N
!
1 1 X
P (x|θ) = exp − (x − µ(θ))2
(2πσ)N/2 2σ 2 n=1
the Fisher infomation is given by[1]
T
∂µ(θ) −1 ∂µ(θ)
[F (θ)]ij = C (θ)
∂θi ∂θi

1 −1 ∂µ(θ) −1 ∂µ(θ)
+ tr C (θ) C (θ) (8)
2 ∂θi ∂θj
where
∂µ(θ)1
 
∂θi
 ∂µ(θ)2 
∂µ(θ)  ∂θi 
= .

∂θi 
 .
.


∂µ(θ)N
∂θi
∂[C(θ)]11 ∂[C(θ)]12 ∂[C(θ)]1N
 
∂θi ∂θi ··· ∂θi
∂[C(θ)]21 ∂[C(θ)]22 ∂[C(θ)]2N
···
 
 ∂θi ∂θi ∂θi 
C(θ) =  . . ..

 . . .

 . . 
∂[C(θ)]N 1 ∂[C(θ)]N 2 ∂[C(θ)]N N
∂θi ∂θi ··· ∂θi
4
Then the Cramer Rao bound read
∆θ ≥ ∆θCRB,i = [F (θ)−1 ]ii

For N independent system with same but independent multiparameter measurement, the Fisher info matrix is
[F (θ)n,m ]i,j where {n, m} < N are the index of the independent system and {i, j} < M are the parameter/estimator
index. The matrix has a dimenion of (M × N ) × (M × N )
The covariance matrix of system n read
 
  [F (θ)]n,n [F (θ)]n,n+N ··· [F (θ)]n,n+(M −1)N
∆θ1,n
.
 [F (θ)]n+N,n [F (θ)]n+N,n+N 
. =
   
. . ..
.
   
 . . 
∆θM,n
[F (θ)]n+(M −1)N,n [F (θ)]n+(M −1)N,n+(M −1)N
2 Quantum Fisher Information
Classical (cont variable) Quantum

R
Mean µX = E(X) xfX (x|θ)dx ⟨X⟩ = T r(ρ̂θ X)
2
(x − µX )2 fX (x|θ)dx ⟨X 2 ⟩ − ⟨X⟩2
R
Variance V ar(X)∆X =
E([X − E(X)]2 ) T r(X 2 ρ̂θ ) − T r(ρ̂θ X)2
R
CovarianceCov(X, Y ) (x − µX )(y − µY ) ⟨XY ⟩ − ⟨X⟩⟨Y ⟩
E([X − E(X)][Y − E(Y )]) fX,Y (x, y|θ)dx T r(XY ρ̂θ ) − T r(ρ̂θ X)T r(ρ̂θ X)
Schwarz inequality (∆xy)2 ≤ ∆x∆y |T r[O† P ]|2 ≤ T r[O† O]T r[P † P ]
From Cramer Rao bound, it apparent that the amount of Fisher information determines the best achievable accuracy
of the estimator θ. An innite amount of Fisher information should gives zero variance / perfect precision of
the θ. However, there exist a fundamental limit to the achievable precision. The upper bound of the Fisher
information can be achieve through generalized measurement in quantum mechanics, hereinafter named Quantum
Fisher Information.
Derivation Assume the data is decoded/measured in the space given by x ∈ {0, 1}⊗N where N is the number of
the qubit. Using the Fisher info from Eq.4
k 2
X 1 ∂f (x, θ)
F (θ) =
f (x, θ) ∂θ
x∈{0,1}N
where k = 2N and x ∈ {0, 1}N = {0...0, 0....1, ..., 1...1}

The probability distribution read
f (xi , θ) = µθ ({i}) = T r[ρ̂θ Ei ] (9)
P
where ρ̂θ = k |k⟩⟨k| is the density matrix and Ei is the diagonal element from the spectral decomposition of
observable A with dimension of k × k .
k
X
A= λi Ei
i=1
P Pk Pk
A is semi-positive Ei ≥ 0 and trace-preserving i Ei = 1 in which fulling i=1 f (xi , θ) = T r[ρ̂θ i=1 Ei ] =
T r[ρ̂θ ] = 1.
Now substituting f (xi , θ) = T r[ρ̂θ Gi ] into F (θ) gives
5
k 2
X 1 ∂T r[ρ̂θ Ei ]
F (θ) =
i=1
T r[ρ̂θ Gi ] ∂θ
k
X (T r[∂θ ρ̂θ Ei ])2
= (10)
i=1
T r[ρ̂θ Ei ]
∂ ρ̂θ
where ∂θ ρ̂θ =
∂θ .
Using the systematic logarithmic derivative, we introduce a self adjoint operator Lθ which satisfy the equation
1
∂θ ρ̂θ = (Lθ ρ̂θ + ρ̂θ Lθ )
2
The trace is same under cyclic permutation, hence the trace of SLD follows
1
T r[∂θ ρ̂θ Ei ] = T r[ (ρ̂θ Lθ Ei + Lθ ρ̂θ Ei )]
2
1 †
= Tr ρ̂θ Ei Lθ + [ρ̂θ Ei Lθ ]
2
= Re (T r[ρ̂θ Ei Lθ ])
where we have use the relation
1
2 (z + z ∗ ) = Re[z]. Then we rewrite the Fisher info into
k
X Re (T r[ρ̂θ Ei Lθ ]) 2
F (θ) =
i=1
T r[ρ̂θ Ei ]
k
X |T r[ρ̂θ Ei Lθ ]| 2
≤
i=1
T r[ρ̂θ Ei ]
T r[A† B]2 ≤ T r[A† A]T r[B † B]

Using schwarz inequality the upper bound of the terms within the summation
can be obtained. Rearrange the terms within the trace under cyclic permutation, we get

k T r[ρ̂1/2 E 1/2 E 1/2 L ρ̂1/2 ] 2
X θ i i θ θ
F (θ) ≤ 1/2 1/2 1/2 1/2
i=1 T r[ρ̂θ Ei Ei ρ̂θ ]
k h i
1/2 1/2 1/2 1/2
X
≤ T r Ei Lθ ρ̂θ ρ̂θ Lθ Ei (11)
i=1
Now, the upper bound of Fisher info is can also be written as
k h i
1/2 1/2
X
FQ (θ) = T r Ei Lθ ρ̂θ Lθ Ei
i=1
" k
#
X
= Tr Ei Lθ ρ̂θ Lθ
i=1
= T r ρ̂θ L2θ

As
1
T r[∂θ ρ̂θ Lθ ] = T r[Lθ ρ̂θ Lθ + ρ̂θ L2θ ]
2
= T r[ρ̂θ L2θ ]
Now, the upper bound of Fisher info is can also be written as
F (θ) ≤ FQ (θ) = T r ρ̂θ L2θ = T r [∂θ ρ̂θ Lθ ]

Remark on the quantum Fisher info:
6
1. Note that the upper bound is given by two condition
(a) [Re(z)]2 ≤ |z|2 , the saturation occur when z is pure ∀θ

T r[A† B]2 ≤ T r[A† A]T r[B † B], the saturation occur when A = λB

(b) where λ is real constant.
2. The quantumness is due to optimal quantum measurement which associated with the use of POVMs in
Fisher information.
3. Optimal POVM rather than optimal estimator (also proven for pure state in Sec 2.3)
Note that Lθ does not represent optimal observable/estimator (e.g. the right θ ). Rather, it determines optimal
POVM. the saturation of inwquality 1(a) implies T r[ρ̂θ Ei Lθ ] takes on real values, while the saturation of CS
1/2 1/2 1/2 1/2
inequality in Eq.11 indicates
P ρ̂θ Ei = cρ̂θ Lθ Ei which implies the POVMs are {Ei } = {|ki ⟩⟨ki |} given
the SDL is Lθ = i wi |ki ⟩⟨ki |.
4. Larger quantum Fisher info require large entanglement depth .

1
After a lengthy derivation (see appendix), we the quantum Fisher info (see appendix)[3, 2]
X (∂θ qk )2 X
F [ρ̂θ ] ≤ FQ [ρ̂θ ] = + σk,k′ |⟨k|∂θ k ′ ⟩|2 (12)
qk
k k′ ̸=k
2
where σk,k′ = 2qk (qqk′′−qk)
+qk .
k
It can be seen that the rst term is purely classical, while the second term contains truly quantum contribution.
For ∂θ k = 0 or [ρ̂θ , ∂θ ρ̂θ ] = 0 the quantum Fisher info becomes classical Fisher info.
The Cramer Rao bound provided by the quantum Fisher information is the Quantum Cramer Rao (QCR) bound
e.g.
1
∆θQCR = p (13)
FQ [ρ̂θ , H]
with
∆θ ≥ ∆θQCR
where ∆θQCR it is the lowest attainable bound by a given parameter θ.
2.1 Properties of quantum Fisher information
i) convexity : mixing of quantum states cannot increase the sensitivity of estimated phase.
h i h i
(1) (2) (1) (2)
F [pρθ + (1 − p)ρθ ] ≤ F pρθ + F (1 − p)ρθ
or
X (k)
X h (k)
i
F[ pk ρθ ] ≤ F p k ρθ (14)
k k
ii) additivity : QFI of independent quantum system is additive.
(1) (2) (1) (2)

F [ρθ ⊗ ρθ ] = F [ρθ ] + F [ρθ ] (15)
PN (i)
For N independent quantum system ρ⊗N , the quantum Fisher info read F [ρ⊗N ] = i=1 F [ρθ ]. If all system are
identical, the quantum cramer rao bound read
1 1
∆θQCR = q ∝√
(1)
N FQ [ρθ ] N
This represents the standard quantum limit (SQL) or shot-noise limit. A classical analogy would be independent
1
measurement on N identical copies of probe state, giving the same factor of √
N
.
1
7
2.2 Example 1: Unitaries families
ρθ = Uθ ρUθ† , where Uθ = exp(−iGθ) is the unitary perturbation to the initial state ρ =

P
Consider a state k qk |k⟩⟨k|
and G is a Hermitian generator.
ρθ = exp(−iGθ)ρ exp(iGθ)
X
= qk |ψk ⟩⟨ψk |
k
where |ψk ⟩ = Uθ |k⟩

Then, the rst derivative of ρθ read
X X
∂θ ρθ = qk |∂θ ψk ⟩⟨ψk | + qk |ψk ⟩⟨∂θ ψk |
k k
!
X X
= −i G qk |ψk ⟩⟨ψk | − qk |ψk ⟩⟨ψk |G
k k
= −i [G, ρθ ]
= −iUθ [G, ρ] Uθ†
It can be seen that the ∂θ qk = 0, excluding the contribution of classical Fisher info in the Eq. 12
Substituting Eq.18, note that now evertything is treated in the basis of |ψk ⟩
X ⟨ψk |∂θ ρθ |ψk′ ⟩

Lθ = 2 |ψk ⟩⟨ψk′ |
qk + qk′
k′ k
X ⟨k|U † Uθ [G, ρ] U † Uθ |k ′ ⟩
Lθ = −2i θ θ
Uθ |k⟩⟨k ′ |Uθ†
qk + qk′
k′ k
X ⟨k| [G, ρ] |k ′ ⟩
= −2i Uθ |k⟩⟨k ′ |Uθ†
qk + qk′
k′ k
!
X [⟨k|Gρ|k ′ ⟩ − ⟨k|ρG|k ′ ⟩]
= −2iUθ |k⟩⟨k | Uθ†
′
qk + qk′
k′ k
!
′ [qk − qk′ ]
|k⟩⟨k | Uθ†
X
′
= 2iUθ ⟨k|G|k ⟩
′
q k + q k ′
k k
= Uθ (L(G)) Uθ†
′ qk −qk′
|k⟩⟨k ′ |
P
where k′ k ⟨k|G|k ⟩ qk +qk′
L(G) = 2i
The Fisher information follows
F [ρ̂θ ] = T r[ρθ L2θ ]

= T r[Uθ ρUθ† Uθ L(G)2 Uθ† ]
= T r[ρL(G)2 ]
where it can be seen that Quantum Fisher information independent of the estimator, but on the generator or
more precisely a SLD that depends on G.
Due to the dependence on generator G, sometimes the Fisher upper bound also be wirtten as F [ρ̂θ ] = F [ρ̂θ , G]
where
8
! 
X q k − q k ′ X q j − q j ′
L(G)2 = ⟨k|G|k ′ ⟩ |k⟩⟨k ′ |  ⟨j|G|j ′ ⟩ |j⟩⟨j ′ |
q k + q k ′
′
q j + q j ′
k′ k j j

XX qk − qk′ qk′ − qj ′
= −4 ⟨k|G|k ′ ⟩⟨k ′ |G|j ′ ⟩ |k⟩⟨j ′ |
qk + qk ′ q k ′ + qj ′
k′ k j ′

XX
′ ′ ′ qk − qk′ qk′ − qj ′
2
ρL(G) = −4 ⟨k|G|k ⟩⟨k |G|j ⟩qk′ |k⟩⟨j ′ |
′ ′
qk + qk′ qk′ + qj ′
k k j
2
X qk − qk′
T r[ρ L(G)2 ] = 4 |⟨k|G|k ′ ⟩|2 qk

′
qk + qk′
k k
Then, we get the Fisher info
FQ [ρ̂θ , G] = T r[ρ L(G)2 ]

X
=2 |⟨k|G|k ′ ⟩|2 σk,k′ (16)
k′ k
2
qk −qk′
where σk,k′ = 2qk qk +qk′
2.3 Example 2 : Pure state
A pure state should statisfy ρ2 = ρ. Using this relation, the rst order dervative gives
∂θ ρθ = ∂θ ρ2θ
= ρθ (∂θ ρθ ) + (∂θ ρθ ) ρθ
Comparing to the systematic logarithmic derivative ∂θ ρθ = 12 (ρθ L + Lρθ ), we know
L = 2∂θ ρθ
= 2 [|∂θ ψθ ⟩⟨ψθ | + |ψθ ⟩⟨∂θ ψθ |]
where ρθ = |ψθ ⟩⟨ψθ | with |ψθ ⟩ = Uθ |ψ0 ⟩

Now the Fisher info can be wirtten as
F [ρ̂θ ] = T r[∂θ ρθ L]
1
= T r[L2 ]
2
where
L2= = 4 (|∂θ ψθ ⟩⟨ψθ |∂θ ψθ ⟩⟨ψθ | + ⟨ψθ |ψθ ⟩⟨∂θ ψθ | + |ψθ ⟩⟨∂θ ψθ |∂θ ψθ ⟩⟨ψθ | + |ψθ ⟩⟨ψθ |∂θ ψθ ⟩⟨∂θ ψθ |)
T r[L2 ] = 8 ⟨ψθ |∂θ ψθ ⟩2 + ⟨∂θ ψθ |∂θ ψθ ⟩

|∂θ ψθ ⟩ = ∂θ Uθ |ψ0 ⟩ = −iG|ψθ ⟩
⟨∂θ ψθ |∂θ ψθ ⟩ = ⟨ψθ |G2 |ψθ ⟩ = ⟨ψ0 |Uθ† G2 Uθ |ψ0 ⟩ = ⟨ψ0 |G2 |ψ0 ⟩
⟨∂θ ψθ |ψθ ⟩ = i⟨ψθ |G|ψθ ⟩ = ⟨ψ0 |Uθ† GUθ |ψ0 ⟩ = ⟨ψ0 |G|ψ0 ⟩
9
Lastly the Fisher info read
F [ρ̂θ ] = 4 ⟨ψθ |G2 |ψθ ⟩ − ⟨ψθ |G|ψθ ⟩2

= 4⟨ψ0 |∆G2 |ψ0 ⟩

= 4⟨∆G2 ⟩
the Cramer Rao bound becomes
1
∆θ ≥ (17)
4⟨∆G2 ⟩
where it can be seen that the lower bound in independent on θ, but depends on the uctuation of generator G,
manifesting truly quantum contribution. The Hamiltonian in quantum optics are mainly governed by the fermonic
operator/generator G = {σ+ , σ− , σz } or bosonic operator/generator G = {a, a† }, where their minimum uctuation
are given by Heisenberg uncertainty principle
Bosonic
ℏ
∆x∆p ≥
2
Fermionic (see appendix for derivation)
ℏ
ϵijk |⟨Jk ⟩| ≤ σJi σJj
2
2.4 Example 3: Mixed state
From Eq.16
2
X qk − qk′
F [ρ̂θ , G] =4 |⟨k|G|k ′ ⟩|2 qk
qk + qk′
k′ k
meanwhile we know the variance of G is dened as
⟨∆G2 ⟩ = ⟨G2 ⟩ − ⟨G⟩2

= T r[G2 ρ0 ] − T r[Gρ0 ]2
qk ⟨k|G|k ′ ⟩⟨k ′ |G|k⟩ = |⟨k|G|k ′ ⟩|2 qk

P
T r[G2 ρ0 ] = T r 2
P P
where k qk G |k⟩⟨k| = k,k′ k′ k
!2
X
2
T r[Gρ0 ] = qk ⟨k|G|k⟩
k
To extract the Fisher info which have solely the state mmixing information, we rewrite the Fisher info into
2
2 2 2
X qk − qk ′′ 2
F [ρ̂θ , G] = 4⟨∆G ⟩ + 4⟨G⟩ − 4⟨G ⟩ + 4 |⟨k|G|k ⟩| qk
qk + qk′
k′ k
" 2 #
2 2
X
′ 2 qk − qk′
= ⟨∆G ⟩ + ⟨G⟩ + 4 |⟨k|G|k ⟩| qk −1
qk + qk′
k′ k
" #
2 2
X
′ 2 2 qk′
= ⟨∆G ⟩ + ⟨G⟩ − 8 |⟨k|G|k ⟩| qk 2
k′ k
(qk + qk′ )
where the second terms represent the contribution from mixing.
10
2.5 Example 4 : GHZ state
1
|GHZ⟩ = √ |0⟩⊗N + |1⟩⊗N

2
2 2
PN
Prove the Fisher info is F [ρGHZ , Jz ] = 4e0 N where Jz = k=1 e0 σz
Now
ρGHZ = |GHZ⟩⟨GHZ|
2
X 1
= |k⟩⟨k ′ |
2
k′ k
M
X
= qk |k⟩⟨k|
k
where {|l⟩, |l′ ⟩} ∈ {|0⟩⊗N , |1⟩⊗N }. After the eigen decomposition, we get a rank-1 density matrix M =1 and
qk = δi,1
|k = 1⟩ = √1 (|0⟩⊗N + |1⟩⊗N ) , q1 = 1
2
|k = 2⟩ = √1 (|0⟩⊗N − |1⟩⊗N ) , q2 = 0
2
. .
. .
. .
N
|k = 2 ⟩ = ... , q2N = 0
PM
Compare to ρ= k qk |k⟩⟨k|, its aparrent that the density matrix of rank 1 after the decomposition M =1 and
qk = δi,1 . The intial state is then evolve accordingly with the spin operator Jz
ρθ = exp(−iJz θ)ρ exp(iJz θ)

the act of opreator Jz on the GHZ state gives
Jz |0⟩⊗N = N e0 |0⟩⊗N
Jz |1⟩⊗N = −N e0 |1⟩⊗N
hence
1
⟨0|⊗N − ⟨1|⊗N Jz |0⟩⊗N + |1⟩⊗N

⟨k = 1|Jz |k = 2⟩ =
2
1
⟨0|⊗N − ⟨1|⊗N N e0 |0⟩⊗N − |1⟩⊗N

=
2
|⟨k = 1|Jz |k = 2⟩|2 = e20 N 2
From Eq.16, substituting G as Jz into Fisher info
2
X qk − qk′
T r[ρ L20 ] = 4 |⟨k|Jz |k ′ ⟩|2 qk

qk + qk′
k′ k
2
2 q1 − q2
= 4|⟨k = 1|Jz |k = 2⟩| q1
q1 + q2
F [ρGHZ , Jz ] = 4e20 N 2
where |⟨k|Jz |k ′ ⟩|2 = 0 k = k ′ and k > 2.
for
For collective spin operator e0 = 1/2. Now the Quantum Rao bound is
1
∆θ ≥ ∆θCRB =
N
which is the Heisenberg limit.
F [ρGHZ , Jx ] = F [ρGHZ , Jy ] = 4e20 N
11
2.6 Example 5 : Seperable state
PN
Consider a state |0⟩⊗N , and measurement is made along the x− basis. Find the F [ρsep , Jx ] where Jx = k=1 e0 σx
The density matrix of the state read
ρ = (|0⟩⟨0|)⊗N
X
= qk |k⟩⟨k|
k
where |k⟩ ∈ {0, 1}⊗N

|k = 1⟩ = |0⟩⊗N , q1 = 1
|k = 2⟩ = |00...1⟩ q2 = 0
. .
. .
. .
|k = 2N ⟩ = |1⟩⊗N q2N = 0
PM
Compare to ρ= k qk |k⟩⟨k|, its aparrent that the density matrix of rank 1 after the decomposition M =1
and qk = δi,1 . The intial state is then evolve accordingly with the spin operator Jx
ρθ = exp(−iJx θ)ρ exp(iJx θ)

the act of opreator Jx on the initial state gives
Jx |0⟩⊗N = e0 [|10...0⟩ + |01....0⟩ + ... + |0...1⟩]

N
where the intial product state becomes sum of C1 = N single excitation state.
From Eq.16, substituting G as Jz into Fisher info
2
X qk − qk′
T r[ρ L20 ] = 4 |⟨k|Jx |k ′ ⟩|2 qk

qk + qk′
k′ k
2
X
⊗N 2 q1 − qk′
=4 |⟨k|Jx |0⟩ | q1
⊗N
q1 + qk′
k∈0,1
F [ρGHZ , Jx ] = 4e20 N
1
For collective spin operator e0 = 2 . Now the quantum Cramer-Rao bound is
1
∆θ ≥ ∆θCRB = √
N
which is the standard quantum limit.
F [ρGHZ , Jz ] = 0, F [ρGHZ , Jy ] = 4e20 N
12
3 Appendix
3.1 Derivation of scalar CRLB
A scalar parameter is given by α = g(θ) which is parametrized by θ. Assume unbiased estimator
Z
E(α̂) = α̂p(x; θ)dx = α = g(θ)
where p(x; θ) is the pdf.

Let the Log-Likelihood function be l(x; θ) = ln(p(x; θ)). The derivative of Log-likelihood function wrt θ are
p′ (x; θ)
l′ (x; θ) =
p(x; θ)
p′′ (x; θ)p(x; θ) − p′ (x; θ)2

l′′ (x; θ) =
p(x; θ)2
Integrating the rst derivative over the pdf gives
Z
E(l′ (x; θ)) = l′ (x; θ)p(x; θ)dx
Z
∂
= p(x; θ)dx
∂θ
Z
∂
= p(x; θ)dx
∂θ
∂
= 1
∂θ
=0
regularity condition is satisfy if the order of dierentiation and integration is interchangeble

Integrating over the second derivative gives
Z
E(l′′ (x; θ)) = l′′ (x; θ)p(x; θ)dx
p′ (x; θ)2
Z
= p′′ (x; θ) − dx
p(x; θ)
Z 2
∂ ′ 2
= p(x; θ) − l (x; θ) p(x; θ) dx
∂θ2
Z
′
l (x; θ)2 p(x; θ) dx

=−
= −E l′ (x; θ)2

= −I
I = E l′ (x; θ)2 = E(l′′ (x; θ))

where is the Fisher info.
Now let's look back the scalar parameter α
dierentiating both sides
Z
∂ ∂g(θ)
α̂ p(x; θ)dx =
∂θ ∂θ
or
Z
∂g(θ)
α̂l′ (x; θ)p(x; θ)dx =
∂θ
or
13
Z
∂g(θ)
(α̂ − α)l′ (x; θ)p(x; θ)dx =
∂θ
where
Z
αl′ (x; θ)p(x; θ)dx = αE(l′ (x; θ))
=0
Using Cauchy-Schwarz inequality
∆x2 ∆y 2 ≥ (∆xy)2
or
Z Z Z 2
h(x)2 w(x)dx g(x)2 w(x)dx ≥ g(x)h(x)w(x)dx
the condition of equality implies g(x) = ch(x) where c is the constant that's independent x.
Now let
w(x) = p(x; θ)
h(x) = l′ (x; θ)
g(x) = (α̂ − α)
the inequality becomes
Z Z Z
′ 2
l (x; θ) p(x; θ)dx (α̂ − α) p(x; θ)dx ≥ (α̂ − α)l′ (x; θ)p(x; θ)dx
2
2
∂g(θ)
E(l′ (x; θ)2 )V ar(α) ≥
∂θ
If the parameter are unbiased g(θ) = α = θ, and g ′ (θ) = 1

Then, we get the CRLB
1
V ar(α) ≥
E(l′ (x; θ)2 )
3.2 Symmetric logarithmic derivative
14
3.3 Derivation of quantum Fisher information
Relation of systematic logarithmic derivative with lyapunov equation

the continuous Lyapunoc euqation if of the form
AX + XAH + Q = 0
where A, Q, X ∈ Rn×n .
Given any Q > 0,there exist a unique X >0 statisfying AX + XAH + Q = 0 i the linear system ẋ = Ax is
globally asymptotically stable. the quadratic function (V (x)) = xT Xx is a Lyapunov function that can be used to
verify the stability.
the solution to the Lyapunov equation is
Z ∞
H
X= eAτ QeA τ
dτ
0
This SLD has the form of Lyapunov equation
1
− (ρθ Lθ + Lθ ρθ ) + ∂θ ρθ = 0
2
Then to solution read
Z ∞
Lθ = 2 eρθ τ ∂θ ρθ eρθ τ dτ
0
P
Now expanding into the basis ρθ = k qk |k⟩⟨k|
! !
Z ∞ X X
−qk τ −qk′ τ ′ ′
Lθ = 2 e |k⟩⟨k| ∂θ ρθ e |k ⟩⟨k | dτ
0 k k′
X Z ∞
=2 e−(qk +qk′ )τ dτ ⟨k|∂θ ρθ |k ′ ⟩|k⟩⟨k ′ |
k′ k 0
X ⟨k|∂θ ρθ |k ′ ⟩
Lθ = 2 |k⟩⟨k ′ | (18)
qk + qk′
k′ k
where qk + qk′ ̸= 0.
The Fisher info read
15
F [ρ̂θ ] = T r[∂θ ρθ Lθ ]
X X ⟨k|∂θ ρθ |k ′ ⟩
=2 ⟨m|∂θ ρθ |k⟩⟨k ′ |m⟩
m k′ k
q k + q k ′
X 2
= |⟨k ′ |∂θ ρθ |k⟩|2
q k + qk ′
k′ k
P
Now we work out the term ∂θ ρθ = k (∂θ qk |k⟩⟨k| + qk |∂θ k⟩⟨k| + qk |k⟩⟨∂θ k|)
X ⟨k|∂θ ρθ |k ′ ⟩
Lθ = 2 |k⟩⟨k ′ |
qk + qk′
k′ k
X ⟨k| P (∂θ ql |l⟩⟨l| + ql |∂θ l⟩⟨l| + ql |l⟩⟨∂θ l|) |k ′ ⟩
=2 l
|k⟩⟨k ′ |
′
qk + qk ′
k k
X ∂θ qk X qk′ ⟨k|∂θ k ′ ⟩ X qk ⟨∂θ k|k ′ ⟩
=2 |k⟩⟨k| + 2 |k⟩⟨k ′ | + 2 |k⟩⟨k ′ |
2qk ′
qk + qk′ ′
qk + qk′
k k k k k
X ∂θ qk X qk′ − qk
′ ′
= |k⟩⟨k| + 2 ⟨k|∂θ k ⟩|k⟩⟨k |
qk ′
qk′ + qk
k k ̸=k
where we have use ⟨k|k ′ ⟩ = δ(k − k ′ ), ∂θ ⟨k|k ′ ⟩ = ⟨∂θ k|k ′ ⟩ + ⟨k|∂θ k ′ ⟩ = 0 which implies ⟨∂θ k|k ′ ⟩ = −⟨k|∂θ k ′ ⟩
The generalized Fisher info read
X (∂θ qk )2 X qk′ − qk X qk′ − qk

T r[∂θ ρθ Lθ ] = |k⟩⟨k| + 2 qk′ |⟨k|∂θ k ′ ⟩|2 + 2 qk |⟨k|∂θ k ′ ⟩|2
qk ′
qk′ + qk ′
qk′ + qk
k k ̸=k k ̸=k
X (∂θ qk )2 X
F [ρ̂θ ] = + σk,k′ |⟨k|∂θ k ′ ⟩|2
qk
k k′ ̸=k
2
where σk,k′ = 2qk (qqk′′−qk)
+qk
k
3.4 Uncertainty relation in angular momentum component
⟨J 2 ⟩ = ⟨Jx2 + Jy2 + Jz2 ⟩

= ℏ2 j(j + 1)
⟨J⟩2 = ⟨J⟩2z + ⟨J⟩2y + ⟨J⟩2x

= ⟨J⟩2z
= (mℏ)2
≤ j 2 ℏ2
Now the variance of total angular momentum read
(∆J)2 = ⟨(J − ⟨J⟩)2 ⟩

= ⟨J 2 ⟩ − ⟨J⟩2
(∆J)2 − ⟨J 2 ⟩ ≥ −⟨J⟩2
≥ −j 2 ℏ2
(∆J)2 ≥ ℏ2 j(j + 1) − j 2 ℏ2
∆Jx2 + ∆Jy2 + ∆Jz2 ≥ jℏ2
16
The uncertainty relation between the angular momentum components can be derived through the commutation
relation
[Ji , Jk ] = iℏϵijk Jk
|⟨[Ji , Jk ]⟩| = iℏϵijk |⟨Jk ⟩|
1
Using the CS inequality σA σB ≥ 2i |⟨[A, B]⟩| = 21 |⟨[A, B]⟩|
Then we get
1 ℏ
|⟨[Ji , Jk ]⟩| = ϵijk |⟨Jk ⟩|
2i 2
≤ ∆Ji ∆Jj
and
ℏ
ϵijk |⟨Jk ⟩| ≤ ∆Ji ∆Jj
2
Spin squeezing
(i)
Consider a non-entalged state/ separable state composed of N density matrix ρk
(1) (2) (N )
X
ρ= pk ρk ⊗ ρk ⊗ ... ⊗ ρk
k
P (i)
where pk is positive real number and fulling the completenss relation k pk = 1, ρk i is the qubit index and k is
the summation index.
the variance of angular momentum component along z direction is
(∆Jz )2 = ⟨Jz2 ⟩ + ⟨Jz ⟩2

PN (i)
where Jz = i=1 jz
N (∆Jk )2
ξ2 ≡
⟨Ji ⟩2 + ⟨Jj ⟩2
1 ⟨J 2 ⟩ − ⟨Jk ⟩2
≥ k2
N ⟨J⟩ + ⟨Jk ⟩2
References
[1] Steven M. Kay. Fundamentals of statistical signal processing. 1993.
[2] Matteo G.A. Paris. QUANTUM ESTIMATION FOR QUANTUM TECHNOLOGY.

http://dx.doi.org.libproxy1.nus.edu.sg/10.1142/S0219749909004839, 7(SUPPL.):125137, nov 2011.
[3] Jader P Santos. Quantum estimation theory.
[4] Songfeng Zheng. Fisher Information and Cramér-Rao Bound.
17

Parameter Estimation Techniques in Statistics

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Parameter Estimation Techniques in Statistics

Uploaded by

Copyright:

Available Formats

Parameter Estimation

Png Wen Han

2 Quantum Fisher Information 5

Preliminaries : Revision on Statistics

Classical (cont variable)

Correlation : tell whether X and Y has the same behaviour.

= E0 {[l′ (X|θ)]2 } (1)

F (θ) = V arθ [l′ (X|θ)] (2)

or (refer to the appendix)

F (θ) = −Eθ [l′′ (x|θ)] (3)

For discrete distribution function,

the mean value E(..) becomes

1.2 Multiple measurement

Fn (θ) = V arθ [ln′ (x|θ)]

Fn (θ) = −Eθ [ln′′ (x|θ)]

1.3 Cramer Rao Bound

V arθ [θ̂] ≥ 1/nF (θ) (6)

where 1/nI(θ) is the CramerRao lower bound (CRLB).

F (θ) = −Eθ [l′′ (x|θ)]

1.5 Extension to vectorized Gaussian parameter

for simplest case the model has two parameters θ = (µ, σ)

∆θ ≥ ∆θCRB,i = [F (θ)−1 ]ii

2 Quantum Fisher Information

Classical (cont variable) Quantum

where k = 2N and x ∈ {0, 1}N = {0...0, 0....1, ..., 1...1}

T r[A† B] 2 ≤ T r[A† A]T r[B † B]

Now, the upper bound of Fisher info is can also be written as

F (θ) ≤ FQ (θ) = T r ρ̂θ L2θ = T r [∂θ ρ̂θ Lθ ]

Remark on the quantum Fisher info:

(a) [Re(z)]2 ≤ |z|2 , the saturation occur when z is pure ∀θ

4. Larger quantum Fisher info require large entanglement depth .

2.1 Properties of quantum Fisher information

ii) additivity : QFI of independent quantum system is additive.

(1) (2) (1) (2)

ρθ = Uθ ρUθ† , where Uθ = exp(−iGθ) is the unitary perturbation to the initial state ρ =

where |ψk ⟩ = Uθ |k⟩

X ⟨ψk |∂θ ρθ |ψk′ ⟩

F [ρ̂θ ] = T r[ρθ L2θ ]

Then, we get the Fisher info

FQ [ρ̂θ , G] = T r[ρ L(G)2 ]

2.3 Example 2 : Pure state

Comparing to the systematic logarithmic derivative ∂θ ρθ = 12 (ρθ L + Lρθ ), we know

where ρθ = |ψθ ⟩⟨ψθ | with |ψθ ⟩ = Uθ |ψ0 ⟩

|∂θ ψθ ⟩ = ∂θ Uθ |ψ0 ⟩ = −iG|ψθ ⟩

F [ρ̂θ ] = 4 ⟨ψθ |G2 |ψθ ⟩ − ⟨ψθ |G|ψθ ⟩2

= 4⟨ψ0 |∆G2 |ψ0 ⟩

the Cramer Rao bound becomes

2.4 Example 3: Mixed state

meanwhile we know the variance of G is dened as

⟨∆G2 ⟩ = ⟨G2 ⟩ − ⟨G⟩2

qk ⟨k|G|k ′ ⟩⟨k ′ |G|k⟩ = |⟨k|G|k ′ ⟩|2 qk

where the second terms represent the contribution from mixing.

ρθ = exp(−iJz θ)ρ exp(iJz θ)

where |k⟩ ∈ {0, 1}⊗N

ρθ = exp(−iJx θ)ρ exp(iJx θ)

Jx |0⟩⊗N = e0 [|10...0⟩ + |01....0⟩ + ... + |0...1⟩]

A scalar parameter is given by α = g(θ) which is parametrized by θ. Assume unbiased estimator

where p(x; θ) is the pdf.

p′′ (x; θ)p(x; θ) − p′ (x; θ)2

regularity condition is satisfy if the order of dierentiation and integration is interchangeble

I = E l′ (x; θ)2 = E(l′′ (x; θ))

Using Cauchy-Schwarz inequality

the inequality becomes

T r[A† B]2 ≤ T r[A† A]T r[B † B]

meanwhile we know the variance of G is dened as

regularity condition is satisfy if the order of dierentiation and integration is interchangeble