Professional Documents
Culture Documents
http://www.cs.tut.fi/kurssit/ELT-53606/
Network analysis and dimensioning I D.Moltchanov, TUT, 2013
OUTLINE:
• Why do we need statistics?
• Description of statistical data;
• Statistical CDFs and histograms;
• Estimating parameters of the sample;
• Point estimators of parameters;
• Interval estimators of parameters;
• Criteria of fitting accuracy;
• Criteria of homogeneity of samples;
• Statistics of stochastic processes;
• Tests for autocorrelations and white noise;
• Tests for stationarity.
where
• σ 2 [X] is the estimate of variance;
PN
• m is the estimate of mean, given by 1/N i=1 Xi .
Important: such an estimate of variance is biased meaning that we make systematic error.
1 82 1 75
2 80 2 78
3 80 3 78
4 78 4 80
5 78 5 80
6 84 6 81
7 82 7 82
8 75 8 82
9 85 9 84
10 81 10 85
FX* ( x) = Pr{ X £ x}
0 x1 x2 x3 x4 x5 x
FX* ( x) = Pr{ X £ x}
0 x1 x2 x3 x4 x5 x
Fi(D) Fi(D)
100 1
50 0.75
0 0.5
50 0.25
100 0
0 14.29 28.57 42.86 57.14 71.43 85.71 100 100 64.29 28.57 7.14 42.86 78.57 114.29 150
iD iD
SPDF: 100 radom numbers with Normal(15,30) SPDF: 100 radom numbers with Normal(15,30)
Fi(D) Fi(D)
200 1
125 0.75
50 0.5
25 0.25
100 0
0 142.86 285.71 428.57 571.43 714.29 857.14 1000 100 64.29 28.57 7.14 42.86 78.57 114.29 150
iD iD
SPDF: 1000 radom numbers with Normal(15,30) SPDF: 1000 radom numbers with Normal(15,30)
p1 * p2 * p3 * p4 * ... pi * ... pN *
fi,E(D) E(i)
0.15 100
0.11 75
0.075 50
0.038 25
0 0
0 5.71 11.43 17.14 22.86 28.57 34.29 40 0 14.29 28.57 42.86 57.14 71.43 85.71 100
iD i
Histogram: 100 radom numbers with Geom(0.1) 100 radom numbers with Geom(0.1)
fi,E(D)
E(i)
0.1
100
0.075
75
0.05
50
0.025
25
0
0 5.71 11.43 17.14 22.86 28.57 34.29 40 0 4
0 1428.57 2857.14 4285.71 5714.29 7142.86 8571.43 1 .10
iD
i
Histogram: 10000 radom numbers with Geom(0.1)
10000 radom numbers with Geom(0.1)
f*X(x)
x1 x2 x3 x4 x5 x
fi,E(D) E(i)
0.03 100
0.0225 50
0.015 0
0.0075 50
0 100
150 107.14 64.29 21.43 21.43 64.29 107.14 150 0 14.29 28.57 42.86 57.14 71.43 85.71 100
iD i
Histogram: 100 radom numbers with Normal(15,30) 100 radom numbers with Normal(15,30)
fi,E(D) E(i)
0.015 200
0.0113 125
0.0075 50
0.0038 25
0 100
150 107.14 64.29 21.43 21.43 64.29 107.14 150 0 142.86 285.71 428.57 571.43 714.29 857.14 1000
iD i
Histogram: 1000 radom numbers with Normal(15,30) 1000 radom numbers with Normal(15,30)
fi,E(D)
E(i)
0.1
80
0.075
60
0.05
40
0.025
20
0
0 5.71 11.43 17.14 22.86 28.57 34.29 40 0
0 14.29 28.57 42.86 57.14 71.43 85.71 100
iD
i
Histogram: 100 radom numbers with Geom(0.1)
100 radom numbers with Geom(0.1)
fi,E(D) E(i)
0.1 150
0.075 112.5
0.05 75
0.025 37.5
0 0 4
0 5.71 11.43 17.14 22.86 28.57 34.29 40 0 1428.57 2857.14 4285.71 5714.29 7142.86 8571.43 1 .10
iD i
Histogram: 10000 radom numbers with Geom(0.1) 10000 radom numbers with Geom(0.1)
a = φ(X1 , X2 , . . . , Xn ). (13)
a → a, N → ∞. (20)
E[a] = a, N → ∞. (21)
σ 2 [a] → 0, N → ∞. (22)
• the first term is the mean of X 2 that probabilistically tends to its mean: E[X 2 ];
• the second term m2 probabilistically tends to E 2 [X];
• finally, we may find that the estimator is consistent:
N
2 1 X 2
σ [X] = lim (Xi − m2 ) = α2 [X] − E 2 [X] = σ 2 [X]. (29)
N →∞ N
i=1
0
• note that for any measurement X i , i = 1, 2, . . . , N we have:
2
0
E X i = σ 2 [X], i = 1, 2, . . . . (33)
• substituting we get:
N
2 N −1X 2 2 X 0 0 N −1 2 N −1 2
E[σ [X]] = σ [X] − E[X X
i j ] = N σ [X] = σ [X]. (35)
N 2 i=1 N 2 i<j N N
Important note:
• multiplier N/(N − 1) must be taken into account whenever N < 50;
• with increase of N the multiplier N/(N − 1) tends to one and can be dropped.
• where
n n
1 X 1 X
mX = Xi , mY = Yi . (40)
N i=1 N i=1
q1 q q2
• which is equivalent to
σ[X] σ[X]
P r E[X] − zα/2 √ ≤ µ ≤ E[X] + zα/2 √ ≈ 1 − α, (46)
N N
– substituting your values to get 100(1 − α)% confidence interval for µ;
– where −zα/2 and zα/2 are upper or lower critical values of N (0, 1).
a a
2 2
x1 x2 x3 x4 ... xi ... xk
Let us take a null hypothesis H0 consisting in that the RV X has the following PF.
x1 x2 x3 x4 ... xi ... xk
p1 p2 p3 p4 ... pi ... pk
• deviations of frequencies p?i from probabilities pi are just due to stochastic reasons.
Note: to verify or reject this hypothesis we have to define a certain measure of deviation.
r = k − l − 1, (51)
• estimate frequencies ni = i, i = 1, 2, . . . , k;
• estimate values of hypothetical function F (x), pi = P r{X ∈ ∆i }, i = 1, 2, . . . , k;
• using tables get quantile χ21−α (k − 1)
• estimate χ2 statistics as
k
2
X (ni − npi )2
χ = (58)
i=1
npi
– note: np5 = 0.6 < 5 and np4 = 3 < 5, np3 = 12.2 > 5;
– we have to join three last intervals (important empirical rule!).
• determine the number of degrees of freedom: r = k − l − 1 = 4 − 1 − 1 = 2;
• choosing α = 0.05, get χ21−α (r) = χ20.95 (2) = 5.99;
• estimate χ2 statistics;
– since χ2 = 0.9 < χ20.95 (2) = 5.99 we accept H0 .
7. Homogeneity of samples
Homogeneity of samples:
• we test whether two samples are taken from the same distribution.
We have:
• criteria for fitting to hypothetical distribution;
• criteria for homogeneity of samples
Difference between these two:
• fitting: we compare statistical with analytical one;
• homogeneity: we compare two statistical distributions.
What tests are available:
• Smirnov’s test;
• χ2 test.
KY(i) KY(i)
1 1
0.75 0.75
0.5 0.5
0.25 0.25
0 0
0.25 0.25
0.5 0.5
0 5 10 15 20 0 5 10 15 20
i, lag i, lag
(a) AR(1) model: K(1) = 0.6 (b) AR(1) model: K(1) = 0.0
KY(i) KY(i)
1 1
0.75 0.75
0.5 0.5
0.25 0.25
0 0
0.25 0.25
0.5 0.5
0 5 10 15 20 0 5 10 15 20
i, lag i, lag
(a) AR(1) model: K(1) = 0.6 (b) AR(1) model: K(1) = 0.0
White noise: sequence of iid random variables with finite mean anf variance.
Tests is as follows:
• in four cases there is a turning point in the middle:
– white noise: there should be (2/3)(n − 2) of such points out of n.
• for large n number of turning points is distributed as N (2n/3, 8n/45);
p
• reject H0 at α = 0.05 if number of turning point is outside 2n/3 ± 1.96 8n/45.
KY(i)
1
c 2a =0.05(5) = 11.07 < Q(5) = 49.013: H0 rejected
0.75
0.5
0.25 c 2a =0.05(10) = 18.31 < Q(10) = 51.416: H0 rejected
0
0.25
c 2a =0.05(15) = 25.00 < Q(15) = 55.332: H0 rejected
0.5
0 5 10 15 20
i, lag
KY(i)
1
c 2a =0.05(5) = 11.07 > Q(5) = 4.193: H0 accepted
0.75
0.5
0.25 c 2a =0.05(10) = 18.31 > Q(10) = 8.416: H0 accepted
0
0.25
c 2a =0.05(15) = 25.00 > Q(15) = 11.499: H0 accepted
0.5
0 5 10 15 20
i, lag
Figure 18: Example of the ’Portmanteau’ test for trace generated from AR(1) model.
KY(i)
1
c 2a =0.05(5) = 11.07 < Q(5) = 50.789: H0 rejected
0.75
0.5
0.25 c 2a =0.05(10) = 18.31 < Q(10) = 53.467: H0 rejected
0
0.25 c 2a =0.05(15) = 25.00 < Q(15) = 58.028: H0 rejected
0.5
0 5 10 15 20
i, lag
KY(i)
1
c 2a =0.05(5) = 11.07 > Q(5) = 4.454: H0 accepted
0.75
0.5
0.25 c 2a =0.05(10) = 18.31 > Q(10) = 9.105: H0 accepted
0
0.25 c 2a =0.05(15) = 25.00 > Q(15) = 12.66: H0 accepted
0.5
0 5 10 15 20
i, lag
Figure 19: Example of the modified ’Portmanteau’ test for trace generated from AR(1) model.
24
18
12
0
0 1000 2000 3000 4000
Figure 20: Signal-to-noise ratio process over IEEE 802.11b wireless channel.
fi,E(D)
0.15
fi,E(D)
0.11
0.11
0.083
0.075
0.055
0.038
0.028
0
0 7.14 14.29 21.43 28.57 35.71 42.86 50
0
iD 0 7.14 14.29 21.43 28.57 35.71 42.86 50
T 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
m =
0 1 10 61 72 126 329 914 602 513 476 522 433 427 870 353 78 36 21 16 7
T 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
n =
0 11 17 24 36 31 70 251 213 176 240 212 197 167 266 118 55 49 26 19 7
Conclusion: since 1-dimensional distributions are different trace is not strictly stationary.