You are on page 1of 74
Convergence in Probability: A sequence of random variables X1,X2,..., Xp) --+ «= is said to convergence in probability toa, if for any E> 0, Jim PUXn -a|<é)=1 Or Jim PAX, —a|>€)=0 Chebyshev's Theorem: If X,,X2, ...,Xpis a sequence of random variables and if mean [1,, and standard deviation Op of X,, exists for all n and if o,, > 0 as. n — 00, then X,, — bn & Oasn> 0 Proof: By Cheychev's inequality, for. €> 0, on Pg Hpl 2€) < @ 70,as n>0 Hence, Xn — Hn © Oasn = ©, provided o, > 0 ssn > 00 Weak law of Large Number (W.L.L.N.) Let X,,X; X,, is a sequence of random variables and [;, 2, respective means and let B, = Var(X, + X2 ++" +X) <0 ok Then, > inden X, + X_ te +X, + fg tent {| 1 = mn Ha a “| ,...,Xp are independent and identical distributed (i,i.d.), ie, E(X;) = wand Var(X;) = 0? for alli, then By = Var(X, + X2 +++ Xp) = Var(X,) + Var(X2) +++ Var(X,) = no? Thus, W.L.L.N. holds for the sequence of (.i.d. and we get Riou ie, Xp converges in reaey 1 Conditions for WLLN holds When sequence {X,,} is independent i) E(X,) exists for al i i) By = Var(X, +Xz +++ Xp) exists, ee (X%q) fe Lid, then sequence (X,,} is i. ay 2 ii) 30, asn>o n E(X;) exist is enough for the existence of WLLN. we apply the further test This result is called Khinchin's Theorem Markov's theorem: ® If at least one of the conditions is not met then Markov's theorem: The WLLN holds if for some 5 > 0, E(IX;|?*®) exits and bounded The Markov's theorem provides only a necessary condition for the WLLN to hold good. This mean that if for some 5 > 0, E(1X;|'*°) unbounded then WLLN cannot hold for the sequence of rv. {Xn} Result: Set ~ If the variables are upiformly bounded then the condition, B, lim — > 0 n>© n is necessary as well as sufficient for WLLN to hold. The necessary and sufficient condition for sequence {X,,} to satisty the WLLN is (8 rer 1+Y¥2 asn Sn-E(Sn) n Where Y,, = here S,, =X, +Xp te + Xp Example: Examine whether the weak law of large numbers holds for the sequence {X;} of independent random variables defined as follows: P(Xp = +2") = 2502! | PO = 0). = 427 Solution: Xp 2k —2F 0 E(Xy) = DX PX) P(X,)|2-@2k+) | 2-(@k+) | 4 — 2-24 = 28(2-@k+0)) _ 2*(2-@8+)) 4.9 By = Var(X, + Xz +0 +Xq) =0 (finite) = Var(X,) + Var(X,) + + Var(X,) Var (Xi) = B(X2) = (E(%))" <1tit—+1 = 224(2-@H*) 4 2282-2400) 4.9 — 02 =n =e ones 2 2 n nm Hence, WLLN holds for the sequence of independent .v.'¢ A. Example: Examine if WLLN holds for the sequence {X,, } of i.i.d r.v's with Seas P(X, = (-1)*"1k) = k=1,2,3,...3f = 12,3 nk?’ Solution: E(X;) = Dea, XiP(X;) Hence, by Khinchin’s theorem, WLLN holds for the sequence (x; } of iid tv's. XN Example: Let X; assume that values i and — i with equal probabilities. Show that the law of large numbers cannot be applied to the independent variables X,,X>,...... # (E(X) = EXiP(K) | [Var(x,) = (x?) - (E%D)” ii 2 (2 372 - 5+ S*J-@ )| %| % =0 =i2 B, = Var(X, + Xz ++ +X,) B (n+1)2n+1) “= 5 =Var(X;) + Var(X2) + + Var(X,) 2 Gal = 127422? 4--4n? Hence, we cannot draw any conclusion “nove n+) vite wR te on 6 oe Here, we apply the further tests. prs apts i E(|x,)!*8) = Xj ee |e rar (al) =~ 4 PX)| | = pts which is unbounded for 5° 0 Hence, by Markov's theorem, the WLLN cannot be applied — to the sequence {X;} of independent r.v's. Example: {X,.},k = 1,2, ... is a sequence of independent random variables each taking the values, —1, 0, 1. Given that P(X, = 1) == P(X, = -1), P(X, = 0) =1 -2 Examine if the law of large number holds for this sequence or not. Solution:{E(X) = DXePCXq) | Var(%e) = (X2)- (COD)? =i let rye] By = Var(X; + Xz ++ +Xy) = Var(Xy) + Var(X2) ++ Var(X,) - == “50 dsm 09 1 n 4 1 =2[5+5+5+~45] Thus, WLLN holds for the sequence {X,,} Example: Examine whether the WLLN holds good for the sequence X,, of independent random variables, where P(X, =4)=? iP (X= - E(Kq) = IXnPXq) ) (Var(X,) = B02) - (00))" me Pa =1vn aU ae ae Filed | | = % = 3n*3n (a) Oo) 5 3 8 ~5n By, = Var(X, + X_ ++ 4+Xq) a = Var(X,) + Var(X2) + + Var(X,) os a 7 0,asn 70 =spegtg+ +4] encores Example: Examine whether weak law of large numbers holds for the sequence {X,} of independent random variables defined by P (Xx = +k) =4 , Since X;'s are independent, so _Bya_nt1 1 By = Var(Xy + Xp +o" + Xq) Ter gal 2 = Var(X,) + Var(X2) + ce een Hence, we cannot draw any conclusion S1t2tertn -maty whether WLLN holds or not. 2 Here, we apply the further tests. Kz I ted oe or E(IXg1*9) = DIXK 8 POX) Xx 1+6 1+5 eet ee P(X,) 772 2 1+6 =k2___ this is unbounded for 5 > 0 Hence, by Markov's theorem, the WLLN be applied to the sequence {X,,} of independent rev's. Example: Let X,, Xz, ..., Xp. be iui. variables with mean 4c and variance 0? and as. inks 488 n> 00, EEE . tor some constant c; (0 < c < 00). Find.c. = ae Solution: Given that E(X;) =u, Var(X)) =o Var(X,) = E(X?) — p2 = E(X?) =u? + 0? (finite) for i = 1,2, .., Thus, by Khinchin’s theorem, WLLN holds for the sequence X? of Lid. ray's so that XP 4+ XP 4+--4+X? MaKe 8 cae) n = 2 2 2 ae Ap Eka Fale pw +o? a ) n Example: Determine whether the WLLN holds for the following sequence of independent random variables: P(X, = 1) =30 — 27") = P(X, = -1) Solution: E(Xx) = LXpP(Xy) =}0-2%)-30-27) =0 (finite) Var(X,) = E(X2) ~ (EQ%))° =tas24ta-2)-0 =1-2" — x [= 2 =1 PO) Fa -2" 1 =n za-2 ) Since X;'s are independent, so By, =Var(X, +X ++" +Xq) = Var(X;) + Var(X,) +--+ Var(X,) =Ya-2-") 274(1-(2-1)") 4)" =n-1+(3)" 70 asn>@ Hance, WLNN holds forthe sequence (X} Example: Let {X;,} be mutually independent and identical distributed random variables with mean [1 and finite variance. If Sy =X, +X2+-+Xpy, check whether law of large number holds for the sequence {S,, }. Solution: Given that {X;,} be mutually i.i.d with mean 4, variance 0 (say) S,=X, Thus, 52 =X, +X, By, = Var(S; +S2 + ++ Sy) =v, [* + (Xy + Xp) +O, + Xp + i +~) ae ee 4) = Var(nX, + (n—1)X2 + (n— 2)X3 + +Xy) = n’Var(X,) + (n = 1)?Var(X,) ++. +Var(Xy) XytXy to tk, [EG =n | (finite) | } : a = wate (n- 1)?o? +++? Thus, By = Var(S; + Sz + + Sy) IX, + (Xy + Xz) + Oy + Xp + Xg) + weer Ht Byt FE) = Var(nX, + (n= 1)X2 + (n— 2)X3 ++ Xn) = n*Var(X,) + (n— 1)?Var(X,) +. +Var(X,,) = no? + (n— 1)?6? + +07 = 07 [n? + (n—1)? +--+ 17] _ o?n(n + Qn +1) 6 = Var Example: If X; can have only two values, i~ and —i“with equal probabilities. Show that the law of large numbers to be the independent variables Xy,X2,..,if a <3. Solution: E(X;) = DX;P(X)) By = Var(X, +X_ ++ +X,) noe ee =Var(X,) + Var(X;) +--+ Var(X,) 2 2 = 128 4228 4. ne Var(X;) = E(x?) - (E))" & fx? dx (By Euler Maciaurin’s formula) pe pa 0 =S+5-0 neat = (2 3 ee 0 only if2a—-1<0, } n 2a+1 ie., \ Central limit theorem (CLT) is an important result in Statistics, which states that the normal distribution is the limiting distribution to the sum of the independent random variables with finite variance as the numbers of random variables get indefinitely large. Since many real processes yield distributions with finite variance, this theorem has a wide See es eee eR ne” Central Limit Theorem a it X,(i = 1,2, ..., m) be independently distributed random variables such that E(X,) = wy and Var(X;) = 07 then as n — 00, the distribution of the sit of these random variables, namely at Sp = Xy+Xq+-+Xn tends to the normal distributed with mean 41 and variance 0”, where Example: A coin is tossed 200 times. Find the approximate probability that the number of heads obtained is between 80 and 120. aia ear Required probability aoe, = P(80 << X < 120 oS = aoe =P(4=80) + P(K=M1) + - - eA = 120) at = ,...,Xp are Poisson variables with parameter 2, use the central limit theorem to estimate P(120 cm The principle of least squares consists of adjusting values of the unknown parameters a,b such that the sum of squares of errors of estimate is mimimize, ie., E=Y0- ye) les = dY(y-a-— bx)? For minimization, OE OE ja79 & x0 which gives OE ja = ~2L07 — a — bx) = 0 => Yiy-a-bx)=0 & S=_2xy(y-a- bx) =0 => Y(xry —ax — bx?) =0 Hence, we get ® Ly = na + byx Lay = adx + by x? These equations are called as NORMAL EQUATIONS. After SIMPLYING these two equations, we can get the values of the parameters a and b. Hesse tine remain Bab ogg eeertian i597, 0 3, Similarly, we can find the trend equations for other curves. y=at+bx qd) What are the coefficients of the constants a,b ?? Coefficient of a ist Multiplying the Eq. (1) by 1 and summation Yy = Yat Ybx =f y =nat+ bY es \ Coefficient of b is x Multiplying the Eq. (1) by x and summation Yxy = Yax + Ybx? => Dxy = ax + byx Two Samples Ini Thus, Normal Equations for the Straight line/ Linear curve is Ly = na + byx xy = aYx + byx? After solving this, we get the values of a,b and hence the corresponding equation y = a+ bx obtained is called as ‘REA y=atbxt+cx? (2) What are the coefficients of the constants a, b, Coefficient of a ist Multiplying the Eq. (2) by 1 and summation > Sy =na+byx+chx? Coefficient of b is x Multiplying the Eq. (2) by x and summation Uxy = ax + bYx? + cYx3 Coefficient of c is x? YLex?y = ayx? + byx? + cYx* vs Thus, Normal Equations for the Quadratic curve/Parabolic curve is Yy = na + bYx + cYx? dexy = adx + by x? + cYx3 Yx?y = ayx? + by x? + cyx* After solving this, we get the values of a,b,c and hence the corresponding equation y x cx? obtained is called as Example I: For the following data, a) Fit the linear trend by the least square method. b) Calculate the trend values. ¢) Estimate the production for the year 2020. Year 2010/ 2012| 2014/ 2016/2018 Production (in ‘000 units)|18_ [21 [23 [27 [16 Solution: The linear equation is y=atbx The normal equations of the straight line are Dy = na + byx Dxy = aYx + byx? ere aes: 7) 4 64 16 “Year 00 | Prod x= xy x S| wy | x-z04 2010 | 18 4 -72 16 2012 | 24 2 -42 4 2014 | 23 0 0 0 2016 | 27 2 54 4 2018 16 Total ios Normal Equations Ly = na+byx Yxy = adx + byyx? Normal Equations Yy = na+byx c> 105 =5a+bx0 Day = ax + bYx? 4=ax0+40b toca, b= Thus, the trend equation is y =21+0.1% c) Estimate the production for the year 2020: For 2020: the value of x = 6 Ke K= doy Hence, the estimated value =21+01x6 e = 21.6 (‘000 units) Example 2: Below are given the figures of production (in ‘000 tons) of a factory: Year 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 Production|77 [ee [a4 [es [ar [ae [a0 a) Fit the straight line by the method of least square and show the trend values. b) What is the monthly increase in production? Solution: The linear equation is y=atbx [ ty=na+byx xy = ax + byx? The normal equations of the straight line are dy = na +byx 623 =7a+bx0 Dxy = ax + byx? c> 56 =ax0+28b Hence,a = 89, b=2 Thus, the trend equation is Year | Prod |x=X-| xy wm | wy | 206 y=39+2x 2012 | 77 | -3 |] -231| 4 83g BIKAR 2013 | 88 | -2 | -I76 b) What is the monthly increase in production? The trend equation is oy. \ y =89 + 2x gee) Trend values Increase with a rate of 2 (‘000) units every year. ice., the yout increase in production is 2000 tons. Hence, the monthly increase in production = 2000/12 = 166.667 tons. Example 3: The prices of a commodity during 2011-2016 are given below. Fit a parabola to these data. Estimate the price for the year 2020. Year | ZOI| 2012 | 2013 | 2014| 2015 | 2016 Price |100 |107 [128 |140 |i! |14Z2 (on: The parabola equation is Ox + bx +c y =qtbxtcx? Normal Equations for the Quadratic curve/Parabolic curve is Yy = na + byx + cx? Yxy = aYx + byx? + cyx? yxy = ax? + byx? + cyx* Normal Equations for the Quadratic curve/Parabolic curve ls Ly = na+byx + cx? 848 = 6a + 70c Dxy = ax + bYx? + cEx? 694 = 70b Yx?y = ax? + byx? + cDx* 10160 = 70a + 1414c After Solving, we get a= 136.130, b=9.914, c=0.446 Thus, parabolic trend equation is y = 136.130 + 9.914 x + 0.446x2] S What is the price for the year 2020.? For year 2020: ®& x = 2(2020 — 2013.5) = 13 Substitute x = 13 in the trend equation y = 136,130 + 9.914 x + 0.446x? and get y = 340.386 Hence, the price for the year 2020 is 340.386. For the curves like y = ab* or y = ae™* or y = ax? ete How to write the Normal Equations of it; © How to fit the time-series data on it. Next lecture will be | Method of Rast Square or Curve Fitting for such curves Remember: The normal equations of such curves can't compute directly, as in the case of linear or quadratic. The normal equations of the curve y = ab* can't compute directly, as in the case of linear or quadratic. Because, in the exponential curve . ow at owe > y =ab* Coefficient of dl is*b*7 Coefficient of b is a Here, both the coefficients are unknowns. While in case of linear In case of quadratic y=a+bx y=atbx+cx? Coefficient of ais 1: | Coefficient of a is 1; Coefficient of b is % Coefficient of b is x: Coefficient of c is x? Here, these coefficients are known and are independent of a, b,c. Fitting of Exponential Trend: The exponential ws a by lat) ae sgn Taking logarithm of both sides, we get logy = loga + log b* => logy=loga+xlogb => Y=A+xB (1) whereY = logy , A=loga, B = logh “itting of Exponential Trend: Now, Eq. (1), is a straight Che exponential curve is given by line trend between Y and x. y = ab* Hence, the normal equations are Taking logarithm of both sides, we get LY =nA+BYx % log y = log.a + log b* and = Yix¥ = AYx + BY:x? = logy = loga + xlogb a sat tse ta Sones os, a => Y=A+xB d) a= Antilog (A) ; b = Antilog(B) whereY = logy , A=loga, B =logb Example: Fit a trend function y = ab* to the following data: Year 2013 2014 2015 2016 |2017 Sale (0000) 16 45 13.8 40.2|125 Solution: For the curve y = ab*, Taking logarithm of both sides, we get logy = loga+ xlogb => Y=A+xBy where Y = logy , A=loga, B =logb Year | Sales |x=X-|Y=logy | xY | x? m | on | 2013 | 16 | 1 [0.2041 [0.2041 | 1 2014 | 45 | 2 |0.6532/13064 | 4 205 | 13g | 3 [Lisaa [3.4ia7 | 4 2016 | 40.2] 4 |1.6042 [6.4168 | 16 zoi7 | 125 | 5 |2.046a/10.4845| 25 eae i ag sas 55 Thus, normal equations are LY = nA + BYx 5.6983 = 5A +15B and Yx¥ = AYx + ay 21.8315 = 154 +55B After solving, we get A = —0.2814: B = 0.4737 Therefore. a = Antilog(A) = Antilog(—0.2814) = 0.5231 b = Antilog(B) = Antilog(0.4737) = 2.977 Thus, the trend equation is |y = (0.5231)(2.977)* Calculate the trend values: Trend values Yyoa® S | 1 |1.5573,, 2014 | 4.5| 2 |0.6532| 1.3064 | 4 | 4.6361 2015 | 13.8) 3 [1.1344 3.4197 | 9 |13.8017 2016 |40.2| 4 [1.6042 6.4168 | 16 | 41.0877 2017 | 125 | 5 | 2.0464) 10.4%45) 25) 122.310 ‘oval, i aa a Fitting of Trend y = ae?* The curve y = ae* Taking logarithm on both sides, we get log y = loga + loge* — Seon ne * =>Y=A+bx where Y = logy, A=loga which is linear trend between Y and x. Normal Equations are LY = nA + by x Se ON and YxY = AYx + byx? i After solving, we will get the values of. 4 and b. Thus, value of @ is computed by a = Antilog (A). Year (x) 2018 Fitting of Trend y = ax?: The curve y = ax? Taking logarithm on both sides, we get logy = loga + logx? = logy = loga + blogx == A+bX which is linear trend between Y and X. where Y = logy, A = loga,X = logx

You might also like