Falk1997 Article OnMadAndComedians

‘Ann. Inst, Statist. Math. Vol. 49, No. 4, 615-644 (1907) ON MAD AND COMEDIANS MICHAEL FALK Mathematisch-Geographische Fakultit, Katholische Universitat Bichetitt, Ostenstrasse 26-28, D-85071 Bichstatt, Germany (Received March 8, 1996; revised October 15, 1998) Abstract. A popullar robust measure of dispersion of a random variable (rv) X is the median. absolute dewiatinn fram the median ened(|X ~ mea(¥))), MAD for short, which is basod on the medion med(X) of X. By choosing, Y = X, the MAD tumns out to be a special case of the comedian med((X ~ med(X)}(Y — med(Y))), which is a robust measure of covariance between rvs X and Y. We investigate the comedian in detail, in particular in the norenal ‘case, and establish strong consistency and asymptotic normality of empirical ‘courterparts. ‘This leads to a robust competitor of the coefficient of correlation as an asymptotic lovela-statistic for testing independence of X and Y. An example shows the weird fact that knowledge of the population med(.X) does not necessarily pay (in the sense of asymptotic relative eflciency) when estimating the MAD. ey worty and plrases: Meulan absohie deviation from the median, robust measure of correlation, comedian, breakdown point, covariance, correlation eo- officient, bivariate normal vectors, strong consistency, asymptotic normality. 1. Introduction A widely accepted robust measure of location of a random variable (rv) X is the median med(X), which satisfies the inequalities P{X < med(X)} > 1/2 < P(X 2 wed(X)}. The median is in general nor unique, bur med(x) = 1 *(4/2) is a possible choice, where Py) wie RQ Z gs 4S 1) denotes the generalized inverse of the distribution function (edf) F(t) = P(X < Hh, ECT, of N. Consequently, we aay med(X) — (or <)>) 6 ¢ R, if there exists at least one median of X, which satisfies this relation. In case of a symmetric X, the median of X is the centor of symmetry and coolucides tur chis case with che expecration E(X) of X, Af Ie exists; the normal case is a popular example. ‘A robust alternative to the standard deviation VAR(X)"/? := B{(X — b{A)}?)" as a measiire ot scale is the median absolute deviation from the median 615ore MICHAEL ALK (MAD), defined by (1) MAD(X) := med(|X — med(X))). As Huber ((1981), p. 107) points out, the ‘MAD has emerged as the single most usefull ancillary estimate of scale”. Hampel (1974), who called MAD the median deviation, showed it to be an M-estimate of scale, which facilitates calculation ot the influence curve (see e.g. Huber (1981), p. 137)). In particular for a normal rv X with mean jc and variance VAR(X) = 0?, we obtain from Lemma 1.1 below the following linear correspondence between MAD(X) and the standard deviation of X: MAD/X) = 9 med(|(X — u}/ol) = 0113/4). where © denotes the standard normal cdf, Our approach towards robustiaess of statistical functionals relies on the data based concept of breakdown points (Rousseeuw and Leroy (1987)} i.e., the least proportion of observations in a data set that must be moved in order to lot the value of the statistical functional under consideration break down iJe., tend to infinity. ‘The MAD has the highest possible breakdown point, since roughly one half of the data have to be pushed to infinity to let the median follow. While the MAD or the (in case of a svmmetric rv X eauivalent) interouartile ranae JOR P-1(3/4) ~ F-1(1/4) as a farther robust measure of dispersion underlying a box: plot (Tukey (1977}) have become quite popular in recent years and are now built-in procedures in statistioal software narksees, a median hasee and therefore rahnst alternative to the covariance COV(X,Y) = B((X — E(X))(¥ — E(Y))) as a measure of covariance between rvs X, ¥ is not ready at hand, though it plays a crucial role in various topion of atatistion ‘The covariance is the center of interest in correlation analysis, it is essen- tially the (simple) regression coefficient in regression analysis and it is of cause the componcat of the covariance matrin 8 of e random vector X. The covariance matrix is basic to different multidimensional techniques such as principal com- ponent analysis, where the eigenvectors of $ are the (orthogonal) directions of largest varlauon oF in factor unaiysts Lesed on prluclyal compoueuts. Phe yeu eralized squared distance (a — y)$~"(x ~ y) between vectors @, y is known as their Mahalanobis distance, typically applied in discriminant analysis and cluster anatysis. ‘Lhe near transtormauon Y := 5 ‘/*.A 1s known as data sphermg, resulting in a random vector ¥ whose covariance matrix is the nity matrix. Here S-/? denotes the usual symmetric root of $—"/?. ‘The common empirical counterpart of COV(X,Y), based on n independent copies (X1,¥i),--+s(Xn, Ya) of the bivariate random vector (X,Y), is the empirical covariance GOVA(X,Y) = (nD SG = XaM(Ki nds where X, Dh XM and ¥, Y; are the sample means. The empirical covariance (inatrix) is obviously highly sensitive to extreme observa- 2 ne it can corinuely he affected hy only ane evtreme ohservation among,ON MAD AND COMEDIANS 517 (HY )s ee (M Yh) fees ils Lecahow as poi is 1/st, Never hiekess i in (ypivally in use as a black box in statistical software packages. An exposition of robust covariances is given by Huber (1981), Chapter 8) In this paper we propose an alternative measiire of dependence between rvs X,Y, which we will call comedian of X and Y. Tt generalizes the MAD since it equals MAD? in case X In this sense, MAD and comedian parallel o and. covariance which satisfy o# = COV(X,Y) in case X = Y. As the comedian is actually a median, it alao has the highest possible breakdown point In the following we compile several basic results for general rvs X, Y, thus evaluating parallel and different properties of COV(X,¥) and the comedian. In the next section we investigate the particular ease where (X, ¥’) is bivariate nortnal We will see that there is a (continously differentiable) ono-to-one corresnandence between the coefficient of correlation @ of normal vectors (X.Y) and the come- dinn based correlation median. This observation offers a way to define a robust astimatar af @. In Sections 2 and A we ostablish Himit rovulte for estimates af the comedian and show by an example the weird fact that knowledge of the population med(X) does not necessarily pay (in the sense of asymptotic relative efficiency) whon MAD(X) hao to be catimated. By £(X) we denote in tho following the distribution of X. Our first lemma is obvious but nevertheless quite useful LEMO4a LL. Ran veal numbers a, bG BR med(aX +b) = amed(X) +b. Note that the proceding equality includes negative a. By a) COMLEX, ¥) <= mad (X — mad K)YY — moa (V))) we denote the comedian of X and Y. The comedian parallels COV(X,¥) and is therefore a measure of covariance. But COV(X,Y) requires the existence of the first two moments of X and Y, whereas COM(X,Y) always exists. Take for ‘example X, ¥ independent, each following a standard Cauchy distribution. Then COV(X, Y) does not exist but their comedian is zero. This is immediate from the following lemma, Lemma 12. If X and ¥ are independent, then COM(X,Y) = 0. PRoor, The assertion follows from Fubini’s theorem: PUL — mod(X))(¥ — med(¥)) i oy =f POX — meat xY > 0160 — meat Vian) fos) © +f P(X —med(X) > O}L{Y — med(¥)dy) oop) & + PUY = med(¥)} sysois MICHAEL FALK Le., CUM(A,Y) = 0.0 ‘The proof of tho following lemma uses the fact that med(X) < med(¥) if PAY ROOF. ‘Lhe mequality tollows trom the preceding remark and Lemma Ll: 0 = med(X ~ med(X)) < med({X| ~ med(X)) = med(1X1) — med(X), and 0 = med(mod(X) ~ X) < med(med(X) + |Xj) = med(X) + med |X’ which imply ~ med(|X|) < med(X) < med{|X)). ‘The equality med(|X|) = med(|X|")}/* follows by utilizing the generalized inverse: med(|X|) = inf{t > 0: PA|X| 1/2} = inf{t > 0: P(X <#} > 1/2} = (inf{s 20: P{|X]? < 8} > 1/2})'/” = med(|xX|P)*/”. a Lemmas 1.1 and 1.3 imply that MAD(aX + 5) = |a|MAD(X) and COM(X,Y) = aMAD(X)*, if ¥ = aX +h as. for some a,b € R. By choosing X—Y, ue MAD theiefure luis out uy be a spectal eave uf de eoutediad: ‘The comedian is moreover symmetric, location invariant and scale equivariant i.e., COM(X,a¥ +5) = aCOM(X,¥) = aCOM(Y,X). A natural median based al- Leruaulve 10 the coespieient of corretation = COV(A,Y}/(ozdy) 18 therefore the correlation median : |) eg SOMO) (13) 8X Y= = seaDeR MAD) Since 6 = 0 if X and Y are independent by Lemma 1.2, and é € {1,1} in case of complete dependence Y = aX +4 a.s., the question naturally arises, whether 5 € [—1, 1] im general, just like 9. The answer is however “no” for general rvs X. Y, whereas in the bivariate normal case the answer is “yes” (see the discussion after Lemma 2.1 below). Even more, we will see in the next section that there is a smiooth one-to-one correspondence hetween o and § in the normal case. Towether with the limit results in Section 3 on empirical counterparts of MAD and COM, this observation offers therefore a way to estimate the coefficient of correlation Af hivariate normal vortors hy moans af tho comndian ie, by an setimator with highest breakdown point.ON MAD AND COMEDIANS 19 2. The normal case In this section we evaluate the comedian in the predominant case of biyariate normal vectors {X,¥), ie. £((X,¥)}) = N(y, 5), where p= (uy .42)" € R’ is the vector of means and ¥ = (0,,) is the (2 x 2)-covariance matrix. We assume that a2, > 0. i= 1.2, Since the distributions of X and Y are svmmetric with respect: to jy and 2, we have med(X) = j1,, med(X) = jz, Lemma 1.1 farther entails COM(X,¥) = med((X = pa )(¥ — 19)) X= mY =ouon sed ( =) ou On and thus, we assume without loss of generality throughout this section that = 0 and that B= (/ 9) with |o| < 1. In the following we compile a list of auxiliary results on explicit representations ol (X,¥) and XY via independent rvs. By noting that XY = ((X+¥)®—(X —¥)?)/4, the following auxiliary result is immediate from the well-known fact that if (X,Y) is bivariate normal N(0,), then X +Y and X —¥ are independent normal with means 0 and respective variances 2(1 + 9) and 2(1 ~ 9) LEMMA 2.1, Suppose (X,Y) is bivariate normal N(0,5). Then eK) 6 Ge + wet ot) ; where Zy, Zs are independent standard normal rvs. Lemma 2.1 implies that the comedian of X and ¥, whose joint distribution is N(O.¥). is a strictly monotone increasing, symmetric and continions funetion af 2 € [4,1]. Define the comedian funetion by g(g) = COM(X, Y). ‘Then we have olor) < oes) for -1< ae Land lot (X,7) have che wingorm aiseribution on the set ((@,y)i1S@Saifasy st (is ubviy620 MICHAEL FALK aul ay > 1) U{(e,y) +1 Se S -1/a,-a Sy S -1 and zy > 4). then, by symmetry, med(X) = med({¥} = 0, med(X*) = med(¥) = 1 which implies MAD(X) = MAD(Y) = 1, but P{XY < 1} = 0 ie, COM(X,Y) > 1. The comedian has therelore the drawback, that the comedian matrix (COM{X;,X;}) as a robust alternative to the covariance matrix of rvs X;,...,Xq is in general not positive (semi-}definite, ‘Lhe problem of non positive semicefiniteness of estimators frequently occurs in robust estimation of covariance matrices; see Roussoomw and Molenberghs (1993), who extensively discuss and propose methods to transform such non positive semidefinite matrices to positive semidefinite ones. The investigation of the comedian matrix in bivariate models parametrized by some association parameters will be the content of future research work, Tn ence af arhitrary hivariate narmal random vactore the proceding consid. erations imply however that the correlation median is always between ~1 and 1, ie COMOC.Y) MAD(X) MAD(Y) where 0 is the usual correlation coefficient of X.Y, By (2.1). 6 = 6(o} is continuous and strictly increasing function from [—1,1] onto [1,1], satisfying 6(-@) = -8(0) and 6(—1) = ~1, 6(0) = 0, &(1) = 1. We will see below that 6 is continously differentiable an (—1, 1} (Propasitian 9 1) We could therefore utilize the correlation median of normal vectors (X,Y) as a measure of dependence between (X,Y). In particular, the obvious empirical counterpart $y can serve ax a robust acanure of the comelation covfficient by inversion: = olel/aih) €[-L.1, (23) On = G9 “G1)On) is a consistent estimate of 9, if 6, estimates 6 consistently. ‘This will be the content of Sections 3 and 4 As a byproduct of Lemma 2.1 we obtain the characteristic function gy BlexplitX¥)), t € B, of XV. Note that the characteristic fnetion of the 2 distributed rv Z? is given by E(exp(itZ?)) = (1 — 2it)-V?, t € R. From Lemma 241 we obtain therefore (24) pelt) = Elexp(itXY)) a = Blan ("re ye ‘Tho reprovontation of a bivariate standard noninal wactar by polar soardinates known as the polar method and going back to Box and Muller (1958), together with Lemma 2.1 yields the following representation of C(XY), which will be crucial for our further investigationsON MAD AND COMEDIANS ou Convunarr 2.1, Let uyuin (X,¥) huue dist dation N (0B). Then we have L(XY) = £(R(cos(sU) + 0), where R and U are independent, U is uniformly on (0,1) distributed and R follows the standard exponential distribution on (0,00). As an immediate consequence of the preceding representation of £(XY) we obtain the following well known fact (Feller (1971), p. 101, Huber (1981), p. 209): P{XY > 0} = 2 arccos(~), @ € [-1,1) Note that if we parametrize 9 = —cos(ra), a € [0,1], then P{XY > 0} = @ becomes the identity on [0, 1]. Proor oF CorouLary 2.1. Let Uj, Up be independent and uniformly on (0.1) distributed rvs. The polar method implies that Z: := /—2loe Us cas(2rU>) and Zp := ¥—2log Uj sin(2xU) are independent and standard normal rvs. As a consequence we obtain from Lemma 2.1 the representation L(XY) =£ Ga + eB? ~(1— 02) = {log 4 ((1 + 0) cos*(zn02) ~ (1 = 9) sim?(2n0%))) L£(-logU (cos? (2xUs} ~ sin®(2rU2) + 0) = P(-Ing alons(AeTla) + @)) Finally, observe that R= —log UJ; has a standard exponential distribution and thet £(con(txUz)} — L(coo(nt's))- 2 Corollary 2.1 enables us to write down the df of C(XY) in closed form. Its wan be derived fut Uic followings Fouule by iaterctaigiing differeatiation and integration, COKOLLAKY 2.2. Under ihe conditions of Corottary 2.1 we nave l om ( cielyug® 18° P{XY >t} i 1 fp ( EPEC ac A ( amr) aexe MICHABL FALK PRooP. Corollary 2.1 together with Fubini’s theorem yield P{R(cos(aU) + @) > t} P{R(cos(nU) + g) > t,cos(rU) + @ < 0} + PER Cum(nW) + w) & beus(ul) + wz OF (ifr) arcon(=0) t [ {n>} aw #20 > case) £0. ala : P4R<——— tA { i aro} +P cos(rU) + 0 > 0}, t 4 from which the auertion follows. © ‘The preceding result together with some elementary analysis implies that the dE g(t) — 1 PLXY & 4) of XV i continuously differentiable at #0 with derivative (ayn) acon) t Sa f bin ( costa) + 3) costa) eos ** 0 the integral equation (2.6) fe) = (@*(8/4))" 1 _ 2) 1 Pew (222) 1 ure) Gnawa sexp, ‘We are presently unable to solve this integral equation and to provide the function g in explicit form. A Monte Carlo approximation visualizing its shape is given at the end ot this section. PROOF OF PROPOSITION 2.1. The following proof of the differentiability of g(o) for @ # 0 is based on implicit differentiation. Define the fanction H on [-1,1] « R by H(o.t) := Holt) — 1/2 = PEXY > t}~ 1/2 Then H(o,-) is a continuous and strietly decreasing function with g being the unique solution of the equation H(9, 9(9)) = 0, @ € [-1,1) Fix now 6 € (0,1) and ¢ > 0. Formula (2.5) implies t 1 ate) tras ® a u ples = Novt we compte the partial derivative nf H with rospont ta @. By sub urs x7! arceos(1s) we obtain from Corollary 2.2 the expansion ituting (et ht) Ale) aa °° arr : t 1 it Ca 3) wai Twin IL (0 (-aepea) -© (aia) ear Fix now a € (~g, 1). Splitting up the preceding integral into the sum f*, + f° of two integrals, substituting w+ u—h in the first integral and using ‘Diylor’s Formula, we obvaln that she above megral equais rt [onl wie) (eye ft t 1 oe [lo (S) aeraou MICHARL FALK 1 £ ‘ : - -1 ar) (ae aa pan) ) I exp 1 rate (Q~w) =e t 1 = ha lex -4) aoa ton) oo [ow ( ty Gow +H (= (why)? YA = 12 ra or i ee [OP are) Wr uer oth Owe ut 1 L tow (sts) gata af" “ “ ome foe (sea) aaa a t t ste [loo (oa) erat em™ This sepusentation inaplico a 1 t ate ara ( oH) | t u fives) aan tet fied Cos ee [, Pure) ura By letting now a tend to ~g we obtain t 1 ae) uwroru-wyn>? ‘The implicit differentiability theorem implies therefore that the function g is differentiable on (0,1] with derivative satisfying J ilo.so) + Ziloatoa'te) =0 ie, mo), exp ( at d(e)= Fo ( T ae) wrod eeON MAD AND COMEDIANS 025 coetficient p Fig, 1, Approximation (obtained bv Monte Carlo) of the comeclian fanetion als = COM(X, ¥) as a fimetion of the correlation coefficient p. Note that g(o) ranges over [-a(4),a(2}h, where g(1) = (671(0.75))? = DaB6. It remains to show that o/(g) ~ 0 as @~ 0. Fix to this end = € (0,1) and aplit wp the intel ie the saneralor of y'(e) inte ue san fy + {2.7 C > 0 denoting an appropriate constant we obtain for o | 0 the bound ate) aS) wee tot) Pawo) Ei ‘The substitution w+ g(g)u ~ o implies that the right hand side equals Ca(0) 1° ox (— G0) = LAS epee an (4) Las ot 1G emp (2) 2a 0 ro He + Dyoince wo (e-+2}/y(y) > oxy dhe mumicratn ip Lauuted but Ue deat converges to infinity, This proves 4g} —» 0 as 0 — O. Since g is continuous on [0,1], the mean value theorem together with the preceding considerations mipltes With 1 © (0,2) ae)— 9{0) if p30626 MICHAEL FALK Les, g 18 differentiable at g = 0 whi y'(0) = 0 and ube pruvt i complete. Since we are presently unable to state g(o) explicitely as a function of g, its evaluation as a solution t, of the equation H,(t) = 0 (see Corollary 2.2) requires approximation methods. Figure 1 is the result of a Monte Catlo simulation of ‘o using GAUSSra«. version 2.1, based on the empirical medians of 8000 pseudo random numbers — log{us)(cos(rw,) + 9), with @ = —1 to 1 by 0.01. ‘Tho function 4g is almost linear near 0. 3, Strong consistency of the comedian In this section we establish strong consistency ot the empirical comedian, based on independent copies (X1,¥;), (Xz; Ya);.- of (X,Y). Denote by med,(X), med,.(Y) empirical medians of Xy. ‘X. and Ys -Y. and define the empirical MAD and comedian by (ot) MAD, (2) .— mcd (|X, — edn (X)|6—1,---4m) COM, (X,Y) = med{(X; — med, (X))(¥; ~ med, (V7), = 1,.-.,n} In the following result we establish strong consistency of GOM,(X,¥) for general random vectors (X,Y) under suitable regularity conditions. By F, @ and If we denule die cs of X,Y aud (XN — wed X))(Y ~ usal(¥)), seopectively. ‘TuvoreM 3.1. Suppose that F, G and H are continuous and strictly increasing at F~'(1/2), @*(1/2) and H~"(1/2), respectively. ‘Then, COM.OCY) 2 COMUX.Y) as. brom the tact that CUM(A,A} = MAD®(A) for a general rv X (see the discussion after Lemma 1.3 (i)), strong consistency of MAD, (X), which has been established by Hall and Welsh ((1985), Theorem 1), is a special case of the preceding result: by choosing ¥, = Xj Conouiany 21, Tf P and Hare continuous and strictly imerensing af F-3(1/2) and H-"(1/2), respectively, we have MAD,(A) — MAD(x) aus. By the continuity and strong monotonicity of the comedian function g, defined in (2.1), the two preceding results itnply strong consistency of the transformed tuupirival wirelation median im the aormal eaveON MAD AND COMEDIANS ear Cononane 3.2. Suppuoe that (X,Y) lous use uibitrury nundcycncrute bi variate normal distribution with correlation coefficient o € [-1,1]. Then the transformed empirical correlation median (orga), MAD.COMAD.(Y) with the convention Bq = 1 or —1 if the argument is above g(1) or below g(—L), is fn etrangly concictont estimate af 4 0 Furthermore, MAD (A)NADAT), SENT en OOVGY) a. Corollary 3.2 suggests as a highly robust and consistent alternative to the empirical correlation matrix of normal vectors (X{1,...,X{"),...,(XAY,....X0)) that matrix, whose entries are the pairwise transformed correlation medians dy bn(X, XU). This matrix, like the comedian matrix introduced in the preceding section, will in general not be positive semidefinite. ‘To counteract this disadvantage it should therefore also be transformed as proposed in Rousseeuw and Molenberghs (1993). For the proof of Theorem 3.1, which uses a truncation argument, we need the following lemma. LEMMA 3.1. Let a1 < ay <-+- < dy and by < by < +++ < by be two sets of real numbers such that at most k <(n—2)/4 numbers in the two sets differ. Then we have Imed{a) ~ med(b)] < Prony sacs — jaya} where wup{k integer: k Bjay2\-m > ¥jnj2}-24 Which implies the assertion. Cl PROOF OF THEOREM 3.1. In the following we will make use of the quantile transform method i.c., a sequence of iid rvs with an arbitrary common edf F can bo anumod to bo given in the form F-1(U,), P-1(Uy),... whore Uy,Us,... aro028 MICHAEL EALK independent and uniformly on (0,1) distributed rvs (sce e.g. Section 1.2 of Reiss (1989) for details). Furthor wo will utilize the well-known fact: that k (32) “net = ax a Vien 0 as, where Zin < Zan S++ S Zney denote the ordered values pertaining to arbitrary NB Zi Zn. Chance for a given small ¢ > 04 number A — Ae) > 0 largo onough ouch that P{|X] < A} > 1 —e/2 and P{|¥| < A} > 1 —«/2 and define for ¢ = 1,2, the truncated rvs A if X,>A A if K>A weafr, if |X)¢A and ¥ {x if Wl Dand that as. mod,(V) € [H>'(1/2 — 32), Hp"(1/2 + 32)] ultimately and lim sup, Vinya}4afeniaa © AE'(L/2 + 32), lim inf, oo Vjnjaj—afen) 2 HEML/2 — 8c), (use the quantile tranaformation mothod, (3.2) and the monotonicity of IZ? on (0,1). Thus we obtain from (3.4) and the triangular inequality that as. (recall that COM(X,Y) = H~!(1/2)) limsup |COM,(X, ¥) - COM(X, Y)] < limsup |COM,(X, ¥) — ied, (V)| + lim sup jied..(V) — H-"(1/2)| s Ho'(1)2-+82) — Hz1(A)2— 32) + HE /2 +32) — H-*(4/2) + |HEMI/2~ 82) — 1-1/2) since H. converges to H pointwise, which implies the convergence H>"(q) —+.—0 H-“(g) ab every continuity point of H-! (see e.g. Lemma 1.2.9 in Reiss (1989)), and since H~! is continous at 1/2. This completes the proof of Theorem 3.1.0 4. Asvmatetic normality of the comedian In this section we will establish asymptotic normality of empirical counterparts of tho comodian, Tt turn out that in tho partioular ooo, whore X, ¥ are independent, the limiting distribution is not affected if the marginal medians med(X), med(¥) are known. ‘This is different to the estimation of the MAD, where the phenomenon occurs that an valimated maiginal median cau even reduce the limiting variance. For reference we state the following theorem on asymptotic normality of the empirical MAD, proved by Hall and Welsh (1985). Further results for ihe MAD tn a regression model were established by Welsh (1960). THBORBM 4.1, Suppose that the F is continuous near and differentiable at Fo(L/2), Fo'(1/2) +MAD(X) and F~'(1/2) — MAD(X) with f(F-"(1/2)) > 0 and A:= f(F-1(1/2) + MAD(X)) + {(F-M(1/2) - MAD(X)) > 0, where f =F’. Then, vn(MAD,(X) — MAD(X)) = (0, a (: + jt) A where C= f(F-M1/2) — MAD(X)) - f(F“1/2) + MAD(X)), and B = 0? + 1OF(E1G/9) FUP G/a) 1 MADCX)) (P12) MAD(X)))690 MICHAEL FALK It FUP 1/2) = MAD(X)} = FUP 4/2) + MADLX)), which is in particular true if the edf F is symmetric with respect to F~(1/2) ie. F(F7"(1/2) — y) = 1- F(F (1/2) +9), y € R, then B equals zero and thus, the variance of the limiting normal distribution in the preceding result reduces ta 1/(16f(F-(1/2)-+ MAD(X))?). ConoLtaRY 4.1. If in addition to the assumptions of Theorem 4.1 the edf F is symmetric around F-*(1/2), then at 1 Va(MAD,(X) - MAD(X)) =) (0 mF) If F is in particular the edf of the normal distribution with arbitrary mean and variance o* > 0, then VAIDAD (A) = 090/828 (O55 ie war) i where denotes the standard normal density. ‘Theorem 4.1 implies that under those assumptions on the cdf F of X {tt Nl Yat RI) ~ COME) 39 (42 MABEDD (5B) ifY =aX +das, for some a.b€R. The constants A. B are defined in Theorem 4d. Ii the median of F is known and unique, then we can replace MAD,(X) by MAD, (X) == med{|X; — med(X)|,i = 1,...,n}. It P is differeutiable at wol(X) + MAD(X) atl uel(X) — MAD(X) with flmed(X) + MAD(X)) + f(med(X) — MAD(X)) > 0, then by Example 4.1.4 in Reiss (1989) Since the term B in the variance of the limiting distribution of /i(MAD,(X) — MAN(Y)} in Theorem 41 does not wanish in general, the limiting distributions of MAD,(X) and MAD,(X) difler in general. This is different to the limiting distribution of the sample variance, which is not affected by plugging. in the sample It is easy to find examples of edfs where the term B is positive. ‘This is what wwe would expect as we replace the unknown median med(X) in MAD,(X) by ‘the empirical one, Bur one can also Mud examples, where Bis nogavlve, which (4a) VYi(MAD,(X) ~ MAD(X}) = (6.ON MAD AND COMEDIANS eat contradicts our intution, since in this ease MEAD, (2) 18 tor large m more con- centrated around MAD(X) than MAD,(X). The following example leads to a negative value of B. Put 1/4, A/a 1 4/508 — where 2 satisfies f°" 1/4 + 4/52% — a2dr = 1/2. Then the median of the corresponding caf F is'0 aud MAD « [1.9191,1.9192]. Consequently, /(— MAD) — (MAD) = MAD? —4/5 MAD® <0 and (42) Se) B= ({(-MAD) - f(MAD))* + 4f(0)(f(- MAD) ~ f(MAD))(1 ~ F(MAD) ~ F(— MAD)) = (MAD? —4/5MAD3)(MAD?_-7/15MAD®_ MAD# /s) <0. ‘This example shows that knowledge of the population median does not nee- scarily, pay (in the sense of asymptotic relative etheiency) when estimating the MAD. It is interesting to compare the limiting normal distribution of MAD,(X)/ ~1(9/4) as an estimator of the standard deviation ¢ with that of VAR,(X) i Dla (%i— Xn)? in the particular case, where X has the normal distribution whereas mn Vi(VAR,, (X) ~ 2) 3 N(0,07/2). ‘Whe osymptotic relative ficiency (ARE) of MAD with reopect to the empirical standard deviation, defined as the ratio of the variances of their limiting normal distributions equals therefore (43) ARE = 8("!(3/4)}°"1(3/4)? ~ 0.368. Asan estimator of ¢, the MAD requires therefore about 3n observations to perform roughly as well as the empirical standard deviation based on observations, if is large (see, for example, Section 1.15.3 in Serfling (1980) for a discussion of the concept of ARE). This is the price one has to pay for the robustness of the estimator "Pho prinn far mnhuctnese can actnally he lawered hy utilizing for eample the estimator (4) Gp — emery,632 MICHAEL FALK introduced by Rousseeuw and Croux (1993) as an alternarive to the empirical MAD. It was proved that S, is consistent for the corresponding poptilation functional (45) S(X) := emed x, {med x, [Xi - Xol}, where Xi, Xp are independent copies of X. Note that the constant ¢ ean be chosen so that S(X) equals the standard deviation « at normal distributions, just like the constant. 1/~'(3/4) for the MAD. The asymptotic efficiency of Sy is 58%, compared to 37%, roughly, for the empirical MAD. Morecver, S, is location. free in the sense that it does not use any location estimator (like the sample median), so the effect that knowing med (X) in advance may be a disadvantage for the sample MAD does not occur for Sy. In view of the general equality med(|X|) = med(X?)'/? (see Lemma 1.3) one could aleo extend §), to an ootimatar af ravarianca, given hy (4.6) Sa XY) = EF medi cica{medscjcn(ai — 2;}(yi ~ 95) ‘The author is grateful to the referee for providing this idea, whose investigation has to be the content: of future research work. In the casey where med(X) and med(¥) are Imown, wo oan also modify the ‘empirical comedian in an obvious way and obtain immediately from the asymptotic normality of sample quantiles (see eg. Reiss (1989), Example 4.1.1) its limiting isceibution, stave ia Ube following sult, Tie peautive and ¥ are rvs which are symmetric with respect to some known real numbers. In this case, these conters of symmetry are med(X) and med(¥). Wh ip uflest aroused Chat X PROPOSITION 4.1. Suppose that med(X) and med{Y) are unique and that the edf H of (X -med(X))(¥ —med(V)) is differentiable at COM(X, ¥) with posi- five derivative h(COM(X,Y)) > 0. Put COM,(X.¥) := med (X;—med(X))(Yi— med(¥)),i = 1,...s}, where (X:,¥i}(X2,¥2),... are independent copies of (X.Y). Then a a i 1 vii(COM, (X,Y) — COM(X,¥)) 35 N (0. | Let now (X,Y) have an arbitrary bivariate normal distribution with positive variances o%, of of X and ¥ and correlation coefficient @ # 0. In this case, the preceding result implies (4.2) n(GOM, (X, ¥) — COM(X,Y)) (“agtion) “4h3(a(e)) ]” where hy is given in (2.5). If X and ¥ are however independent, Proposition 4.1 does not entail nondegenerate asymptotic normality of COM, (X.Y), since in Ubis vase y ~ 0, y{y) — 0 aud thus, hglg(e)) — oe; vce the romorke aftor (2.5), DON MAD AND COMEDIANS: 63s The innlepoilence vane svyuiion Uscsefine ait calta iuvestigativa, provided in tlie following result, Trmonem 42. Suppose Huul the ulfo P, @ of X wad ¥ wre differentiubte weer med(X}, med(Y) with derivatives J, 9 that are Hélder-continuous at med(X) and med(¥), respectively, and satisfy f(med(X)) > 0, g(med{¥}) > 0. Then we have Vin log(n)GOM,,(X,¥) ZO 1/(4f(med(X))* g(med(¥))*)). ‘The proof of Theorem 4.2 indicates that 2/71 log(n} f(med(X))g(med(¥))(1+ log log(n) /log(n))COM,(X,¥) approaches the standard normal distribution much taster than 2/n login) f (med( X )}g{med(Y ))COM,, (X,Y). ‘Lhis is also ver- ified by numerous Monte Carlo simulations. COROLLARY 4.2. Suppose that X, ¥ are independent normal rus with post- tive variances o%, 0}. ‘Then, based on n independent copies of (X,¥), we have log(n)COM,(X,¥) = N(O,o%a}). ‘This result entails that in the case 9 = 0, the empirical comedian COM,(X,Y) as an estimator of COM(X,¥) = 0 outperforms the sample covariance COV.(X.¥Y) of normal vertors with known means, since faCov,(X, Ye N(,o% 09). Proposition 4.1 and Corollary 4.2 indicate that the cmpirical comedian is superefficient” at the parameter p = J, Whether CUM,(4,¥) 18 actually su perefficiont in the sense defined for example im Section 8.6 in Pfanzagl (1994), requires knowledge about its limiting distribution along a sequence of alteruatives fn > p © (1,1). But this is an open problem to the best of our knowledge. ‘The same applios to the sample comedian GOM,,(X,Y) with estimated marginal ruedians defined below and to the test for @ = 0 defined in (4.12). Proor. We utilize again the quantile transformation technique and assume the representation X;¥; = HW1(U;), i = 1,....n, where Ur,...,U» are indeven- dent and uniformly on (0, 1) distributed rvs and H denotes the edf of XY. Without loss of generality we can suppose med(X} = med(¥) = Fix t € R. From Example 4.1.1 in Reiss (1989) we obtain P{ Yiilog(n)GOM,(X,Y) < t} — PLofe( Ry /2) ~ 1/9) < dfn H(ta, ®(2Vn(H (ta,,) — 1/2}) + ofl), whore Bh, donates the ampisical eaf of U,...,U, and an + 1/(ymlog(n)). Te remains to show that for t€R (48) vil H{tan) — 1/2), —» FlOdo(O0M. 1/2)} 4 (1)034 MICHAEL FALK Fix mberefore ¢ © R. Fubiu’s theorem uuplies (49) YnlH (tan) ~1/2) A f Gtan/2)C(X)dx — VR { * G(tan/2)C(X)de lb Ico =vnf * Gtan/2) — GO)E(X)(da) ‘ va f Cltanfe) C(0)L(X)(de). Choose a large constant KC > 0 such that t//K is so close to zero that it is in the domain of Hélder-continuity of g. ‘Then we have 420) vm f Gttagian - GrOrerxvan) = vif Glea,/2) ~G(0) E(x (ds) + 0) et f . ‘(tan /2)tan /2£(X)(dz) + o(1) = VF J (0am /2£1X)(ee) + va, (QVtan/t) — g(U)}tan/ TELA )(dx) + O(),, where 9 € (0,1). Fix © > 0 small such that the interval (0.¢] is in the domain of Hélder-continuity of f. The first integral then equals for large n. way 208 Upton oe gt fe 1 = Eat dica, 2M) + LE) — FONde + 01) _ gO) Ft fo ~ Toxin) fiero) 9(0)F(O}t “Tog(ny “oale) ~ eat Kan)) + of1) (0) F(0)t + 01). Ry the cane arguments nna chews that the socand integral in (L1N) is af order o(1). Together with (4.10) we have thus showa that the first integral in (4.9) equals g(0)}(0)t/2. In complete analogy ono shows that the second integral equals ‘a(0)f(0)#/2, which yields (4.8) and thus the aonertion. © In the next result we establish asymptotic normality of the empirical comedian fn the cove, where med(X) and med(V’) are unknown.ON MAD AND COMEDIANS: 635 Treureat 4.3, Suppose in wildicion co dhe assumptions im Theorem 4.2 thus the derivatives f, g are Hélder-continuous in a neighborhood of med(X) and med(¥), respectively. Then we have Vir log(n) COM, (X,Y) = N(O, 1/(4F(med(X))?a(med(¥))?)) ‘The variance of the limiting distribution of the comedian with unknown marginal medians is therefore the same as in the case with known medians. Formulas (4.16)-(4.19) in the proof of Theorem 4.3 indicate that with b,, == 1 +2loglog(n)/log(n), the distribution of 2b, f{med(X))g(med(¥))COM, (X,Y) approaches the standard normal distrihntion much faster than withont the vari ance correction term by ‘Theorem 4.3 offers a way to test independence of arbitrary normal rvs X, ¥ bby meano of the comedian, Denote again by MADy(X) — med[|Xz _sedy(X)] + i=1,...,m} the empirical MAD with estimated median of X, based on n independent copies of X. From Corollary 3.1 we obtain MAD, (X) 100 MAD(X) = ox ® *(3/4} almost surely and thus, the following result Is an immediate consequence of Theorem 4.3. COROLLARY 4.3. For mdependent but arbitrary normal rvs X,Y we nave Vilog(n)COMn(X,Y) 3 NO, ra 07} and thus, nt F1GAM ftog(ny COMME) ar0,1) 7 MAD,(X)MAD,(¥) © Again, it is preferable to multiply the left hand side above hy b= (1 + 2loglog(n)/log(n)) in order to accelerate the rate of convergence to N(0,1). If we denote again by 6 = COM, (X,Y)/(MADa(X)MAD,(¥}) the empirical correlation median, then the preceding reoult implica that (4.12) n(n) = Lea-t(1a/2)n/(bu 9°13 /4)? Yi log(n)}.0o} IPD) is an asymptotic level a-test for testing independence of normal rvs X and Y. From Theorem 3.1 we know that COM,(X,Y) converges to COM(X,¥) as. for an arbitrary hivariate normal vactar (X.¥) and fram (91) that COM(X, ¥) = O iff X, ¥ are independent. The asymptotic level atest yn(6,) defined above is therefore also asymptotically consistent, These considerations are summarized in the met result CoroLtaRy 44, If bq is based on n independent copies of an arbitrary bi- susiate wurmad wectin (X,Y) we dawwe a if X,Y are independent 1 otherwise. tela) a {836 MICHAEL FALK It would clearly be interesting to evaluate the power function of 3, (6,,) a sequence of alternatives (X,Y) converging to the independent case, would enable us to compare ¢,,(6,) with other tests for independence, in particular that one based on the empirical correlation coefficient. But to this end, we need the asymptotic distribution of 6, in the dependent case i.e., along a sequence of alternatives converging to zero. ‘To the best of our knowledge, this is an open, problem. The standardization of the empirical comedian hy MAD..CXOMAT.(Y) is however taylor made for normal rvs. A nonparametric ie., distributional cobust standardization of COM, (X,Y) can be based on estimators of the marginal den- sitios at med(X), mod(¥). ‘An obvious choice can be based on the kernel estimator of the quantile density function (F~!)'(q) = 1/ f(F1(q)) at q = 1/2. Put (4.13) inte) = (Fe @oa?4(- 2/002) whore the bandwidth a, > 0 converges to zero and the Kernel fimetion & has hounded support and satisfies f k(x)dx = 0, f xk(x)dr = —1. This kernel estimator bias been extensively studied in the last decade, see e.g. Xiang (1994) and tho literature cited therein, In particular Theorem 2 in Falk (1986) implies that if f = F’ oxists and is positive near med(X) = F-*(1/2), then via (62 - ) M0. f Kid. where K{y) = J, b(e)dz, provided na? — oo and nad — 0. Consequently, it naz — oo and na® — U, we obtain under the conditions of ‘Theorem 4.3 2al1/2)Gn(1/2} vnloginJCOMa (X,Y) BNGn, whore ju(1/2) ia dofined the some way as fy(1/2) but with Fiz1(o) replaced by the empirical quantile function of Y,,...,¥ In particular for the choice ky(z) = —32/2, x € [1,1], ke(x) = 0 elsewhere, which io tlw detivative of de Byranccteuihor (1909) heausel f(a) ~ a/-4(1 — 2), x [-1,1] we obtain f KA(y)dy = 3/5 and, moreover, with a, € (0, 1/2) 2ne, SE Nalaca it~ 1/2)Xin where Xin S--- $ Xam denote the ordered values of Xi,-...Xn- Equally, one can approximate jn(1/2} with Xuq replaced by Yin. These estimators are ahviously quite mabust with asymptotia breakdown paint 1/9ON MAD AND COMEDIANS 6st Au alternative standardizauion of Ue comedian can be based on the kernel deusity estimator directly. Put (4.14) Faltiody{X)) Wigan (x) ~ Xi . 7 where K has bounded support and satisfies [ K (x)dx = 1, f 2K (x)dx = 0. Notice that k(x) = K'(2), © € R, is a kernel as above, provided A is continuously difforontiablo on Rand has bounded cupport. Tho following raoult oan bo proved by utilizing the conditioning technique of the proof of Theorem 4.3 below. Here we assume that for an even sample size n = 2m the empirical median med, (X) Jy defued a a Axed couver combination EXman +A ~ C)Xm pins © (0,2), of che most central order statistics Xwn, Xmsin- The common choice is ¢ = 1/2. LEMMA 4.1. Suppose that the cdy F of A 18 aifferentaanle near # (1/2) with positive and Lipschitz-continuous derivative J. If K has bounded support and satisfies [ K(a)de = 1, f wK*(x)dr = 0, f K%(o}dr < 00, then Salinedn(X)) LE ade rit ( Foned XT) :) 3" (figs provided na ~+ 0, nay +00 as n+ 90. Different to fn(1/2), the limiting normal distribution of (med, (X)) depends ‘on the value f(med(X)). In case of the Epanechnikoy kernels kes and Kp, fy(1/2) outperforms asymptotically the standardization f,(ied,,(X)) iff f(med(X)) < 2 Proor or THrone 4.3. The proof is based on iterated conditioning and ‘on the quantile transformation. Suppose in the following X, = F-'(U;), Yi = GAW,), i= 1,2,..., where U%,U,.-. and W,, Wa,... are independent sequences of independent and uniformly on (0,1) distributed rvs. By Fy, Gy we denote the empirical edfs of U;,...,U, and Wr,...,Wy, re spectively. For simplicity we assume that m = 2m + 1 is odd, Fix t > 0 and put 1/(fiilog(n)). Then we have P(Viilog(n)GOM,(X,¥) < t} 7 ry 1 oo ag| (Xi = FENN — G3"(1/29)) > vs) =/°( 1c tay] (PM) — FF 1/2) = A(GTWY) — GONG T (1/2)))) > m| valPe (1/2) ~ 1/2) = eas 0) = 1/2))(au)638 MICHAEL FALK _ aa p Sen a UE) A (G7) — "(GE 5"(0/2))) + DU costaat (PU?) — Fe M(0/2 + w/v) tat x (GW) Gz ‘aay >m} x L(vn(Fy (1/2) — 1/2)(du), since conditional on F>1(1/2) = 1/2 + u//n, the sample U,...,Uq consists of two independent samples of independent rvs U;"),...,US? and U?,...,U!? and dhe value 1/2-ru/ YT, where dhe 7 are unlformly on (0, 1/2-+u/ yM) austributea and the U! are uniformly on (1/2 + u//%, 1) distributed. Moreover, these rvs are independent of W, W2, By repeating this argument. and conditioning on G;'(1/2) = 1/2 + w/ Ym, the preceding integral equals Mit va/2 s UT ~Lend an (-eosan] (FU) — FML/2 + w/v) x ‘wl — 10/24 w/va))) tc astag UE!) — PQ ye + vm) 2 (GHEE) GAL /2 + w/yia)}) > “| x LVnlGy (1/2) ~ 1/2))(dw)£( ya F (1/2) — 1/2))(du), where r(j) € (0,1,2}, Ws" is uniformly on (0,1/2 + w/yf) distributed, W°? is uniformly on (1/24 w/ /a,1) distributed and Wi4,..., 14), wi2),..., Wl?) are independent rvs, W{"” = 1/2 + w/y/n. The random vector r= (r(1),...,7(n)) is a random permutation of m ones, m twos and one zero, and independent of the Oj aud Wi; ‘The number NV of ones among the r(i) in the first sum in the preceding integral has therefore a hypergeometric distribution with parameters 2m +1, m, m ic. == CY JOR. k=O cm By conditioning on N = k, this approximation together with the asymptotic normality of sample quantiles ii vatiativual dintanue (see G2. TheoristsON MAD AND COMEDIANS 630 els (1984)) implies that the above multaple integral equals Pe (cootanlttt WU) — U2 uy vA) x (GW!) — AM J24 w/a) $+ Acer (PMU) — PHD + w/v) x (ow?) ~@*(1/2+ w/Va))) + zy Uccetagy (FU?) = FOMA/2 + uf/n)) WL) — ANA 9 4 anf) + DO Aecostenl (FUP) — FMR + w/ VA) (CWE) ~ G2 + w/a) > oh X C(N)(ak) (VG, (1/2) ~ 1/2))(dw) x £¢vn( Fr (1/2) ~ 1/2)) (du) + 0(1) arta Ell tel x PAS tera) ~ F124 u/va)) ce! wi) — ea /2 +u/va))) + YS a ceeaat (EMO!) — BG /2 + why) x eon) HG N24 w/yay)) + FH acccsagl(PU2) — F/2 + w/v) x ee (Ww) —G7(1/2 + w/a) + > A-oe.tag)( (FN /2 + uf VR) iemohth (GW?) — (1/2 + w/Vn))} > w} A ELV UR) NCO,1/(ay'(G (2/2))) Kaw)

Falk1997 Article OnMadAndComedians

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Falk1997 Article OnMadAndComedians

Uploaded by

Copyright:

Available Formats

You might also like