The Resistant Line and Related Regression Methods

Author(s): Iain M. Johnstone and Paul F. Velleman
Source: Journal of the American Statistical Association, Vol. 80, No. 392 (Dec., 1985), pp. 10411054
Published by: Taylor & Francis, Ltd. on behalf of the American Statistical Association
Stable URL:
Accessed: 11-08-2015 18:18 UTC

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content
in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship.
For more information about JSTOR, please contact

Taylor & Francis, Ltd. and American Statistical Association are collaborating with JSTOR to digitize, preserve and extend
access to Journal of the American Statistical Association.

This content downloaded from on Tue, 11 Aug 2015 18:18:38 UTC
All use subject to JSTOR Terms and Conditions

Berkeley. has not been analyzed theo. Minitab and P-Stat). The method thus combines bounded influence and acceptable efficiency with conceptualand computationalsimplicity. or x. 11 Aug 2015 18:18:38 UTC All use subject to JSTOR Terms and Conditions . Ithaca. 1.A dual plot representationof the procedure yields a fast. y) population. 80. Theoryand Methods 1041 This content downloaded from 132. Variance-reductionswindle. see Velleman and Hoaglin 1981 and Emerson and 1. Peter Rousseeuw. VELLEMAN* In a bivariate(x. influence function. Cornell University.MarkMatthewsfor Figure 1. (1. we discuss extensions to multiple carriersbriefly in Section 7. While retainingthe computationaland resistancepropertiesof the resistantline. Dual 2.) extremein y. John Tukey. and the line with an equal numberof points above and our use of the terms "resistant. Where resistance is 1.. these estimators can have substantiallyhigher efficiency. The resulting tribution-free"is intendedto conformwith thatof Huber1981.and heavy-tailederrordistributions.1 Summary in efficiency or in resistance.) Designed as a "pencil and paper"method. Robust regression. by the NationalScience Foundation. and the referees for detailed and helpful comments on drafts of this article. distribution-freeconfidence intervals (Section 3). The modificationsproposedin Section 5 can increasethe line as an exploratorydata-analytictool for quickly fitting a practicalefficiency to roughly75% undera wide rangeof error straightline to bivariate(x. INTRODUCTION important.(minimumvariance)estimatorof the form (1. asymptoticand small-sampleefficiency for bothGaussian Berry-Esseen theorem. (For detaileddiscussions including examples.x are subject to error. Al.200 on Tue. They showed for Gaussian y that an optimal of "Vg-resistant" lines constructedfrom M-estimatorsof loca. (1946) noted the advantageof three groupsin simple partitionEconomic and Social Statistics. Brown-Mood regression. No. and he found that the optimal diStanfordby Office of Naval ResearchContractN00014-81-K-0340. The experiment uses two new variance reduction "swindles" and provides small-sampleresults not previously availablefor most of the estimatorsstudied. CA. and Independentlyof Wald. convergentalgorithm. middle.nonparametricconfidence intervals for the slope. Monte Carlo results show that small sample efficiencies exceed their asymptotic values in importantcases. Velleman is Associate Professor. Stanford. We then use the preceding theory and the results of a Monte Carlo simulation experimentof small sample properties(Section 6) to compare the resistant lines with related line-fitting procedures.2 Background Hoaglin 1983.1) is obtainedby using only nl3 points in each of the extremegroups. peated median regression. equally spacedx-values divided into asymptoticnormality(Section 4) for the resistantline. JOHNSTONEand PAULF. includingprotectionfrom datavalues extreme in x as well as those extreme in y ("boundedinfluence") KEY WORDS: Bounded influence. A class three linearregression.248. tion is then introduced(Section 5).Stanford its efficiency relative to least squares from 4 to '. {(xi. yi) into two groups L = {(xi. Replacingmediansby otherM-estimators of location increases efficiency substantiallywithout affecting breakdown or computational complexity. yi): xi > med xi} and used an estimatorbased on group like many exploratorymethods. consistency. resistance.228. parameterestimates are resistantto the effects of data points chap. 3. The resistantline does well for criteria1 and 2 and in many cases has an efficiency of roughly 60%. Johnstone'swork was supported in part at Cornell by an AustralianNational University Scholarship. Consequently. New YorkState School of IndustrialandLabor based estimation of the correlationcoefficient of a bivariate Relations. the technique. yi): xi < med xi} and R = statistics packages (e.g. and similarestimatorto fixed. Johnstoneis AssistantProfessor. (In this article." "nonparametric. NY 14853. CA 94305.Wald(1940) studiedthe errors-in-variables parametricregression in computer-assisteddata analyses. ? 1985 AmericanStatisticalAssociation Journalof the AmericanStatisticalAssociation December 1985."and "disbelow it in each of the outer groups is fitted. increasing * IainM. (YR . values. and at the MathematicalSciences ResearchInstitute. the Tukey's methodblendsfeaturesof two traditionalapproaches resistant line is also useful as a computationallyquick non.Re. cost of computation plot. Nair and Shrivastava(1942) applied a asymptotic results on consistency. y) scatterplotthe three-groupresistantline is that line for which the median residual in each outer third of the data (orderedon x) is zero. He partitionedthe observed data points metric linear regressionproceduresand is widely available in (xi.1) XL).Vol. y) data. The datapoints are divided distributionsand increase the asymptoticrelative efficiency to into three groups according to smallest. and asymptotic normality results.Breakdown. 10) proposedthe (threegroup) resistant tool.YL)/(XR bw= This articlepresentsa fast computationalalgorithm(Section 2). andCornellUniversityfor computerfunds. 392.The authorsthank David Hoaglin.problemof fitting a straightline y = a + bx when both y and though it compares favorably to other well-known nonpara.Confidence interval.Departmentof Statistics.The Resistant Line and Related Regression Methods IAIN M. whetherone is interestedin such cases 1. In consideringthe meritsof variousalternativesto leastsquares regressionmethodsin exploratoryworkour primarycriteriaare the following: 1. It was proposed by Tukey as an exploratorymethodresistantto outliersin x andy and suited to handcalculation.M estimator.means: retically. or largest x 80% under classical regression assumptions. influence function. chap. or Gaussian (x.Finally. Both breakdownand efficiency are adequate for exploratorywork. the resistant line is a valuable Tukey (1970. Mosteller University. Paul F.a value of 60% may not be noticeablydifferentfrom 100%.

which has generallyhigh efficiency. respectively (nL + nM + nR = n). The interceptaBM is then chosen to make the median of all the residualszero. but they rewardthe extra effort with higher breakdownvalues or greaterefficiencies.bxi)intersects TL exactlyonce.from(1.1) iER iEL The interceptestimateaRLmay. The same optimalitycalculationapplies to the regressionproblem (Section 4) and in a differentcontext leads to the so called "27%rule"in psychometrics(compareKelley 1939 andMcCabe 1980 for some references).bxi). Inequality(2.3) whichboundsthe differencein the slopesof the mediantraces nearbRL.XR)might be empty. nL = nR = n/2.. see Bingham and Emerson 1983). andso forth.248. The second approach(Mood 1950. -and this step of the iteration worse. Brown and Mood 1951) is nonparametric:Divide the datainto two groupswith the split at the medianx (as in Wald's method). Yn)be given with xl s .mediFRyi.bTs = med{(y . least absolute residual(L1)regression(e. The resistantline slope estimate bRLis then defined as the solution to b(k+ 1) - b(k) + (et) - eJ)) (/ CR XL). jR . With this caveat. the Vu-resistant Several other interestingestimatorswere not includedin our study. ei(b) = yixib.2) > Ib(k).xi). encountered bytheMoodThedualplotrevealsthedifficulties Tukeyalgorithm. The iterationcontinues until the two group medians are equal or the correctionis as small as required. and the related repeated median estimator bRM = medi medjoi{(yj. Now.LetXR = mediER xi. L1regressionperformscomparablyto lines with respect to criteria 1-3. M.)We shall occasionally abuse notation by writing i E L for xi E L. (See Vellemanand Hoaglin 1981 for a detailed algorithm.yi)/(xj . This content downloaded from 132. see Emerson and Hoaglin (1983).228. for example. Furtherpertinentwork is surveyed by Emerson and Hoaglin (1983).bRL bRL ~~~~> 2Ax.200 on Tue. if . December 1985 vision places about27% of the datain each of the outergroups.xi) Ixj > 4.Thus the iterationcan be stymied x-values.and nR data points.TL(b(k))]/Ax. is thedifferencebetweenthe medianresidualsin theleft andrightgroups.1 A Dual Plot The dual plot interchanges the roles of points and lines. nM. respectively.2) can be has made the approximation writtenas TR(b(k)) F b(k) - TL(bRL)| TR(bRL) TL(b()) | .1042 Joumal of the American Statistical Association.The third. Ignore the intercept.xib). If nL is odd. against b. it fails to converge often enough to be inappropriatefor practicaldata analysis (e. gL = med{yi:xi-E L}. the slope of the line in (x. This appearsto be the only published algorithmfor Brown-Mood regression. Brown-Moodregression is the special case in which nM = 0. .g.. Foranillustrative example. The resistant line is a special case of Kildea's (1981) weighted median estimatorsthat uses only 0-1 weights. Fisher (1983) discussed these and other graphicalnonparametricregressionmethods. Let XL = med{fx:xi i L}.YL)I(XR . TLconsists of line segmentsfromtheduallinesin L. yi). . PROPERTIES (xv.. we includethreeothersof currentinterestin the comparisons.TL(b(k))]IAxI > 2|bk) I[TR(b(k)) then Ib(k+1) bRL bRLI. Dolby (1960) and Daniels (1954) have used the dual plot in simple linearregressioncontexts. Laplace1818andBassettandKoenker 1978) has been suggested as an improvementon least squares for heavy-tailederrordistributions. and so forth. yi)' xzc * .and Ax = XR . computes residuals e(l). This yields a line with slope -xi and intercept y1 to correspondto the original data point (xi.2) Here e and it) are the medians of the kth step residuals in the right and left groups.yj) (interpreted = xj). 11 Aug 2015 18:18:38 UTC All use subject to JSTOR Terms and Conditions . the rightmediantraceTR(b)= mediER(Yi .y. 2. . and so points (x1. The b valueof this intersectionis the slope of the resistantline (see Figure1). and iterates and R containing nL. Let. andthe S-estimators proposedby Rousseeuw and Yohai (1983). We hope comparisonswill be made with related techniques such as bounded-influencerobust regressions of the Krasker-Welsch(1982) type.XL).xi)} introducedby Siegel (1982). theline segmentsin TL averageadjacentpairsof duallines. into three groups L.However. (2. yi) and as ooif xi (xj. (1.min{R} and M C (XL. (2. which has an optimal 50% breakdown. Section 2 offers an explanation and an algorithmwhose rapid convergence is guaranteed. On the dualplot the distancebetween themediantracesat anyslope. the least medianof squares(LMS) estimatorsproposedby Rousseeuw(1984). the resistantline combines the resistanceto outliers of the Brown-Mood median regressionwith the improvedefficiency obtainedfrom three partitions.Thus. In addition to the regression estimatorsjust discussed. the x-values Partition x_.b. In short. Some of our results were first reportedin Johnstone and Velleman (1982). LINE OF THERESISTANT 2.A mediantracecanbe steeponly if anx valuein its group is large in magnitude.It has an influencefunction that is unboundedin x and thus can be sensitive to extreme x values.XL. b(k+l) = b(k) + [TR(b(k)) . SinceXL < XR. where XL = max{L} < xR. Thenthe firstestimatedslope is b(l) = (YR. This graphicalconstructionshowsthatbRLalwaysexists andis unique. med (yi . and find the slope bBM makingthe medianresidualin the left groupequalto the median of the residuals in the right group." _ b(k) . The medianstrace of the left group is the piecewise-linear functionTL(b)= mediEL (y1 .g. Mood's (1950) algorithmfor findingthe slope (repeatedin Tukey 1970) begins with b(') = (YR. be chosen to make the medianresidualsof both groupszero. y) space joining (xi.yL)/AX. (2. Two dual lines ei(b) and ej(b) will intersectat a point having b-value bij= (y.)I(xj .bRLI. and plot each residual.2). namely. a. If nL is even.Two are more expensive to compute than those discussed thus far.) = med (yi .y1)I(xi.bx. These arethe median-of-pairwise-slopes procedurestudiedby Theil (1950) andSen (1968). See Emerby high-leveragepoints with extraordinary son and Hoaglin (1983) for a specific example constructedby Andrew Siegel.

Asymptotically. y) as the line e = y . .the firsttwo estimatesfromMood's original iterationoften suffice. TLand TR. 11 Aug 2015 18:18:38 UTC All use subject to JSTOR Terms and Conditions . however. Figure 1. Brent (1973) showed that this algorithmwill always converge in fewer than [log2(bma.our experienceconfirmsthis.0 + e and 1. the breakdownpoint is the largest fractionof the data that can be changed arbitrarilywithout changing the estimatorbeyond all bounds. Among algorithms for finding zeros of functions.. n).bRL- 2. Loosely.248. The abscissa of theirintersectionis the resistant lineslope..bma) to search. The resistantline has a breakdownvalue of (1/n) min{[(nL . If nL and nR are both odd. [(nR .Johnstone and Velleman: The Resistant Line and Related Regression Methods 1043 e(b) ~~ % 20. in which the {ei} are iid and (a) the Xi are fixed or (b) the Xi are iid and the ei are independentof them.xb.d ~ (_s -a. where toll = .228.bmijn)tol1)]2 steps.2 An Algorithm for Resistant Lines The dual plot suggests an algorithmfor the resistantline that will always converge.. BecauseE(b) is monotone.)) $ sign(E(bj)) are suitable. The results are given for model a and follow for model b by conditioning on X. and Ullman 1974. ( t 80} /~~~O .Dekker 1969. see Aho.. The zero of the piecewise-linearmonotone-decreasingfunction E(b) = TL(b) . The dual plot displays each data point. -7) 7.e. note that{bRL f > c} = {medR(ei.1)/2]} becausewe needonly alter half of the datapointsin one of the outergroupsto takecomplete This content downloaded from 132. (x.. Lines are shown only forpoints in the left or rightgroups of the data set.i and e is the machine epsilon (i.the one known as ZEROIN(Wilkinson1967. so this algorithmrequires0(n) operations. Hopcroft. Brent claims that the algorithmcan usually be expected to converge much more quickly.and an interval(bmjn. The convergence rate of ZEROIN is independent of n. 2.The algorithm requireschoice of a tolerance. then bRL is median unbiasedfor f without any assumptionof symmetryon the gi.. The median traces.r.In this case bRLandaRLaresymmetrically distributedabout f and a.the medianof n numberscan be found with 0(n) comparisons(e.e. pp. b.followthe median residual..0 x e x Ibma. which is the amountof error in b that can be tolerated. .ci. For moderatesample") > medL(ei . Each step requiresthe median of the residuals in each of the outer groups. for each slope.. ?- h 2. 2. Supposenow also thatthe errors ei are symmetricabout 0 and independent.1)/2].g. To see this.s 0 N~~~FI.tol.. iER and similarlyP{bRL C iEL fi} 1.for our application.. .. andbmi.5 tol + 2.0 differ) in practice0(n log n) median-findingalgorithms are usually used. sTL %% ( ~~~^~~S ( -20. respectively. -2.. . Brent 1973) is well suitedto this problem.c} is equivalentto the same event with every ci replacedby . Velleman and Hoaglin (1981) providedportableprogramsfor findingresistantlines using this algorithm. Thus P{bRL P>f}= med ei ' med i 4'. 97-99). and thus are median < oo) for all values of nL and unbiased and unbiased (if EIe1i nR.but not necessarily identicallydistributed.3 Unbiasedness Consider the bivariate linear model yi = a + fXi + ei (i = 1. ..cxj)} and that {bRL .0 (-4.200 on Tue.anyb.4 Breakdown Bounds The (gross error)breakdownpoint of an estimatormeasures its abilityto resistthe effects of outliers(Hampel1971).thatsatisfysign(E(b. the smallest number for which the machine's floating point representationsof 1.TR(b)yields the resistant line slope. *I" /S* -4) . n/3 is too small to make the large overhead of specialized methods worthwhile.f < ..

05 . Table 1 shows. modifying the repeatedmedian by computingbijonly for xi E L and xj E R is equivalent to using the 0(n) resistant line estimator.228. and define e(R)(fl)similarly in the rightgroup. i E L}.25 . resistance..200 on Tue.10 . The quick. In reality. bRm = med med bij.0 NOTE:a is the probabilityt of gross error.014 . i joi - pays for its optimal gross errorbreakdownof nearly50% with a computationalcomplexity of 0(n2) operations.xif for the residualof (xi.. For convenience take nL = nR = 2k + 1. If nL is even. in a regressionmodel may be derivedfrom an analogof the method that gives interval estimates for the populationmedian.002 .04 . Theil-Sen regression(29%). Highly efficient estimators either involve more computation(M-estimators. where Ix(p.19 . and x.248. it is enough to note that er(bRL) C TL(bRL).07 . This gives a plausible bad-case ratherthan worst-case bound on the protectionavailable. {y . as bir s br.Pr{Bin(2k + 1. Since bRL satisfies medreRer(bRL)= TL(bRL) by definition. k + 1)2. the possibility of breakdownis affected only by the size of the subset drawn so that the breakdownprobabilityis given by a single binomial probability.998 .of even x2 cardinality. From the definition TL(br) = medlELe. In the dual plot.f Distribution-freeconfidence intervalsfor the slope. all bad points must lie in the same outerthird. a contaminationlevel near 25% seems appropriateas a guide in determiningthe applicabilityof the resistantline method. and efficiency that occur in currentregressionmethods.1044 Journal of the American controlof the slope estimate. section 5. CONFIDENCEINTERVALS FOR. Writeei(fl) = yi. q) is Pearson's incomplete beta function. the conclusion of the propositionis just that . a) ? k}2 = 1 .05 . 1. not all subsets of size W are equally dangerous for the resistantline. and if both nL and nR are even (for example). Proposition 2. Although a breakdown value of Ais certainlyresistant(least squaresand least-absolutevalue regression both have breakdownsof 0%). and for Brown-Moodregression nL = nR = [nl2]. lomed lomed blr S Table 1.26 . in 90% of cases the resistantline can tolerateabout25% contamination with outliers (and this still assumes that all outliers reinforce each other in location and sign). let the intersectionof er(b) = Yr .Interestingly.II-a(k + 1.the results extend to the random-carriers model as before. In an ordered set xl c x2.25 . that for sample size 27.1). Here.(br) and from the relation el(br) c TL(br) holds. the breakdown A Probabilistic BreakdownValue. the conclusion of 10 is weakened to br ' himedlELblr.0002 . we can calculate a probabilistic breakdownvalue (compare Donoho and Huber 1982.0 n = 27 0 .respectively. If nL = value is approximatelyequal to A. nL = nR = nl 3.x1/. Remark.Xrb(XrE R) with TL(b)occur at br.013 . yi) atf (ignoring a) in the dual plot. However.04 .5 Resistant Line and Repeated Median Regression The repeatedmedian regressionestimate of Siegel (1982). . f.LMS Brown-Mood Resistantline Theil-Sen Least squares . Let e(Lr)(fl)denote the rth smallest residual in the left group.99998 n = 63 Repeated median.If we restrictattentionto gross errors in y (so thatthe x-values areheld fixed) andassumethatextreme y-values occur independentlywith probabilitya (but continue to assume that they deviate in the same direction). December 1985 probabilities close to and exceeding its nominal breakdown point of 29%. Consider Model a of Section 2. For practicalapplications.56 . and in Section 6. then the breakdownprobabilitybecomes 1 .we use the terms lomed xi and himed xi for x. For the resistantline (includingBrown-Mood).AssumethatnL = nR-=m (say) for con- venience: There is no difficulty in extendingthe theoryor com- This content downloaded from 132. nM = nR. resistantpencil-and-papermethodsdiscussedhere can only aim for moderateefficiency levels.LMSis least medianof squares. respectively. If nL = Proof J0. it is instructiveto note the trade-offs among our criteria of speed. breakdownoccurs if there are more than [((nL 1)/2] (respectively. it follows (since nL is odd) that br = medlELblr. and assume that the ei have a continuous distribution. Proof 20. respectively. lEL rER lEL a Method . Note thatthe Theil-Sen methodis much more sensitiveto gross error - Statistical Association. IllustrativeBreakdownProbabilities and nR are both odd.10 .20 . which yields a larger value of k and hence lower breakdownprobabilities.3. For the resistantline. or repeatedmedianregression(50%). rER bRL C himed himed blr.003 .1.0 3. then bRL medrER medlEL blr. In these cases. LMS Brown-Mood Resistantline Theil-Sen Least squares 0 .003 .9996 .002 . it is poorer than comparable values for Brown-Mood (25%). for example.02 -1. Table 1 gives some values for the breakdownprobability underthe preceding randomlocalized gross error model.Krasker-Welsch) or sacrifice resistance(least squares). breakdownconcentrateson worst-casebehavior. 11 Aug 2015 18:18:38 UTC All use subject to JSTOR Terms and Conditions .High-breakdown lines such as least median of squares (Rousseeuw 1984) can requireextensive computationor the solution of a minimizationproblem that may possess many local minima.14 -1.0002 . as br bRL. for the line to break down.33 Repeated median.37 -1.. We also give results for repeated median and Theil-Sen regression for comparison. 2. [((nR 1)/ 2]) extreme y values in the left (right) groups that deviate in the same direction.

Switching to almost surely. The lomedian and himedianof a distributionfunction G are defined by lomed G = sup(x: G(x) < 2) and himed G = inf(x: G(x) > ). given x. Otherrelateddistribution-freetests for both slope and intercept were given by Brown and Mood (1951). Let fi. When rc is not an integer. Finiteness of the set {81i. where med(e | X) = med e the (mn .B3in analog of (3. Daniels (1954). As a convention. using normalapproximations. A less revealingdemonstrationcan be based directlyon the negative hypergeometricprobabilitiesusing the (local) normal approximationgiven in Feller (1968. instead of Models a or b. finding fir amounts to locating the line in (x. #8) hasexactcoverageprobability (m-x+-1)(m-r+x)/(2m). 1. we can interpolatefor the edges of the interval. The quantityUm = the numberof values in the right group that exceed C(r)is an exceedance statisticand has probabilitymass function P(UM = x) = 1045 Methods iid Xi = 1 if heads. Define the resistantline slope functional/3RL(H) as the crossing point of zero of the function h(fl) = med(Y - f. If.then an estimate of the kind consideredby Hogg (1975) is obtained (with nM not restrictedto be zero) by solving for. then r = 7.H. Pp(fs > fi} = P(U'm< r) also. if m = 20 and a = .be the unique solution of efr)(fi) = e(S)(fi).PR < 1 and that the partitioningis symmetric: PL = PR. and we take med G = (lomed G + himed G)/2. if m is odd then f[(i+ 1)/2] = fiRL.Now P{fl{ > fir} = P(U'm < r).(F(xV2). The latterexpression has been tabulatedby Epstein (1954) for m = 2(1)15(5)20.90 (Epstein 1954).(XR . Let XL < XR be lower PLthand upperpRth percentiles of the marginal distributionof X such that PL = P(X C XL) andPR = P(X ? XR). . (3.r + l} so that if one sets r = (m + 2 - {X + Y = x /-)/2.Johnstone and Velleman: The Resistant Line and Related Regression putationsto unequalnLand nR. The functiong(fi) = eR)(fi) . with Sm = El Xi. along with extensionsto percentileregression. For example. a + fix is the (IOOr/m)thpercentileof the Y distribution.fAX| X 2 XR) = med(a} + This content downloaded from 132.3)] yields the formula rc= (m + 1 . the coverage probabilityis then no longer exactly distribution-free.X I X 2 XR) - med(Y . Indeed.m.4) for a two-sided approximate100(1 .3) where W?(t)is a standardBrownianbridgeon [0. (3. Theinterval(fl3. 194). . x = 0. ASYMPTOTIC PROPERTIES OF THE RESISTANT LINE Remarks 1. 3.This can overcome the restrictedchoice of a values imposed by integer r. S2m+I = 1} and {X < r} = {(Sm + m)/2 ? m . 4.m.1 ): e(r)(fi) = e(r)(fl). (See remark1 of the previoussection. Pp{fi > fir} = Pp{4e) (fi) > e()(fi)l} = P{L) > Rj} R where e(l) c and e(l) c ) are the ordered j ce() in the outer groups. and by a symmetric argument (since s = m + 1 . by Hogg (1975).1) versus rln appears(from data analysis) to be satisfactory..1) Existence and uniqueness of /ir follow from the dual plot as before.2) as may be seen from a simple combinatorialargument. we assume that. we have m} = {S2m = 0. Asymptotic Approximation for P( Um< r) To study asymptoticproperties. respectively. The approximateconfidence intervalsthat result have coverage probabilities close to the nominal level even for moderatem. This can lead to large gaps between successive possible confidence levels if m is small. (3. The standardcontinuity correction [replacex by x .Vm/ 2z1) (3.9. Interpolationof log P(Un < r . 1] (Billingsley 1968). Y) with values in R2. with an appropriatechoice of intercept a. and assume that either med(X IX < XL) < XL or med(X IX 2 XR) > XR or both.248. set s = m + 1 . Thus the interval [fissfir] covers fi with probability1 . Existence and uniquenessare clear because the slope of h(fl) is always less than . Of course.X IX c XL). ? fim. which roundsto 8.1 if tails. . and more extensively by Bechtel (1982)..XL) and h(fl)/fl+ [med(X| X . . 1 . Assume that 0 < PL S 1 . and hence if the model yi = a + fixi + ei obtains.f.and we now show that pairs Clearly fil ?2 fi..(1/V-) in Equation(3.89 instead of . Bechtel (1982) has shown that values of rc from Equation(3.? (fi fir) (r < s) yield distribution-freeconfidence intervalsfor fi. fim}limits the numberof possible confidence IX ?XR)] asfl+ oo. althoughthe latteris done for notationalconvenience only.05.we extend the definition of the resistant line slope estimate from bivariatescatterplotsto arbitrarybivariatedistributions. med(Y ..r).) Of course.2P(U20 s 7) = .200 on Tue. y) space such that r data points in the left grouplie on or below the line and r points in the rightgroup lie on or above the line. and Adichie (1967). of randomvariables(X. 11 Aug 2015 18:18:38 UTC All use subject to JSTOR Terms and Conditions . C XL) The expression(3. then P(Um < r) = P(Sm l/Vi 2 -> x/iV< P(W0Q') > xI\/<) IS2mIV27 = 0) = 1 .1 Consistency and as such may be thoughtof in terms of fair coin-tossing as P[X = x |X + Y = in].2) is a negativehypergeometricprobability 4.2a)% confidence interval. and.4) round to the exact values even at the (fi) is strictlydecreasing. 2.r. This test is discussed in the case nM = 0. r .2 P(Ur' < r). p. where X and Y are independent FisherconsistencyofflRL(H)holdsforlinearregressionmodels negativebinomialvariablesrecordingthe numberof tails before of the form Y = at + fiX + e. The precedingconfidence intervalcan be invertedin the usual way to give a test of Ho: fi = fioagainst general alternatives.228.r + 1)st and rth heads. The test is based on the numberof data points in the left groupthatlie on or above the line with slope fioandintercept chosen so that m points in L U R lie on or above it.

Therefore. 1's). theorem2). If H E XIr.fiX IX : XL)so that fiRL(H) = fi. . 0. (-1.ULwith probabilityPL i= 1. (1. The fi(*)<b. For samplesH.. .1) to obtain AsVar\/ [fRi(H) - fi.n. We mention without proof only one such result in the spiritof Neyman and Scott (1951.n PR. y).2f. for this implies that ultimately uated at fi determines the direction of change in slope. Let the convex hull of PR + PL - This content downloaded from 132. and e are independent.1) accords with intuitionfrom dual plots.. in contrast to. X c Remark1. h(fi) = 1 . strong consistency propertiesof /JRL(H). be a sample from the structuralmodel X = + LI U + ?1. right if x > XR) median trace evalh*(b) > 0 for any b < fi*.n and XR. then fiRL(HJ) . be(X.n permit choice of the partition points based on the sample itself. PR. Huber1981) assertsthat if Hn denotes the empirical distributionfunction of a sample (Xi.bX IX > XR. Y)) = fi* of the respective functions satisfy fi* 2 fiRL ' fi*.n.fiX X XL).n. the form of the for XL.1) if x ' XL. Qualitatively. for example. of size n fromH it is clearlypossible bution of X.fiX IX 2 XR) Several authors(Mallows 1973.n< XR.bX IX > XRJ) togetherwith the correspondingversion in location problemsis pursuedfurtherin Section 5. Huber 1973. Suppose that Hn -4 H. Suppose and a at its that EIXI< oo. Then h*(f) = 2 = E(X IX C XL) and . where XGdenotes the groupedrandomvariabledefined by Issues of identifiabilityand consistency arisejust as for Wald's XG = PR with probabilityPR estimator. 0).fRi(H). To describethe analogous XR .(PR + PL) PR-P iQ and e have smooth positive densities. for example. Bhapkar(1961) has given some partialconsistencyresultsfor Brown-Mood regressionin the case of fixed xi. Proposition 4. 4. points {(. .pL)]-1 fi* _ lim inffi2 s lim sup fi2 fi*. then we use (4. 1)}. 0). For an arbitrarybivariate distributionH. .fix -a) = +CHYO(y if x ' XR and resistant line estimatorsfil'2. (4. H.fix) with respect to the fi*.V ' U < XL = P(XR .248. PR. Y1)of size n from H.f* = 1. Note that the conditions on XL. .Xi ' XL}.1046 Journal of the American I Statistical Association. Appendix A shows that 2(Y .y)is a point probabilitymass at (x.fiX IX 2 XR) absolute residual regression. (x.V < U consistent if lomed G = himed G. HO) analogy between this influence functionand thatof the median -' (Y . The resistantline influence function is so bounded. data point with an extreme x value. . (X. R = {X'. L = {Xi. Let density.RL(H)]= 1 2pLg2(0)(1pR - L) 1 1 Remark2. Then fiRL is a consistent estimatorof fi if In location problems the median of iid samples from G is and only if P(XL . Let (Xi. positive median.y).the resistantline may be fitted to any bivariatescatterplot. f E R.RL(H)and asymptotic variance given by EH[IC(fiRL.we introduce 4.himed(Q)] influence function remains the same except that CH takes on two values (given in Appendix B) according as x 2 XR or for weak convergence of distribution.1. Let Ha = (1 . ~.g. n.if fiRL(H)is uniquelydefined and Hn-4 H (which occurs with probability1 if Hn is the empirical measureof iid observationsfrom H). The notationsHn -4 H and 2(Xn) S-. y)) sponding partitionpoints XL. v]. thate has Example4. Consequently. X independent} and either 0 is the unique median of e or X has an everywhere positive density. where U. Y = flU + c. X < XL. We prove thatfi* _ lim inf fil) _ lim sup fi*(n) C The position of the point (fi.5)H + Clearly h*(fi) 2 h(fi) 2 h*(fi).IC(fiRL.n. h*(fi) = . The influence function of fiRL is defined by IC(fRL.2.R . - lomed(Y . and the unique zeros fi*. Krasker and Welsch 1982) have asserted the importanceof boundingthe influence function of a regression and estimatorin both y and x. As with Wald's estimator. to have fiRL(HJ) = 0 or 1. .fiX | X XL). Then of the median and CH = [2g(O)pL(. = -CH /O(Y . probabilitiesPL.R = E(X IX ' XR). fi* = 0.4 ) = 0.. Y))]2.if H lies in the family of linearregressionmodels XC= {H: Y = a + fiX + e for some a.1.fix . (1.includingthose in which var XG 4g2(0) the data are believed to follow an errors-in-variablesmodel. Velleman and Ypelaar 1980. Proof. December 1985 I IX c XL) = med(Y the supportof q be the interval[. Then Appendix AL .228. then fiRL is stronglyconsistentfor fi.3 Asymptotic Normality A standardheuristicargument(e.(X) denote convergence in distribution-of distri= 0 if XL < X < XRL bution functions and randomvariables. 11 Aug 2015 18:18:38 UTC All use subject to JSTOR Terms and Conditions .and with probability1 . H. fiRL = B shows that under mild smoothness conditions on the distri2.u. and. The result is thus a consequenceof the lower (upper) semicontinuityof the functionlomed(-) [respectively. least h*(fi) = lomed(Y . thatH E Xf. H. fIRL is not unduly affected by a .where e(x.n.200 on Tue.0 [fiRL(H6) fi&(H)]/5.a) (4.n = > < where I{x 0} I{x 0} is the influence function y/O(x) PL. fiRL.2 Influence Function h*(fi) = himed(Y . Xi 2 XR}. -1). It is enough to check.2f. and that PL. then the statisticflRi(Hj) is asymptotically normal with mean f. and I X > XR) = med(a + e) = med(a + .2fi. In particular. respectively. y . SupposethatH places equalmass on the four lim. g. Let Hn be a sequence of bivariatedistributionswith corre. (X. that lim inf 41 (b) ' appropriate(left if x C XL.himed(Y .

the optimalchoice of I. Thereforeit is naturalto considerthe ratio = {X Xi < XL}.L icallyN(0.1k). whereasif X is varianceby maximizing PL(/UR.12Xi. Inspectionof theproofrevealsthatof all choicesof sets and having variance equal to (. AL = E(Xo(X) IX ? XL)I/L. (IUR .butionwithveryheavy quantiles.oo theresistantline slopeestimator.whereX andc arestill assumedindependent. If X is uniformlydistributed givenin AppendixC.efficient)estimateof to therightgroup.for a c 2. .f(Tn. the sets composed of the smallest nL and largest nR Xi's are If EX2 < 00.(Tn .cv'pL (UL. nLIn 4 PL. n.3) can also be madearbitrarily distribution (W.27). to minimize efficiency of the resistantline (with XRalso)canbe determined XL(andfromsymmetry A. whereTL = E(a(X) | {2PL(J2R = local asymptoticminimaxresultsof The + f. these being regardedas iid copies of e .3) sityg.2) (1942). gi) T. (see Section5). i = 1./?0) > C} is equivalent (4.l[AsVar fl] = {varXG/varX} {4g2(0)/Ig}. efficiencybecomes = the PL and 1) N(0. + e) . For ThusP(\/5(fBlU .Win. 2 and F(xo) be unimodal that is f iid sample from H E 5f(. who obtained2ai [F(XL)(1uR .0. then (4. If X is uniform. the in described rule" "27% the yields ela(X) obtains. the marginaldistribution on someinterval. Furtherdetailsare inverseof F. by a factor efficiency increases of observations group (4. +(x) [where 02(xL)I'I(xL) becomes fixedratherthanrandom. 3)(a . models 4.provedin AppendixC forthe [comparetheremarksfollowing(4. andR = {X':X ? XR}. - - - This content downloaded from 132. to order The specialformof L allowsWnLto be approximated of thedenwherelg = E[g(6)Ig(Q6)]2 is the Fisherinformation by the median.corresponding independent location(compareSection5). PR). 0) of (4. the partitionpoint Giventhe marginaldistribution PL = A) is 56.2702.Z) -N(. maximized F(XL) interceptin theBrown-Moodcaseandby Kildea(1981)in the then(4. h(XL) = {fl SdF(s)}l of aboutone-fifth. 11 Aug 2015 18:18:38 UTC All use subject to JSTOR Terms and Conditions .228.Y1).2) middle F(XL). medieR (Ei . andthatequalitycan occuronly if \/. lim inf nEa. as Section5 shows that asymptotically.Tocoverthesecasesit is convenient In the familiarcase in which{ei} are iid N(O.l)x-a Illxl > 1}. 42. andthis taking example. of m = [flPLI valuesof e.) > mediEL (ei ..3) r2X1) = WnL}.(X1. 2 YJ) YI). X < XL).4 AsymptoticEfficiency AsVar PL(JR -i)2/ flRL flLs/AsVar ically efficient. andit suffices metricabout0 andPL = PR.2) theorem Hijek (1972. nR'n and Casley (1958). . .WhenEX2 < XoandF moregeneralsettingof Section5. thatVmV(Wm.(xi .or equispaced. .cn 112X.directproofis easy. if PL 50.200 on Tue. Supposethat(X1.8% (PL regression for Brown-Mood efficiencies to maximize an appropriate neglecting that so and 40. respectively. then h(x) fails to converge to zero as x .1) undernormaltheoryassumptions. n--oo optimal(in the senseof minimizingasymptoticvariance). . correspondingly.flu) > c) = P(Z > W) + o(l). and Gibsonand Barton (1949).Let{Tj}be anysequence lineestimator sistant to showthatVn(flRL conditions) of theregularity modifications estimatesof . anycriticalpointxo E (-.6%. and a sufficient condition for xo to A. then supXL<O leadsto the claimedconclusion.248.J).ZZm)-4 densities for X with tails of orderx-(3+e) for e > 0. to usetheBeny.AL)g(O)} )varianceof the Waldestimator(1. by NairandShrivastava Specificcaseswereconsidered continuouspositivedensityat its median. is an /tL(XO) = 2X0.(x = 4.{2PLCLR . Similarlyone iL)2g2(O)} 1[TL + R-2]/2. If F(x) points. The ratiodependingon the X theoryformediansof iid variablesextendsto show asymptotic small. /P. If a heteroscedastic Introduction.fl)22 {varX E[g'(g)/g(6)]2}-I = U2.ULis interpreted of I only on the marginaldistribution PL'1fPL F-L(p) dp. Similarresultswereobtainedby Hill(1962)forslopeand the choice to lends which support = by A. .thenthe Gaussian F of X.1)(a . .UR- PL)Y PLPR/(PL + PR) or L and R (with L < R) such that nLIn -_ PL and nRIn-_ PR. butratherby usingsample there tails. XL by imized modelof the formY = a + fiX + 3. partition the as "thirds" of contextof weightedmedianswhen the Xi are consideredas is maxwhich = ?'(x)]. 1). Considernow the asymptoticrelativeefficiencyof the rethen the argumentof AppendixC extends(with approriate formodelsin 5Cr.2) is points. and = .2) satisfies maximize Theorem4. thenJUR = = 2) are47.IiL)2].Standard replacing of Wm. in theory. This situationcould.Bbasedon samples(X1. however. groups can be a loss of efficiencywhenthe carrierhas a distrireferenceto fixedvaluesXL remediedby choice where2(XL) = ?(X I X C XL).5%. [4g2()] 12). the least Esseenmethodof AppendixC.3.6121. Suppose also that E|XI < cc.PiL)2 PLI2 in the symmetric h(XL)O* 0 as XL We statea versionof the asymptoticnormalitytheoremfor lxi. Thus uL)2/(2 varX) = (a L andR arenotchosenwith PL(IR the - Inpractice. to (P(XL) = corresponding . and(forconvenience only)PL = PR.Johnstone and Velleman: The Resistant Line and Related Regression 1047 Methods inequalityshowsthat thenthe Cauchy-Schwartz > tails of F vary as the if However. whereF -L(p) denotesthe left continuous divisionpointsdepends residuals the to applied M estimator the particular on not and of X. andZR. a + fiX. oo* case whenPL = PR.* If F is symasymptotic The corresponding = . Bartlett PR.We list a couple of these cases and remarkthat Remarks.if fx(x) = A(a . spaced equally at mass equal or places 2.. e has a (4.then. Tn(Xi. invariant regression of fi) is asymptoticallycenteredGaussianwith variance of size n. that is asymptotIn the specialcase thatXLandXRarechosenin advance. in Xf. op(n"/2) appropriate small by made arbitrarily be can (4. of g. Then -> Jowett(1957a).6% UL. If PL $ P(X < XL) for someXL. Clearly cn.Raredefined smooth implythatforapproximately 4. hasa densityf.with (r varX).2)-2/4. the medianwith an (adaptive. x)2 is asymptotsquaresestimateI3LS = .as the (fhRL /1)-* Gau(O.2) = ?I(x). (Xn.cna2.x)yIY.however.5% (rising to 51. 6) to {Z"R= (compareSection The event {\/(fRl.

The proof conditions on the observed X's and uses a Berry-Esseen theorem to obtain a uniform normal approximation. untilconvergence. recomputea. Y). In practice. and e has density g with t(&Xe))= 0. December 1985 analogousto that of flRL. where TR(Y -bX) is a mnemonic for t(FR. for example. (We make some remarkslater concerning redescending y/. Let t(F) be the M estimatorof location obtainedfrom a monotoneincreasingfunction V(x) by solving A(F.Underappropriate regularityconditions. if known (even approximately).For example. t) = fy. might then be replacedby TL ( x) TR ( bx) (5.248. then A class of regressionestimatesthat generalizesthe resistant andBrown-Moodmethods line.TL(yL. use of y/(x) = x{IxI< k} + k sign x{IxI> k} [which is minimax over contaminationneighborhoodsof the Gaussiandistribution(Huber 1964)] with k = 1. one would either estimate a once and for all. or iterativelysolve for b. a was recomputedwith each choice of b with no apparentill effects. so the choice of yg could be tailored to the distributionof e.bxi. yi). are available.This facilitatesthe proofthatT*(F) and T*(F) are semicontinuous(see AppendixA). and so on. A similar treatmentcan in principle be based on Lor R-estimators. .2 remain valid (with the obvious changes) underthe extra technical restrictionthat ypbe bounded.B. t) > O}.5)H+ c5I{yl. . i = 1. X. and write FL.bxi) . It is apparentfrom (5. In the presence of heteroscedasticerrors. then fJB. and so forth.1).228. and care is requiredin estimatingthe scale.4) The regularityconditionsand proof are detailedin Appendix C. The numberof points nL (respectively. So=0. a [compare(5. then asymptoticnormalityof fl&can be establishedunderfairly generalconditionson V. and the partitioningrule.B --a) a for W( g dyi JUL)f PL(PR = +_ )_ _ x0 0 forXL < X < XR.1).5 raises the ARE with respect to least squares(for uniform X and Gaussian Y) from 57% to 85%. Locating the zero of such a functionis possible using (say) the ZEROINalgorithm as described in Section 2 and thus requiresthe same order of computationeffort as computationof T.V(y-fix . if H E NCwith T(Q) uniquely defined. can be definedby replacingthe mediansappliedto the residuals in the definition of the resistant line with an M estimatorof location.bxi). and EIXI< 00. A(t) = EVi(e -t) be differentiableat as t --> o. andeitherEX2< ooor A(t) = O(ItI) not the most general possible. because y/ is monotone) soThe use of nonmonotone("redescending")V-functionshas lution to been advocatedto afford greaterprotectionfrom outliers. If yg is bounded. is one third of that of the correspondingM = n13.1) where TL and TR indicate that T is applied to the set of those residuals. The gross-errorbreakdown value of fJq.b X IX ' XL and Y . t) < O}> T* (F) = sup{t: A(F. with indexes i correspondingto the nL smallest and nR largest Xi.) As before. we use T*(F) = inf{t: A(F. we proceed as follows: Let XL < XR be lower pLth and upperpRth percentilesof the distribution of X. VT/varXG). The statementand proof of Proposition4. yi . in homoscedastic situations. can be discussed in a fashion exactly This content downloaded from 132. 11 Aug 2015 18:18:38 UTC All use subject to JSTOR Terms and Conditions .2) frxx(x>2 XR (X _< x) XL) y.B)-0Gau(O. i. For the preceding existence/uniquenessresult to hold. however.(x . Define fl.TL(Y -bX) h(b) = TR(Y -bX) lutions are no longerguaranteedto exist or be unique. Let T be an M estimatorof location with monotoneinfluence function yg.(5.however..b = FR.To replace himedF and lomed F. THECLASS OF v/-RESISTANT Statistical Association. and (5. Let VT denote the asymptoticvarianceof {Tj}. these hypotheses cover the extremesin which Tnis the medianor the mean (at least for most e). If the observations (Xi. either unimodalityassumptionson the errordistribuWithout The consistency of fly.b(H) and FR. estimatorif nL = - Remark. (x.b = FL. Then B- . the argumentin Appendix2B shows that if H E . respectively. The monotonicityof y/ and translationinvarianceof T ensure that h is monotone decreasingwithslopeat most X(nL) X(nnR+l ) whichassures the existence of a unique solution to (5.b). one would have to estimatethe scale of the residuals also. (5. In particular.In the simulationof Section 6.2)].3) Thus the influence function is (for x values in the outer thirds) a constant multiple of the influence function of the location estimate T. Let H8 = (1 . suppose that bivariatedata (xi.b X IX 2 XR. To define the yi-resistantline as a functionalon theoretical distributionsH of the pair (X.(Tn)as the zero of the function IC(.is strongly consistent for .4) that the asymptotic relative efficiency (ARE) of two V-resistantlines is the relativeefficiency of the two correspondinglocation estimators.b(H) for the respective distributionfunctions of Y .1048 Journal of the American LINES 5.t) dF (x) = 0. can be based on all the data. y)) h(b) = TR(yL. = fly. a could be replaced by CL and CR estimated separatelyin the two groups. where 'a is a robust estimate of scale that. Y1)come from a model H E 5C. fnR) in the left and right groups need only to be chosen so that nLIn (nR/n) 4 PL (PR). . Essentially the proof requiresthat V be monotone. H. solve for b. possessed by the ordinary resistantline.(H) is defined as the (unique. (5. V'(e t) have third moments. The yt-resistantline estimator Redescending v b. n. Although t = 0. .Bi..200 on Tue. the V-resistantline retainsthe propertyof bounded influence in both x and y. Wald-Nair-Shrivastava-Bartlett. This class is introducedbecause of the substantial improvementsin efficiency that can be achieved without significantlyaffectingbreakdownor (in some cases) computational expense. . while retaininggood breakdownbehavior.

g. Technical details of the experimentare reportedin Appendix E. The LAR line performssimilarly to the RLBW. In this sense.69 and .g.. function and local scale (RLHU3).73. The "slash" distribution. using a Monte Carlo experiment.1) and 10% Gaussian(0. this estimatorcan be computed by hand if a programmablecalculatoris available. The methodof Theil and Sen. .1). The efficiencies have been computedrelative to the Cramer-Rao lower bound for each situation. Rousseeuw. XI((Section 4. the resulting estimators need not even be consistent (e..has infinite variancebut is not as peakedas the Cauchy density. but we will pay about 5% in efficiency if that assumptionis in fact true. a mixture of 90% Gaussian(O. RL. Nevertheless. RLBW.generated as a unit Gaussian variate divided by an independentuniform (0. This increase appearsto derive from the similarbehavior of the median. and "slash"]. 6.200 on Tue. The actual efficiencies would be obtained from the variance of the Pitman estimatorfor each situation. The methods studied fall into three categories: I. least absolute residual (LAR) [algorithmfrom InternationalMathematicaland Statistical Libraries(IMSL) 1980].228. Siegel (1982) reportedefficiencies for the RM at sample size 10 and 20 (for fixed x. 1)] by three errordistributions [Gaussian(O.and their computationsare usually too complex for hand calculationand can be too expensive to apply to large data sets.RLBW3. We also developed a new swindle for this studyto improveperformanceon non-Gaussian errordistributions. which is 100% efficient (relative to the mean) in Gaussian samples of one or two and declines in efficiency to an ARE of 2/yr. SMALL-SAMPLE EFFICIENCIES We studied small-samplepropertiesof 10 simple regression methodsunderthe standardlinearmodel assumptions. Furthermore. usually exceeding 60% efficiency.9). the standarderrors of our efficiency estimates are as small as those of a naive Methods 1049 simulationexperimentwith between 3 and 8 times the number of replications. which could be well estimatedby the score functionswindle (compareJohnstoneandVelleman1985. rewardthe 0(n2) computingexpense with higher Gaussianefficiencies and (for the repeated median) higher breakdown. RLHU3). The Gaussianmixtureis one for which the mean and median have approximatelyequal asymptotic variances. The reverse trend for slash errorsappears to be due to breakdownof the median for the samples of nl3 points in the outer partitionsof small samples. It clearly dominatesat moderatesample sizes among estimatorswith boundedinfluence and inexpensive (0(n)) computationorder. resistant line (RL).Johnstone and Velleman: The Resistant Line and Related Regression tion or a lower bound on the tuning constant. perhaps 20. A heuristicargument(outlined in Appendix D) shows.the Gaussian efficiency of the biweight is close to that of the optimal "hyperbolictangent"redescendingestimators(Hampel.53 and . For its favorablecombinationof properties. typically by factorsof 3-8. 11 Aug 2015 18:18:38 UTC All use subject to JSTOR Terms and Conditions .v-resistantline using a fully iterated Huber y. section 4.-resistantline using one biweight step (c = 6) to improve each median and the overall median absolutedeviationfrom the median (MAD) as a scale estimate (RLBW). 22. The increasein efficiency over asymptoticvalues it exhibits for Gaussianand mixed Gaussian errors at smaller sample sizes is a pleasant benefit in a technique often computed by hand for small data sets.4) is preservedfor a wide class of nonmonotone yi-functionsincludingthe biweight. Their efficiency advantage. III. under regularityconditions ensuring asymptoticuniquenessof the solution.and Bartlett(NSB). The swindles increase the effective numberof replications. the biweight '-resistantline (RLBW) performs quite well.248. The experiment design was a complete crossing of three sample sizes (n = 10. for Gaussian X. Table 2 presentsthe small-sampleefficiencies for fixed equispaced x. is eroded at the more severe error distributions.) Few small-sample results have been published for any of these methods. Among the estimatorsordinarilyfound only with the aid of a computer. The experiment employed a "Gaussianover independent" variance reduction "swindle" (for example. resistantmethods with computingeffort O(n2):repeated median (RM) and Sen's version of the overall medianpairwise slope (SEN). the y. resistant methods with computing effort 0(n): BrownMood regression (BM).-resistantline using one biweight step but a local MAD computed only within each partition(RLBW3). II. BLHU3) partition-based were computedusing a modified version of the code published by VellemanandHoaglin(1981) for the resistantline. Except for slash-distributederrorsat sample size 10 (where all estimators-predictably-did poorly). and the partitionedline method of Nair. (Thatis. and the . We have included a iy-resistantline using a onestep biweight (e. see Simon 1976 or Goodfellow and Martin 1976). preferably40) if very heavytailed errordistributionsare suspected.we preferthis estimatoramong those studied here. the efficiencies reported for the mixed Gaussian and slash errordistributionsare conservative. we can relax the assumptionof homoscedasticerrors. The efficiencies reportedin Table 3 for randomGaussianX This content downloaded from 132.The swindle is defined in Appendix E and discussed more extensively in Johnstoneand Velleman (1984. Shrivastava. so 100% efficiency will only be attainablein the Gaussiancase here. 1985). and 40) by two x-configurationspecifications [fixed equispaced and randomGaussian(0. and Ronchetti 1981). such estimatorsperform well in practice. The resistantline is thus appropriatefor hand calculationon small samples with the caveat that data sets should be a bit larger (at least 15. Because the biweight is a single-step improvement of the medians in the resistantline. It has the highest maximin efficiency ("triefficiency") at sample sizes 22 and 40 among methods studied here. and Siegel's repeatedmedian. BM. the resistantline performs well.61. Diaconis and Freedman 1982). All of the lines (NSB. see Mosteller and Tukey 1977) in the experimentdiscussed in Section 6. y. nonresistant(breakdownvalue = 0) methods with computing effort 0(n): least squares (LS). thatthe asymptoticnormalityresult(5.2). . however.)Occasionally(especially for the more efficient estimators)swindle gain factors reached 160. respectively) that agree with ours. Its primary weakness is that it is not resistantto the effects of large fluctuations at extremex values. 1) variate. When the scale estimate is computed only from the local partition(RLBW3.

y .3) 0 n= 0 77. global MAD.2 (.6 (. Xpi.0454 . RLBW is w-resistant line. Yj) are available on p carriersand one response variable. italicized values are derived from theory. local MAD.5) 100 Triefficiency Slash MixedGaussian 0 0 62. parentheses.9 (.7) 68.9 (. scaled to 1.2) 72.0 62.4 (.5 (1.8 (.1) 59. The method clearly depends on the orderingof the x-varimultiple-regressionfit to have zero partition-basedline slopes against each carrier:for p carriers(includingthe constantcar. c = 6.10 .9) 16.7 61. .0 86.5) 68.7) 73.4 (1.Rk.3) 67. D = [XI. Yaren-dimensionalcolumnvectors.0 86.5) 70.2 81.4854 .some of the loss is due to the suboptimal placement of XL and XR at the 33% ratherthan the 27% points in x.5 72. 50% (RM).025 56.5) 69.3 (.2 (.7 75.but they are somewhat lower than the correspondingfixed-x efficiencies.2 (.0 (.1 62. R2.3 (.bkXk. with ax. . If desired. local MAD.0 (. SEN is Sen (1968).0314 52. IIl-breakdown = 25% (BM).2 (.8 ( 7) 77.6 0 36.0 (.5 (.4) 91. Journal of the Amercan 1050 1985 Table2.8 54.1) requires residuals from a process can be iterated.6) 82.December Statistical Association. results for n = 40 are based on 2.5) 63.1 (.8 62.2) 53.5) 77.9 (. follow the same general pattern.5 37. so the one-dimensionalresistance and breakdownbound are p retained.2 79.7 52.5 (.g.5 (.000 replications. All the datapoints are used in each sweep or projection.1) 62. a (7.5) 79.8) X n = 10 n = 22 n =40 69.1 (. 11 Aug 2015 18:18:38 UTC All use subject to JSTOR Terms and Conditions . boldface values are triefficient (minimums at sample sizes across three distributions).6 (.7 81.0 58. 16% (others).8 69. We seek p ieRj T(ej) = 0 (7.0 60.5 58.0 (1.7) 65.7) 93.0 46. MAD is median absolute deviation derived from the median.6 (.4) 100 100 63. Various internalchecks suggest that the other entries in the tables are reliable. MULTIPLE rier) and coefficient vector.ables. whereXV and Y(c)are the currentcontentsof D.9 88.6) 71.261n .7 62. Theil (1950).8 67. c = 6.5) 8. cost = 0(n).0 (.9 (A4) 53.000 replications. and it is not affimelyequivariant.3) 51.8 1. RL is resistant line (Tukey 1970).2 (1.7 (.2 (. RLBW3 is w-resistant line.3 (.0 (. one-step biweight.4 ( 7) 56.8 111 BM RL RLHU3 RLBW3 RLBW CRLBvariance .4 (. Suppose that n observations(X1I.9 (.4 79.8 (1. .8 58.6) .B.6) 95. .2) 61.. Rj) defined by the rank orderingof carrierxj.6 85. where bjk and bk are regression coefficients derived by'some method such as a yv-resistantline.1) 29.6) 47.200 on Tue. Andrews's sweeping method is one way to approximatea solution to (7.1) 37.8 (. Although hand computationis no longer feasible.1256 .3 62.4 (A4) 76.0 (compare Appendix E). RM is repeated median (Siegel 1982).0) 37. NSB is Nair-Shrivastava (1942).3 (A4) 73.Usuallyone mustchoose This content downloaded from 132.6 (1.0 ( 9) 35.7 (1.8 NSB 89. RLHU3 is w-resistant line.5) 64. LAR is least absolute residual.7 (. through Rp in turn [a total of p(p .g.3 (.7) 92. For the resistant line family estimators.1) 62.2) 72. The tentative proposals here have not yet been evaluatedin detail comparableto the one-dimensionalcase. Applying RI..1 88. define e to satisfy the p simultaneousconditions T(ej) = T(e1).6) 60.1) 0 n = 22 n = 40 X 0 0 60.9 56.x.9 88.0 (.2 (.6 (1.5) 69.0 (.8 86.6 0 0 59.8 (2.3) 75.5) 68.6) 63.0 n = 10 n = 22 n = 40 n =X 63.4 ( 7) 69.2 68.6 (2.5) 73.248.5 (1.8) 46. Beaton and Tukey 1974) or projection pursuit (Friedmanand Tukey 1974).9 (.2 1/n 51. standard errors are in one-step biweight.1 (.3) 68.9 63.. 1k-breakdown = 29% (SEN). .0 (1. c = 1.9 (1.0 (.5 (.8) 58.9 (1.851n NOTE: I-breakdown = 0%. Efficiencyof Slope EstimatesRelativeto the Cramer-RaoLowerBound of Residuals Distribution Gaussian Fixed Equin = 10 n = 22 n = 40 n = spaced x LS LAR 100 61.9 68.5.9 70. Yn.1 (1.1) 54. when applied to D replaces each Xj(j > k) and Y by Xj . x2.0 (A4) 65. The kth sweep operator.3 0 0 il SEN 95. cost = 0(n). Bartlett (1949).5 81. Arrange the data as rows in an n x (p + 1) matrix.l.1) for M-estimatorT and partition sets (Lj.8) 78. X.0571 . Gibson and Jowett 1957b).228.1)/2 + p univariatefits] and retainingthe coefficients bk leads to the fit Multiple-regressionmodels can be estimatedwith partitionbased lines by using the univariatemethodsas a primitivefrom which a multiple regression is built (e.4) 87.4) 89.6) 61.3) 65.000 replications.4) 49.3) 52.1 (2.9 (1.2 (1.p.2 RM 69. by "sweeping" (Andrews 1974.4) 50.7 51. the One extension of Equation (2.8) 70.4 (.2206 . Huber v/ (e.2 (. for example.1214 4. BM is Brown-Mood (1950).6 (. Huber 1981).2) 73.2 (1. .8) 66..8 (A4) 49. respectively. relatively simple computationalalgorithmwith fair efficiency k= 1 is possible. Results for n = 10 are based on 10.. ieLj = j = 2.1 (.8) 51.whereXk. cost = 0(n2). We have been unableto accountfor the discrepancybetween the trend of the small-sample efficiencies and the asymptotic values for the Huberestimates for Gaussianerrorsin both situations. results for n = 22 are based on 5. REGRESSION 7.1).8) 63.9 (.8 69.2 49.7) 17.6) 65.bjkXk and Y . .4 65.2) Yi = Y(C)+ I bkX(kc). CRLB is (naxl(g))-1.9 62.5) 16.

-resistantmultiple regression will * probablyprovidea betterstartingvalue for the Krasker-Welsch iterationthanleast squaresor least absoluteresidualregression.000 replications.8 (1. and simply H.9) 54.4 (.. FH.2 (1.9 (. RLHU3 is w-resistant line.0 (.4 58. Bartlett (1949).4 51. y) .5) 79.7 (.0) 91.3 49. then t 0 NF. bk .) -+ H(A. y) .2 (9) 70.8) 50. .(X < XL.)-2(Y point of the latter.6 (.4 (. x.H.8) 54. global MAD.8) 70. which might be a trimmed sum of squareddeviations (Yi . projectionpursuitgenerally requires more computationthan sweeping. italicized values are derived from theory. Second. obtained by conditioning on X.6) 105.7) 1.8 44.8 (.) = 0.H(A.) . Theil (1950).5 16.0) 62. and XL. it will suffice to show that A(Fnst) -* A(F.(x.7) 78. H. t) > 0} is lower semicontinuousif V is boundedand monotone. is countable.7 55.Hn(A. results for n = 40 are based on 2..7 46.4 (1.8) 64.248.5 (.Whengreater efficiency is desired.(A.2) 43. RL is resistant line (Tukey 1970). x. where N. Results for n = 10 are based on 10. No such multipleregressionwill compete in efficiency with.4 (. If F(N.0 (1. .0 (-9) 63.261n .] X R}).0 (1.Yi(u))2. and let A = {Y .8) 69.1 43.0661 44. RLBW is w-resistant line. . The requiredconvergence follows from the bound IH.7) 80.8) 60.The direction u* is chosen to minimize a resistant "figure of merit" I(u).0 11 RM SEN 91.9) 64.200 on Tue.5 (.2554 .6 (.5.3 (.0) 69.n-+ PL and thatAj( .8 (.8) 54.0) 64.5 56. Suppose thatFn-4 F.7) 53. 50% (RM). t) for t E R\NF. Huber v/ (e.9) 55.1 (1.y) .6) 100. for example.0 (1.8) 72. one-step biweight. This content downloaded from 132.1 32. x.0) 85.3 (1. y) is a continuity set of H.(xn) when HJ(A) = 1.H. xn)j + IHn(A.5 34.5) 52.3 56.3 (1.the breakdowndoes not decrease with increasing p (Huber 1981. c = 6.7 (. and x for XL.oo. For this purpose write x.(A.4 Triefficiency n = 40 n = 0 0 0 X n = 22 n = 40 0 0 53.7 63.0.5 77.(A.0 (.1 (. H).7 0 69. .1) 46. 2(Y .1 50.7 (. boldface values are triefficient (minimums at sample sizes across three distributions).9 (1. X CXL). Now N.. they provide a less expensive exploratory-qualitybounded-influence multiple-regressionestimatethatwill be sufficientfor many applications.5 67.5 (1.7) 52.4 55.9) 40. where NF is a countable set possibly dependingon F.2 X Slash n = 10 n = 22 n = 40 n = 100 80. c = 6. choose y > x.H(A.2 71.2 71. 8).7) 35.4 (. RM is repeated median (Siegel 1982).2 63. Efficiencyof Slope EstimatesRelativeto the Cramer-RaoLowerBound Distribution of Residuals Random GaussianX LS LAR NSB Gaussian n = 10 n = 22 MixedGaussian n = 40 n = 114.t) is bounded and continuous on R\N. Let z be a continuity bX IX S xL.2) 30.6) 7.6 (1.8) 82. so it remainsto check that H"(A.8) 65.)l ' IH.1 (.2.8 40.8 (..7 (.3 (.8) 50. y)l + JH(A. NSB is Nair-Shrivastava (1942).e.0 (. cost = 0(n2). so thatH(x) < H(y) < H(x) + c.8) 63.8 (.3 40.5 .8) 62.7 60.2) obtainedfrom a resistantline primitive are n consistent [i. one-step biweight.3 (1. Huber 1981).4 55. .8) 79.5 58._) = PL.0 (.4 (.7) 60.1429 .3 (.5 111 BM RL RLHU3 RLBW3 RLBW CRLBvariance .5 51.0) 73.0) 71.9 61.5 62. RLBW3 is w-resistant line.8 ( 9) 67. In contrast with some more elaboratemethodsbased on robustcovariances for the carriers.8) 64.4 (1. the boundedinfluence regressionsof Kraskerand Welsch (1982).7 44.1312 4. X C XL.0) 46.3 0 0 fixes a unit vector u E RP and finds a univariateresistantline Yi = bRL(u)(u*Xi) + aRL(u).bX C z}. local MAD.8) 42. y) .3 50.8 77.7 (-8) 84.000 replications.and this implies that NF C {t: F(N. since V is translationinvariant.(A.0) 73. LAR is least absolute residual. In the projection-pursuitapproach.7) 62.2 (1. an orderingfor the variables.7) 61. APPENDIXA: ADDENDATO CONSISTENCYRESULTS We show first that under the condition of Proposition4. x)l C IHn(A.2 62. local MAD. under mild nondegeneracyassumptionson (XI.0526 . note that the monotone function x -+ V(x . y a continuitypoint of H.Johnstone and Velleman: The Resistant Line and Related Regression Methods 1051 Table3. Ill-breakdown = 25% (BM)..8) 88.8 (. However.Of course.0) 67.3 (8) 69.4 (.6 (.4 (.7) 7.0270 1/n .3 56. 11 Aug 2015 18:18:38 UTC All use subject to JSTOR Terms and Conditions .7 45. c = 1.5 27. SEN is Sen (1968).0339 1.6 (.n -+ PL = H(X CXL).6) 56.9) 7. we prove that T* (F) = sup{t: A(F.) = F(NO + t) > 0} is at most countable. XJ) that the estimatedcoefficients in (7. standard errors are in parentheses.000 replications.3 58.8 41.0 56.9) 54.4 (1.0 (1.1 (.9) 71. for example.8 37. The resultingcoefficient estimates are bRL(u*)u* and have the advantageof being orthogonally equivariant.0 ( 3) 62.0 (.6) 56.7 77.0 (.H(A.0 n = 10 n =22 X 69.6 (.8) 58.0 32.) . CRLB is ((n .3)aXI(g))-1' with ax = 1.bX IX < xL.5 (1.1794 . To see this. .0 (-4) 62.g. It can be shown.7 (1.4 42. = NO + t. Xk) is independentof e. cost = O(n). xn) for Hn(A nf {(-oo. and given c > 0.0 ( 9) 63.6 (. Emerson and Hoaglin (1985) discuss a version of the swept resistant-linemultiple regression and present some examples.3 76.3) 50. results for n = 22 are based on 5. chap.Sweepinghas the samebreakdown as its univariateprimitive(6 if resistantline is used).6) 0 53.851n NOTE: I-breakdown = 0%. where the carrier(XI. x)l together with the bound IH. BM is Brown-Mood (1950).6934 .0 62. the y.9) 78.1 (.(y)l and the assumptionsthatPL. 16% (others). . Now let the observationsform an iid sample from the model Y = 'l gkXk + 8.6 (. By hypothesisH.0 (1-6) 27.9 30.4 0 0 46. cost = 0(n).13k = Op(1/ V)]. 1k-breakdown = 29% (SEN).2 (.228. Using right continuity of H(x).0 50. MAD is median absolute deviation derived from the median.2) 0 63.0 71.

b(O).tL(3. EIV(c + w + cX)|3<<0 for all w.i = A[n-n12(w + Z. I1R) and covariancematrix [A'(0)]-2Ef x .B) > c} = {TL(ei . /3. tL(5. b) = 0. is 0. t) and hence = D3gR D3gL at (0. Write FR(x) = P(X <x |XX xR) and UR for P(V TL S? w.. 1).{ti - n j (w + cX)} < 0 (A. = {x = v(v)dFR.y} implies that t X ? XR) = (1 . E2( 3. - n- This content downloaded from 132. It IC(b.i = var(Zni). ?TR S Z) + E(X I X 2 xR). Analogous results y(y DlgR( b(O)). To evaluate (A. In this case. 1) leads to the inapproximate (A.where Tn* < 0.cR)FR. (Al) (O.b(Z).3)H + 3{X.I. b(O)) = tL(O. If I(v)FR. and we give only the resultfor the ordinaryresistantline. When PL = P(X < XL) it may be shown that /1L = E(X I X E(X I X 2 XR)when PP = P(X > xP). TR(ci - ) cn. 2iEL a straightforwardcalculationshows that < Thus D2gR(0. Write We use freely the notation of Sections 4 and 5. t).b(tO)1 fL.(H6) is defined implicitly as the solution to h(. Suppose that (X. It will be + v + seen thatit suffices to consider(A.t) > 0} and T** = inf{t = sup{t O}. we have IVn(/ . . Conditions 1 and 2 involve y and the errordistributionand are relatedto conditionstraditionallyused for asymptoticnormalityof M-estimatesof location.b(V A calculation shows that D6aR|1=0 > XR}IPR X P(TR( i- b. Either EX2 < X or A(t) = O(|tJ) as t -> oo. Assume for now that h E 5C.i =EIZni . z and that the necessarydifferentiationsand interchangesare valid. pR1).7) .. b(O)) = _ DlgR [D2gR D2gL] D3gR LD3gR D3gL tR(O.4).cn-112Xi) < w/I XJ) O as v + t)] + X. In the case of the y function correspondingto T(F) = med(F).g. b. b) = t(FR. = and set n-12(w + cX)}. for some is the probabilitythat a wild point is drawngiven that it lies in R. /3.'ID(-unSn)J 6rnIsn. 1). where as always L = /1. which the result follows easily.b(tO) JfR. and pn. < From the Berry-Esseen theorem follows JPL(Xn) ..i and impose the following conditions Yi)is an iid samplefromH E SCQ. b) .3). to) a){x 2 XR}IPR.Pni|3. cXi)].. t0) = (pRL H. nRIn --> PR2.Journal of the American Statistical Association. TLand TR are independentso that It remainsto find D2gL and D2gR. Conditions3-5 are convenient for obtaininga result for randomcarriersand partitionsof data-dependentsize.6.b(o)) LDlgL V(Zi . from = /3 -C(P1L.= Tn(Z1s. dFRb. < XL). From the monotonicity of h [see (5. (Z- 0 < w) C {Tn W} C { V(Z.we show for the V-resistantline estimator = . eiThe influence function takes a more complicatedform if H 0 S.6. and {x 2 XR}I[(l a= + 3{X 2 XR}] -)PR b(O). where we write {A} instead of I({A}). y)) V dFR(z)zg((b -)z dV(v) f' fl.n. g(v) dV(v). In turn.b(t) = HA(Y.fl-PR FL(p) dp.2) may be sandwichedbetween + t) X cn-112Xi) < zl/- f + [- f FR.200 on Tue. Conditionalon Xn. Note that all of these quantities are functions of Xn. fluence functions (4. then AR= = XL.b. yI is monotone. P J.O A = t = tR(O. (A. p. which in view of (A.b(Z).(A. b(O)) = t(2(c + a)) = a. with u(x) 2 0 for x > 0 and u(x) s 0 for x sZn) is any point in [Tn*.12X)) and hence that converges in probabilityto a jointly normal distributionwith mean 2diag(p[. to). V p/(Zi . -t) is continuousat 0. so it is easy to check that b. then FR. + n-1'2?i~)}2. Underthese assumptions. If Tn.b(H6) b)]. and similarly Proof.4).1) and (5.2) t) is continuously differentiable at t = 0. note that H6 = (1 . { b). (x. Apply the Berry-Esseentheorem and a similar result a).(z) = D3gL where to = tL(0. o.Tn**]. hold for DlgL. c. t) = f ig(z . APPENDIXC: ASYMPTOTIC NORMALITY { i X(i) X(nL)} 112X) < TR(ci We show \/-(TL. b(O).b and hence gL(.bo(to) XR J] 1fLx 5(t) DbFL. sn = YiEL on. (x.S) and /n/ Sn -4 .(O) = fJ and FR.b(v t) -*0 as v-* -00 and V (v)[1 = E[P(TL(ci . t) = t- - D2gR(0.i = EZ. nLn -4 PL.4) iE-L and the correspondingexpression with nonstrictinequality.3) can be handled analogously. defined implicitly by gR( orderto simplify (A. In this case 2(c + a). December 1985 1052 APPENDIXB: DERIVATION OF INFLUENCE FUNCTION For brevity in what follows we shall assume that the functions andy(z) of Section5 aresufficientlysmoothin b and FL. (ii) the function t - Commentson the Assumptions. = and rn = YiEL pn. Assume now that c has a density g and that E|X| < 0. This implies that gL(0. .6) 2(Eyi2)-}I/2 We begin by showing that SnlIn --pLE V2. where FL(P) is the left continuous inverse of the distributionof X.. snIn = (nLIn)Ef. 5. bo(to) J where bo = bRL(O) and Db denotes partialdifferentiationwith respect to b.b(V f = v(o)- + t) dV(v) FR. The estimate b(3) = b. 4.2 + n-1 E E[VI2(ci + n-'2ai~) on X. Here AL = PL1 fr FL(p) dp and P1R = PR-. b.1)] and the model Y a + fiX + c.bX bx s t}. - TR) = \(TL(ci cn X- cn-112X).t) follows by implicit differentiationthat tR(3..6) and = FRb6 [and similarly for FR. Feller 1971. and the yi function correspondingto Tn. (i) A(t) = EV(c- with A(0) = 0. (A. the results will hold except on an easily identified null set.Now the second factor in (A. t) = gR(O. A similarexpression holds for FLa6. y)) = b'(O) = [ -DlhID2h](o./3) is asymptoticallycentered Gaussian with variance [PL + PRI] (PUR -L2E V2//[i(O)V] . t) gR(O. that \/(fl. OfA = FL. Insertingthese expressions in (A. so an appeal to the bounded convergence theoremreduces the proof to showing that rnIsI -4 0. -AR for independentrandomvariables(rv's) (e. Now write Pu. pu. 11 Aug 2015 18:18:38 UTC All use subject to JSTOR Terms and Conditions z ieL {EV(c.b(V o PL(Xf) + t)] dV(v). 2E) ieL 1. where tR(5. (y-box-to) 2 ~ f(X [ L DbFR.3) Consider for now the first term.(W + CIiL){pL(IV(0)) (A. b. c E R.t) < - W) c< O (A. b) = tR(3.b(t) + c4{y FR. Let E denote conditional expectation given Xn. 544) to holds for D2gL. FR.b(V I XJ)].

and Casley. . y. This content downloaded from 132. i = 1. so comparisons of efficiencies among estimatorsin the same run are somewhatmore reliable than the reportedstandarderrorswould indicate. Convergenceof ProbabilityMeasures. .803) and the BoxMuller Gaussian transformationin GGNOR. except that one appeals to the continuity of A(t) at 0 [and A(0) = 0] instead of condition 2 (ii). J.. ElYg(n-'2Y)I . Suppose thateither (i) Ey2 < X0 or (ii) E|YI <0 and g(y) is bounded. I Barton. Assumptions 2 and 3 ensure that the lemma applies to the present situation. 00) -* R that locates the minimum in [-M. "Some NonparametricMedianProcedures. that APPENDIXD: HEURISTIC ARGUMENTFOR REDESCENDINGl Tests for Regression Parameters. . Then (1/n) Ln Yig(n-1/2Yi) -4 0. Reading. .) that nLIn . nL-I ILn L) E 0(1). Let g(y) be a function such that g(y) -O 0 as y -O 0. 884-893. . . It uses a variancedecompositionfor regressioninvariant estimatorsb(y) of /3: var b(Y) = E(X'A2X)-I + var b(e). Bechtel. using Huber Adichie. 431-435. M] 73. be iid rv's.) and set a = 1 in (5. of y -| x(y)l (with conventions to handle nonuniqueness and jump Beaton. .yn-1/2xi).0. a2 conditional on W.). For a simple linear regression model Yi = a + fl(Z. The Design and with an analogous result valid for TR-. [ReceivedMay 1983. V2(c + /N/\)I[ V2(c + tii/N. W. M.t'(s) it(0)] ds + nx l (0) (W + cX). "A Quick Estimateof the Regression to the Gaussian with paths in D( -0. 147-185. a value of t minimizingIm y(yi . x2. P. errorswere generatedusing the IMSL (1980) linearcongruentialran. (1968).2). Revised March 1985.u)(log g)'(ci). L. Then under further smoothness and integrabilityconditions. Underappropriateconditions Andrews. . Xn. Sn = z7 I (Xi- (A.YiJ n-1/2 -4 0. F.. Trials were run on Cornell University's IBM 370/168 computerunder a modified OS/MVT operatingsystem using CMS.. 523-531. n}. "AsymptoticEfficiency of a Class of Non-Parametric (1967).t)I in a prespecifiedinterval[-M.+ var b(Y -L f where L = naX (gX)2/g.Johnstone and Velleman: where qi = - The Resistant Line and Related Regression (w + cXi). Xn(y) Subject to Error.A'(0)] ds. Bhapkar. (1958). 5. (1974). "AsymptoticTheory of Least AbsoluteErrorRegression.200 on Tue. maxI_i. Equation(A.8) where A = diag(Wi) and e = (Y .. .7) tends to 0 in probabilityif W== E2(+c + n -112)A2 EV2(c) = W. Aho. .. J. thesis. whereas in case ii. P. we conditionon the observed sequenceof carriers details and discussion of the relative merits of (A. (1967). D. "TheResistanceof ThreeRegresN/_nTL 8i - yn 112(xi - A-L)) -4 N(0. and Ullman. Define T(y. this is a consequenceof the next lemma applied to Yi = w + cXi and g(y) = fy [{'(s) . . A. Polynomials.The continuityproperty2 (ii) implies a.9) are X = (xI. for sample size 40. A. 11 Aug 2015 18:18:38 UTC All use subject to JSTOR Terms and Conditions . and Koenker.. and var is computedat /3 = 0. and nL L (Xi REFERENCES has a unique zero at t= 0 with i'(0) # 0. and Emerson. These resultshold for each value of y. The Bassett." Journalof theAmericanStatisticalAssociation. so long as b(y) is unbiasedfor (almost all) matricesX.8) and (A. with ei iid from a smooth density g andXi eitherfixed (x = X) or iid (. (1974). N. -4 W and (b) W. N. .for sample size 22. Further In the following.J. D. . remarking only that in case i. .f) = h(Xn(y))-4 h(X(y)) = (2PL (VT)Z/IR L).) + ei. Roger (1978). MA: Addison-Wesley. which is uniformlyboundedin n by virtue of assumption4 and monotonicity of V. Gaussian variates were takendirectlyfrom GGNOR. 207-212. the score function swindle yields this variancedecompositionfor a regressioninvariantestimatorb(y) of /3: var b(y) = L. E. where Fn is the empirical df of XI.235.V. is uniformly integrable. "TheFittingof PowerSeries. is op(n31/2). This follows if (a) W." Technometrics. one can show. which is OP(l) because Ep." Annals of APPENDIXE: MONTECARLOEXPERIMENT DETAILS MathematicalStatistics..0 with probability.PJL)Y."Annals of MathematicalStatistics. Mixed Gaussianvariateswere generated by multiplyingunit Gaussianvalues by 3. and let YI. All programswere writtenin Fortran66. n.N/nTRJ(ci .. Suppose for the sequence (xI. The Markov inequality shows that the second term in (A.N(0. . processX(y) = (2pL VT) Z . Hopcroft.8) and (A. both at optimization level 3. pLjVT). r In < n-I pni. We omit the proof. continuousmappingtheorem.The thirdterm of (A. 846-863. The Gaussianoverindependentswindle for a linear model Y = X = IE Jnfl 2(w+cX) - [.8) remainsvalid if the rows of X are iid random vectors. the randomprocess Bartlett. Meaning of so from the on is continuous the discontinuities) support {X(y)}.7) is o.i = E E |I (c + n-1/2) -A(n-1/2') I. "A Robust Method for Multiple LinearRegression. In each run.Bingham. J. To establish (A. M. The terms on the right sides of (A. Consequently(A. D. Y) as given in Johnstoneand Velleman (1982. andTukey. 1984. 38. whereasb follows frommonotonicityof g andcondition4.1. 32. and the latteris integrable.000. X fixed.S. Let Lnbe a (possibly random)subset of {1.(l) for similar reasons.) > a] s {JV2(c) + V2(c + q)}I{i2(c) V V2(6 + i) > a}. a2) independentof the iid positive Wi. Analysis of ComputerAlgorithms. .000 replications. Y2.> PL. J.16.u = EX. (lly) Lemma. N1nTLJPi . (A. 1985). the same samples were presentedto all estimatorsin parallel."Biometrics. converges weakly degenerate Coefficient. That the second term converges in probability to PLA(0)(w + C/L) may be seen by writing n-I 1iEL X as fgLFn Fn-L(p) dp. D.6) we write Methods 1053 dom numbergeneratorGGUBS (seed = 714. . The first term is op(l).whereZ . 1). . S. "A Confidence Intervalfor the Slope Coefficient of . Slash variates were generated as the quotient of a random Gaussian from GGNOR and an independentrandomuniformfrom GGUBS. E. Indeed. compiled eitherby IBM's FortranH compiler or by IBM's FORTVS compiler. (1983).000.248. . /3 + nxp px l .. Cornell University.] MI. mapping h: D( -00.. nx l assumes that ci = ZilWi. 2. J. . and that A(t) = Eig(c Xi -* t) L .45. "Fitting a Straight Line When Both Variables Are = y E R.228.(PUR . (1949).yn-1'2xi) . . Illustratedon Band-SpectroscopicData.9) togetherwith their variabilities are estimated by standardmoment estimators. . Pseudo-random Wiley. For sample size 10 we performed 10.S) can be establishedby showing thatr. Billingsley. 5. 618-622. (1974). the ResistantLine.9) -ISX). Gilbert. New York:John Without loss of generalitywe set the true /3 = 0. (1982)." unpublishedM. with Zi iid N(0. Indeed. E." Technometrics. (1961). J.Xflaw)/6 is a residual vector constructedusing weighted least squaresestimatesof /3. 00). D. x2.16." Biometrika. R.

(1985). 881-889. 9.G. P. Introductionto the Theory of Statistics. H. Nair. A. G. 75. P."ElectricalEngineeringTechnicalReport. F." Annals of MathematicalStatistics. F." Communicationsin Statistics. and Martin. W. Hill. "A Rank-InvariantMethodof Linearand PolynomialRegression Analysis. Algorithmsfor MinimizationWithoutDerivatives. An Introductionto Probability Theory and Its Applications (Vol. Basics and Computing of ExploratoryData Analysis. R. 6. (1950). (1981). 121-132. "The Change-ofVariance Curve and Optimal Redescending Estimators. TX: Author. Dekker.. D. (1962). 1). 42."Technical Report STAN-CS-67-60. 25. P. (1973). 514-519." Festschiftfor Erich Lehmann. eds. MA: Addison-Wesley. D." Applied Statistics. C. 377-404. "GraphicalMethods in NonparametricStatistics:A Review and AnnotationalBibliography. P. 11 Aug 2015 18:18:38 UTC All use subject to JSTOR Terms and Conditions . Mood. Houston." in Proceedings of the Statistical ComputingSection. F... Henrici. M." Journal of the AmericanStatisticalAssociation. McCabe. "Robust Regression: Asymptotics. P. Velleman. (1951). Doksom.P. M. E. and J. 595604. "ResistantLines for y-versusx. Diaconis. 839-844." Technometrics. Mosteller. Data Analysisand Regression. 851-862. (1967). 9." in Proceedings of the Sixth Berkeley Symposiumon Mathematical Statistics and Probability (Vol. "Tukey'sResistantLine andRelated Methods: Asymptotics and Algorithms. J. (1964). and Mood. (1942). D. M. Applications." Annals of MathematicalStatistics. G. LimitedPreliminaryEdition. (1981).I. Hoaglin." in Proceedings of the Second BerkeleySymposiumon Mathematical Statistics and Probability. "Tables for the Distribution of the Number of Exceedances. Huber. With Applications to Estimatesof Location and Dispersion.). (1984). P. J. "Robust Regression Using Repeated Medians. A. Free University of Brussels." in Proceedings of the Fifth Berkeley Symposiumon MathematicalStatistics and Probability (Vol. H. Mosteller. K. (1968). -~ (1985). L. and Hoaglin. A. eds. "Monte Carlo Swindles Based on Variance Decompositions. and Scott. NJ: Prentice-Hall." in Proceedings of the Statistical ComputingSection. G. Emerson. Brown. 643-648. (1982). and Freedman. New York:John Wiley.Ser. (1980).. 477-481. (1977).. (1954). "The Fittingof StraightLines of Both VariablesAre Subject to Error." in ExploringTables. J." The Annals of Statistics."Journal of the AmericanStatisticalAssociation. D. CA: WadsworthPress. V. P." Annals of MathematicalStatistics. 221-234. A. 23. Tukey. (1983)." Biometrika. InternationalMathematicaland StatisticalLibraries."paperpresentedat the National Bureau of Economic Research Conference on Robust Regression in Cambridge. 159-166. Donoho." Journal of the AmericanStatisticalAssociation. P. pp. CA: University of CaliforniaPress. . J. 352361. (1954). J. "Estimatesof the RegressionCoefficientBased on Kendall's Tau. and Shrivastava. Englewood Cliffs. 499-513. M. pp. Conjectures and Monte Carlo. (1982). Velleman. (1974). 80. CenterumVoor Statistieken OperationeelOnderzoek." Applied Statistics. (1983). "EfficientScores. 1397-1412 (part3). 56-59."Annals of MathematicalStatistics. Neyman. J." Annals of MathematicalStatistics. (1982). "On CertainMethodsof Estimatingthe Linear StructuralRelation. "Local Asymptotic Minimax and Admissibility in Estimation. (1971). 871-880. "The Notion of Breakdown Point. Simon. This content downloaded from 132. F.J. and Hoaglin. L. F. "GraphicalProceduresfor Fitting the Best Line to a Set of Points. V. DeuxiemeSupplementdla TheorieAnalytique des Probabilites."Annals of MathematicalStatistics. (1969). I. W. Hogg. Boston. B. VarianceDecompositions. Epstein. Daniels.. Mass. "On InconsistentM-Estimators. P. Kildea. 762-768. andJowett. 30. N. P. New York:Wiley-Interscience. H. (1976).35. and J."InternationalStatistical Review.eds." Journal of EducationalPsychology. "Two Algorithms Based on Successive Linear Interpolation." in Proceedings of the Social Statistics Section."Journal of the AmericanStatisticalAssociation. "Three-GroupRegressionAnalysis: PartI. 1379-1389. pp..Dept. (1982).CA: Stanford University. 17. ComputerScience Dept. (1976). Robust Statistics. H.. D. (1957b). Mosteller. 77. P.2. 1). G. Wilkinson. Goodfellow. J. "ThreeGroupRegression Analysis: Part2. "A ProjectionPursuitAlgorithm for ExploratoryData Analysis. A.PierreSimon de (1818). R.1054 Journal of the American Statistical Association. Belmont. Berkeley. B. 799-821.. and Yohai. and Huber. Rousseeuw. M. F. (1975). Kelley. T. 114-122. New York: McGraw-Hill. Tukey.. and Tukey. "A QualitativeDefinition of Robustness. (1957a). 438-442. P. 22. 1096-1123." Applied Statistics.. B. "ResistantMultiple Regression One Variableat a Time. R. and Velleman." Journal of the AmericanStatisticalAssociation. D. F. 10. and Ronchetti. (1946). T. New York:John Wiley. M. E. (1983). 386-392 (part 1). J. Trends. Brent. P."Annals of MathematicalStatistics. Hajek. (1940). 33.228. (1950). Ann Arbor.and Monte Carlo Swindles. 266-274. Dolby. pp. CA: University of CaliforniaPress. 454-461. "A Distribution-Free Test for Regression Parameters."Journal of the AmericanStatisticalAssociation. Boston. W. 1. Paris: Courcier. "On a Simple Method of Curve Fitting.University of Washington. Wald. 69. K. W. (1967). Part A-Theory and Methods.." The Annals of Statistics. 53. New York:John Wiley.6. 25-58. Theil. "ConstructingRegressionsWith ControlledFeatures:A Methodof ProbingRegressionPerformance. J. (1981)."Annals of MathematicalStatistics. "TheBehaviorof MaximumLikelihoodEstimatesUnderNonstandardConditions. New York:John Wiley. New York:John Wiley. Hampel. Robust Regression by Means of SEstimators. J. Gibson. (1984). "Use of the 27% Rule in ExperimentalDesign. K. A. eds. 73-101. C. W. 1. R. 2). 79. Hodges.D. Feller. "InfluenceFunctions.). (1960). 175-194. R. (1980). Rousseeuw. "ComputerSimulation Swindles. J. 157-184. "A Test of Linearity Versus Convexity of a Median Regression Curve. technical report. 765-776. "Brown-Mood Type Median Estimatorsfor Simple Regression Models. Tukey. CA: University of California Press. C. W. "On Some Useful 'Inefficient' Statistics. An Introductionto ProbabilityTheoryand Its Applications (Vol." KoninklijkeNederlandseAkademievar WetenschappenProceedings. (1973). "LeastMedianof SquaresRegression. (1973). Mosteller. D. Berkeley.. Berkeley. 3rd ed." in UnderstandingRobust and ExploratoryData Analysis. F.. J. (1939).M. (1971). Rousseeuw. Johnstone. (1981)." IEEE Transactionson Computers. Krasker. 282-300. "On Median Tests for Linear Hypotheses.. 70. F. W. R. F. J.200 on Tue. E. Dejon and P. R. Bickel. 25. American StatisticalAssociation (1982). 189-197. (1968). andTukey. 52-57. (1970). "MonteCarlo Swindle Techniquesin RobustEstimation." Journal of the AmericanStatisticalAssociation."in ConstructiveAspects of the FundamentalTheoremof Algebra. 25. C." Sankhaya-. ExploratoryData Analysis.W. Siegel. (1951). 63. J. 51. Hoaglin. 11. MA: DuxburyPress. Laplace. 242-244. (1972). MultipleRegression Analysis. 6. P. and Ypelaar. and Welsch. Friedman.and Shapes. (1980).Inc. "The Selection of Upper and Lower Groups for the Validationof Test Items." The Annals of Statistics. and J. of ElectricalEngineering. C. L. 17-24.. 76. (1982). Hampel. Stanford.M. American StatisticalAssociation (1983). D. Fisher. "RobustEstimationof a LocationParameter. Sen. December 1985 sion Methods. "Estimatesof PercentileRegression Lines Using Salary Data. D. 521-525 (part 2). Simple Regression Analysis.248. C. IMSLLibrary (8th ed. E. W. 1887-1896. Mallows. AmericanStatisticalAssociation (1981). 218-223. "EfficientBounded-InfluenceRegression Estimation. A. "Finding a Zero by Means of Successive Linear Interpolation.. MI: University Microfilms. L.