You are on page 1of 5

Trimmed mean and other things

Ioannis Kalogridis

November 27, 2022

Exercise 5
1. Find the functional form of the trimmed mean.

2. Prove that for the location model, the trimmed mean is Fisher consistent if F is symmetric.

3. Derive the influence function (IF) of the trimmed mean.

Solution:

We do 1. first. Set m = [(n − 1)α] for α ∈ [0, 0.5). The sample version of the trimmed mean is
n−m
1 X
Tα = X(i) ,
n − 2m
i=m+1

with X(i) denoting the ordered observations (order statistics). We want to write Tα as a function of
Fn and then replace Fn with the population CDF F , which for simplicity we assume is continuous
and strictly increasing. Thus, we want
n−m
1 X
Tα (Fn ) = X(i) .
n − 2m
i=m+1

But,
n−m
X Z Fn−1 (µ)
X(i) = n xdFn (x), (1)
i=m+1 Fn−1 (l)

for all l and µ satisfying

l : Fn−1 (l) ∈ X(m) , X(m+1)




µ : Fn−1 (µ) ∈ X(n−m) , X(n−m+1) .


 

Recall that Fn has jumps of size n−1 at each X(i) and Fn−1 (x) = inf{t : Fn (t) ≥ x}. This implies
that the range of Fn−1 are the order statistics X(1) , . . . , X(n) . Thus, it must be that

m m+1
<l≤
n n

1
and
n−m−1 n−m
<µ≤ ,
n n
so that Fn−1 (l) = X(m+1) and Fn−1 (µ) = X(n−m) . We now need to find the asymptotic (functional)
form of
Z X(n−m)
n
Tα (Fn ) = xdFn (x)
n − 2m X(m+1)

Replacing Fn with F and taking n → ∞ we arrive at


Z F −1 (1−α)
1
Tα (F ) = xdF (x),
1 − 2α F −1 (α)

as the order statistics X(n−m) and X(m+1) converge to F −1 (1 − α) and F −1 (α) respectively (see
below for formal justification).
To prove 2., observe that for symmetric continuous strictly increasing distributions about zero
R
we have 1 − F (x) = F (−x) for x ∈ . Therefore,

x 7→ xf (x),

is an odd function and it follows from F −1 (α) = −F −1 (1 − α) that Tα (F ) = 0.


To derive the IF of the trimmed mean in 3. we first derive the IF of quantiles, i.e., estimators
of the form

Ts (F ) = F −1 (s), s ∈ (0, 1).

As before, we restrict attention to CDFs that are continuous and strictly increasing (so that quan-
tiles are unique). For such F ,

F (F −1 (s)) = s. (2)

Insert the contaminated distribution Fϵ = (1 − ϵ)F + ϵδx0 in (2) to get

s = Fϵ (Fϵ−1 (s)) = (1 − ϵ)F (Fϵ−1 (s)) + ϵδx0 (F −1 (s)).

Differentiating both sides with respect to ϵ and setting ϵ = 0 we get

∂Fϵ−1 (s)

−1
0 = −s + f (F (s)) + δx0 (F −1 (s)).
∂ϵ ϵ=0

Equivalently,

s−1

 , x0 ≤ F −1 (s)
f (F −1 (s))

IF(x0 , Ts , F ) = s

 , x0 > F −1 (s).
f (F −1 (s))

2
Consider now the functional form of the trimmed mean at Fϵ
Z 1−α
1
Tα (Fϵ ) = Fϵ−1 (s)ds.
1 − 2α α

Differentiating with respect to ϵ, interchanging integration and differentiation and evaluating at


ϵ = 0 yields
Z 1−α
∂Tα (Fϵ ) 1
= IF(x0 , Ts , F )ds.
∂ϵ ϵ=0 1 − 2α α

We discern three cases:

ˆ x0 < F −1 (α)

ˆ F −1 (α) ≤ x0 ≤ F −1 (1 − α)

ˆ x0 > F −1 (1 − α).

For x0 < F −1 (α), integration by parts shows that


1−α 1−α
s−1
Z Z
ds = (s − 1)dF −1 (s)ds = . . . = F −1 (α) − W (α),
α f (F −1 (s)) α

with
Z 1−α
W (α) = F −1 (t)dt + αF −1 (α) + αF −1 (1 − α).
α

For F −1 (α) ≤ x0 ≤ F −1 (1 − α), we have


1−α
s − δx0 (F −1 (s)) F (x0 ) 1−α
s−1
Z Z Z
s
ds = ds + ds
α f (F −1 (s)) α f (F −1 (s)) F (x0 ) f (F −1 (s))
= ...
= x0 − W (α).

Finally, for x0 > F −1 (α) we have


Z 1−α Z 1−α
s
IF(x0 , Ts , F )ds = ds = . . . = F −1 (1 − α) − W (α).
α α f (F −1 (s))

Putting everything together,


 −1
F (α) − W (α), x0 < F −1 (α)
1 
IF(x0 , Tα , F ) = x0 − W (α), F −1 (α) ≤ x0 ≤ F −1 (1 − α)
1 − 2α 
 −1
F (1 − α) − W (α), x0 > F −1 (1 − α).

Compare this IF with the IF of the Huber M-estimator of location. What do you observe?

3
Left-continuity of the quantile function
For a right-continuous nondecreasing cumulative distribution function F , define

F −1 (y) = inf{x ∈ R : F (x) ≥ y}.


Prove that F −1 is left-continuous.
Solution:

Observe first that since F is monotone nondecreasing, F −1 inherits this property and we have
F −1 (y) ≤ F −1 (y ′ ) for any y ≤ y ′ . Furthermore, monotone functions can only have jump disconti-
nuities, so that it suffices to show

sup F −1 (y) = F −1 (y0 ). (3)


y<y0

To prove (3) notice that by the monotonicity of F −1 we have F −1 (y) ≤ F (y0 ) for any y < y0 .
Hence, also

sup F −1 (y) ≤ F −1 (y0 ).


y<y0

It therefore suffices to prove supy<y0 F −1 (y) ≥ F −1 (y0 ) in order to establish (3). Observe next
that, for every ϵ > 0, F (F −1 (y) + ϵ) ≥ y. By right continuity of F , we thus have

F (F −1 (y)) ≥ y.

By monotonicity we also have


 
−1
F sup F (y) ≥ y, for y < y0 .
y<y0

Letting y ↑ y0 , we get
 
−1
F sup F (y) ≥ y0 ,
y<y0

so that

F −1 (y0 ) ≤ sup F −1 (y),


y<y0

as supy<y0 F −1 (y) ∈ {x : F (x) ≥ y0 } and F −1 (y0 ) is the infimum of this set. The proof is complete.

Convergence of order statistics


Suppose that F is strictly increasing at F −1 (p) for 0 < p < 1, which means that for all ϵ > 0,

F (F −1 (p) + ϵ) > p, F (F −1 (p) − ϵ) < p.

Then, we have that X(⌈np⌉) is a consistent estimator for F −1 (p), i.e.,


P
→ F −1 (p).
X(⌈np⌉) −

4
Solution:

Recall, by the definition of Fn , that for any m

X(m) ≤ y ⇐⇒ nFn (y) ≥ m. (4)

We need to prove that for all ϵ > 0,

lim Pr |X(⌈np⌉) − F −1 (p)| > ϵ = 0,



n→∞

or, equivalently,

lim Pr X(⌈np⌉) > F −1 (p)| + ϵ = 0



(5)
n→∞
lim Pr X(⌈np⌉) < F −1 (p)| − ϵ = 0

(6)
n→∞

We only prove (6), as (5) may be proved with similar arguments. Using (4) we have

Pr X(⌈np⌉) < F −1 (p) − ϵ ≤ Pr X(⌈np⌉) ≤ F −1 (p) − ϵ


 
 
−1 ⌈np⌉
= Pr Fn (F (p) − ϵ) ≥
n

Now,
   
−1 ⌈np⌉ −1 −1 ⌈np⌉ −1
Pr Fn (F (p) − ϵ) ≥ = Pr Fn (F (p) − ϵ) − F (F (p) − ϵ) ≥ − F (F (p) − ϵ) ,
n n
⌈np⌉
and n → p as n → ∞ while

P
Fn (F −1 (p) − ϵ) −
→ F (F −1 (p) − ϵ),

by the law of large numbers. At the same time, by our assumptions,

δ = p − F (F −1 (p) − ϵ) > 0.

Therefore, for all large n,

⌈np⌉ δ
− F (F −1 (p) − ϵ) ≥ > 0.
n 2
Hence,
 
−1 −1 ⌈np⌉
Pr Fn (F (p) − ϵ) − F (F (p) − ϵ) ≥ − F (F −1 (p) − ϵ) → 0,
n

as n → ∞, completing the proof.

You might also like