Professional Documents
Culture Documents
PII: S0167-7152(08)00576-2
DOI: 10.1016/j.spl.2008.12.016
Reference: STAPRO 5310
Please cite this article as: Laksaci, A., Lemdani, M., Ould-Saı̈d, E., A generalized L 1 -approach
for a kernel estimator of conditional quantile with functional regressors: Consistency and
asymptotic normality. Statistics and Probability Letters (2008), doi:10.1016/j.spl.2008.12.016
This is a PDF file of an unedited manuscript that has been accepted for publication. As a
service to our customers we are providing this early version of the manuscript. The manuscript
will undergo copyediting, typesetting, and review of the resulting proof before it is published in
its final form. Please note that during the production process errors may be discovered which
could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
IPT
CR
A generalized L1-approach for a kernel estimator of
US
and asymptotic normality
∗
Ali Laksaci Mohamed Lemdani
AN
Dép. de Mathématiques Lab. de Biomathématiques
Elias Ould-Saïd
TE
e-mail: ouldsaid@lmpa.univ-littoral.fr
EP
Keywords Asymptotic distribution · functional data · kernel estimate · robust estimation · small balls
probability.
AMS Subject classification Primary, 62G20 ; secondary 62G08 · 62G35 · 62E20.
∗
Corresponding author, Tel: (33) 320 964 933, Fax: (33) 320 964 704.
1
ACCEPTED MANUSCRIPT
IPT
1 Introduction
Let (X, Y ) be a pair of random variables (rv) in F × IR, where the space F is endowed with a semi-metric
d(·, ·) (this covers the case of semi-normed spaces of possibly infinite dimension). In this context, X can be
CR
a functional random variable. In the following, we fix a point x ∈ F and we consider a given neighborhood
Nx of x. We assume that the regular version F z of the conditional distribution function of Y given X = z
exists for any z ∈ Nx . Moreover we suppose that F x has a continuous density f x with respect to (w.r.t.)
US
Lebesgue’s measure over IR.
Now for p ∈ (0, 1), we consider the conditional quantile of order p of F x
t∈IR
where
Ψp (x, t) = IE [ψp (Y − t) |X = x]
During the last decade, thanks to progress in computer tools, it has become easier to deal with increasingly
bulky data. Therefore statistical problems related to the modelization of functional random variables have
received an increasing interest in the recent literature (see, e.g., Bosq, 2000, Ramsay and Silverman, 2002
EP
and 2005 for the linear model and Ferraty and Vieu (2006) in the nonparametric case). It is well
known that traditional statistical methods fail when dealing with functional data. Then, the goal
of this paper is to study a nonparametric estimator of tp (x) when the explanatory variable X is
C
functional.
Conditional quantiles are widely studied when the explanatory variable lies within a finite dimen-
AC
sional space and there are many references on this topic (see, e.g., Samanta (1989) for previous
results and Gannoun et al. (2003) for recent advances and references). Let us also mention Zhou
and Liang (2000) and Lin and Li (2007) where the asymptotic normality is considered in the case
of the conditional median (p = 1/2) under α-mixing and association conditions, respectively. Note
here that in our approach we get, for the median, the well-known L1 -norm estimator.
2
ACCEPTED MANUSCRIPT
IPT
However the number of papers dealing with this model, in the case of functional X, is fairly limited.
For instance, a B-spline approach is used in Cardot et al. (2004) to study the linear model of
regression on quantiles with explanatory variable taking values in a Hilbert space. The authors
CR
obtained the estimator’s L2 -convergence. In the nonparametric context, the almost complete (a.co.)1
convergence of the conditional quantile’s kernel estimator is established in Ferraty et al. (2006)
when the observations are independent and identically distributed (iid) whereas the dependent case
US
is considered in Ferraty et al. (2005) with an application to climatologic data. The asymptotic
normality of this estimator was studied in both cases (iid and strong mixing) by Ezzahrioui and
Ould-Saïd (2008). Recently, Dabo-Niang and Laksaci (2008) stated the convergence in Lp -norm
AN
under less restrictive conditions. They considered the concentration property on small balls of the
probability measure of the underlying explanatory variable, in the iid case. All the works mentioned
above use an estimation procedure based on the double-kernel method.
DM
In this paper, we estimate the conditional quantile nonparametrically, by adapting the L1 -norm
method which enjoys some robustness properties. This is done by replacing the absolute value
function by ψp (·) and is motivated by the fact that L1 is the natural space where the rvs live. To
our knowledge, this kind of result has not so far been addressed. We establish, under mild assump-
tions (which include the concentration property), the almost complete consistency and asymptotic
TE
normality of this estimator. These results complete those obtained in Ferraty et al. (2006) where
consistency with rate (but not asymptotic normality) is given for a smoothed estimator. On the
EP
other hand, using a robust approach allows us to avoid the classical small-departure-related prob-
lems which may occur for the double-kernel method. Moreover we do not have to care, in our case,
for a second smoothing parameter.
C
The paper is organized as follows. We present our estimation procedure in Section 2 before giving
the hypotheses and stating the main results in Section 3. The proofs of the auxiliary results are
AC
relegated to Section 4.
P
1
We say that a sequence Zn converges a.co. to Z if and only if, for any > 0, n IP (|Zn − Z| > ) < ∞.
3
ACCEPTED MANUSCRIPT
IPT
2 Nonparametric estimator of the conditional quantile
Let (X1 , Y1 ), . . . (Xn , Yn ) be n independent random pairs in F × IR which are identically distributed
CR
as (X, Y ). Based on this n-sample, a kernel estimate of Ψp (x, t) is given by
n
X
b p (x, t) =
Ψ Wni (x)ψp (Yi − t), ∀t ∈ IR
i=1
−1
K(h d(x, Xi )) 2
US
where Wni (x) = Pn −1
, K is a kernel function and h := hn is a sequence of positive
j=1 K(h d(x, Xj ))
real numbers which goes to zero as n goes to infinity. A natural estimator of tp (x) is then
b b p (x, t).
tp (x) = arg min Ψ (3)
t∈IR
AN
b p (x, ·) being convex with limit +∞ at both −∞ and +∞ (with great probability), it
Note here that Ψ
b p (x, ·) is piecewise linear (with increasing
has at least one minimizing value over IR. Note also that Ψ
slope) and may have a flat (zero-slope) part where it takes its minimum value. In that case, we
DM
take b
tp (x) as the smallest minimizing value (thus the lowest abscissa of the flat part). Otherwise
b
tp (x) is unique and locates the turning point from negative to positive slope. It can be shown that
(see the proof in the appendix)
tp (x) = bb
b tp (x) := inf{t ∈ IR : Fbx (t) ≥ p} (4)
TE
i=1
3 Main results
AC
In the following, the ball of center x and radius r > 0 is denoted B(x, r). Moreover, for i = 1, . . . , n,
let Ki (x) = K(h−1 d(x, Xi )) and, in view of (5), put
Fbx (t)
Fbx (t) = N (6)
FbDx
2
In the case where the denominator of Wni (x) is zero, we use the convention 0/0 = 0. We here point out that
this event has small probability under suitable condition (see Lemma 3.4 below).
4
ACCEPTED MANUSCRIPT
IPT
where
X n X n
1 1
FbNx (t) = Ki (x)1I{Yi ≤t} and FbDx = Ki (x). (7)
n IE[K1 (x)] i=1 n IE[K1 (x)] i=1
CR
We first study the consistency of our estimator before addressing the asymptotic normality issue.
3.1 Consistency
US
In this subsection, we establish the almost complete convergence of b
tp (x) to tp (x). To do that, we
consider the following hypotheses (hereafter C1 , C2 , . . . denote positive constants). Recall that we
assume that F z exists for any z ∈ Nx and that F x admits a continuous density f x .
AN
(H1) IP(X ∈ B(x, r)) =: φx (r) > 0 for all r > 0 and limr→0 φx (r) = 0.
|F x1 (t1 ) − F x2 (t2 )| ≤ C1 db (x1 , x2 ) + |t1 − t2 |k , for b, k > 0.
(H3) K is a measurable function with support [0, 1] and satisfies 0 < C2 ≤ K(·) ≤ C3 < ∞.
mild regularity condition (H2) is assumed for the distribution function. Hypotheses (H3)-(H4) are
technical conditions imposed for the brievity of proofs.
Now, in order to state the a.co. convergence of our estimate, we need the following result.
C
1/2 !
log n
sup |Fbx (t) − F x (t)| = O hb + O a.co.
t∈[tp (x)−δ, tp (x)+δ] n φx (h)
Theorem 3.1 Under the hypotheses of Proposition 3.1, if f x (tp (x)) > 0, then
1/2 !
log n
|tbp (x) − tp (x)| = O hb + O a.co.
n φx (h)
5
ACCEPTED MANUSCRIPT
IPT
Proof of Theorem 3.1. As F x is continuously differentiable with f x (tp (x)) > 0, we have for any
small enough > 0
!
X X
CR
IP tbp (x) > tp (x) + ≤ IP sup |Fbx (t) − F x (t)| ≥ f x (ξ1 ()) (8)
n n t∈[tp (x), tp (x)+]
and !
X X
P tbp (x) < tp (x) − ≤ IP sup |Fbx (t) − F x (t)| ≥ f x (ξ2 ()) (9)
US
n n t∈[tp (x)−, tp (x)]
where ξ1 () (resp. ξ2 ()) is between tp (x) and tp (x) + (resp. tp (x) − and tp (x)). Moreover, there
exists δ0 ∈]0, δ] such that
inf f x (t) ≥ C4 > 0.
AN
t∈[tp (x)−δ0 , tp (x)+δ0 ]
1/2 !
b log n
Then applying (8) and (9) with = h + , we get for a large enough n0
n φx (h)
!
DM
X X
P |tbp (x) − tp (x)| > ≤ IP sup |Fbx (t) − F x (t)| ≥ C4 < ∞.
n≥n0 n≥n0 t∈[tp (x)−δ0 , tp (x)+δ0 ]
1 h bx h i h i i F x (t) h h ii
Fbx (t) − F x (t) = FN (t) − IE FbNx (t) − F x (t)) − IE FbNx (t) − FbDx − IE FbDx
FbDx FbDx
EP
for all t ∈ IR. The proof is achieved through the following lemmas.
Lemma 3.1 (see Ferraty et al., 2006) Under Hypotheses (H1), (H3) and (H4), we have
h i 1/2 !
C
log n
FbDx − IE FbDx = O a.co.
n φx (h)
AC
Moreover
X 1
b x
IP FD < < ∞.
n
2
6
ACCEPTED MANUSCRIPT
IPT
Lemma 3.3 Under Hypotheses (H1), (H3) and (H4), we have,
h i 1/2 !
bx log n
sup FN (t) − IE FbNx (t) = O , a.co.
t∈[tp (x)−δ, tp (x)+δ] n φx (h)
CR
3.2 Asymptotic normality
Now, we study the asymptotic normality of tbp (x). We replace (H1), (H3) and (H4) by the following
US
hypotheses, respectively.
(H10 ) The concentration property (H1) holds. Moreover, there exists a function βx (·) such that
AN
∀s ∈ [0, 1], lim φx (sr)/φx (r) = βx (s).
r→0
(H30 ) The kernel K satisfies (H3) and is a differentiable function on ]0, 1[ with derivative K 0 such
DM
In what follows, we assume that x is in A = {z ∈ F, f z (tp (z)) 6= 0 and βz (·) is not identically zero}.
TE
Remark 3.1 - The function βx (·) defined in (H10 ) plays a fundamental role in the asymptotic
normality result. It permits to give the variance term explicitly. Noting that (H10 ) is fulfilled by
several small ball probability functions, we quote the following cases (which can be found in Ferraty
EP
et al. (2007)):
i) φx (h) = Cx hγ for some γ > 0 with βx (u) = uγ ,
ii) φx (h) = Cx hγ exp −Ch−p for some γ > 0 and p > 0 with βx (u) = δ1 (u) where δ1 (·) is Dirac’s
C
function,
iii) φx (h) = Cx | ln h|−1 with βx (u) = 1I]0,1] (u).
AC
- The first condition defining the set A is classical and related to a nonvanishing conditional density.
The second condition means that a small amount a concentration is needed in order to ensure
asymptotic normality.
- Finally we point out that the second rate in our Hypothesis (H40 ) is the same as the one in Masry
(2005, Corollary 2) for infinite-dimension spaces. More precisely, this assumption is used to remove
7
ACCEPTED MANUSCRIPT
IPT
the bias term. In other words, asymptotic normality could be established without this assumption
but, in that case, the result would be perturbed by the presence of the bias term.
CR
Now we are in position to give our main result.
Theorem 3.2 Under Hypotheses (H10 ), (H2), (H30 ) and (H40 ), we have
1/2
nφx (h) D
tbp (x) − tp (x) −→ N (0, 1) as n → ∞
US
σ 2 (x)
np o
IP nφx (h)σ −1 (x) tbp (x) − tp (x) ≤ u = IP tbp (x) ≤ τp (u, x)
n o n o
= IP p ≤ Fbx (τp (u, x)) + IP FbDx = 0 ∩ tbp (x) ≤ τp (u, x)
TE
n h i h io
b x) − IE Φ(u,
= IP Φ(u, b x) ≤ IE −Φ(u,
b x)
n o
+ IP FbDx = 0 ∩ tbp (x) ≤ τp (u, x) .
EP
Lemma 3.5 Under the hypotheses of Theorem 3.2, we have, for all u ∈ IR,
p h i
b x) = u + o(1)
nφx (h) [f x (tp (x))σ(x)]−1 IE −Φ(u, as n −→ ∞.
Lemma 3.6 Under the hypotheses of Theorem 3.2, we have, for all u ∈ IR,
p h i
x −1 b b D
nφx (h) [f (tp (x))σ(x)] Φ(u, x) − IE Φ(u, x) −→ N (0, 1) as n −→ ∞.
8
ACCEPTED MANUSCRIPT
IPT
Remark 3.2 In Lemma 3.5, Hypothesis (H40 ) allows to get a result with no asymptotic bias term.
Weaking this hypothesis into nφx (h) → ∞ and proceeding as in Ferraty et al. (2007) we consider
the function ζ(r) = IE F X (tp (x)) − p d(x, X) = r and suppose it to be differentiable at zero.
CR
Moreover we assume that F z is absolutely continuous for z ∈ Nx with density f z . In that case we
get the bias term
a0 (x)
ζ 0 (0) h
a1 (x)
US
where Z 1
a0 (x) = K(1) − (sK(s))0 βx (s)ds.
0
AN
4 Auxiliary results and proofs
Proof of (4)
n
X
DM
i=1
Since b
tp (x) minimizes D(·) then for any t ≥ b
tp (x) we have D(t)−D(b
tp (x)) ≥ 0 which by (11) implies
tp (x))(Fbx (t) − p) ≥ 0 and thus
(t − b
C
∀t > b
tp (x) Fbx (t) ≥ p. (12)
AC
∀s < b
tp (x) Fbx (s) < p. (13)
The equivalence between (3) and (4) is then a direct consequence of (12) and (13).
Proof of Lemma 3.2
9
ACCEPTED MANUSCRIPT
IPT
First note that K1 (x) = K1 (x)1IB(x,h) (X1 ), under (H3). We deduce from (7) and (H1)
h i 1
x b x
F (t) − IE FN (t) = IE K1 (x)1IB(x,h) (X1 ) F x (t) − F X1 (t) .
IE [K1 (x)]
CR
Then by (H2) we have
1IB(x,h) (X1 )|F x (t) − F X1 (t)| ≤ C1 hb ,
this last inequality completes the proof, since C1 does not depend on t ∈ [tp (x) − δ, tp (x) + δ].
US
Proof of Lemma 3.3
Using the compactness of [tp (x) − δ, tp (x) + δ], we can write
dn
[
[tp (x) − δ, tp (x) + δ] ⊂ ]yj − ln , yj + ln [ (14)
AN
j=1
with ln = n−1/2k and dn = O n1/2k . The monotony of IE[FNx (·)] and FbNx (·) gives, for 1 ≤ j ≤ dn
Now, from (H2) and (7) we have, for any t1 , t2 ∈ [tp (x) − δ, tp (x) + δ]
bx
IEFN (t1 ) − IEFbNx (t2 ) ≤ C1 |t1 − t2 |k .
TE
(16)
sup max
t∈[tp (x)−δ, tp (x)+δ] 1≤j≤dn z∈{yj −ln ,yj +ln }
! s
log n
Since ln = n−1/2k , it follows that, under (H4), lnk = o . Thus, it remains to show that
n φx (h)
C
s !
log n
bx
max max FN (z) − IEFbNx (z) = O , a.co. (17)
AC
10
ACCEPTED MANUSCRIPT
IPT
Set
Ki (x)1I{Yi ≤z} − IE K1 (x)1I{Y1 ≤z}
Λi (z) = .
IE [K1 (x)]
From (H1), (H3) and (H4) we get
CR
|Λi (z)| ≤ C3 /C2 φx (h) and V ar [Λi (z)] ≤ C3 /C22 φx (h).
US
s !
log n
bx
IP FN (z) − IEFbNx (z) > η ≤ 2 exp{−C7 η 2 log n}, z ∈ {yj − ln , yj + ln }, 1 ≤ j ≤ dn .
n φx (h)
AN
Therefore, using dn = O n1/2k and choosing η such that C7 η 2 = 1 + 1/k, the right-hand side of
the last inequality is the general term of a convergent series which, in view of (17), completes the
proof.
DM
i=1
= (1 − φx (h))n ≤ exp(−nφx (h)).
1
b
IE −Φ(u, x) = IE K1 (x) F X1 (τp (u, x)) − F x (tp (x))
IE[K1 (x)]
AC
11
ACCEPTED MANUSCRIPT
IPT
which yields
I1 (x) = O hb .
CR
I2 (x) = u [nφx (h)]−1/2 σ(x)f x (tp (x)) + o [nφx (h)]−1/2 .
US
Proof of Lemma 3.6.
We use Liapounov’s theorem (see Loève (1963, p. 275). In the first step we consider the asymptotics
of the variance term
h i
AN
b x) = φx (h)
nφx (h)V ar Φ(u, 2 V ar K1 (x) 1I{Y1 ≤τp (u,x)} − p
IE [K1 (x)]
2 " #
2
IE[K1 (x)] K1 (x)(1I{Y1 ≤τp (u,x)} − p)2 K 1 (x) 1I {Y1 ≤τp (u,x)} − p
= φx (h) 2 IE 2
− φx (h)IE2
IE [K1 (x)] IE[K 1 (x)] IE[K 1 (x)]
DM
IE[K12 (x)]
=: φx (h) J1 (x) + J2 (x). (19)
IE2 [K1 (x)]
Conditioning wrt X1 , we prove using the triangle inequality that, under (H2)
1I{d(x,X1 )≤h} IE 1I{Y1 ≤τp (u,x)} kX1 − p = 1I{d(x,X1 )≤h} F X1 (τp (u, x)) − F x (tp (x))
TE
where t?p (x) is between tp (x) and τp (u, x). Since f x is continuous, we deduce from Theorem 3.1 that,
under (H10 ) and (H40 )
EP
J1 (x) = (1 − 2p) + (p − p2 ).
IE[K12 (x)]
Using the same analytic arguments as for (20) allows us to obtain
AC
Next, by integrating wrt the distribution of the real rv Z := d(x, X1 ), we can show that, under
(H10 ) and (H30 ) (see Ferraty and Vieu, 2006, p. 44),
Z 1
j j
IE K1 (x) = K (1)φx (h) − (K j )0 (s)φx (sh) ds + o (φx (h)) , j = 1, 2. (23)
0
12
ACCEPTED MANUSCRIPT
IPT
From (10) it follows that
CR
Then, using (19), (21), (22) and (24), we get
h i
b x) −→ p(1 − p) a2 (x) = [f x (tp (x))σ(x)]2
nφx (h)V ar Φ(u, as n → ∞. (25)
a21 (x)
US
In the second step we focus on the asymptotic normality. For this, let
1
Mi (x) = Ki (x) 1I{Yi ≤τp (u,x)} − p .
nIE[K1 (x)]
AN
It is enough to show that for some δ > 0
n
X
IE |Mi (x) − IE [Mi (x)] |2+δ
i=1
!!(2+δ)/2 −→ 0 as n → ∞. (26)
DM
n
X
V ar Mi (x)
i=1
P
From (25) it is clear that nφx (h)V ar ( ni=1 Mi (x)) converges to [f x (tp (x))σ(x)]2 as n tends to
infinity. Therefore, to conclude the proof, it is enough to show that, after normalization, the
TE
numerator in (26) converges to 0. Since the observations are iid, using the Cr -inequality (see Loève
(1963, p. 155), we get
n
X 2+δ
(nφ(h)) (2+δ)/2
IE Mi (x) − IE [Mi (x)] ≤ 21+δ n (nφ(h))(2+δ)/2 IE |M1 (x)|2+δ + |IE [M1 (x)] |2+δ
EP
i=1
=: N1 (x) + N2 (x). (27)
Observe that, according to (23), we have, for j = 1, 2, under (H10 ), IE K1j (x) = O(φx (h)). Then,
C
φ (h) 2+δ
1+δ −δ/2 x 2+δ
≤ 2 (nφx (h)) IE K1 (x) /φx (h) −→ 0 as n → ∞
IE[K1 (x)]
13
ACCEPTED MANUSCRIPT
IPT
which concludes the proof.
Acknowledgment. The authors are grateful to an anonymous referee whose careful reading and
appropriate remarks gave us the opportunity to improve the quality of the paper and add some
CR
theoretical results.
References
US
[1] Bosq, D. 2000. Linear Processes in Function Spaces: Theory and Applications, Lecture Notes
in Statistics, 149, Springer-Verlag, New York.
AN
[2] Cardot, H., Crambes, H., Sarda, P. 2004. Estimation spline de quantiles conditionnels pour
variables explicatives fonctionnelles, C. R. Math., Paris 339, 141–144.
[3] Dabo-Niang, S., Laksaci, A. 2008. Nonparametric estimation of conditional quantile when the
DM
[4] Ezzahrioui, M., Ould-Saïd, E. 2008. Asymptotic results of a nonparametric conditional quantile
estimator for functional time series data, Comm in Statist. Theory & Methods 37 2335–2759.
TE
[5] Ferraty, F., Rabhi, A., Vieu, P. 2005. Conditional quantiles for functional dependent data with
application to the climatic El Niño phenomenon, Sankhyā 67 378–398.
EP
[6] Ferraty, F., Laksaci, A., Vieu, P. 2006. Estimating some characteristics of the conditional
distribution in nonparametric functional models, Statist. Inf. for Stoch. Proc. 9 47–76.
[7] Ferraty, F., Vieu, P. 2006. Nonparametric Functional Data Analysis. Theory and Practice,
C
[8] Ferraty, F., Mas, A. Vieu, P. 2007. Nonparametric regression on functional data: inference and
practical aspects. Aust. and New Zeal. J. of Statist. 49 267–286.
[9] Gannoun, A., Saracco, J., Yu, K. 2003. Nonparametric prediction by conditional median and
quantiles, J. Statist. Plann. Inference 117 207–223.
14
ACCEPTED MANUSCRIPT
IPT
[10] Gasser, T., Hall, P., Presnell, B. 1998. Nonparametric estimation of the mode of a distribution
of random curves. J. R. Stat. Soc., Ser. B, 60 681–691.
CR
[11] Lin, Z., Li, D. 2007. Asymptotic normality for L1 -norm kernel estimator of conditional median
under association dependence, J. Multivariate Anal. 98 1214–1230.
[12] Loève, M. 1963. Probability Theory, Third Edition, Van Nostrand, Princeton.
US
[13] Masry, E. 2005. Nonparametric regression estimation for dependent functional data: Asymp-
totic normality, Stoch. Proc. and their Appl. 115 155–177.
AN
[14] Ramsay, J.O., Silverman, B.W. 2002. Applied Functional Data Analysis: Methods and Case
Studies, Springer, New York.
[15] Ramsay, J.O., Silverman, B.W. 2005. Functional Data Analysis, Second Edition, Springer, New
DM
York.
[16] Samanta, M. 1989. Non-parametric estimation of conditional quantiles, Statist. Probab. Lett.
7 407–412.
TE
[17] Zhou, Y., Liang, H. 2000. Asymptotic normality for L1 -norm kernel estimator of conditional
median under α-mixing dependence, J. Multivariate Anal. 73 136–154.
C EP
AC
15