Professional Documents
Culture Documents
doi: 10+10170S0266466608080304
BRU CE E. HAN S EN
University of Wisconsin
This paper presents a set of rate of uniform consistency results for kernel estima-
tors of density functions and regressions functions+ We generalize the existing
literature by allowing for stationary strong mixing multivariate data with infinite
support, kernels with unbounded support, and general bandwidth sequences+ These
results are useful for semiparametric estimation based on a first-stage nonpara-
metric estimator+
1. INTRODUCTION
This paper presents a set of rate of uniform consistency results for kernel
estimators of density functions and regressions functions+ We generalize the
existing literature by allowing for stationary strong mixing multivariate data
with infinite support, kernels with unbounded support, and general bandwidth
sequences+
Kernel estimators were first introduced by Rosenblatt ~1956! for density esti-
mation and by Nadaraya ~1964! and Watson ~1964! for regression estimation+
The local linear estimator was introduced by Stone ~1977! and came into prom-
inence through the work of Fan ~1992, 1993!+
Uniform convergence for kernel averages has been previously considered in
a number of papers, including Peligrad ~1991!, Newey ~1994!, Andrews ~1995!,
Liebscher ~1996!, Masry ~1996!, Bosq ~1998!, Fan and Yao ~2003!, and Ango
Nze and Doukhan ~2004!+
In this paper we provide a general set of results with broad applicability+ Our
main results are the weak and strong uniform convergence of a sample average
functional+ The conditions imposed on the functional are general+ The data are
assumed to be a stationary strong mixing time series+ The support for the data
is allowed to be infinite, and our convergence is uniform over compact sets,
expanding sets, or unrestricted euclidean space+ We do not require the regres-
sion function or its derivatives to be bounded, and we allow for kernels with
This research was supported by the National Science Foundation+ I thank three referees and Oliver Linton for
helpful comments+ Address correspondence to Bruce E+ Hansen, Department of Economics, University of Wis-
consin, 1180 Observatory Drive, Madison, WI 53706-1393, USA; e-mail: bhansen@ssc+wisc+edu+
unbounded support+ The rate of decay for the bandwidth is flexible and includes
the optimal convergence rate as a special case+ Our applications include esti-
mation of multivariate densities and their derivatives, Nadaraya–Watson regres-
sion estimates, and local linear regression estimates+ We do not consider local
polynomial regression, although our main results could be applied to this appli-
cation also+
These features are useful generalizations of the existing literature+ Most papers
assume that the kernel function has truncated support, which excludes the pop-
ular Gaussian kernel+ It is also typical to demonstrate uniform convergence only
over fixed compact sets, which is sufficient for many estimation purposes but
is insufficient for many semiparametric applications+ Some papers assume that
the regression function, or certain derivatives of the regression function, is
bounded+ This may appear innocent when convergence is limited to fixed com-
pact sets but is unsatisfactory when convergence is extended to expanding or
unbounded sets+ Some papers only present convergence rates using optimal band-
width rates+ This is inappropriate for many semiparametric applications where
the bandwidth sequences may not satisfy these conditions+ Our paper avoids
these deficiencies+
Our proof method is a generalization of those in Liebscher ~1996! and Bosq
~1998!+
Section 2 presents results for a general class of functions, including a vari-
ance bound, weak uniform convergence, strong uniform convergence, and con-
vergence over unbounded sets+ Section 3 presents applications to density
estimation, Nadaraya–Watson regression, and local linear regression+ The proofs
are in the Appendix+
Regarding notation, for x ⫽ ~ x 1 , + + + , x d ! 僆 R d we set 7 x 7 ⫽ max~6 x 1 6,
+ + + , 6 x d 6!+
2. GENERAL RESULTS
Z x! ⫽
C~
nh
1
d
n
兺 Yi K
i⫽1
冉 冊
x ⫺ Xi
h
, (1)
and
2s ⫺ 2
b⬎ + (4)
s⫺2
sup f ~ x! ⱕ B0 ⬍ ` (5)
x
and
Q
Z x!! ⱕ
Var~ C~ . (8)
nh d
Z x! ⫺ EC~
Theorem 1 implies that 6 C~ Z x!6 ⫽ Op ~~nh d !⫺102 ! pointwise in x 僆 R d+
We are now interested in uniform rates+ We start by considering uniformity
over values of x in expanding sets of the form $x : 7 x7 ⱕ cn % for sequences cn
that are either bounded or diverging slowly to infinity+ To establish uniform
convergence, we need the function K~u! to be smooth+ We require that K either
has truncated support and is Lipschitz or that it has a bounded derivative with
an integrable tail+
Assumption 3 allows for most commonly used kernels, including the poly-
nomial kernel class cp ~1 ⫺ x 2 ! p , the higher order polynomial kernels of Müller
~1984! and Granovsky and Müller ~1991!, the normal kernel, and the higher
order Gaussian kernels of Wand and Schucany ~1990! and Marron and Wand
~1992!+ Assumption 3 excludes, however, the uniform kernel+ It is unlikely that
this is a necessary exclusion, as Tran ~1994! established uniform convergence
730 BRUCE E. HANSEN
THEOREM 2+ Suppose that Assumptions 1–3 hold and for some q ⬎ 0 the
mixing exponent b satisfies
b⬎
冉
1 ⫹ ~s ⫺ 1! 1 ⫹
q
d
冊
⫹d
(10)
s⫺2
and for
d
b ⫺1 ⫺ d ⫺ ⫺ ~1 ⫹ b!0~s ⫺ 1!
q
u⫽ (11)
b ⫹ 3 ⫺ d ⫺ ~1 ⫹ b!0~s ⫺ 1!
Then for
cn ⫽ O~~ln n! 10d n 102q ! (13)
and
an ⫽ 冉 冊
ln n
nh d
102
, (14)
Z x! ⫺ EC~
sup 6 C~ Z x!6 ⫽ Op ~a n !. (15)
7 x 7ⱕcn
b⬎
冉
2⫹s 3⫹
d
q
⫹d 冊 (16)
s⫺2
and for
u⫽
b 1⫺ 冉 冊 2
s
⫺
2
s
⫺3⫺
d
q
⫺d
(17)
b⫹3⫺d
fn2
⫽ O~1!. (18)
n uh d
Then for
Z x! ⫺ EC~
sup 6 C~ Z x!6 ⫽ O~a n ! (20)
7 x 7ⱕcn
Z x! ⫺ EC~
sup 6 C~ Z x!6 ⫽ Op ~a n !.
x僆R d
Z x! ⫺ EC~
sup 6 C~ Z x!6 ⫽ O~a n !
x僆R d
almost surely.
Theorems 4 and 5 show that the extension to uniformity over unrestricted
euclidean space can be made with minimal additional assumptions+ Equa-
tion ~21! is a mild tail restriction on the conditional mean and density function+
The kernel tail restriction ~22! is satisfied by the kernels discussed in Sec-
tion 2+2 for all q ⬎ 0+
3. APPLICATIONS
3.1. Density Estimation
Let X i 僆 R d be a strictly stationary time series with density f ~ x!+ Consider the
estimation of f ~ x! and its derivatives f ~r! ~ x!+ Let k~u! : R d r R denote a multi-
variate pth-order kernel function for which k ~r! ~u! satisfies Assumption 1 and
兰 6u6 p⫹r 6k~u!6du ⬍ `+ The Rosenblatt ~1956! estimator of the rth derivative
f ~r! ~ x! is
f Z ~r! ~ x! ⫽
1 n
兺 k ~r!
nh d⫹r i⫽1
冉 冊
x ⫺ Xi
h
,
where h is a bandwidth+
We first consider uniform convergence in probability+
d
b ⬎ 1⫹ ⫹ d, (23)
q
UNIFORM CONVERGENCE RATES 733
Suppose that supx f ~ x! ⬍ ` and there is some j * ⬍ ` such that for all j ⱖ j * ,
supx 0 , x j fj ~ x 0 , x j ! ⬍ ` where fj ~ x 0 , x j ! denotes the joint density of $X 0 , X j %.
Assume that the pth derivative of f ~r! ~ x! is uniformly continuous. Then for any
sequence cn satisfying (13),
The optimal convergence rate (by selecting the bandwidth h optimally) can be
obtained when
b ⬎ 1⫹d⫹
d
q
⫹
d
p⫹r
冉 冊
2⫹
d
2q
(26)
and is
冊 . (27)
then
b⬎3⫹d⫹
d
q
⫹
p⫹r
d
冉 冊
3⫹
d
2q
is
冊 (28)
almost surely.
Alternative results for strong uniform convergence for kernel density esti-
mates have been provided by Peligrad ~1991!, Liebscher ~1996, Thms+ 4+2 and
4+3!, Bosq ~1998, Thm+ 2+2 and Cor+ 2+2!, and Ango Nze and Doukhan ~2004!+
Theorem 6 contains Liebscher’s result as the special case r ⫽ 0 and q ⫽ `, and
he restricts attention to kernels with bounded support+ Peligrad imposes r-mixing
and bounded cn + Bosq restricts attention to geometric strong mixing+
m~ x! ⫽ E~Yi 6 X i ⫽ x!+
冉 冊
兺 Yi k
i⫽1
x ⫺ Xi
h
兺 冉 冊
[ x! ⫽
m~ ,
n x ⫺ Xi
k
i⫽1 h
where h is a bandwidth+
dn ⫽ inf f ~ x! ⬎ 0,
6 x 6ⱕcn
a n* ⫽ 冉 冊log n
nh d
102
⫹ h 2, (29)
then
[ x! ⫺ m~ x!6 ⫽ Op ~dn⫺1 a n* !.
sup 6 m~ (30)
6 x 6ⱕcn
[ x! ⫺ m~ x!6 ⫽ Op dn⫺1
sup 6 m~
6 x 6ⱕcn
冉 冉 冊 冊 ln n
n
20~d⫹4!
. (31)
冢 冣冢 冣
n n ⫺1 n
冉 冊 兺 ki 兺 兺 ki Yi
ji' k i
K x!
m~ i⫽1 i⫽1 i⫽1
⫽ +
mK ~1! ~ x! n n n
兺 ji ki i⫽1
i⫽1
兺 ji ji' k i 兺 ji ki Yi
i⫽1
736 BRUCE E. HANSEN
Let k~u! be a multivariate symmetric kernel function for which 兰 6u6 4 6k~u!6du ⬍
` and the functions k~u!, uk~u!, and uu ' k~u! satisfy Assumptions 1 and 3+
K x! ⫺ m~ x!6 ⫽ Op ~dn⫺2 a n* !.
sup 6 m~
6 x 6ⱕcn
K x! ⫺ m~ x!6 ⫽ O~dn⫺2 a n* !
sup 6 m~
6 x 6ⱕcn
almost surely.
These are the same rates as for the Nadaraya–Watson estimator, except the
penalty term for expanding cn has been strengthened to dn⫺2 + When cn is fixed
the convergence rate is Stone’s optimal rate+
Alternative uniform convergence results for pth-order local polynomial esti-
mators with fixed cn have been provided by Masry ~1996! and Fan and Yao
~2003, Thm+ 6+5!+ Fan and Yao restrict attention to d ⫽ 1+ Masry allows d ⱖ 1
but assumes that ~ p ⫹ 1! derivatives of m~ x! are uniformly bounded ~second
derivatives in the case of local linear estimation!+ Instead, we assume that the
second derivatives of the product f ~ x!m~ x! are uniformly bounded, which is
less restrictive for the case of local linear estimation+
REFERENCES
Andrews, D+W+K+ ~1995! Nonparametric kernel estimation for semiparametric models+ Economet-
ric Theory 11, 560–596+
Ango Nze, P+ & P+ Doukhan ~2004! Weak dependence: Models and applications to econometrics+
Econometric Theory 20, 995–1045+
Bosq, D+ ~1998! Nonparametric Statistics for Stochastic Processes: Estimation and Prediction, 2nd
ed+ Lecture Notes in Statistics 110+ Springer-Verlag+
Fan, J+ ~1992! Design-adaptive nonparametric regression+ Journal of the American Statistical Asso-
ciation 87, 998–1004+
Fan, J+ ~1993! Local linear regression smoothers and their minimax efficiency+ Annals of Statistics
21, 196–216+
Fan, J+ & Q+ Yao ~2003! Nonlinear Time Series: Nonparametric and Parametric Methods+
Springer-Verlag+
Granovsky, B+L+ & H+-G+ Müller ~1991! Optimizing kernel methods: A unifying variational princi-
ple+ International Statistical Review 59, 373–388+
Liebscher, E+ ~1996! Strong convergence of sums of a-mixing random variables with applications
to density estimation+ Stochastic Processes and Their Applications 65, 69–80+
Mack, Y+P+ & B+W+ Silverman ~1982! Weak and strong uniform consistency of kernel regression
estimates+ Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete 61, 405– 415+
UNIFORM CONVERGENCE RATES 737
Marron, J+S+ & M+P+ Wand ~1992! Exact mean integrated squared error+ Annals of Statistics 20,
712–736+
Masry, E+ ~1996! Multivariate local polynomial regression for time series: Uniform strong consis-
tency and rates+ Journal of Time Series Analysis 17, 571–599+
Müller, H+-G+ ~1984! Smooth optimum kernel estimators of densities, regression curves and modes+
Annals of Statistics 12, 766–774+
Nadaraya, E+A+ ~1964! On estimating regression+ Theory of Probability and Its Applications 9,
141–142+
Newey, W+K+ ~1994! Kernel estimation of partial means and a generalized variance estimator+ Econo-
metric Theory 10, 233–253+
Peligrad, M+ ~1991! Properties of uniform consistency of the kernel estimators of density and of
regression functions under dependence conditions+ Stochastics and Stochastic Reports 40, 147–168+
Rio, E+ ~1995! The functional law of the iterated logarithm for stationary strongly mixing sequences+
Annals of Probability 23, 1188–1203+
Rosenblatt, M+ ~1956! Remarks on some non-parametric estimates of a density function+ Annals of
Mathematical Statistics 27, 832–837+
Stone, C+J+ ~1977! Consistent nonparametric regression+ Annals of Statistics 5, 595– 645+
Stone, C+J+ ~1982! Optimal global rates of convergence for nonparametric regression+ Annals of
Statistics 10, 1040–1053+
Tran, L+T+ ~1994! Density estimation for time series by histograms+ Journal of Statistical Planning
and Inference 40, 61–79+
Wand, M+P+ & W+R+ Schucany ~1990! Gaussian-based kernels+ Canadian Journal of Statistics 18,
197–204+
Watson, G+S+ ~1964! Smooth regression analysis+ Sankya, Series A 26, 359–372+
APPENDIX
Proof of Theorem 1. We start with some preliminary bounds+ First note that Assump-
tion 1 implies that for any r ⱕ s,
冕 Rd
6K~u!6 r du ⱕ KP r⫺1 m ⱕ KP s⫺1 m+ (A.1)
Second, assuming without loss of generality that B0 ⱖ 1 and B1 ⱖ 1, note that the Lr
inequality, ~5!, and ~6! imply that for any 1 ⱕ r ⱕ s
ⱕ B1r0s B0
~s⫺r!0s
ⱕ B1 B0 + (A.2)
Zi ⫽ K 冉 冊
x ⫺ Xi
h
Yi +
738 BRUCE E. HANSEN
Then for any 1 ⱕ r ⱕ s, by iterated expectations, ~A+2!, a change of variables, and ~A+1!
h ⫺d E6 Z 0 6 r ⫽ h ⫺d E E 冉 冉冨 冉 冊 K
x ⫺ X0
h
Y0 冨 6 X 冊冊
r
冕 冨 冉 冊冨 x⫺u r
⫽ h ⫺d K E~6Y0 6 r 6 X 0 ⫽ u! f ~u! du
Rd h
⫽ 冕 Rd
6K~u!6 r E~6Y0 6 r 6 X 0 ⫽ x ⫺ hu! f ~ x ⫺ hu! du
ⱕ 冕 Rd
6K~u!6 r duB1 B0
ⱕ KP s⫺1 mB1 B0
[ mT ⬍ `+ (A.3)
Finally, for j ⱖ j *, by iterated expectations, ~7!, two changes of variables, and Assump-
tion 1,
冉 冉冨 冉 冊 冉 冊
E6 Z 0 Z j 6 ⫽ E E K
x ⫺ X0
h
K
x ⫺ Xj
h 冨
Y0 Yj 6 X 0 , X j 冊冊
⫽ 冕 冕 冨 冉 冊 冉 冊冨
R d
R d
K
x ⫺ u0
h
K
x ⫺ uj
h
E~6Y0 Yj 6 6 X 0 ⫽ u 0 , X j ⫽ u j !
⫻ fj ~u 0 , u j ! du 0 du j
⫽ 冕冕
Rd Rd
6K~u 0 !K~u j !6E~6Y0 Yj 6 6 X 0 ⫽ x ⫺ hu 0 , X j ⫽ x ⫺ hu j !
⫻ fj ~ x ⫺ hu 0 , x ⫺ hu j ! du 0 du j
ⱕ h 2d 冕冕
Rd Rd
6K~u 0 !K~u j !6du 0 du j B2
ⱕ h 2d m 2 B2 + (A.4)
Assume that n is sufficiently large so that h ⫺d ⱖ j *+ We now bound the Cj separately for
j ⱕ j *, j * ⬍ j ⱕ h ⫺d , and h ⫺d ⫹ 1 ⬍ j ⬍ `+
First, for j ⱕ j *, by the Cauchy–Schwarz inequality and ~A+3! with r ⫽ 2,
6Cj 6 ⱕ E~Z 0 ⫺ EZ 0 ! 2 ⱕ EZ 02 ⱕ mh
T d+
UNIFORM CONVERGENCE RATES 739
Third, for j ⬎ h ⫺d ⫹ 1, using Davydov’s lemma, ~2!, and ~A+3! with r ⫽ s we obtain
ⱕ 6Aj ⫺b~1⫺20s! ~ mh
T d ! 20s
Z x!! ⫽
nh d Var~ C~
1
n
E 冉兺 n
i⫽1
Z i ⫺ EZ i 冊2
j* h ⫺d `
ⱕ C0 ⫹ 2 兺 6Cj 6 ⫹ 2 兺 6Cj 6 ⫹ 2 兺 6Cj 6
j⫽1 j⫽j * ⫹1 j⫽h ⫺d ⫹1
h ⫺d
ⱕ ~1 ⫹ 2j * ! mh
T d⫹2 兺 ~ m 2 B2 ⫹ mT 2 !h 2d
j⫽j * ⫹1
`
⫹2 兺 6AmT 20s j ⫺~2⫺20s! h 2d0s
j⫽h ⫺d ⫹1
12AmT 20s
ⱕ ~1 ⫹ 2j * ! mh
T d ⫹ 2~ m 2 B2 ⫹ mT 2 !h d ⫹ h d,
~s ⫺ 2!0s
where the final inequality uses the fact that for d ⬎ 1 and k ⱖ 1
`
兺 j ⫺d ⱕ
j⫽k⫹1
冕 k
`
x ⫺d dx ⫽
k 1⫺d
~d ⫺ 1!
+
冉
Q ⫽ ~1 ⫹ 2j * ! mT ⫹ 2~ m 2 B2 ⫹ mT 2 ! ⫹
12AmT 20s s
s⫺2
冊 , (A.5)
冉冨 兺 冊
冨
冢 冣
n «2 n
P Z i ⬎ « ⱕ 4 exp ⫺ ⫹4 am ,
i⫽1 nsm2 8 m
64 ⫹ «mb
m 3
R n ~ x! ⫽ C~
Z x! ⫺
1
nh d i⫽1
冉 冊
兺 Yi K
n x ⫺ Xi
h
1~6Yi 6 ⱕ tn !
⫽
1
nh d
i⫽1
n
兺 Yi K 冉 冊 x ⫺ Xi
h
1~6Yi 6 ⬎ tn !+ (A.6)
Then by a change of variables, using the region of integration, ~6!, and Assumption 1
6ER n ~ x!6 ⱕ
1
h d 冕 冨 冉 冊冨
Rd
K
x⫺u
h
E~6Y0 61~6Y0 6 ⬎ tn !6 X 0 ⫽ u! f ~u! du
⫽ 冕 Rd
6K~u!6E~6Y0 61~6Y0 6 ⬎ tn !6 X 0 ⫽ x ⫺ hu! f ~ x ⫺ hu! du
ⱕ 冕 Rd
6K~u!6E~6Y0 6 s tn⫺~s⫺1!1~6Y0 6 ⬎ tn !6 X 0 ⫽ x ⫺ hu! f ~ x ⫺ hu! du
ⱕ tn⫺~s⫺1! 冕 Rd
6K~u!6E~6Y0 6 s 6 X 0 ⫽ x ⫺ hu! f ~ x ⫺ hu! du
6R n ~ x! ⫺ ER n ~ x!6 ⫽ Op ~tn⫺~s⫺1! ! ⫽ Op ~a n !,
6K~ x 2 ! ⫺ K~ x 1 !6 ⱕ dK * ~ x 1 !, (A.8)
UNIFORM CONVERGENCE RATES 741
where K * ~u! satisfies Assumption 1+ Indeed, if K ~u! has compact support and is
Lipschitz then K * ~u! ⫽ L 1 1~7u7 ⱕ 2L!+ On the other hand, if K~u! satisfies the differ-
entiability conditions of Assumption 3, then K * ~u! ⫽ L 1~1~7u7 ⱕ 2L! ⫹ 7u ⫺ L7⫺h
1~7u7 ⬎ 2L!!+ In both cases K * ~u! is bounded and integrable and therefore satisfies
Assumption 1+
Note that for any x 僆 A j then 7 x ⫺ x j 70h ⱕ a n , and equation ~A+8! implies that if n is
large enough so that a n ⱕ L,
冨K冉
x ⫺ Xi
h
冊 冉 ⫺K
xj ⫺ Xi
h
冊冨 ⱕ an K * 冉 xj ⫺ Xi
h
冊 +
Now define
E x! ⫽
C~
nh
1
d
n
兺 Yi K *
i⫽1
冉 冊
x ⫺ Xi
h
, (A.9)
E x!6 ⱕ B1 B0
E6 C~ 冕Rd
K * ~u! du ⬍ `+
Then
Z x! ⫺ EC~
sup 6 C~ Z x!6 ⱕ 6 C~
Z x j ! ⫺ EC~
Z x j !6 ⫹ a n @6 C~
E x j !6 ⫹ E6 C~
E x j !6#
x僆A j
Z x j ! ⫺ EC~
ⱕ 6 C~ Z x j !6 ⫹ a n 6 C~
E x j ! ⫺ EC~
E x j !6 ⫹ 2a n E6 C~
E x j !6
ⱕ 6 C~
Z x j ! ⫺ EC~
Z x j !6 ⫹ 6 C~
E x j ! ⫺ EC~
E x j !6 ⫹ 2a n M,
the final inequality because a n ⱕ 1 for n sufficiently large and for any M ⬎ E6 C~
E x!6+
We find that
ⱕ N max P~6 C~
Z x j ! ⫺ EC~
Z x j !6 ⬎ M ! (A.10)
1ⱕjⱕN
E x j ! ⫺ EC~
⫹ N max P~6 C~ E x j !6 ⬎ M !+ (A.11)
1ⱕjⱕN
We now bound ~A+10! and ~A+11! using the same argument, as both K~u! and K * ~u!
satisfy Assumption 1, and this is the only property we will use+
Let Z i ~ x! ⫽ Yi K ~~ x ⫺ X i !0h! ⫺ EYi K ~~ x ⫺ X i !0h!+ Because 6Yi 6 ⱕ tn and
6 K~~ x ⫺ X i !0h!6 ⱕ KP it follows that 6 Z i ~ x!6 ⱕ 2tn KP [ bn + Also from Theorem 1
we have ~for n sufficiently large! the bound
742 BRUCE E. HANSEN
sup E
x
冉兺 冊m
i⫽1
Z i ~ x!
2
ⱕ Qmh d+
Set m ⫽ a ⫺1 ⫺1
n tn and note that m ⬍ n and m ⬍ «bn 04 for « ⫽ Ma n nh for n sufficiently
d
P~6 C~ Z x!6 ⬎ Ma n ! ⫽ P
Z x! ⫺ EC~ 冉冨 兺 n
i⫽1
冨
Z i ~ x! ⬎ Ma n nh d 冊
ⱕ 4 exp ⫺冉 M 2 a n2 n 2 h 2d
64Qnh ⫹ 6 KMnh
d
P d 冊 ⫹4
n
m
am
ⱕ 4 exp ⫺冉 M 2 ln n
64Q ⫹ 6 KM
P
冊 ⫹ 4Anm⫺1⫺b
the second inequality using ~2! and ~14! and the last inequality taking M ⬎ Q+ Recalling
that N ⱕ cnd h ⫺d a ⫺d
n , it follows from this and ~A+10!–~A+11! that
where
T1n ⫽ cnd h ⫺d a ⫺d
n n
⫺M0~64⫹6 KP !
(A.13)
and
Thus
and because
` ` `
兺
n⫽1
P~6Yn 6 ⬎ tn ! ⱕ 兺
n⫽1
tn⫺s E6Yn 6 s ⱕ E6Y0 6 s 兺 ~nfn !⫺1 ⬍ `,
n⫽1
using the fact that 兺 n⫽1 ~nfn !⫺1 ⬍ `, then for sufficiently large n, 6Yn 6 ⱕ tn with prob-
`
ability one+ Hence for sufficiently large n and all i ⱕ n, 6Yi 6 ⱕ tn , and thus R n ~ x! ⫽ 0
with probability one+ We have shown that
6R n ~ x! ⫺ ER n ~ x!6 ⫽ O~a n !
almost surely+ Thus, as in the proof of Theorem 2 we can assume that 6Yi 6 ⱕ tn +
Equations ~A+12!–~A+14! hold with tn ⫽ ~nfn !10s and cn ⫽ O~fn10d n 102q !+ Employing
h ⫽ O~fn⫺2 n u ! and rn ⱕ o~fn⫺102 n⫺~1⫺u!02 ! we find
⫺d
⫽ o~fn⫺1 n d02q⫹u⫹d~1⫺u!02⫺M0~64⫹6 KP ! !
⫽ o~~nfn !⫺1 !
⫽ O~~nfn !⫺1 !
by ~17! and the fact that ~1 ⫹ b!0s ⬍ ~1 ⫹ b ⫺ d !02 is implied by ~16!+ Thus
`
兺 ~T1n ⫹ T2n ! ⬍ `+
n⫽1
E x! ⫽
C~
1
nh d
n
兺 Yi K
i⫽1
冉 冊
x ⫺ Xi
h
1~7 X i 7 ⱕ cn !+ (A.15)
Observe that cn⫺q ⱕ O~a n !+ Using the region of integration, a change of variables, ~21!,
and Assumption 1,
E x!!6 ⱕ h ⫺d E 6Y0 6 K
Z x! ⫺ C~
6E~ C~ 冉 冨 冉 冊冨 x ⫺ X0
h
1~7 X 0 7 ⬎ cn ! 冊
⫽ h ⫺d 冕
7u7⬎cn
冨
E~6Y0 6 6 X 0 ⫽ u! K 冉 冊冨
x⫺u
h
f ~u! du
冕 冨 冉 冊冨
x⫺u
ⱕ h ⫺d cn⫺q 7u7 q E~6Y0 6 6 X 0 ⫽ u! K f ~u! du
Rd h
⫽ cn⫺q 冕
Rd
7 x ⫺ hu7 q E~6Y0 6 6 X 0 ⫽ x ⫺ hu! f ~ x ⫺ hu!6K~u!6du
ⱕ cn⫺q B3 m
⫽ O~a n !+ (A.16)
By Markov’s inequality
Z x! ⫺ EC~
sup 6 C~ Z x!6 ⫽ sup 6 C~
E x! ⫺ EC~
E x!6 ⫹ Op ~a n !+ (A.17)
x x
Z x! with C~
This shows that the error in replacing C~ E x! is Op ~a n !+
Suppose that cn ⬎ L, 7 x7 ⬎ 2cn , and 7 X i 7 ⱕ cn + Then 7 x ⫺ X i 7 ⱖ cn , and ~22! and
q ⱖ d imply that
K 冉 冊
x ⫺ Xi
h
ⱕ L2 冨
x ⫺ Xi
h 冨
⫺q
ⱕ L 2 h q cn⫺q ⱕ L 2 h d cn⫺q +
Therefore
E x!6 ⱕ
sup 6 C~
7 x 7⬎2cn
1
nh d i⫽1
n
兺 6Yi 6 7 xsup
7⬎2c
冨K n
冉 冊冨
x ⫺ Xi
h
1~7 X i 7 ⱕ cn !
1 n
ⱕ 兺 6Yi 6L 2 cn⫺q
n i⫽1
⫽ O~a n !
and
E x! ⫺ EC~
sup 6 C~ E x!6 ⫽ O~a n ! (A.18)
7 x 7⬎2cn
UNIFORM CONVERGENCE RATES 745
E x! ⫺ EC~
sup 6 C~ E x!6 ⫽ Op ~a n !+ (A.19)
7 x 7ⱕ2cn
E x! ⫺ EC~
sup 6 C~ E x!6 ⫽ O~a n !
x
⫽ Op h ⫺r冉 冉 冊冊 log n
nh d
102
⫽ Op 冉冉 冊 冊
log n
nh d⫹2r
102
+
E f Z ~r! ~ x! ⫽
1
h d⫹r
冉 冉 冊冊
E k ~r!
x ⫺ Xi
h
⫽
h
1
d⫹r 冕 冉 冊k ~r!
x⫺u
h
f ~u! du
⫽
1
h d 冕冉 冊
k
x⫺u
h
f ~r! ~u! du
⫽ f ~ x! ⫹ O~h p !,
where the final equality is by a pth-order Taylor series expansion and using the assumed
properties of the kernel and f ~ x!+ Together we obtain ~25!+ Equation ~27! is obtained by
setting h ⫽ ~ln n0n!10~d⫹2p⫹2r! , which is allowed when u ⫽ d0~d ⫹ 2p ⫹ 2r!+ 䡲
746 BRUCE E. HANSEN
Proof of Theorem 7. The argument is the same as for Theorem 6, except that Theo-
rem 3 is used so that the convergence holds almost surely+ 䡲
Proof of Theorem 8. Set g~ x! ⫽ m~ x! f ~ x!, g~ [ x! ⫽ ~nh d !⫺1 兺 i⫽1
n
Yi k~~ x ⫺ X i !0h!,
and f Z ~ x! ⫽ ~nh ! 兺 i⫽1 k~~ x ⫺ X i !0h!+ We can write
d ⫺1 n
[ x!
g~ [ x!0f ~ x!
g~
[ x! ⫽
m~ ⫽ + (A.20)
f Z ~ x! f Z ~ x!0f ~ x!
sup 6 f Z ~ x! ⫺ f ~ x!6 ⫽ Op ~a n* !
7 x 7ⱕcn
and therefore
f Z ~ x! ⫺ f ~ x!
冨 f ~ x! 冨 冨 冨
f Z ~ x! Op ~a n* !
sup ⫺ 1 ⫽ sup ⱕ ⱕ Op ~dn⫺1 a n* !+
7 x 7ⱕcn 7 x 7ⱕcn f ~ x! inf f ~ x!
6 x 6ⱕcn
[ x! ⫺ Eg~
sup 6 g~
7 x 7ⱕcn
[ x!6 ⫽ Op 冉冉 冊 冊
log n
nh d
102
+
We calculate that
[ x! ⫽
Eg~
1
hd
冉 冉 冊冊
E E~Y0 6 X 0 !k
x ⫺ X0
h
⫽
h
1
d 冕 冉 冊
Rd
k
x⫺u
h
m~u! f ~u! du
⫽ 冕 Rd
k~u!g~ x ⫺ hu! du
⫽ g~ x! ⫹ O~h 2 !
and thus
[ x! ⫺ g~ x!6 ⫽ Op ~a n* !+
sup 6 g~
7 x 7ⱕcn
冨 f ~ x! ⫺ m~ x! 冨 ⱕ
[ x!
g~ Op ~a n !
sup ⱕ Op ~dn⫺1 a n* !+ (A.21)
7 x 7ⱕcn inf f ~ x!
6 x 6ⱕcn
UNIFORM CONVERGENCE RATES 747
[ x!0f ~ x!
g~ m~ x! ⫹ Op ~dn⫺1 a n* !
[ x! ⫽
m~ ⫽ ⫽ m~ x! ⫹ Op ~dn⫺1 a n* !
f Z ~ x!0f ~ x! 1 ⫹ Op ~dn⫺1 a n* !
as claimed+
The optimal rate is obtained by setting h ⫽ ~ln n0n!10~d⫹4! , which is allowed when
u ⫽ d0~d ⫹ 4!, which is implied by ~11! for sufficiently large b+ 䡲
Proof of Theorem 9. The argument is the same as for Theorem 8, except that Theo-
rems 3 and 7 are used so that the convergence holds almost surely+ 䡲
Proof of Theorem 10. We can write
[ x! ⫺ S~ x! ' M~ x!⫺1 N~ x!
g~
K x! ⫽
m~ ,
f Z ~ x! ⫺ S~ x! ' M~ x!⫺1 S~ x!
where
S~ x! ⫽
nh
1
d 兺冉 冊冉 冊
n
i⫽1
x ⫺ Xi
h
k
x ⫺ Xi
h
,
M~ x! ⫽
nh
1
d 兺冉
n
i⫽1
冊冉 冊 冉 冊
x ⫺ Xi
h
x ⫺ Xi
h
'
k
x ⫺ Xi
h
,
N~ x! ⫽
nh
1
d 兺冉
n
i⫽1
冊冉 冊 x ⫺ Xi
h
k
x ⫺ Xi
h
Yi +
Defining V ⫽ 兰R d uu ' k~u! du, Theorem 2 and standard calculations imply that uni-
formly over 7 x7 ⱕ cn ,
S~ x! ⫽ hV f ~1! ~ x! ⫹ Op ~a n* !,
M~ x! ⫽ V f ~ x! ⫹ Op ~a n* !,
N~ x! ⫽ hVg ~1! ~ x! ⫹ Op ~a n* !+
f ~ x!⫺1 M~ x! ⫽ V ⫹ Op ~dn⫺1 a n* !,
and so
S~ x! ' M~ x!⫺1 S~ x!
⫽ Op ~dn⫺2 ~h ⫹ a n* ! 2 ! ⫽ Op ~dn⫺2 a n* !
f ~ x!
and
S~ x! ' M~ x!⫺1 N~ x!
⫽ Op ~dn⫺2 a n* !+
f ~ x!
Therefore
[ x! ⫺ S~ x! ' M~ x!⫺1 N~ x!
g~
f ~ x!
K x! ⫽
m~ ⫽ m~ x! ⫹ Op ~dn⫺2 a n* !
f Z ~ x! ⫺ S~ x! ' M~ x!⫺1 S~ x!
f ~ x!
uniformly over 7 x7 ⱕ cn + 䡲
Proof of Theorem 11. The argument is the same as for Theorem 10, except that
Theorems 3 and 7 are used so that the convergence holds almost surely+ 䡲