Bootstrap approximation of nearest neighbor regression function estimates (Q756890)

Let (X,Y) be a random vector in the plane and denote by \(m(x)=E(Y | X=x)\) the corresponding regression function. Let \((X_ i,Y_ i)^ n_{i=1}\) be an i.i.d. sample with distribution function H, \(F_ n\) the empirical d.f. of the X sample and \(H_ n\) the bivariate empirical d.f. of the (X,Y) sample. To estimate m(x) at a point \(x_ 0\), \textit{S. S. Yang} [J. Am. Stat. Assoc. 76, 658-662 (1981; Zbl 0475.62031)] introduced the smoothed nearest neighbor type estimate \[ m_ n(x_ 0)=a_ n^{- 1}\int y K[a_ n^{-1}(F_ n(x_ 0)-F_ n(x))]H_ n(dx,dy), \] where \(\{a_ n\}\) denotes a sequence of bandwidths and K is a kernel function, and \textit{W. Stute} [Ann. Stat. 12, 917-926 (1984; Zbl 0539.62026)] proved a central limit theorem for this estimate. But, for example, to construct a confidence interval for \(m(x_ 0)\) in a small sample situation this result can not be used. To overcome this difficulty, the author considers the bootstrap version of \(m_ n(x_ 0)\) as follows: \[ m^*_ n(x_ 0)=a_ n^{-1}\int y K[a_ n^{-1}(F^*_ n(x_ 0)-F^*_ n(x))]H^*_ n(dx,dy), \] where \((X^*_ i,Y^*_ i)^ n_{i=1}\) is an i.i.d. sample with d.f. \(H_ n\), \(H^*_ n\) the bivariate d.f. of the \((X^*,Y^*)\) sample and \(F^*_ n\) the d.f of the \(X^*\) sample. The author shows, among others, that \[ \sup_{z}| P^*[(na_ n)^{1/2}(m^*_ n(x_ 0)-m_ n(x_ 0))\leq z]-P((na_ n)^{1/2}(m_ n(x_ 0)-\bar m_ n(x_ 0))\leq z]| \to 0 \] holds with probability one, where \[ \bar m_ n(x)=a_ n^{-1}\int y K[a_ n^{-1}(F(x_ 0)- F(x))]H(dx,dy), \] and \(P^*\) is the probability measure corresponding to the bootstrap sample.

0 references

reviewed by

Ken-ichi Yoshihara

0 references

zbMATH Keywords

bootstrap approximation

0 references

Monte Carlo

0 references

normal approximation

0 references

empirical distribution function