On the conditional distributions of low-dimensional projections from high-dimensional data

DOI10.1214/12-AOS1081zbMATH Open1360.62371arXiv1304.5943OpenAlexW2049955185MaRDI QIDQ355082FDOQ355082

Publication date: 24 July 2013

Published in: The Annals of Statistics (Search for Journal in Brave)

Abstract: We study the conditional distribution of low-dimensional projections from high-dimensional data, where the conditioning is on other low-dimensional projections. To fix ideas, consider a random d-vector Z that has a Lebesgue density and that is standardized so that

m a t h b b E Z = 0

and

m a t h b b E Z Z^{'} = I_{d}

. Moreover, consider two projections defined by unit-vectors

a l p h a

and

, namely a response

y = a l p h a^{'} Z

and an explanatory variable

. It has long been known that the conditional mean of y given x is approximately linear in x

u n d e r s o m e r e g u l a r i t y c o n d i t i o n s; c f . H a l l a n d L i [A n n . S t a t i s t .21 (1993) 867 - 889] . H o w e v e r, a c o r r e s p o n d i n g r e s u l t f o r t h e c o n d i t i o n a l v a r i a n c e h a s n o t b e e n a v a i l a b l e s o f a r . W e h e r e s h o w t h a t t h e c o n d i t i o n a l v a r i a n c e o f y g i v e n x i s a p p r o x i m a t e l y c o n s t a n t i n x (a g a i n, u n d e r s o m e r e g u l a r i t y c o n d i t i o n s) . T h e s e r e s u l t s h o l d u n i f o r m l y i n

alpha

a n d f o r m o s t

�eta$'s, provided only that the dimension of Z is large. In that sense, we see that most linear submodels of a high-dimensional overall model are approximately correct. Our findings provide new insights in a variety of modeling scenarios. We discuss several examples, including sliced inverse regression, sliced average variance estimation, generalized linear models under potential link violation, and sparse linear modeling.

Full work available at URL: https://arxiv.org/abs/1304.5943

Recommendations

zbMATH Keywords

dimension reduction regression high-dimensional models small sample size

Mathematics Subject Classification ID

Multivariate analysis (62H99) Asymptotic distribution theory in statistics (62E20) Estimation in multivariate analysis (62H12)

Cites Work

Cited In (7)

This page was built for publication: On the conditional distributions of low-dimensional projections from high-dimensional data

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q355082)