Distributed statistical inference for massive data

From MaRDI portal
Publication:2054533

DOI10.1214/21-AOS2062zbMATH Open1486.62123arXiv1805.11214OpenAlexW3211347790MaRDI QIDQ2054533FDOQ2054533


Authors: Liuhua Peng, Song Xi Chen Edit this on Wikidata


Publication date: 3 December 2021

Published in: The Annals of Statistics (Search for Journal in Brave)

Abstract: This paper considers distributed statistical inference for general symmetric statistics %that encompasses the U-statistics and the M-estimators in the context of massive data where the data can be stored at multiple platforms in different locations. In order to facilitate effective computation and to avoid expensive communication among different platforms, we formulate distributed statistics which can be conducted over smaller data blocks. The statistical properties of the distributed statistics are investigated in terms of the mean square error of estimation and asymptotic distributions with respect to the number of data blocks. In addition, we propose two distributed bootstrap algorithms which are computationally effective and are able to capture the underlying distribution of the distributed statistics. Numerical simulation and real data applications of the proposed approaches are provided to demonstrate the empirical performance.


Full work available at URL: https://arxiv.org/abs/1805.11214




Recommendations




Cites Work


Cited In (19)





This page was built for publication: Distributed statistical inference for massive data

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2054533)