On statistics, computation and scalability
From MaRDI portal
Publication:373533
DOI10.3150/12-BEJSP17zbMATH Open1273.62030arXiv1309.7804MaRDI QIDQ373533FDOQ373533
Authors: Michael Jordan
Publication date: 17 October 2013
Published in: Bernoulli (Search for Journal in Brave)
Abstract: How should statistical procedures be designed so as to be scalable computationally to the massive datasets that are increasingly the norm? When coupled with the requirement that an answer to an inferential question be delivered within a certain time budget, this question has significant repercussions for the field of statistics. With the goal of identifying "time-data tradeoffs," we investigate some of the statistical consequences of computational perspectives on scability, in particular divide-and-conquer methodology and hierarchies of convex relaxations.
Full work available at URL: https://arxiv.org/abs/1309.7804
Recommendations
Cites Work
- Bootstrap methods: another look at the jackknife
- Title not available (Why is that?)
- Minimax estimation via wavelet shrinkage
- A Simpler Approach to Matrix Completion
- Weak convergence of dependent empirical measures with application to subsampling in function spaces
- A note on methods of restoring consistency to the bootstrap
- A split-and-conquer approach for analysis of
- Computational and statistical tradeoffs via convex relaxation
- High-dimensional analysis of semidefinite relaxations for sparse principal components
- Quantum mechanical algorithms for the nonabelian hidden subgroup problem
Cited In (19)
- Relative efficiency of using summary versus individual data in random‐effects meta‐analysis
- Variational discriminant analysis with variable selection
- A general framework for penalized mixed-effects multitask learning with applications on DNA methylation surrogate biomarkers creation
- Optimal sampling designs for multidimensional streaming time series with application to power grid sensor data
- Variable selection for distributed sparse regression under memory constraints
- Distributed Censored Quantile Regression
- Eigenvector-based sparse canonical correlation analysis: fast computation for estimation of multiple canonical vectors
- A Large-Scale Constrained Joint Modeling Approach for Predicting User Activity, Engagement, and Churn With Application to Freemium Mobile Games
- A communication-efficient method for ℓ0 regularization linear regression models
- Statistical and computational tradeoff in genetic algorithm-based estimation
- A MOM-based ensemble method for robustness, subsampling and hyperparameter tuning
- Nonconvex Dantzig selector and its parallel computing algorithm
- Robust classification via MOM minimization
- Joint integrative analysis of multiple data sources with correlated vector outcomes
- Distributed sequential estimation procedures
- Discussion of ``Analysis of spatio-temporal mobile phone data: a case study in the metropolitan area of Milan by Piercesare Secchi, Simone Vantini and Valeria Vitelli
- Distributed statistical estimation and rates of convergence in normal approximation
- Parallel-and-stream accelerator for computationally fast supervised learning
- A divide-and-conquer algorithm for core-periphery identification in large networks
This page was built for publication: On statistics, computation and scalability
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q373533)