Small area estimation of the homeless in Los Angeles: an application of cost-sensitive stochastic gradient boosting
From MaRDI portal
Publication:614137
DOI10.1214/10-AOAS328zbMATH Open1202.62178arXiv1011.2890MaRDI QIDQ614137FDOQ614137
Authors: Brian Kriegler, Richard Berk
Publication date: 27 December 2010
Published in: The Annals of Applied Statistics (Search for Journal in Brave)
Abstract: In many metropolitan areas efforts are made to count the homeless to ensure proper provision of social services. Some areas are very large, which makes spatial sampling a viable alternative to an enumeration of the entire terrain. Counts are observed in sampled regions but must be imputed in unvisited areas. Along with the imputation process, the costs of underestimating and overestimating may be different. For example, if precise estimation in areas with large homeless c ounts is critical, then underestimation should be penalized more than overestimation in the loss function. We analyze data from the 2004--2005 Los Angeles County homeless study using an augmentation of stochastic gradient boosting that can weight overestimates and underestimates asymmetrically. We discuss our choice to utilize stochastic gradient boosting over other function estimation procedures. In-sample fitted and out-of-sample imputed values, as well as relationships between the response and predictors, are analyzed for various cost functions. Practical usage and policy implications of these results are discussed briefly.
Full work available at URL: https://arxiv.org/abs/1011.2890
Recommendations
Cites Work
- A decision-theoretic generalization of on-line learning and an application to boosting
- The elements of statistical learning. Data mining, inference, and prediction
- Greedy function approximation: A gradient boosting machine.
- Least angle regression. (With discussion)
- Title not available (Why is that?)
- Random forests
- Bagging predictors
- Quantile regression.
- Additive logistic regression: a statistical view of boosting. (With discussion and a rejoinder by the authors)
- Boosting algorithms: regularization, prediction and model fitting
- MODEL SELECTION AND INFERENCE: FACTS AND FICTION
- Title not available (Why is that?)
- Boosting with early stopping: convergence and consistency
- Quantile regression forests
- Boosting With theL2Loss
- Stochastic gradient boosting.
- Can one estimate the conditional distribution of post-model-selection estimators?
- Statistical learning from a regression perspective
- Title not available (Why is that?)
- Boosted classification trees and class probability/quantile estimation
- Counting the homeless in Los Angeles County
Cited In (14)
- Mathematical optimization in classification and regression trees
- A General M-estimation Theory in Semi-Supervised Framework
- Counting the homeless in Los Angeles County
- Semi-Supervised Linear Regression
- Kernel machines with missing responses
- Inflection points in community-level homeless rates
- An empirical comparison of classification algorithms for mortgage default prediction: evidence from a distressed mortgage market
- 2-step gradient boosting approach to selectivity bias correction in tax audit: an application to the VAT gap in Italy
- Small Area Quantile Estimation
- Small area mean estimation after effect clustering
- Semi-supervised inference: general theory and estimation of means
- Nonparametric multiple expectile regression via ER-Boost
- Potential sales estimates of a new store
- Adaptive stochastic gradient boosting tree with composite criterion
Uses Software
This page was built for publication: Small area estimation of the homeless in Los Angeles: an application of cost-sensitive stochastic gradient boosting
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q614137)