Small area estimation of the homeless in Los Angeles: an application of cost-sensitive stochastic gradient boosting
From MaRDI portal
(Redirected from Publication:614137)
Abstract: In many metropolitan areas efforts are made to count the homeless to ensure proper provision of social services. Some areas are very large, which makes spatial sampling a viable alternative to an enumeration of the entire terrain. Counts are observed in sampled regions but must be imputed in unvisited areas. Along with the imputation process, the costs of underestimating and overestimating may be different. For example, if precise estimation in areas with large homeless c ounts is critical, then underestimation should be penalized more than overestimation in the loss function. We analyze data from the 2004--2005 Los Angeles County homeless study using an augmentation of stochastic gradient boosting that can weight overestimates and underestimates asymmetrically. We discuss our choice to utilize stochastic gradient boosting over other function estimation procedures. In-sample fitted and out-of-sample imputed values, as well as relationships between the response and predictors, are analyzed for various cost functions. Practical usage and policy implications of these results are discussed briefly.
Recommendations
Cites work
- scientific article; zbMATH DE number 5957364 (Why is no real title available?)
- scientific article; zbMATH DE number 3860199 (Why is no real title available?)
- scientific article; zbMATH DE number 1907572 (Why is no real title available?)
- A decision-theoretic generalization of on-line learning and an application to boosting
- Additive logistic regression: a statistical view of boosting. (With discussion and a rejoinder by the authors)
- Bagging predictors
- Boosted classification trees and class probability/quantile estimation
- Boosting With theL2Loss
- Boosting algorithms: regularization, prediction and model fitting
- Boosting with early stopping: convergence and consistency
- Can one estimate the conditional distribution of post-model-selection estimators?
- Counting the homeless in Los Angeles County
- Greedy function approximation: A gradient boosting machine.
- Least angle regression. (With discussion)
- MODEL SELECTION AND INFERENCE: FACTS AND FICTION
- Quantile regression forests
- Quantile regression.
- Random forests
- Statistical learning from a regression perspective
- Stochastic gradient boosting.
- The elements of statistical learning. Data mining, inference, and prediction
Cited in
(14)- Adaptive stochastic gradient boosting tree with composite criterion
- Mathematical optimization in classification and regression trees
- A General M-estimation Theory in Semi-Supervised Framework
- Counting the homeless in Los Angeles County
- Semi-Supervised Linear Regression
- Kernel machines with missing responses
- Inflection points in community-level homeless rates
- An empirical comparison of classification algorithms for mortgage default prediction: evidence from a distressed mortgage market
- 2-step gradient boosting approach to selectivity bias correction in tax audit: an application to the VAT gap in Italy
- Small Area Quantile Estimation
- Small area mean estimation after effect clustering
- Semi-supervised inference: general theory and estimation of means
- Nonparametric multiple expectile regression via ER-Boost
- Potential sales estimates of a new store
This page was built for publication: Small area estimation of the homeless in Los Angeles: an application of cost-sensitive stochastic gradient boosting
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q614137)