Regularized target encoding outperforms traditional methods in supervised machine learning with high cardinality features
DOI10.1007/s00180-022-01207-6zbMath1505.62313arXiv2104.00629OpenAlexW3142848995MaRDI QIDQ2095774
Florian Pfisterer, Bernd Bischl, Florian Pargent, Janek Thomas
Publication date: 15 November 2022
Published in: Computational Statistics (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/2104.00629
benchmarkgeneralized linear mixed modelssupervised machine learningdummy encodinghigh-cardinality categorical featurestarget encoding
Computational methods for problems pertaining to statistics (62-08) Classification and discrimination; cluster analysis (statistical aspects) (62H30) Learning and adaptive systems in artificial intelligence (68T05)
Uses Software
Cites Work
- Additive structure in qualitative data: An alternating least squares method with optimal scaling features
- Regression with qualitative and quantitative variables: An alternating least squares method with optimal scaling features
- Inference for the generalization error
- Benchmark for filter methods for feature selection in high-dimensional classification data
- On the necessity and design of studies comparing statistical methods
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
This page was built for publication: Regularized target encoding outperforms traditional methods in supervised machine learning with high cardinality features