Regression diagnostics meets forecast evaluation: conditional calibration, reliability diagrams, and coefficient of determination
From MaRDI portal
Publication:6144424
Abstract: Model diagnostics and forecast evaluation are two sides of the same coin. A common principle is that fitted or predicted distributions ought to be calibrated or reliable, ideally in the sense of auto-calibration, where the outcome is a random draw from the posited distribution. For binary responses, this is the universal concept of reliability. For real-valued outcomes, a general theory of calibration has been elusive, despite a recent surge of interest in distributional regression and machine learning. We develop a framework rooted in probability theory, which gives rise to hierarchies of calibration, and applies to both predictive distributions and stand-alone point forecasts. In a nutshell, a prediction - distributional or single-valued - is conditionally T-calibrated if it can be taken at face value in terms of the functional T. Whenever T is defined via an identification function - as in the cases of threshold (non) exceedance probabilities, quantiles, expectiles, and moments - auto-calibration implies T-calibration. We introduce population versions of T-reliability diagrams and revisit a score decomposition into measures of miscalibration (MCB), discrimination (DSC), and uncertainty (UNC). In empirical settings, stable and efficient estimators of T-reliability diagrams and score components arise via nonparametric isotonic regression and the pool-adjacent-violators algorithm. For in-sample model diagnostics, we propose a universal coefficient of determination, ext{R}^ast = frac{ ext{DSC}- ext{MCB}}{ ext{UNC}}, that nests and reinterprets the classical in least squares (mean) regression and its natural analogue in quantile regression, yet applies to T-regression in general, with MCB , DSC , and under modest conditions.
Cites work
- scientific article; zbMATH DE number 3141644 (Why is no real title available?)
- scientific article; zbMATH DE number 53676 (Why is no real title available?)
- scientific article; zbMATH DE number 193111 (Why is no real title available?)
- scientific article; zbMATH DE number 3390139 (Why is no real title available?)
- A score regression approach to assess calibration of continuous probabilistic predictions
- Algorithms in order restricted statistical inference and the Cauchy mean value property
- An Empirical Distribution Function for Sampling with Incomplete Information
- Bias-corrected score decomposition for generalized quantiles
- Characterizing the optimal solutions to the isotonic regression problem for identifiable functionals
- Combining predictive distributions
- Conditional transformation models
- Cross-calibration of probabilistic forecasts
- Elicitability and backtesting: perspectives for banking regulation
- Elicitation of Personal Probabilities and Expectations
- Empirical Processes with Applications to Statistics
- Goodness of Fit and Related Inference Processes for Quantile Regression
- Higher order elicitability and Osband's principle
- Inconsistency of bootstrap: the Grenander estimator
- Inferences Under a Stochastic Ordering Constraint
- Isotonic Distributional Regression
- Krein condition in probabilistic moment problems
- Machine learning. The art and science of algorithms that make sense of data.
- Making and evaluating point forecasts
- Measurability of functionals and of ideal point forecasts
- Measure and probability
- Monotone least squares and isotonic quantiles
- Monotone percentile regression
- Nonparametric shape-restricted regression
- Of quantiles and expectiles: consistent scoring functions, Choquet representations and forecast rankings. With discussion and authors' reply
- On regression representations of stochastic processes
- On the distributional transform, Sklar's theorem, and the empirical copula process
- Order-sensitivity and equivariance of scoring functions
- Point forecasting and forecast evaluation with generalized Huber loss
- Predictive density and conditional confidence interval accuracy tests
- Predictive model assessment for count data
- Present Position and Potential Developments: Some Personal Views: Statistical Theory: The Prequential Approach
- Probabilistic Forecasts, Calibration and Sharpness
- Regression Quantiles
- Robust Estimation of a Location Parameter
- Sensitivity measures based on scoring functions
- Strictly Proper Scoring Rules, Prediction, and Estimation
- The asymptotic behavior of monotone percentile regression estimates
- The role of the information set for forecasting -- with applications to risk management
- Valid sequential inference on probability forecast performance
- Veridical data science
Cited in
(7)- Isotonic conditional laws
- Auto-calibration tests for discrete finite regression functions
- Decompositions of the mean continuous ranked probability score
- Tail Calibration of Probabilistic Forecasts
- Regression Recalibration by Learning PIT Map Values
- Isotonic Regression for Variance Estimation and Its Role in Mean Estimation and Model Validation
- Neural Networks for Insurance Pricing with Frequency and Severity Data: A Benchmark Study from Data Preprocessing to Technical Tariff
This page was built for publication: Regression diagnostics meets forecast evaluation: conditional calibration, reliability diagrams, and coefficient of determination
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6144424)