To explain or to predict?
From MaRDI portal
Abstract: Statistical modeling is a powerful tool for developing and testing theories by way of causal explanation, prediction, and description. In many disciplines there is near-exclusive use of statistical modeling for causal explanation and the assumption that models with high explanatory power are inherently of high predictive power. Conflation between explanation and prediction is common, yet the distinction must be understood for progressing scientific knowledge. While this distinction has been recognized in the philosophy of science, the statistical literature lacks a thorough discussion of the many differences that arise in the process of modeling for an explanatory versus a predictive goal. The purpose of this article is to clarify the distinction between explanatory and predictive modeling, to discuss its sources, and to reveal the practical implications of the distinction to each step in the modeling process.
Recommendations
Cites work
- scientific article; zbMATH DE number 3483405 (Why is no real title available?)
- scientific article; zbMATH DE number 3513162 (Why is no real title available?)
- scientific article; zbMATH DE number 3549968 (Why is no real title available?)
- scientific article; zbMATH DE number 735230 (Why is no real title available?)
- scientific article; zbMATH DE number 2002520 (Why is no real title available?)
- scientific article; zbMATH DE number 1834445 (Why is no real title available?)
- scientific article; zbMATH DE number 918134 (Why is no real title available?)
- A Predictive View of the Detection and Characterization of Influential Observations in Regression Analysis
- A conversation with Hirotugu Akaike
- An Efficient Method of Estimating Seemingly Unrelated Regressions and Tests for Aggregation Bias
- An investigation of missing data methods for classification trees applied to binary response data
- Bayes Model Averaging with Selection of Regressors
- Bayes not Bust! Why Simplicity is no Problem for Bayesians
- Bayesian data analysis.
- Boosting algorithms: regularization, prediction and model fitting
- Causal diagrams for empirical research
- Causation, prediction, and search
- Common risk factors in the returns on stocks and bonds
- Decomposition of Prediction Error in Multilevel Models
- Handling missing values when applying classification models
- Heuristics of instability and stabilization in model selection
- How to Tell When Simpler, More Unified, or LessAd HocTheories will Provide More Accurate Predictions
- Information criteria and statistical modeling.
- Interaction effects in logistic regression
- Introduction to linear regression analysis.
- Investigating Causal Relations by Econometric Models and Cross-spectral Methods
- Methods and Criteria for Model Selection
- Modeling online auctions.
- Not even wrong. The failure of string theory and the search for unity in physical law.
- Predictive likelihood: A review. With comments and a rejoinder by the author
- Present Position and Potential Developments: Some Personal Views: Statistical Theory: The Prequential Approach
- Probability weights in rank-dependent utility with binary even-chance independence.
- Random forests
- Scientific method, statistical method and the speed of light.
- Simplicity, Inference and Modelling
- Specification Tests in Econometrics
- Statistical learning from a regression perspective
- Statistical modeling: The two cultures. (With comments and a rejoinder).
- The Predictive Sample Reuse Method with Applications
- The Statistical Research Group, 1942-1945
- The central role of the propensity score in observational studies for causal effects
Cited in
(79)- Methods to compute prediction intervals: a review and new results
- A random forest based approach for predicting spreads in the primary catastrophe bond market
- Parameter identifiability and model selection for sigmoid population growth models
- The balance property in neural network modelling
- Variable selection in time series forecasting using random forests
- The wisdom of crowds and transfer market values
- Omitted variable bias in GLMs of neural spiking activity
- Learning certifiably optimal rule lists for categorical data
- Variable Selection With Second-Generation P-Values
- Machine learning versus statistical modeling
- Detailed study of a moving average trading rule
- Robust estimation in canonical correlation analysis for multivariate functional data
- PBoostGA: pseudo-boosting genetic algorithm for variable ranking and selection
- Particle swarm optimization based ridge logistic estimator
- Robust multivariate functional discriminant coordinates
- Comment on: ``Models as approximations
- An endemic–epidemic beta model for time series of infectious disease proportions
- Monitoring systemic risk in the hedge fund sector
- Prediction of the Nash through penalized mixture of logistic regression models
- The heteroscedastic graded response model with a skewed latent trait: testing statistical and substantive hypotheses related to skewed item category functions
- Variable selection -- a review and recommendations for the practicing statistician
- Pitfalls and merits of cointegration-based mortality models
- Discriminant coordinates analysis for multivariate functional data
- RandGA: injecting randomness into parallel genetic algorithm for variable selection
- scientific article; zbMATH DE number 52015 (Why is no real title available?)
- Quantifying simulator discrepancy in discrete-time dynamical simulators
- Measuring the Stability of Results From Supervised Statistical Learning
- On the exploration of regression dependence structures in multidimensional contingency tables with ordinal response variables
- Handling co-dependence issues in resampling-based variable selection procedures: a simulation study
- Sequential event prediction
- Controlling the error probabilities of model selection information criteria using bootstrapping
- scientific article; zbMATH DE number 7625186 (Why is no real title available?)
- Multivariate analysis of variance for functional data
- The growing ubiquity of algorithms in society: implications, impacts and innovations
- Selected statistical methods of data analysis for multivariate functional data
- Explanation, prediction, description, and information theory
- ‘The COM‐Poisson model for count data: a survey of methods and applications’ by K. Sellers, S. Borle and G. Shmueli
- On exploratory analytic method for multi-way contingency tables with an ordinal response variable and categorical explanatory variables
- Distributional regression for demand forecasting in e-grocery
- On stability issues in deriving multivariable regression models
- The Need for More Emphasis on Prediction: A “Nondenominational” Model-Based Approach
- Interpretable classifiers using rules and Bayesian analysis: building a better stroke prediction model
- A Tale of Two Matrix Factorizations
- Games with second-order expected utility
- Rejoinder to: Probability estimation with machine learning methods for dichotomous and multicategory outcome
- What makes a VRP solution good? The generation of problem-specific knowledge for heuristics
- Models for understanding versus models for prediction
- A Bayesian perspective of statistical machine learning for big data
- An empirical comparison of popular structure learning algorithms with a view to gene network inference
- Bayesian hierarchical rule modeling for predicting medical conditions
- A novel bagging approach for variable ranking and selection via a mixed importance measure
- A novel completeness test for leakage models and its application to side channel attacks and responsibly engineered simulators
- An evolutionary estimation procedure for generalized semilinear regression trees
- A comparison of full model specification and backward elimination of potential confounders when estimating marginal and conditional causal effects on binary outcomes from observational data
- A neutral comparison of algorithms to minimize \(L_0\) penalties for high-dimensional variable selection
- Predicting class switch recombination in B-cells from antibody repertoire data
- Propensity-based standardization to enhance the validation and interpretation of prediction model discrimination for a target population
- Confidence, prediction, and tolerance in linear mixed models
- Some models are useful, but how do we know which ones? Towards a unified Bayesian model taxonomy
- A Survey of Differentially Private Regression for Clinical and Epidemiological Research
- The InterModel Vigorish as a Lens for understanding (and quantifying) the value of item response models for dichotomously coded items
- Flexible model-based non-negative matrix factorization with application to mutational signatures
- Explainable AI for operational research: a defining framework, methods, applications, and a research agenda
- Information criteria for model selection
- Post-estimation shrinkage in full and selected linear regression models in low-dimensional data revisited
- Statistical plasmode simulations-potentials, challenges and recommendations
- SUBiNN: a stacked uni- and bivariate \(k\)NN sparse ensemble
- Estimating retail demand with Poisson mixtures and out-of-sample likelihood
- A special issue on: Actual impact and future perspectives on stochastic modelling in business and industry
- Differential equations in data analysis
- Bayesian approaches to variable selection: a comparative study from practical perspectives
- Explainable ensemble trees
- A zero-inflated endemic-epidemic model with an application to measles time series in Germany
- Classification model with weighted regularization to improve the reproducibility of neuroimaging signature selection
- Analytical Problem Solving Based on Causal, Correlational and Deductive Models
- Selection of variables for multivariable models: opportunities and limitations in quantifying model stability by resampling
- Using cross-validation methods to select time series models: promises and pitfalls
- Forbidden Knowledge and Specialized Training: A Versatile Solution for the Two Main Sources of Overfitting in Linear Regression
- Conditional intensity: A powerful tool for modelling and analysing point process data
This page was built for publication: To explain or to predict?
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q906529)