Abstract: Statistical modeling is a powerful tool for developing and testing theories by way of causal explanation, prediction, and description. In many disciplines there is near-exclusive use of statistical modeling for causal explanation and the assumption that models with high explanatory power are inherently of high predictive power. Conflation between explanation and prediction is common, yet the distinction must be understood for progressing scientific knowledge. While this distinction has been recognized in the philosophy of science, the statistical literature lacks a thorough discussion of the many differences that arise in the process of modeling for an explanatory versus a predictive goal. The purpose of this article is to clarify the distinction between explanatory and predictive modeling, to discuss its sources, and to reveal the practical implications of the distinction to each step in the modeling process.
Recommendations
Cites work
- scientific article; zbMATH DE number 3483405 (Why is no real title available?)
- scientific article; zbMATH DE number 3513162 (Why is no real title available?)
- scientific article; zbMATH DE number 3549968 (Why is no real title available?)
- scientific article; zbMATH DE number 735230 (Why is no real title available?)
- scientific article; zbMATH DE number 2002520 (Why is no real title available?)
- scientific article; zbMATH DE number 1834445 (Why is no real title available?)
- scientific article; zbMATH DE number 918134 (Why is no real title available?)
- A Predictive View of the Detection and Characterization of Influential Observations in Regression Analysis
- A conversation with Hirotugu Akaike
- An Efficient Method of Estimating Seemingly Unrelated Regressions and Tests for Aggregation Bias
- An investigation of missing data methods for classification trees applied to binary response data
- Bayes Model Averaging with Selection of Regressors
- Bayes not Bust! Why Simplicity is no Problem for Bayesians
- Bayesian data analysis.
- Boosting algorithms: regularization, prediction and model fitting
- Causal diagrams for empirical research
- Causation, prediction, and search
- Common risk factors in the returns on stocks and bonds
- Decomposition of Prediction Error in Multilevel Models
- Handling missing values when applying classification models
- Heuristics of instability and stabilization in model selection
- How to Tell When Simpler, More Unified, or LessAd HocTheories will Provide More Accurate Predictions
- Information criteria and statistical modeling.
- Interaction effects in logistic regression
- Introduction to linear regression analysis.
- Investigating Causal Relations by Econometric Models and Cross-spectral Methods
- Methods and Criteria for Model Selection
- Modeling online auctions.
- Not even wrong. The failure of string theory and the search for unity in physical law.
- Predictive likelihood: A review. With comments and a rejoinder by the author
- Present Position and Potential Developments: Some Personal Views: Statistical Theory: The Prequential Approach
- Probability weights in rank-dependent utility with binary even-chance independence.
- Random forests
- Scientific method, statistical method and the speed of light.
- Simplicity, Inference and Modelling
- Specification Tests in Econometrics
- Statistical learning from a regression perspective
- Statistical modeling: The two cultures. (With comments and a rejoinder).
- The Predictive Sample Reuse Method with Applications
- The Statistical Research Group, 1942-1945
- The central role of the propensity score in observational studies for causal effects
Cited in
(79)- Games with second-order expected utility
- A Bayesian perspective of statistical machine learning for big data
- Measuring the Stability of Results From Supervised Statistical Learning
- Particle swarm optimization based ridge logistic estimator
- Robust multivariate functional discriminant coordinates
- Explainable AI for operational research: a defining framework, methods, applications, and a research agenda
- The wisdom of crowds and transfer market values
- Handling co-dependence issues in resampling-based variable selection procedures: a simulation study
- On the exploration of regression dependence structures in multidimensional contingency tables with ordinal response variables
- Using cross-validation methods to select time series models: promises and pitfalls
- Forbidden Knowledge and Specialized Training: A Versatile Solution for the Two Main Sources of Overfitting in Linear Regression
- Bayesian approaches to variable selection: a comparative study from practical perspectives
- Models for understanding versus models for prediction
- What makes a VRP solution good? The generation of problem-specific knowledge for heuristics
- Learning certifiably optimal rule lists for categorical data
- Robust estimation in canonical correlation analysis for multivariate functional data
- Variable selection in time series forecasting using random forests
- The growing ubiquity of algorithms in society: implications, impacts and innovations
- Estimating retail demand with Poisson mixtures and out-of-sample likelihood
- A special issue on: Actual impact and future perspectives on stochastic modelling in business and industry
- An evolutionary estimation procedure for generalized semilinear regression trees
- Controlling the error probabilities of model selection information criteria using bootstrapping
- scientific article; zbMATH DE number 52015 (Why is no real title available?)
- ‘The COM‐Poisson model for count data: a survey of methods and applications’ by K. Sellers, S. Borle and G. Shmueli
- Differential equations in data analysis
- Selected statistical methods of data analysis for multivariate functional data
- On stability issues in deriving multivariable regression models
- SUBiNN: a stacked uni- and bivariate \(k\)NN sparse ensemble
- On exploratory analytic method for multi-way contingency tables with an ordinal response variable and categorical explanatory variables
- Distributional regression for demand forecasting in e-grocery
- Variable selection -- a review and recommendations for the practicing statistician
- A novel bagging approach for variable ranking and selection via a mixed importance measure
- Prediction of the Nash through penalized mixture of logistic regression models
- A zero-inflated endemic-epidemic model with an application to measles time series in Germany
- Bayesian hierarchical rule modeling for predicting medical conditions
- Selection of variables for multivariable models: opportunities and limitations in quantifying model stability by resampling
- The heteroscedastic graded response model with a skewed latent trait: testing statistical and substantive hypotheses related to skewed item category functions
- Quantifying simulator discrepancy in discrete-time dynamical simulators
- The balance property in neural network modelling
- Information criteria for model selection
- Machine learning versus statistical modeling
- A Tale of Two Matrix Factorizations
- Some models are useful, but how do we know which ones? Towards a unified Bayesian model taxonomy
- A Survey of Differentially Private Regression for Clinical and Epidemiological Research
- Statistical plasmode simulations-potentials, challenges and recommendations
- Rejoinder to: Probability estimation with machine learning methods for dichotomous and multicategory outcome
- Pitfalls and merits of cointegration-based mortality models
- Conditional intensity: A powerful tool for modelling and analysing point process data
- A comparison of full model specification and backward elimination of potential confounders when estimating marginal and conditional causal effects on binary outcomes from observational data
- A neutral comparison of algorithms to minimize \(L_0\) penalties for high-dimensional variable selection
- Predicting class switch recombination in B-cells from antibody repertoire data
- Propensity-based standardization to enhance the validation and interpretation of prediction model discrimination for a target population
- Confidence, prediction, and tolerance in linear mixed models
- Omitted variable bias in GLMs of neural spiking activity
- PBoostGA: pseudo-boosting genetic algorithm for variable ranking and selection
- Classification model with weighted regularization to improve the reproducibility of neuroimaging signature selection
- A novel completeness test for leakage models and its application to side channel attacks and responsibly engineered simulators
- A random forest based approach for predicting spreads in the primary catastrophe bond market
- Parameter identifiability and model selection for sigmoid population growth models
- RandGA: injecting randomness into parallel genetic algorithm for variable selection
- An endemic–epidemic beta model for time series of infectious disease proportions
- The Need for More Emphasis on Prediction: A “Nondenominational” Model-Based Approach
- Monitoring systemic risk in the hedge fund sector
- Post-estimation shrinkage in full and selected linear regression models in low-dimensional data revisited
- Variable Selection With Second-Generation P-Values
- Analytical Problem Solving Based on Causal, Correlational and Deductive Models
- Sequential event prediction
- The InterModel Vigorish as a Lens for understanding (and quantifying) the value of item response models for dichotomously coded items
- An empirical comparison of popular structure learning algorithms with a view to gene network inference
- scientific article; zbMATH DE number 7625186 (Why is no real title available?)
- Explanation, prediction, description, and information theory
- Comment on: ``Models as approximations
- Detailed study of a moving average trading rule
- Explainable ensemble trees
- Multivariate analysis of variance for functional data
- Interpretable classifiers using rules and Bayesian analysis: building a better stroke prediction model
- Methods to compute prediction intervals: a review and new results
- Discriminant coordinates analysis for multivariate functional data
- Flexible model-based non-negative matrix factorization with application to mutational signatures
This page was built for publication: To explain or to predict?
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q906529)