On the marginal likelihood and cross-validation

DOI10.1093/BIOMET/ASZ077zbMATH Open1441.62038arXiv1905.08737OpenAlexW3001945457WikidataQ126761267 ScholiaQ126761267MaRDI QIDQ5113024FDOQ5113024

Authors: Ed Fong, Chris Holmes

Publication date: 9 June 2020

Published in: Biometrika (Search for Journal in Brave)

Abstract: In Bayesian statistics, the marginal likelihood, also known as the evidence, is used to evaluate model fit as it quantifies the joint probability of the data under the prior. In contrast, non-Bayesian models are typically compared using cross-validation on held-out data, either through

k

-fold partitioning or leave-

p

-out subsampling. We show that the marginal likelihood is formally equivalent to exhaustive leave-

p

-out cross-validation averaged over all values of

p

and all held-out test sets when using the log posterior predictive probability as the scoring rule. Moreover, the log posterior predictive is the only coherent scoring rule under data exchangeability. This offers new insight into the marginal likelihood and cross-validation and highlights the potential sensitivity of the marginal likelihood to the choice of the prior. We suggest an alternative approach using cumulative cross-validation following a preparatory training phase. Our work has connections to prequential analysis and intrinsic Bayes factors but is motivated through a different course.

Full work available at URL: https://arxiv.org/abs/1905.08737

Recommendations

zbMATH Keywords

cross-validation marginal likelihood prequential scoring

Mathematics Subject Classification ID

Foundations and philosophical topics in statistics (62A01) Sufficient statistics and fields (62B05)

Cited In (15)

This page was built for publication: On the marginal likelihood and cross-validation

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5113024)