The epic story of maximum likelihood
From MaRDI portal
Publication:449800
DOI10.1214/07-STS249zbMATH Open1246.01016arXiv0804.2996OpenAlexW2056635176MaRDI QIDQ449800FDOQ449800
Authors: Stephen M. Stigler
Publication date: 1 September 2012
Published in: Statistical Science (Search for Journal in Brave)
Abstract: At a superficial level, the idea of maximum likelihood must be prehistoric: early hunters and gatherers may not have used the words ``method of maximum likelihood to describe their choice of where and how to hunt and gather, but it is hard to believe they would have been surprised if their method had been described in those terms. It seems a simple, even unassailable idea: Who would rise to argue in favor of a method of minimum likelihood, or even mediocre likelihood? And yet the mathematical history of the topic shows this ``simple idea is really anything but simple. Joseph Louis Lagrange, Daniel Bernoulli, Leonard Euler, Pierre Simon Laplace and Carl Friedrich Gauss are only some of those who explored the topic, not always in ways we would sanction today. In this article, that history is reviewed from back well before Fisher to the time of Lucien Le Cam's dissertation. In the process Fisher's unpublished 1930 characterization of conditions for the consistency and efficiency of maximum likelihood estimates is presented, and the mathematical basis of his three proofs discussed. In particular, Fisher's derivation of the information inequality is seen to be derived from his work on the analysis of variance, and his later approach via estimating functions was derived from Euler's Relation for homogeneous functions. The reaction to Fisher's work is reviewed, and some lessons drawn.
Full work available at URL: https://arxiv.org/abs/0804.2996
Recommendations
maximum likelihoodefficiencysufficiencyKarl Pearsonhistory of statisticsAbraham WaldHarold HotellingJerzy NeymanR. A. Fishersuperefficiency
Cites Work
- Asymptotic Statistics
- Defining the curvature of a statistical problem (with applications to second order efficiency)
- Title not available (Why is that?)
- Consistent Estimates Based on Partially Consistent Observations
- Title not available (Why is that?)
- Tests of Statistical Hypotheses Concerning Several Parameters When the Number of Observations is Large
- Consistency of the Maximum Likelihood Estimator in the Presence of Infinitely Many Incidental Parameters
- Title not available (Why is that?)
- Title not available (Why is that?)
- Principles of Statistical Inference
- Outline of a Theory of Statistical Estimation Based on the Classical Theory of Probability
- R. A. Fisher and the fiducial argument
- Assessing the accuracy of the maximum likelihood estimator: Observed versus expected Fisher information
- R. A. Fisher and the making of maximum likelihood 1912--1922
- Title not available (Why is that?)
- An invariant form for the prior probability in estimation problems
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Karl Pearson's theoretical errors and the advances they inspired
- Note on the Consistency of the Maximum Likelihood Estimate
- Fisher in 1921
- The geometry of asymptotic inference. With comments and a rejoinder by the author
- Maximum Likelihood: An Introduction
- Title not available (Why is that?)
- The geometry of exponential families
- Laplace's 1774 memoir on inverse probability
- On rereading R. A. Fisher
- Title not available (Why is that?)
- A history of parametric statistical inference from Bernoulli to Fisher, 1713--1935
- Title not available (Why is that?)
- R. A. Fisher in the 21st century. Invited paper presented at the 1996 R. A. Fisher lecture. (With comments).
- R. A. Fisher: an appreciation
- Maximum likelihood and decision theory
- F. Y. Edgeworth and R. A. Fisher on the efficiency of maximum likelihood estimation
- Harold Hotelling 1895--1973
- Three early papers on efficient parametric estimation
- What did Fisher mean by inverse probability in 1912--1922?
- J. H. Lambert's work on probability
- Biometrika centenary: Sample surveys
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Probability and Statistics
- Title not available (Why is that?)
- The History of Likelihood
- Thiele: Pioneer in Statistics
- Ancillary history
- Title not available (Why is that?)
- Title not available (Why is that?)
- On Fisher's Bound for Asymptotic Variances
- A Survey of Maximum Likelihood Estimation
- Statistical Estimation
- Title not available (Why is that?)
- Title not available (Why is that?)
- THE STATISTICAL UTILIZATION OF MULTIPLE MEASUREMENTS
- The Impact of R. A. Fisher on Statistics
Cited In (26)
- Monte Carlo gradient estimation in machine learning
- Big Bayes stories -- foreword
- Half a century of information geometry, part 1
- Interview with Myfanwy E. Evans: entanglements on and models of periodic minimal surfaces
- Lagrange and probability theory
- Clustering in Hilbert's projective geometry: the case studies of the probability simplex and the elliptope of correlation matrices
- R. A. Fisher and the making of maximum likelihood 1912--1922
- Suppression of metastasis by primary tumor and acceleration of metastasis following primary tumor resection: A natural law?
- Karl Pearson's theoretical errors and the advances they inspired
- Extreme value distributions: an overview of estimation and simulation
- Lectures on Entropy. I: Information-Theoretic Notions
- Maximum likelihood characterization of distributions
- On the history of maximum likelihood in relation to inverse probability and least squares.
- Stein 1956: Efficient nonparametric testing and estimation
- Superefficiency from the vantage point of computability
- What did Fisher mean by inverse probability in 1912--1922?
- Doob at Lyon: Bringing Martingales Back to France
- On closed-form expressions for the Fisher-Rao distance
- Model-based geostatistics from a Bayesian perspective: investigating area-to-point Kriging with small data sets
- On Fitting Probability Distribution to Univariate Grouped Actuarial Data with Both Group Mean and Relative Frequencies
- Analytic posteriors for Pearson's correlation coefficient
- A conversation with Stephen M. Stigler
- Maximum Likelihood Estimation and Graph Matching in Errorfully Observed Networks
- Mere Renovation is Too Little Too Late: We Need to Rethink our Undergraduate Curriculum from the Ground Up
- Testing that a local optimum of the likelihood is globally optimum using reparameterized embeddings. Applications to wavefront sensing
- The double Gaussian approximation for high frequency data
This page was built for publication: The epic story of maximum likelihood
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q449800)