Tools for statistical inference. Observed data and data augmentation methods (Q1188819)

The purpose of the book under review is to give a survey of methods for the Bayesian or likelihood-based analysis of data. The author distinguishes between two types of methods: the observed data methods and the data augmentation ones. The observed data methods are applied directly to the likelihood or posterior density of the observed data. The data augmentation methods make use of the special ``missing'' data structure of the problem. They rely on an augmentation of the data which simplifies the likelihood or posterior density. The book consists of VI sections. In the first one, Introduction, examples associated with censored regression data, random randomized response, latent class analysis and hierarchical models are presented as motivation of the problems, and the techniques considered in the book are mentioned. In section 2, Observed data techniques -- normal approximation, the likelihood function, the posterior density function and the maximum likelihood method are discussed and illustrated. Next, the normal based inference is considered from the point of view of both Frequentists and Bayesians. Finally, the highest posterior density region of a given content is defined and the significance level is motivated by it, from the Bayesian point of view. ``Observed data techniques'' is the title of section 3. Here approximations based on numerical integration, Laplace expansions, Monte Carlo, composition and importance sampling are studied. The method of composition, in particular, is useful for constructing samples distributed according to \(J(y)=\int f(y| x)g(x)\,dx\) where \(g(x)\) and \(f(y| x)\) are given densities. This method is illustrated by constructing the predictive distribution. The importance sampling method is used to approximate \(J(y)\) when one cannot sample directly from \(g(x)\). Sections 4--6 (whose titles are, respectively, The EM algorithm; Data augmentation; and The Gibbs sampler) review the data augmentation methods. The principle of data augmentation states: ``Augment the observed data \(Y\) with latent data \(Z\) so that the augmented posterior distribution \(p(\theta| Y,Z)\) is ``simple''. Make use of this simplicity in maximizing/marginalizing, calculating/sampling the observed posterior \(p(\theta| Y)\).'' Several algorithms are available which make use of that principle. The simplest of them is the EM algorithm which provides the mean of normal approximation to the likelihood or the posterior density, while the Louis modification specifies the scale. The Poor Man's Data Augmentation algorithm allows for a non-normal approximation to the likelihood or posterior density. The Data Augmentation and the Gibbs Sampler approaches are iterative algorithms which, under certain regularity conditions, provide a way of improving inference based on entire posterior distribution. The SIR algorithm is a noniterative algorithm based on importance sampling ideas. All stated results are illustrated by examples.

0 references

Mathematics Subject Classification ID

62-02

0 references

0 references

0 references

0 references