Nonlinear principal component analysis and its applications (Q338803)

From MaRDI portal
scientific article
Language Label Description Also known as
English
Nonlinear principal component analysis and its applications
scientific article

    Statements

    Nonlinear principal component analysis and its applications (English)
    0 references
    0 references
    0 references
    0 references
    7 November 2016
    0 references
    This book is concerned with the principle and related applications of nonlinear principal component analysis (PCA) and multiple correspondence analysis (MCA), which are useful methods for analyzing mixed measurement levels data, i.e. data that jointly include categorical (nominal and ordinal) and numerical variables. In this frame, the present monograph is divided into two parts. The first part (Nonlinear principal component analysis) consists of the Chapters 2--3 of the book, while the second part (Applications and related topics) consists of the last four chapters (Chapters 4--7). Chapter 1 (Introduction) gives a brief introduction and the outline of the monograph. In the sequel, the subject of each one of the Chapters 2--7 is briefly presented. In Chapter 2 (Nonlinear principal component analysis) initially a brief introduction to ordinary PCA is given. It is well known that ordinary PCA, which is used to reduce a large number of variables to a small number of composites, with as little loss of information as possible, suffers, among others, from the limitation that all of the variables are assumed to be scaled at the numeric level. However, it can be extended to deal with mixed measurement level data. For such an extension a quantification of qualitative data is required in order to obtain optimal scaling data. PCA with optimal scaling is referred to as nonlinear PCA. In the sequel of Chapter 2, the quantification of qualitative data is discussed, and the nonlinear PCA is introduced. It is shown that nonlinear PCA can find solutions by minimizing two types of loss functions: a lower-rank approximation and homogeneity analysis with restrictions. For this reason, two algorithms, in which the alternating least squares (ALS) algorithm is utilized, are provided: PRINCIPALS of \textit{F. W. Young} et al. [Psychometrika 43, 279--281 (1978; Zbl 0383.92001)] and the PRINCALS of \textit{A. Gifi} [Nonlinear multivariate analysis. Chichester etc.: John Wiley \& Sons (1990; Zbl 0697.62048)]. An illustration of nonlinear PCA is presented in the last section of Chapter 2. MCA is a widely used technique to analyze categorical data, which aims to reduce large sets of variables into smaller sets of components that summarize the information contained in the data. Since its purpose is similar with that of PCA, MCA can be regarded as an adaption to categorical data of PCA. Motivated by this fact, Chapter 3 (Multiple correspondence analysis) introduces MCA as a special case of nonlinear PCA. In this frame, a formulation is given in which the quantified data matrix is approximated by a lower-rank matrix using a quantification technique. A demonstration of MCA is given in the last section of Chapter 3. The applications part of the book (Part II) consists of four chapters. In each chapter one application of nonlinear PCA is discussed. The first chapter of Part II (Chapter 4, Variable selection in nonlinear principal component analysis) describes the variable selection for mixed measurement levels data. Specifically, taking into account that any measurement level multivariate data can be uniformly dealt with as numerical data in the context of PCA by the ALS with optimal scaling (as it was shown in Chapter 2), variable selection in nonlinear PCA is presented. The second chapter of Part II (Chapter 5, Sparse multiple correspondence analysis) illustrates sparse MCA. A real data example demonstrates that sparse MCA can provide simple solutions. Chapter 6 (Joint dimension reduction and clustering) illustrates the joint dimension reduction and clustering. In this context, a review of techniques that combine MCA and \(k\)-means clustering to obtain qualitative variables relationships is given. An illustration based on real data example is given. Finally, in Chapter 7 (Acceleration of convergence of the alternating least squares algorithm for nonlinear principal component analysis) an acceleration algorithm of ALS is presented. The algorithm presented uses an idea (the vector \(\varepsilon\) algorithm) from the field of numerical analysis and aims to generate a faster linear convergent sequence than the one of the algorithms presented in Chapter 2. Summarizing, this book endeavors to demonstrate the usefulness of theory and applications of the nonlinear PCA and MCA. The authors have written an interesting and high valuable book, which gives an excellent overview to the mathematical foundations and the statistical principles of its themes. At the end of each chapter, a short list of references is provided and this will help a reader wishing to pursue this area further.
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    nonlinear principal components
    0 references
    quantification of qualitative data
    0 references
    alternating least squares algorithm
    0 references
    multiple correspondence analysis
    0 references
    variable selection in nonlinear PCA
    0 references
    sparse MCA
    0 references
    dimension reduction
    0 references
    clustering
    0 references
    0 references