Persistence curves: a canonical framework for summarizing persistence diagrams (Q2667927)

From MaRDI portal
scientific article
Language Label Description Also known as
English
Persistence curves: a canonical framework for summarizing persistence diagrams
scientific article

    Statements

    Persistence curves: a canonical framework for summarizing persistence diagrams (English)
    0 references
    0 references
    0 references
    2 March 2022
    0 references
    This paper is concerned with topological summaries of persistence diagrams obtained from the computation of persistent homology (PH). PH extracts topological information from a dataset by tracking the changes in topological features over some varying parameter; these informations are then stored as a persistence diagram (PD). However, as would be required by machine learning algorithms, the PD space is not a Hilbert space. Thus, researchers propose various methods to map PD into a Hilbert space, a step which is known as summarizing PDs resulting in topological summaries. There are two techniques employed for summarizing PDs, namely kernel functions and PD vectorization. In this paper, the latter technique is chosen to propose a unifying framework of vectorizing PDs called persistence curves (PC). Once the PC framework is established, the authors show how PC suits existing PH summaries. For clarity, the authors show the following PD summaries by means of PC: lifespan curve, life entropy curve, PD thresholding, persistence landscape, persistence silhouette and Euler characteristics curve. In the context of stability analysis, Theorem 1 is constructed which can be used to any specific PC, except for persistence landscape. Using this theorem, stability analysis is performed for numerous PCs, for example, lifespan PC is reported to be conditionally stable with respect to Wasserstein distance (\(W_1\)) and unstable under bottleneck distance (\(W_{\infty}\)). In order to achieve a stable persistence summary, the authors propose two variations of PCs called normalized and entropy-based persistence curves. With lifespan PC as an example, they show that normalized lifespan PC is stable under \(W_1\) and conditionally stable under \(W_{\infty} \). This finding leads to another question: will a PC become stable under normalization? The authors propose three conditions in order for a PC to be stable after normalization, with caveat it is conditionally stable prior normalization. In a similar fashion, they show that a life entropy curve is conditionally stable with respect to \(W_1\) and \(W_{\infty}\). The Python implementation of PC is readily available in Github: \url{https://github.com/azlawson/PersistenceCurves} Next, they authors discuss the computational efficiency, efficacy and experimental stability of the proposed PCs using two applications: parameter determination for a discrete dynamical system and image texture classification. The comparisons among random forest classification results by PCs and other TDA methods are also included. The numerical experiments suggest that the stability and classification performance might not be highly correlated. For example, persistence landscape (which has proven to be stable) rather underperformed as compared to Betti curve and persistence statistics. Under the influence of Gaussian noise, persistence image outperformed the rest of PCs, whereas persistence statistics underperformed. In the final section, the authors warn the users by showing two distinct images producing similar PDs, which in turn will result in similar PCs. In addition, they note to choose suitable vector size for retaining the information of the input. They also suggest normalized life curve for practical application based on its performance, stability and computational efficiency.
    0 references
    0 references
    0 references
    0 references
    0 references
    persistent homology
    0 references
    topological data analysis
    0 references
    persistence diagram
    0 references
    persistence curves
    0 references
    image classification
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references