Functional summaries of persistence diagrams (Q2195550): Difference between revisions
From MaRDI portal
Removed claim: author (P16): Item:Q482884 |
Changed an Item |
||
Property / author | |||
Property / author: Brittany Terese Fasy / rank | |||
Normal rank |
Revision as of 07:47, 15 February 2024
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Functional summaries of persistence diagrams |
scientific article |
Statements
Functional summaries of persistence diagrams (English)
0 references
26 August 2020
0 references
Persistence diagrams/barcodes are not natural objects for statistics and machine learning and therefore it is desirable to map these objects into various functional summaries and vector representations. This paper first shortly reviews different functional summaries that have appeared. In later part some of these summaries are applied and the results compared in two data applications, simulated architectural morphology of prostate cancer and fibrin networks. The theoretical part focuses on common foundation for the different functional summaries and particularly focuses on a generalization of one of them, the generalized landscape function. This generalization allows landscape functions constructed with kernels other than the standard triangle kernel. Different kernels allow bandwidth parameters for controlling how features of a persistence diagram are captured and the claim is that important features on the diagram are captured with fewer generalized landscape functions, a form of dimensionality reduction. The theoretical part of the paper shows statistical results for functional summaries, such as sample mean converging to the population mean and confidence bands for generalized landscape functions. Two-sample test of null hypothesis, classification and clustering are addressed as well. Interestingly, in the application part, the classification performance of the generalized landscape function is shown to outperfom some other summaries and also other generalized landscape functions with different bandwidth parameters. This demonstrates the validity of the more flexible method where classification performance can be optimized with parameter tuning. This of course comes with the extra cost of having to choose a kernel and finding the optimal parameter but this is standard machine learning practice.
0 references
topological data analysis
0 references
persistent homology
0 references
persistence diagrams
0 references
functional data analysis
0 references
functional summaries
0 references
statistical inference
0 references