Clustering by Compression
DOI10.1109/TIT.2005.844059zbMATH Open1297.68097WikidataQ56269115 ScholiaQ56269115MaRDI QIDQ3546722FDOQ3546722
Authors: Rudi L. Cilibrasi, Paul M. B. Vitányi
Publication date: 21 December 2008
Published in: IEEE Transactions on Information Theory (Search for Journal in Brave)
Recommendations
Kolmogorov complexitynormalized compression distanceheterogenous data analysishierarchical unsupervised clusteringparameter-free data miningquartet tree methoduniversal dissimilarity distance
Learning and adaptive systems in artificial intelligence (68T05) Information theory (general) (94A15) Image processing (compression, reconstruction, etc.) in information and communication theory (94A08) Coding and information theory (compaction, compression, models of communication, encoding schemes, etc.) (aspects in computer science) (68P30) Algorithmic information theory (Kolmogorov complexity, etc.) (68Q30)
Cited In (71)
- INFORMATION DISTANCE AND ITS APPLICATIONS
- An information theory approach to stock market liquidity
- Approximating ( k,ℓ )-Median Clustering for Polygonal Curves
- Theoretical computer science: computational complexity
- Realism and Texture: Benchmark Problems for Natural Computation
- An all-or-nothing flavor to the Church-Turing hypothesis
- A linguistic approach to classification of bacterial genomes
- Quantum information distance
- Comparative genomics with succinct colored de Bruijn graphs
- Kolmogorov Complexity-Based Similarity Measures to Website Classification Problems: Leveraging Normalized Compression Distance
- Preliminary results on masquerader detection using compression based similarity metrics
- Pattern classification of phylogeny signals
- Artificial sequences and complexity measures
- Mining Compressing Sequential Patterns
- On Universal Transfer Learning
- Probing the quantum-classical boundary with compression software
- A philosophical treatise of universal induction
- Evaluating the Impact of Information Distortion on Normalized Compression Distance
- Application of Kolmogorov complexity and universal codes to identity testing and nonparametric testing of serial independence for time series
- An automatic and parameter-free information-based method for sparse representation in wavelet bases
- Exploring programmable self-assembly in non-DNA based molecular computing
- Algorithmic complexity bounds on future prediction errors
- On the complexity and dimension of continuous finite-dimensional maps
- Using data compressors to construct order tests for homogeneity and component independence
- The Application of Data Compression-Based Distances to Biological Sequences
- Sequence distance via parsing complexity: heartbeat signals
- Nonapproximability of the normalized information distance
- Grammar-based compression and its use in symbolic music analysis
- Open problems in universal induction \& intelligence
- A copula entropy approach to correlation measurement at the country level
- Information-theoretic method for classification of texts
- Improved metaheuristics for the quartet method of hierarchical clustering
- Compression based homogeneity testing
- Implementation and Application of Automata
- Aspects in classification learning -- review of recent developments in learning vector quantization
- A linearly computable measure of string complexity
- Between order and chaos: The quest for meaningful information
- Indefinite proximity learning: a review
- Compression-based distance between string data and its application to literary work classification based on authorship
- An exact algorithm for the minimum quartet tree cost problem
- Summarizing and understanding large graphs
- Sublinear algorithms for approximating string compressibility
- An extension of the Burrows-Wheeler transform
- A new combinatorial approach to sequence comparison
- On universal prediction and Bayesian confirmation
- Expanding the algorithmic information theory frame for applications to Earth observation
- Hydrozip: how hydrological knowledge can be used to improve compression of hydrological data
- On universal transfer learning
- Textual data compression in computational biology: algorithmic techniques
- Computable model discovery and high-level-programming approximations to algorithmic complexity
- A fast quartet tree heuristic for hierarchical clustering
- Hierarchical clustering of text documents
- Clustering with respect to the information distance
- A parametrized family of Tversky metrics connecting the Jaccard distance to an analogue of the normalized information distance
- Notes on sum-tests and independence tests
- Ranking inter-relationships between clusters
- Using ideas of Kolmogorov complexity for studying biological texts
- Similarity and denoising
- Solovay functions and their applications in algorithmic randomness
- Topographic mapping of large dissimilarity data sets
- A \textit{really} simple approximation of smallest grammar
- Normalized information-based divergences
- Distance measures for biological sequences: some recent approaches
- Clustering the normalized compression distance for influenza virus data
- The Similarity Metric
- Universal codes as a basis for time series testing
- Algorithmic relative complexity
- Temporal clustering of time series via threshold autoregressive models: application to commodity prices
- Application of data compression methods to nonparametric estimation of characteristics of discrete-time stochastic processes
- The Normalized Compression Distance Is Resistant to Noise
- Detecting life signatures with RNA sequence similarity measures
This page was built for publication: Clustering by Compression
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q3546722)