A novel hybrid dimension reduction technique for undersized high dimensional gene expression data sets using information complexity criterion for cancer classification (Q308795): Difference between revisions

Latest revision as of 01:26, 14 November 2024

scientific article

Language	Label	Description	Also known as
English	A novel hybrid dimension reduction technique for undersized high dimensional gene expression data sets using information complexity criterion for cancer classification	scientific article

Statements

instance of

scholarly article

0 references

title

A novel hybrid dimension reduction technique for undersized high dimensional gene expression data sets using information complexity criterion for cancer classification (English)

0 references

0 references

0 references

0 references

Computational \& Mathematical Methods in Medicine

0 references

publication date

6 September 2016

0 references

review text

Summary: Gene expression data typically are large, complex, and highly noisy. Their dimension is high with several thousand genes (i.e., features) but with only a limited number of observations (i.e., samples). Although the classical principal component analysis (PCA) method is widely used as a first standard step in dimension reduction and in supervised and unsupervised classification, it suffers from several shortcomings in the case of data sets involving undersized samples, since the sample covariance matrix degenerates and becomes singular. In this paper we address these limitations within the context of probabilistic PCA (PPCA) by introducing and developing a new and novel approach using maximum entropy covariance matrix and its hybridized smoothed covariance estimators. To reduce the dimensionality of the data and to choose the number of probabilistic PCs (PPCs) to be retained, we further introduce and develop celebrated Akaike's information criterion (AIC), consistent Akaike's information criterion (CAIC), and the information theoretic measure of complexity (ICOMP) criterion of Bozdogan. Six publicly available undersized benchmark data sets were analyzed to show the utility, flexibility, and versatility of our approach with hybridized smoothed covariance matrix estimators, which do not degenerate to perform the PPCA to reduce the dimension and to carry out supervised classification of cancer groups in high dimensions.

0 references

zbMATH Keywords

principal component analysis

0 references

maximum entropy covariance matrix

0 references

hybridized smoothed covariance estimators

0 references

Akaike's information criterion

0 references

describes a project that uses

boost

0 references

MaRDI profile type

MaRDI publication profile

0 references

full work available at URL

https://doi.org/10.1155/2015/370640

0 references

cites work

Principal component analysis.

0 references

On the maximum-entropy approach to undersized samples

0 references

Probabilistic Principal Component Analysis

0 references

Q4769776

0 references

Model selection and Akaike's information criterion (AIC): The general theory and its analytical extensions

0 references

On the information-based measure of covariance complexity and its application to the evaluation of multivariate linear models

0 references

Q3286740

0 references

Q3185327

0 references

Q2974127

0 references

Empirical Bayes estimation of the multivariate normal covariance matrix

0 references

Q3878557

0 references

A well-conditioned estimator for large-dimensional covariance matrices

0 references

Q5632131

0 references

Informational complexity criteria for regression models.

0 references

Akaike's information criterion and recent developments in information complexity

0 references

Q4139463

0 references

Identifiers

zbMATH Open document ID

1344.92015

0 references

DOI

10.1155/2015/370640

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

0 references

0 references

journals/cmmm/PamukcuBC15

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:308795

@@ Property / review text @@
+Summary: Gene expression data typically are large, complex, and highly noisy. Their dimension is high with several thousand genes (i.e., features) but with only a limited number of observations (i.e., samples). Although the classical principal component analysis (PCA) method is widely used as a first standard step in dimension reduction and in supervised and unsupervised classification, it suffers from several shortcomings in the case of data sets involving undersized samples, since the sample covariance matrix degenerates and becomes singular. In this paper we address these limitations within the context of probabilistic PCA (PPCA) by introducing and developing a new and novel approach using maximum entropy covariance matrix and its hybridized smoothed covariance estimators. To reduce the dimensionality of the data and to choose the number of probabilistic PCs (PPCs) to be retained, we further introduce and develop celebrated Akaike's information criterion (AIC), consistent Akaike's information criterion (CAIC), and the information theoretic measure of complexity (ICOMP) criterion of Bozdogan. Six publicly available undersized benchmark data sets were analyzed to show the utility, flexibility, and versatility of our approach with hybridized smoothed covariance matrix estimators, which do not degenerate to perform the PPCA to reduce the dimension and to carry out supervised classification of cancer groups in high dimensions.
+Normal rank
@@ Property / Mathematics Subject Classification ID @@
+B15
@@ Property / Mathematics Subject Classification ID: 92B15 / rank @@
+Normal rank
@@ Property / Mathematics Subject Classification ID @@
+P10
@@ Property / Mathematics Subject Classification ID: 62P10 / rank @@
+Normal rank
@@ Property / Mathematics Subject Classification ID @@
+-07
@@ Property / Mathematics Subject Classification ID: 62-07 / rank @@
+Normal rank
@@ Property / zbMATH DE Number @@
+6623996
@@ Property / zbMATH DE Number: 6623996 / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+principal component analysis
@@ Property / zbMATH Keywords: principal component analysis / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+maximum entropy covariance matrix
@@ Property / zbMATH Keywords: maximum entropy covariance matrix / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+hybridized smoothed covariance estimators
@@ Property / zbMATH Keywords: hybridized smoothed covariance estimators / rank @@
+Normal rank
@@ Property / zbMATH Keywords @@
+Akaike's information criterion
@@ Property / zbMATH Keywords: Akaike's information criterion / rank @@
+Normal rank
@@ Property / Wikidata QID @@
+Q35594708
@@ Property / Wikidata QID: Q35594708 / rank @@
+Normal rank
@@ Property / describes a project that uses @@
+boost
@@ Property / describes a project that uses: boost / rank @@
+Normal rank
@@ Property / MaRDI profile type @@
+MaRDI publication profile
@@ Property / MaRDI profile type: MaRDI publication profile / rank @@
+Normal rank
@@ Property / full work available at URL @@
+https://doi.org/10.1155/2015/370640
+Normal rank
@@ Property / OpenAlex ID @@
+W2036324035
@@ Property / OpenAlex ID: W2036324035 / rank @@
+Normal rank
@@ Property / cites work @@
+Principal component analysis.
@@ Property / cites work: Principal component analysis. / rank @@
+Normal rank
@@ Property / cites work @@
+On the maximum-entropy approach to undersized samples
+Normal rank
@@ Property / cites work @@
+Probabilistic Principal Component Analysis
@@ Property / cites work: Probabilistic Principal Component Analysis / rank @@
+Normal rank
@@ Property / cites work @@
+Q4769776
@@ Property / cites work: Q4769776 / rank @@
+Normal rank
@@ Property / cites work @@
+Model selection and Akaike's information criterion (AIC): The general theory and its analytical extensions
+Normal rank
@@ Property / cites work @@
+On the information-based measure of covariance complexity and its application to the evaluation of multivariate linear models
+Normal rank
@@ Property / cites work @@
+Q3286740
@@ Property / cites work: Q3286740 / rank @@
+Normal rank
@@ Property / cites work @@
+Q3185327
@@ Property / cites work: Q3185327 / rank @@
+Normal rank
@@ Property / cites work @@
+Q2974127
@@ Property / cites work: Q2974127 / rank @@
+Normal rank
@@ Property / cites work @@
+Empirical Bayes estimation of the multivariate normal covariance matrix
+Normal rank
@@ Property / cites work @@
+Q3878557
@@ Property / cites work: Q3878557 / rank @@
+Normal rank
@@ Property / cites work @@
+A well-conditioned estimator for large-dimensional covariance matrices
+Normal rank
@@ Property / cites work @@
+Q5632131
@@ Property / cites work: Q5632131 / rank @@
+Normal rank
@@ Property / cites work @@
+Informational complexity criteria for regression models.
+Normal rank
@@ Property / cites work @@
+Akaike's information criterion and recent developments in information complexity
+Normal rank
@@ Property / cites work @@
+Q4139463
@@ Property / cites work: Q4139463 / rank @@
+Normal rank
@@ Property / DBLP publication ID @@
+journals/cmmm/PamukcuBC15
@@ Property / DBLP publication ID: journals/cmmm/PamukcuBC15 / rank @@
+Normal rank
@@ links / mardi / name / links / mardi / name @@
+Publication:308795