Multiple suboptimal solutions for prediction rules in gene expression data (Q382664): Difference between revisions
From MaRDI portal
Created a new Item |
ReferenceBot (talk | contribs) Changed an Item |
||
(7 intermediate revisions by 6 users not shown) | |||
Property / review text | |||
Summary: This paper discusses mathematical and statistical aspects in analysis methods applied to microarray gene expressions. We focus on pattern recognition to extract informative features embedded in the data for prediction of phenotypes. It has been pointed out that there are severely difficult problems due to the unbalance in the number of observed genes compared with the number of observed subjects. We make a reanalysis of microarray gene expression published data to detect many other gene sets with almost the same performance. We conclude in the current stage that it is not possible to extract only informative genes with high performance in the all observed genes. We investigate the reason why this difficulty still exists even though there are actively proposed analysis methods and learning algorithms in statistical machine learning approaches. We focus on the mutual coherence or the absolute value of the Pearson correlations between two genes and describe the distributions of the correlation for the selected set of genes and the total set. We show that the problem of finding informative genes in high dimensional data is ill-posed and that the difficulty is closely related with the mutual coherence. | |||
Property / review text: Summary: This paper discusses mathematical and statistical aspects in analysis methods applied to microarray gene expressions. We focus on pattern recognition to extract informative features embedded in the data for prediction of phenotypes. It has been pointed out that there are severely difficult problems due to the unbalance in the number of observed genes compared with the number of observed subjects. We make a reanalysis of microarray gene expression published data to detect many other gene sets with almost the same performance. We conclude in the current stage that it is not possible to extract only informative genes with high performance in the all observed genes. We investigate the reason why this difficulty still exists even though there are actively proposed analysis methods and learning algorithms in statistical machine learning approaches. We focus on the mutual coherence or the absolute value of the Pearson correlations between two genes and describe the distributions of the correlation for the selected set of genes and the total set. We show that the problem of finding informative genes in high dimensional data is ill-posed and that the difficulty is closely related with the mutual coherence. / rank | |||
Normal rank | |||
Property / Mathematics Subject Classification ID | |||
Property / Mathematics Subject Classification ID: 92D10 / rank | |||
Normal rank | |||
Property / Mathematics Subject Classification ID | |||
Property / Mathematics Subject Classification ID: 68T10 / rank | |||
Normal rank | |||
Property / Mathematics Subject Classification ID | |||
Property / Mathematics Subject Classification ID: 62H20 / rank | |||
Normal rank | |||
Property / Mathematics Subject Classification ID | |||
Property / Mathematics Subject Classification ID: 62H30 / rank | |||
Normal rank | |||
Property / zbMATH DE Number | |||
Property / zbMATH DE Number: 6231305 / rank | |||
Normal rank | |||
Property / Wikidata QID | |||
Property / Wikidata QID: Q30624006 / rank | |||
Normal rank | |||
Property / describes a project that uses | |||
Property / describes a project that uses: ElemStatLearn / rank | |||
Normal rank | |||
Property / describes a project that uses | |||
Property / describes a project that uses: PoiClaClu / rank | |||
Normal rank | |||
Property / MaRDI profile type | |||
Property / MaRDI profile type: MaRDI publication profile / rank | |||
Normal rank | |||
Property / full work available at URL | |||
Property / full work available at URL: https://doi.org/10.1155/2013/798189 / rank | |||
Normal rank | |||
Property / OpenAlex ID | |||
Property / OpenAlex ID: W2117073647 / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Q4864293 / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ <sup>1</sup> minimization / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Stable signal recovery from incomplete and inaccurate measurements / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Exploration, normalization, and summaries of high density oligonucleotide array probe level data / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Gene selection for cancer classification using support vector machines / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Regularization and Variable Selection Via the Elastic Net / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Q3093381 / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: The elements of statistical learning. Data mining, inference, and prediction / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Q4792072 / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: A boosting method for maximization of the area under the ROC curve / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: On biological validity indices for soft clustering algorithms for gene expression data / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Sparse and Redundant Representations / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Classification and clustering of sequencing data using a Poisson model / rank | |||
Normal rank | |||
links / mardi / name | links / mardi / name | ||
Latest revision as of 01:42, 7 July 2024
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Multiple suboptimal solutions for prediction rules in gene expression data |
scientific article |
Statements
Multiple suboptimal solutions for prediction rules in gene expression data (English)
0 references
21 November 2013
0 references
Summary: This paper discusses mathematical and statistical aspects in analysis methods applied to microarray gene expressions. We focus on pattern recognition to extract informative features embedded in the data for prediction of phenotypes. It has been pointed out that there are severely difficult problems due to the unbalance in the number of observed genes compared with the number of observed subjects. We make a reanalysis of microarray gene expression published data to detect many other gene sets with almost the same performance. We conclude in the current stage that it is not possible to extract only informative genes with high performance in the all observed genes. We investigate the reason why this difficulty still exists even though there are actively proposed analysis methods and learning algorithms in statistical machine learning approaches. We focus on the mutual coherence or the absolute value of the Pearson correlations between two genes and describe the distributions of the correlation for the selected set of genes and the total set. We show that the problem of finding informative genes in high dimensional data is ill-posed and that the difficulty is closely related with the mutual coherence.
0 references
0 references
0 references