Efficient multi-class cancer diagnosis algorithm, using a global similarity pattern

Authors:
Tae Young Yang
Affiliations:
Department of Mathematics, Myongji University, Kyonggi 449-728, Republic of Korea
Venue:
Computational Statistics & Data Analysis
Year:
2009

Citing 5
Cited 4

Improving classification of microarray data using prototype-based feature selection

ACM SIGKDD Explorations Newsletter
A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression

Bioinformatics
HykGene: a hybrid approach for selecting marker genes for phenotype classification using microarray gene expression data

Bioinformatics
Clustering of gene expression data using a local shape-based similarity measure

Bioinformatics
Bayesian nearest-neighbor analysis via record value statistics and nonhomogeneous spatial Poisson processes

Computational Statistics & Data Analysis

Editorial: Computational statistics within clinical research

Computational Statistics & Data Analysis
Case-based reasoning as a decision support system for cancer diagnosis: A case study

International Journal of Hybrid Intelligent Systems - Data Mining and Hybrid Intelligent Systems
CBR System with Reinforce in the Revision Phase for the Classification of CLL Leukemia

IWANN '09 Proceedings of the 10th International Work-Conference on Artificial Neural Networks: Part II: Distributed Computing, Artificial Intelligence, Bioinformatics, Soft Computing, and Ambient Assisted Living
MicroCBR: A case-based reasoning architecture for the classification of microarray data

Applied Soft Computing

Quantified Score

Hi-index	0.03

Visualization

Abstract

Since different subtypes of a cancer respond differently to the same therapy, it is important to diagnose the cancer type of a patient correctly, and then customize the treatment for that patient. DNA microarrays have recently received a great deal of attention in cancer diagnosis. Given a microarray dataset for multiple subtypes of cancer, the proposed procedure sequentially combines a gene-rank algorithm for detecting significant genes, with a pattern-based classifier for diagnosing a query test sample. In detail, for each cancer subtype, genes are ranked to determine a characteristic pattern, and the classifier measures a similarity between the sample and its type, based on the selected top-ranked genes. The sample is then classified according to the subtype to which it is the most similar. This is different from the widely applied k-nearest neighbor approaches using local similarity patterns. The procedure utilizes reliable global patterns to classify the types in test samples. Empirical studies using public datasets show that the top-ranked genes in each subtype provide a clear means of discrimination, and the classifier uses a few significant genes to distinguish the types in the test samples correctly. The procedure is an excellent alternative to more complex approaches due to its simplicity, ease of use, and robustness.