APPLYING DATA MINING TECHNIQUES FOR CANCER CLASSIFICATION ON GENE EXPRESSION DATA
Cybernetics and Systems
Identifying significant genes with FM/CM-GA
MACMESE'09 Proceedings of the 11th WSEAS international conference on Mathematical and computational methods in science and engineering
Hi-index | 0.00 |
Recent studies on molecular level classification of tissues have produced remarkable results, and indicated that gene expression assays could significantly aid in the development of efficient cancer diagnosis and classification platforms. However, cancer classification based on the DNA array data is still a difficult problem. The main challenge is the overwhelming number of genes relative to the number of training samples. It makes accurate classification of data more difficult. This paper applies genetic algorithms (GA) with an initial solution provided by t- statistics (t-GA) for selecting a group of relevant genes from cancer microarray data. The decision tree based cancer classifier is then built on top of these selected genes. The performance of this approach is evaluated by comparing with other gene selection methods using the publicly available gene expression datasets. Experimental results indicate that t-GA has the highest accurate rate among different methods. The Z-score figure also shows that the gene selection operation provided by t-GA is reproducible.