Instance-Based Learning Algorithms
Machine Learning
C4.5: programs for machine learning
C4.5: programs for machine learning
Machine Learning
Machine Learning
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
An evaluation of statistical spam filtering techniques
ACM Transactions on Asian Language Information Processing (TALIP)
A methodology for analyzing SAGE libraries for cancer profiling
ACM Transactions on Information Systems (TOIS)
A comparison of event models for Naive Bayes anti-spam e-mail filtering
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Improvements to Platt's SMO Algorithm for SVM Classifier Design
Neural Computation
Estimating continuous distributions in Bayesian classifiers
UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
A filter model for feature subset selection based on genetic algorithm
Knowledge-Based Systems
Feature subset selection in large dimensionality domains
Pattern Recognition
Evaluation Measures for Multi-class Subgroup Discovery
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
A dimensionality reduction based on feature quality measure
Proceedings of the First International Conference on Intelligent Interactive Technologies and Multimedia
Opcode sequences as representation of executables for data-mining-based unknown malware detection
Information Sciences: an International Journal
Integration of dense subgraph finding with feature clustering for unsupervised feature selection
Pattern Recognition Letters
Hi-index | 0.00 |
Recently developed Serial Analysis of Gene Expression (SAGE) technology enables us to simultaneously quantify the expression levels of tens of thousands of genes in a population of cells. SAGE is better than Microarray in that SAGE can monitor both known and unknown genes while Microarray can only measure known genes. SAGE gene expression profiling based cancer classification is a better choice since cancers may be due to some unknown genes. Whereas a wide range of methods has been applied to traditional Microarray based cancer classification, relatively few studies have been done on SAGE based cancer classification. In our study we evaluate popular machine learning methods (SVM, Naive Bayes, Nearest Neighbor, C4.5 and RIPPER) for classifying cancers based on SAGE data. In order to deal with the high dimensional problem, we propose to use Chi-square for tag/gene selection. Both binary classification and multicategory classification are investigated. The experiments are based on two human SAGE datasets: brain and breast. The results show that SVM and Naive Bayes are the top-performing SAGE classifiers and that Chi-square based gene selection can improve the performance of all the five classifiers investigated.