Instance-Based Learning Algorithms
Machine Learning
Elements of information theory
Elements of information theory
C4.5: programs for machine learning
C4.5: programs for machine learning
Wrappers for feature subset selection
Artificial Intelligence - Special issue on relevance
Inductive learning algorithms and representations for text categorization
Proceedings of the seventh international conference on Information and knowledge management
Fast training of support vector machines using sequential minimal optimization
Advances in kernel methods
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Feature selection for high-dimensional genomic microarray data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Object Recognition with Informative Features and Linear Classification
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Benchmarking Attribute Selection Techniques for Discrete Class Data Mining
IEEE Transactions on Knowledge and Data Engineering
Dynamic Algorithm for Inferring Qualitative Models of Gene Regulatory Networks
CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
Fast Binary Feature Selection with Conditional Mutual Information
The Journal of Machine Learning Research
Data mining in bioinformatics using Weka
Bioinformatics
A Mathematical Theory of Communication
A Mathematical Theory of Communication
Computers in Biology and Medicine
Exploring essential attributes for detecting MicroRNA precursors from background sequences
VDMB'06 Proceedings of the First international conference on Data Mining and Bioinformatics
Generation of comprehensible hypotheses from gene expression data
BioDM'06 Proceedings of the 2006 international conference on Data Mining for Biomedical Applications
Informative MicroRNA expression patterns for cancer classification
BioDM'06 Proceedings of the 2006 international conference on Data Mining for Biomedical Applications
Hi-index | 0.01 |
In the feature selection of cancer classification problems, many existing methods consider genes individually by choosing the top genes which have the most significant signal-to-noise statistic or correlation coefficient. However the information of the class distinction provided by such genes may overlap intensively, since their gene expression patterns are similar. The redundancy of including many genes with similar gene expression patterns results in highly complex classifiers. According to the principle of Oc-camýs razor, simple models are preferable to complex ones, if they can produce comparable prediction performances to the complex ones. In this paper, we introduce a new method to learn accurate and low-complexity classifiers from gene expression profiles. In our method, we use mutual information to measure the relation between a set of genes, called gene vectors, and the class attribute of the samples. The gene vectors are in higher-dimensional spaces than individual genes, therefore, they are more diverse, or contain more information than individual genes. Hence, gene vectors are more preferable to individual genes in describing the class distinctions between samples since they contain more information about the class attribute. We validate our method on 3 gene expression profiles. By comparing our results with those from literature and other well-known classification methods, our method demonstrated better or comparable prediction performances to the existing methods, however, with lower-complexity models than existing methods.