Inferring decision trees using the minimum description length principle
Information and Computation
Tissue classification with gene expression profiles
RECOMB '00 Proceedings of the fourth annual international conference on Computational molecular biology
Machine Learning
Machine Learning
Machine Learning
On Estimating Probabilities in Tree Pruning
EWSL '91 Proceedings of the European Working Session on Machine Learning
Combining Divide-and-Conquer and Separate-and-Conquer for Efficient and Effective Rule Induction
ILP '99 Proceedings of the 9th International Workshop on Inductive Logic Programming
Covering vs divide-and-conquer for top-down induction of logic programs
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Hi-index | 0.01 |
Gene expression array technology has rapidly become a standard tool for biologists. Its use within areas such as diagnostics, toxicology, and genetics, calls for good methods for finding patterns and prediction models from the generated data. Rule induction is one promising candidate method due to several attractive properties such as high level of expressiveness and interpretability. In this work we investigate the use of rule induction methods for mining gene expression patterns from various cancer types. Three different rule induction methods are evalu-tedoon two public tumor tissue data sets. The methods are shown to obtain as good prediction accuracy as the best current methods, at the same time allowing for straightforward interpretation of the prediction models. These models typically consist of small sets of simple rules, which associate a few genes and expression levels with specific types of cancer. We also show that information gain is a useful measure for ranked feature selection in this domain.