C4.5: programs for machine learning
C4.5: programs for machine learning
Machine learning, neural and statistical classification
Machine learning, neural and statistical classification
Machine learning, neural and statistical classification
Data Mining Using Grammar-Based Genetic Programming and Applications
Data Mining Using Grammar-Based Genetic Programming and Applications
Artificial Intelligence Review - Special issue on lazy learning
Knowledge discovery from data?
IEEE Intelligent Systems
Using a mixture of probabilistic decision trees for direct prediction of protein function
RECOMB '03 Proceedings of the seventh annual international conference on Research in computational molecular biology
Meta-Learning by Landmarking Various Learning Algorithms
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Improved Dataset Characterisation for Meta-learning
DS '02 Proceedings of the 5th International Conference on Discovery Science
Bayesian Artificial Intelligence
Bayesian Artificial Intelligence
Rule extraction from linear support vector machines
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Filtering erroneous protein annotation
Bioinformatics
Introduction to Data Mining, (First Edition)
Introduction to Data Mining, (First Edition)
A survey of interestingness measures for knowledge discovery
The Knowledge Engineering Review
Rule Extraction from Recurrent Neural Networks: A Taxonomy and Review
Neural Computation
A new discrete particle swarm algorithm applied to attribute selection in a bioinformatics data set
Proceedings of the 8th annual conference on Genetic and evolutionary computation
Functional bioinformatics for Arabidopsis thaliana
Bioinformatics
Bioinformatics
Transmembrane segments prediction and understanding using support vector machine and decision tree
Expert Systems with Applications: An International Journal
IEEE Transactions on Neural Networks
LEGAL-tree: a lexicographic multi-objective genetic algorithm for decision tree induction
Proceedings of the 2009 ACM symposium on Applied Computing
Evolutionary model tree induction
Proceedings of the 2010 ACM Symposium on Applied Computing
Lexicographic multi-objective evolutionary induction of decision trees
International Journal of Bio-Inspired Computation
Evolutionary model trees for handling continuous classes in machine learning
Information Sciences: an International Journal
Applying wearable solutions in dependent environments
IEEE Transactions on Information Technology in Biomedicine
Learning data structure from classes: A case study applied to population genetics
Information Sciences: an International Journal
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Inducing decision trees with an ant colony optimization algorithm
Applied Soft Computing
Comprehensible classification models: a position paper
ACM SIGKDD Explorations Newsletter
Hi-index | 0.00 |
The literature on protein function prediction is currently dominated by works aimed at maximizing predictive accuracy, ignoring the important issues of validation and interpretation of discovered knowledge, which can lead to new insights and hypotheses that are biologically meaningful and advance the understanding of protein functions by biologists. The overall goal of this paper is to critically evaluate this approach, offering a refreshing new perspective on this issue, focusing not only on predictive accuracy but also on the comprehensibility of the induced protein function prediction models. More specifically, this paper aims to offer two main contributions to the area of protein function prediction. First, it presents the case for discovering comprehensible protein function prediction models from data, discussing in detail the advantages of such models, namely, increasing the confidence of the biologist in the system's predictions, leading to new insights about the data and the formulation of new biological hypotheses, and detecting errors in the data. Second, it presents a critical review of the pros and cons of several different knowledge representations that can be used in order to support the discovery of comprehensible protein function prediction models.