A partially supervised classification approach to dominant and recessive human disease gene prediction

Authors:
Borja Calvo;Núria López-Bigas;Simon J. Furney;Pedro Larrañaga;Jose A. Lozano
Affiliations:
Intelligent Systems Group, Department of Computer Science and Artificial Intelligence, University of the Basque Country UPV-EHU, Paseo Manuel de Lardizabal 1, E-20018 San Sebastián, Spain;Research Unit on Biomedical Informatics, Universitat Pompeu Fabra, Dr. Aiguader 88, E-08003 Barcelona, Spain;Research Unit on Biomedical Informatics, Universitat Pompeu Fabra, Dr. Aiguader 88, E-08003 Barcelona, Spain;Intelligent Systems Group, Department of Computer Science and Artificial Intelligence, University of the Basque Country UPV-EHU, Paseo Manuel de Lardizabal 1, E-20018 San Sebastián, Spain;Intelligent Systems Group, Department of Computer Science and Artificial Intelligence, University of the Basque Country UPV-EHU, Paseo Manuel de Lardizabal 1, E-20018 San Sebastián, Spain
Venue:
Computer Methods and Programs in Biomedicine
Year:
2007

Citing 10
Cited 3

Bagging predictors

Machine Learning
Bayesian Network Classifiers

Machine Learning - Special issue on learning with probabilistic representations
Partially Supervised Classification of Text Documents

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Building Text Classifiers Using Positive and Unlabeled Examples

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Text classification from positive and unlabeled documents

CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Positive Sample Only Learning (PSOL) for Predicting RNA Genes in E. coli

CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
Splice site identification by idlBNs

Bioinformatics
Highly consistent patterns for inherited human diseases at the molecular level

Bioinformatics
PSoL: a positive sample only learning algorithm for finding non-coding RNA genes

Bioinformatics
Learning to classify texts using positive and unlabeled data

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence

Learning Bayesian classifiers from positive and unlabeled examples

Pattern Recognition Letters
Feature subset selection from positive and unlabelled examples

Pattern Recognition Letters
A novel computational method for predicting disease genes based on functional similarity

ICIC'10 Proceedings of the Advanced intelligent computing theories and applications, and 6th international conference on Intelligent computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The discovery of the genes involved in genetic diseases is a very important step towards the understanding of the nature of these diseases. In-lab identification is a difficult, time-consuming task, where computational methods can be very useful. In silico identification algorithms can be used as a guide in future studies. Previous works in this topic have not taken into account that no reliable sets of negative examples are available, as it is not possible to ensure that a given gene is not related to any genetic disease. In this paper, this feature of the nature of the problem is considered, and identification is approached as a partially supervised classification problem. In addition, we have performed a more specific method to identify disease genes by classifying, for the first time, genes causing dominant and recessive diseases independently. We base this separation on previous results that show that these two types of genes present differences in their sequence properties. In this paper, we have applied a new model averaging algorithm to the identification of human genes associated with both dominant and recessive Mendelian diseases.