A comparison of machine learning techniques for detection of drug target articles

Authors:
Roxana Danger;Isabel Segura-Bedmar;Paloma Martínez;Paolo Rosso
Affiliations:
Natural Language Engineering Lab. - ELiRF. Dpto. de Sistemas Informáticos y Computación, Universidad Politécnica de Valencia, Valencia, Spain;Dpto. de Informática, Universidad Carlos III de Madrid, Leganés, Madrid, Spain;Dpto. de Informática, Universidad Carlos III de Madrid, Leganés, Madrid, Spain;Natural Language Engineering Lab. - ELiRF. Dpto. de Sistemas Informáticos y Computación, Universidad Politécnica de Valencia, Valencia, Spain
Venue:
Journal of Biomedical Informatics
Year:
2010

Citing 17
Cited 2

C4.5: programs for machine learning

C4.5: programs for machine learning
Estimating attributes: analysis and extensions of RELIEF

ECML-94 Proceedings of the European conference on machine learning on Machine Learning
Approximate statistical tests for comparing supervised classification learning algorithms

Neural Computation
Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
An adaptation of Relief for attribute estimation in regression

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Background and overview for KDD Cup 2002 task 1: information extraction from biomedical articles

ACM SIGKDD Explorations Newsletter
Logistic Model Trees

Machine Learning
The relationship between Precision-Recall and ROC curves

ICML '06 Proceedings of the 23rd international conference on Machine learning
An introduction to ROC analysis

Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Implementing the iHOP concept for navigation of biomedical literature

Bioinformatics
Semantic Smoothing for Model-based Document Clustering

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
EBIMed---text crunching to gather facts for proteins from Medline

Bioinformatics
Fuzzy lattice reasoning (FLR) classifier and its application for ambient ozone estimation

International Journal of Approximate Reasoning
Top 10 algorithms in data mining

Knowledge and Information Systems
Discriminative parameter learning for Bayesian networks

Proceedings of the 25th international conference on Machine learning
Estimating continuous distributions in Bayesian classifiers

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Speeding up logistic model tree induction

PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases

Multi-objective genetic algorithm evaluation in feature selection

EMO'11 Proceedings of the 6th international conference on Evolutionary multi-criterion optimization
Using a shallow linguistic kernel for drug-drug interaction extraction

Journal of Biomedical Informatics

Quantified Score

Hi-index	0.02

Visualization

Abstract

Important progress in treating diseases has been possible thanks to the identification of drug targets. Drug targets are the molecular structures whose abnormal activity, associated to a disease, can be modified by drugs, improving the health of patients. Pharmaceutical industry needs to give priority to their identification and validation in order to reduce the long and costly drug development times. In the last two decades, our knowledge about drugs, their mechanisms of action and drug targets has rapidly increased. Nevertheless, most of this knowledge is hidden in millions of medical articles and textbooks. Extracting knowledge from this large amount of unstructured information is a laborious job, even for human experts. Drug target articles identification, a crucial first step toward the automatic extraction of information from texts, constitutes the aim of this paper. A comparison of several machine learning techniques has been performed in order to obtain a satisfactory classifier for detecting drug target articles using semantic information from biomedical resources such as the Unified Medical Language System. The best result has been achieved by a Fuzzy Lattice Reasoning classifier, which reaches 98% of ROC area measure.