Using analytic QP and sparseness to speed training of support vector machines
Proceedings of the 1998 conference on Advances in neural information processing systems II
Automatic Extraction of Biological Information from Scientific Text: Protein-Protein Interactions
Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
Enriching the knowledge sources used in a maximum entropy part-of-speech tagger
EMNLP '00 Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics - Volume 13
RelEx---Relation extraction using dependency parse trees
Bioinformatics
Extracting Protein-Protein Interactions from MEDLINE using the Hidden Vector State model
International Journal of Bioinformatics Research and Applications
A graph kernel for protein-protein interaction extraction
BioNLP '08 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Comparative experiments on learning information extractors for proteins and their interactions
Artificial Intelligence in Medicine
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
A hybrid approach to extract protein–protein interactions
Bioinformatics
Hi-index | 0.00 |
Automatic extraction of protein-protein interaction (PPI) information from scientific literature is important for building PPI databases, studying biological networks and discovering new biological knowledge through automatic hypothesis generation. In this paper, we present a new method for PPI extraction based on a mixture of logistic models. The method automatically clusters interaction words (words that describe the interactions of protein pairs) into groups with similar grammatical properties. Logistic models are fitted for each cluster of interaction words. Directionality of interactions is an essential piece of information for many protein interactions and important for building directed biological networks. Most of current PPI extraction methods do not extract the directional information of interactions. This is in part due to the lack of specific corpora with directionality information annotated. We introduce a new corpus, PICAD, for evaluating PPI extraction tools that includes directional annotation. The corpus is available at http://stat.fsu.edu/~jinfeng/resources/PICAD.txt. In addition, we propose an ensemble approach using logistic regression, Bayesian Networks, and SVM for identifying PPIs. We show that using an ensemble of classifiers allows us to capture different features in the text and report an F-measure of 75.7% using our new corpus.