The role of syntactic features in protein interaction extraction
Proceedings of the 2nd international workshop on Data and text mining in bioinformatics
Detecting Protein-Protein Interactions in Biomedical Texts Using a Parser and Linguistic Resources
CICLing '09 Proceedings of the 10th International Conference on Computational Linguistics and Intelligent Text Processing
Extracting complex biological events with rich graph-based feature sets
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing: Shared Task
Event extraction from trimmed dependency graphs
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing: Shared Task
Molecular event extraction from link grammar parse trees
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing: Shared Task
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
A rich feature vector for protein-protein interaction extraction from multiple corpora
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Multiple kernel learning in protein-protein interaction extraction from biomedical literature
Artificial Intelligence in Medicine
Identifying disease diagnosis factors by proximity-based mining of medical texts
ACIIDS'11 Proceedings of the Third international conference on Intelligent information and database systems - Volume Part II
Learning bayesian network using parse trees for extraction of protein-protein interaction
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 2
Hi-index | 3.84 |
Motivation: Automatic knowledge discovery and efficient information access such as named entity recognition and relation extraction between entities have recently become critical issues in the biomedical literature. However, the inherent difficulty of the relation extraction task, mainly caused by the diversity of natural language, is further compounded in the biomedical domain because biomedical sentences are commonly long and complex. In addition, relation extraction often involves modeling long range dependencies, discontiguous word patterns and semantic relations for which the pattern-based methodology is not directly applicable. Results: In this article, we shift the focus of biomedical relation extraction from the problem of pattern extraction to the problem of kernel construction. We suggest four kernels: predicate, walk, dependency and hybrid kernels to adequately encapsulate information required for a relation prediction based on the sentential structures involved in two entities. For this purpose, we view the dependency structure of a sentence as a graph, which allows the system to deal with an essential one from the complex syntactic structure by finding the shortest path between entities. The kernels we suggest are augmented gradually from the flat features descriptions to the structural descriptions of the shortest paths. As a result, we obtain a very promising result, a 77.5 F-score with the walk kernel on the Language Learning in Logic (LLL) 05 genic interaction shared task. Availability: The used algorithms are free for use for academic research and are available from our Web site http://mllab.sogang.ac.kr/~shkim/LLL05.tar.gz. Contact: shkim@lex.yonsei.ac.kr