Head-driven statistical models for natural language parsing
Head-driven statistical models for natural language parsing
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Journal of the American Society for Information Science and Technology - Bioinformatics
Using predicate-argument structures for information extraction
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Support Vector Learning for Semantic Argument Classification
Machine Learning
The Proposition Bank: An Annotated Corpus of Semantic Roles
Computational Linguistics
Semantic role labeling via integer linear programming inference
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Shallow semantics for relation extraction
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Improving noun phrase coreference resolution by matching strings
IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing
Hi-index | 0.00 |
In this paper, we construct a biomedical semantic role labeling (SRL) system that can be used to facilitate relation extraction. First, we construct a proposition bank on top of the popular biomedical GENIA treebank following the PropBank annotation scheme. We only annotate the predicate-argument structures (PAS's) of thirty frequently used biomedical predicates and their corresponding arguments. Second, we use our proposition bank to train a biomedical SRL system, which uses a maximum entropy (ME) model. Thirdly, we automatically generate argument-type templates which can be used to improve classification of biomedical argument types. Our experimental results show that a newswire SRL system that achieves an F-score of 86.29% in the newswire domain can maintain an F-score of 64.64% when ported to the biomedical domain. By using our annotated biomedical corpus, we can increase that F-score by 22.9%. Adding automatically generated template features further increases overall F-score by 0.47% and adjunct arguments (AM) F-score by 1.57%, respectively.