BIOSMILE: adapting semantic role labeling for biomedical verbs: an exponential model coupled with automatically generated template features

Authors:
Richard Tzong-Han Tsai;Wen-Chi Chou;Yu-Chun Lin;Cheng-Lung Sung;Wei Ku;Ying-Shan Su;Ting-Yi Sung;Wen-Lian Hsu
Affiliations:
Academia Sinica and National Taiwan University;Academia Sinica;Academia Sinica and National Taiwan University;Academia Sinica;Academia Sinica and National Taiwan University;Academia Sinica and National Taiwan University;Academia Sinica;Academia Sinica
Venue:
LNLBioNLP '06 Proceedings of the HLT-NAACL BioNLP Workshop on Linking Natural Language and Biology
Year:
2006

Citing 11
Cited 0

Head-driven statistical models for natural language parsing

Head-driven statistical models for natural language parsing
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Genescene: An ontology-enhanced integration of linguistic and co-occurrence based relations in biomedical texts: Research Articles

Journal of the American Society for Information Science and Technology - Bioinformatics
Using predicate-argument structures for information extraction

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Support Vector Learning for Semantic Argument Classification

Machine Learning
Discovering patterns to extract protein--protein interactions from full texts

Bioinformatics
The Proposition Bank: An Annotated Corpus of Semantic Roles

Computational Linguistics
Semantic role labeling via integer linear programming inference

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Shallow semantics for relation extraction

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Exploiting full parsing information to label semantic roles using an ensemble of ME and SVM via integer linear programming

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Improving noun phrase coreference resolution by matching strings

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we construct a biomedical semantic role labeling (SRL) system that can be used to facilitate relation extraction. First, we construct a proposition bank on top of the popular biomedical GENIA treebank following the PropBank annotation scheme. We only annotate the predicate-argument structures (PAS's) of thirty frequently used biomedical predicates and their corresponding arguments. Second, we use our proposition bank to train a biomedical SRL system, which uses a maximum entropy (ME) model. Thirdly, we automatically generate argument-type templates which can be used to improve classification of biomedical argument types. Our experimental results show that a newswire SRL system that achieves an F-score of 86.29% in the newswire domain can maintain an F-score of 64.64% when ported to the biomedical domain. By using our annotated biomedical corpus, we can increase that F-score by 22.9%. Adding automatically generated template features further increases overall F-score by 0.47% and adjunct arguments (AM) F-score by 1.57%, respectively.