A semi-automatic method for annotating a biomedical proposition bank

Authors:
Wen-Chi Chou;Richard Tzong-Han Tsai;Ying-Shan Su;Wei Ku;Ting-Yi Sung;Wen-Lian Hsu
Affiliations:
Academia Sinica, Taiwan, ROC;Academia Sinica, Taiwan, ROC and National Taiwan University, Taiwan, ROC;Academia Sinica, Taiwan, ROC;Academia Sinica, Taiwan, ROC and National Taiwan University, Taiwan, ROC;Academia Sinica, Taiwan, ROC;Academia Sinica, Taiwan, ROC
Venue:
LAC '06 Proceedings of the Workshop on Frontiers in Linguistically Annotated Corpora 2006
Year:
2006

Citing 10
Cited 8

Class-Based Construction of a Verb Lexicon

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Head-driven statistical models for natural language parsing

Head-driven statistical models for natural language parsing
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
WordFreak: an open tool for linguistic annotation

NAACL-Demonstrations '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: Demonstrations - Volume 4
The Penn Treebank: annotating predicate argument structure

HLT '94 Proceedings of the workshop on Human Language Technology
The Proposition Bank: An Annotated Corpus of Semantic Roles

Computational Linguistics
Semantic role labeling via integer linear programming inference

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
BIOSMILE: adapting semantic role labeling for biomedical verbs: an exponential model coupled with automatically generated template features

BioNLP '06 Proceedings of the Workshop on Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis
Introduction to the CoNLL-2005 shared task: semantic role labeling

CONLL '05 Proceedings of the Ninth Conference on Computational Natural Language Learning
Improving noun phrase coreference resolution by matching strings

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing

A memory-based learning approach to event extraction in biomedical texts

BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing: Shared Task
Event frame extraction based on a gene regulation corpus

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Semi-automated named entity annotation

LAW '07 Proceedings of the Linguistic Annotation Workshop
Assessing the benefits of partial automatic pre-labeling for frame-semantic annotation

ACL-IJCNLP '09 Proceedings of the Third Linguistic Annotation Workshop
A scaleable automated quality assurance technique for semantic representations and proposition banks

LAW V '11 Proceedings of the 5th Linguistic Annotation Workshop
Is it worth the effort? Assessing the benefits of partial automatic pre-labeling for frame-semantic annotation

Language Resources and Evaluation
Automatic Identification and Classification of Noun Argument Structures in Biomedical Literature

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Data reduction for continuum of care: an exploratory study using the predicate-argument structure to pre-process radiology sentences for measurement of semantic similarity

UAHCI'13 Proceedings of the 7th international conference on Universal Access in Human-Computer Interaction: applications and services for quality of life - Volume Part III

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we present a semiautomatic approach for annotating semantic information in biomedical texts. The information is used to construct a biomedical proposition bank called BioProp. Like PropBank in the newswire domain, BioProp contains annotations of predicate argument structures and semantic roles in a treebank schema. To construct BioProp, a semantic role labeling (SRL) system trained on PropBank is used to annotate BioProp. Incorrect tagging results are then corrected by human annotators. To suit the needs in the biomedical domain, we modify the PropBank annotation guidelines and characterize semantic roles as components of biological events. The method can substantially reduce annotation efforts, and we introduce a measure of an upper bound for the saving of annotation efforts. Thus far, the method has been applied experimentally to a 4,389-sentence tree-bank corpus for the construction of BioProp. Inter-annotator agreement measured by kappa statistic reaches .95 for combined decision of role identification and classification when all argument labels are considered. In addition, we show that, when trained on BioProp, our biomedical SRL system called BIOSMILE achieves an F-score of 87%.