BioNLP '08 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Overview of BioNLP'09 shared task on event extraction
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing: Shared Task
Learning the scope of hedge cues in biomedical texts
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Using hedges to enhance a disease outbreak report text mining system
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
The CoNLL-2010 shared task: learning to detect hedges and their scope in natural language text
CoNLL '10: Shared Task Proceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task
Cross-genre and cross-domain detection of semantic uncertainty
Computational Linguistics
Hi-index | 0.00 |
With the dramatic growth of scientific publishing, Information Extraction (IE) systems are becoming an increasingly important tool for large scale data analysis. Hedge detection and uncertainty classification are important components of a high precision IE system. This paper describes a two part supervised system which classifies words as hedge or non-hedged and sentences as certain or uncertain in biomedical and Wikipedia data. In the first stage, our system trains a logistic regression classifier to detect hedges based on lexical and Part-of-Speech collocation features. In the second stage, we use the output of the hedge classifier to generate sentence level features based on the number of hedge cues, the identity of hedge cues, and a Bag-of-Words feature vector to train a logistic regression classifier for sentence level uncertainty. With the resulting classification, an IE system can then discard facts and relations extracted from these sentences or treat them as appropriately doubtful. We present results for in domain training and testing and cross domain training and testing based on a simple union of training sets.