Cross-genre and cross-domain detection of semantic uncertainty

Authors:
György Szarvas;Veronika Vincze;Richárd Farkas;György Móra;Iryna Gurevych
Affiliations:
Technische Universität Darmstadt;Hungarian Academy of Sciences;Universität Stuttgart;University of Szeged;Technische Universität Darmstadt
Venue:
Computational Linguistics
Year:
2012

Citing 34
Cited 2

A factuality profiler for eventualities in text

A factuality profiler for eventualities in text
Multi-dimensional classification of biomedical text

Bioinformatics
Linguistically motivated large-scale NLP with C&C and boxer

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Fine-grained subjectivity and sentiment analysis: recognizing the intensity, polarity, and attitudes of private states

Fine-grained subjectivity and sentiment analysis: recognizing the intensity, polarity, and attitudes of private states
Recognizing speculative language in biomedical research articles: a linguistically motivated perspective

BioNLP '08 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Overview of BioNLP'09 shared task on event extraction

BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing: Shared Task
Biomedical event annotation with CRFs and precision grammars

BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing: Shared Task
Syntactic dependency based heuristics for biological event extraction

BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing: Shared Task
Analyzing text in search of bio-molecular events: a high-precision machine learning framework

BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing: Shared Task
Learning the scope of hedge cues in biomedical texts

BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Using hedges to enhance a disease outbreak report text mining system

BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
ConText: an algorithm for identifying contextual features from clinical text

BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
Domain adaptation for statistical classifiers

Journal of Artificial Intelligence Research
Finding hedges by chasing weasels: hedge detection using Wikipedia tags and shallow linguistic features

ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Artificial Intelligence: A Modern Approach

Artificial Intelligence: A Modern Approach
Detecting speculations and their scopes in scientific text

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Using maximum entropy model to extract protein-protein interaction information from biomedical literature

ICIC'07 Proceedings of the intelligent computing 3rd international conference on Advanced intelligent computing theories and applications
Epistemic modality: From uncertainty to certainty in the context of information seeking as interactions with texts

Information Processing and Management: an International Journal
A Survey on Transfer Learning

IEEE Transactions on Knowledge and Data Engineering
Evaluating a meta-knowledge annotation scheme for bio-events

NeSp-NLP '10 Proceedings of the Workshop on Negation and Speculation in Natural Language Processing
Using domain similarity for performance estimation

DANLP 2010 Proceedings of the 2010 Workshop on Domain Adaptation for Natural Language Processing
The CoNLL-2010 shared task: learning to detect hedges and their scope in natural language text

CoNLL '10: Shared Task Proceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task
A cascade method for detecting hedges and their scope in natural language text

CoNLL '10: Shared Task Proceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task
A hedgehop over a max-margin framework using hedge cues

CoNLL '10: Shared Task Proceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task
Memory-based resolution of in-sentence scopes of hedge cues

CoNLL '10: Shared Task Proceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task
Resolving speculation: MaxEnt cue classification and dependency-based scope rules

CoNLL '10: Shared Task Proceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task
Combining manual rules and supervised learning for hedge cue and scope detection

CoNLL '10: Shared Task Proceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task
Hedge detection using the RelHunter approach

CoNLL '10: Shared Task Proceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task
Exploiting rich features for detecting hedges and their scope

CoNLL '10: Shared Task Proceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task
Uncertainty detection as approximate max-margin sequence labelling

CoNLL '10: Shared Task Proceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task
Hedge detection and scope finding by sequence labeling with normalized feature selection

CoNLL '10: Shared Task Proceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task
HedgeHunter: a system for hedge detection and uncertainty classification

CoNLL '10: Shared Task Proceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task
Exploiting CCG structures with tree kernels for speculation detection

CoNLL '10: Shared Task Proceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task
A baseline approach for detecting sentences containing uncertainty

CoNLL '10: Shared Task Proceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task

Collaboratively built semi-structured content and Artificial Intelligence: The story so far

Artificial Intelligence
Learning to detect english and hungarian light verb constructions

ACM Transactions on Speech and Language Processing (TSLP) - Special issue on multiword expressions: From theory to practice and use, part 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

Uncertainty is an important linguistic phenomenon that is relevant in various Natural Language Processing applications, in diverse genres from medical to community generated, newswire or scientific discourse, and domains from science to humanities. The semantic uncertainty of a proposition can be identified in most cases by using a finite dictionary (i.e., lexical cues) and the key steps of uncertainty detection in an application include the steps of locating the (genre-and domain-specific) lexical cues, disambiguating them, and linking them with the units of interest for the particular application (e.g., identified events in information extraction). In this study, we focus on the genre and domain differences of the context-dependent semantic uncertainty cue recognition task. We introduce a unified subcategorization of semantic uncertainty as different domain applications can apply different uncertainty categories. Based on this categorization, we normalized the annotation of three corpora and present results with a state-of-the-art uncertainty cue recognition model for four fine-grained categories of semantic uncertainty. Our results reveal the domain and genre dependence of the problem; nevertheless, we also show that even a distant source domain data set can contribute to the recognition and disambiguation of uncertainty cues, efficiently reducing the annotation costs needed to cover a new domain. Thus, the unified subcategorization and domain adaptation for training the models offer an efficient solution for cross-domain and cross-genre semantic uncertainty recognition.