Combining labeled and unlabeled data with co-training
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Table extraction using conditional random fields
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Contextual word similarity and estimation from sparse data
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Journal of Biomedical Informatics
Exploring hedge identification in biomedical literature
Journal of Biomedical Informatics
Multi-dimensional classification of biomedical text
Bioinformatics
BioNLP '06 Proceedings of the Workshop on Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis
Exploring text and image features to classify images in bioscience literature
BioNLP '06 Proceedings of the Workshop on Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis
The BioScope corpus: annotation for negation, uncertainty and their scope in biomedical texts
BioNLP '08 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Learning the scope of hedge cues in biomedical texts
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
ConText: an algorithm for identifying contextual features from clinical text
BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
IJCAI'89 Proceedings of the 11th international joint conference on Artificial intelligence - Volume 1
IWCS '11 Proceedings of the Ninth International Conference on Computational Semantics
Hi-index | 0.00 |
Objective: Hedging is frequently used in both the biological literature and clinical notes to denote uncertainty or speculation. It is important for text-mining applications to detect hedge cues and their scope; otherwise, uncertain events are incorrectly identified as factual events. However, due to the complexity of language, identifying hedge cues and their scope in a sentence is not a trivial task. Our objective was to develop an algorithm that would automatically detect hedge cues and their scope in biomedical literature. Methodology: We used conditional random fields (CRFs), a supervised machine-learning algorithm, to train models to detect hedge cue phrases and their scope in biomedical literature. The models were trained on the publicly available BioScope corpus. We evaluated the performance of the CRF models in identifying hedge cue phrases and their scope by calculating recall, precision and F1-score. We compared our models with three competitive baseline systems. Results: Our best CRF-based model performed statistically better than the baseline systems, achieving an F1-score of 88% and 86% in detecting hedge cue phrases and their scope in biological literature and an F1-score of 93% and 90% in detecting hedge cue phrases and their scope in clinical notes. Conclusions: Our approach is robust, as it can identify hedge cues and their scope in both biological and clinical text. To benefit text-mining applications, our system is publicly available as a Java API and as an online application at http://hedgescope.askhermes.org. To our knowledge, this is the first publicly available system to detect hedge cues and their scope in biomedical literature.