Exploring hedge identification in biomedical literature

Authors:
Ben Medlock
Affiliations:
University of Cambridge, Computer Laboratory, William Gates Building, 15 JJ Thomson Avenue, Cambridge CB3 OFD, UK
Venue:
Journal of Biomedical Informatics
Year:
2008

Citing 21
Cited 18

Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Making large-scale support vector machine learning practical

Advances in kernel methods
A re-examination of text categorization methods

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Text Classification from Labeled and Unlabeled Documents using EM

Machine Learning - Special issue on information retrieval
The use of bigrams to enhance text categorization

Information Processing and Management: an International Journal
Active + Semi-supervised Learning = Robust Multi-View Learning

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Enhancing Supervised Learning with Unlabeled Data

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Augmenting Naive Bayes Classifiers with Statistical Language Models

Information Retrieval
Unsupervised word sense disambiguation rivaling supervised methods

ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
RCV1: A New Benchmark Collection for Text Categorization Research

The Journal of Machine Learning Research
Weakly-supervised relation classification for information extraction

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Scaling to very very large corpora for natural language disambiguation

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Bootstrapping

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Weakly supervised natural language learning without redundant views

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Counter-training in discovery of semantic patterns

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Relation extraction using label propagation based semi-supervised learning

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Effective self-training for parsing

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
The second release of the RASP system

COLING-ACL '06 Proceedings of the COLING/ACL on Interactive presentation sessions
Weakly supervised learning methods for improving the quality of gene name normalization data

ISMB '05 Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics
Using pointwise mutual information to identify implicit features in customer reviews

ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
Techniques for improving the performance of naive bayes for text classification

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing

Tasks, topics and relevance judging for the TREC Genomics Track: five years of experience evaluating biomedical text information retrieval systems

Information Retrieval
Learning the scope of hedge cues in biomedical texts

BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
ConText: An algorithm for determining negation, experiencer, and temporal status from clinical reports

Journal of Biomedical Informatics
Exploring surface-level heuristics for negation and speculation discovery in clinical texts

BioNLP '10 Proceedings of the 2010 Workshop on Biomedical Natural Language Processing
A hedgehop over a max-margin framework using hedge cues

CoNLL '10: Shared Task Proceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task
Detecting hedge cues and their scopes with average perceptron

CoNLL '10: Shared Task Proceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task
Memory-based resolution of in-sentence scopes of hedge cues

CoNLL '10: Shared Task Proceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task
Uncertainty detection as approximate max-margin sequence labelling

CoNLL '10: Shared Task Proceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task
Hedge detection and scope finding by sequence labeling with normalized feature selection

CoNLL '10: Shared Task Proceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task
Exploiting multi-features to detect hedges and their scope in biomedical texts

CoNLL '10: Shared Task Proceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task
A baseline approach for detecting sentences containing uncertainty

CoNLL '10: Shared Task Proceedings of the Fourteenth Conference on Computational Natural Language Learning --- Shared Task
Detecting hedge cues and their scope in biomedical text with conditional random fields

Journal of Biomedical Informatics
Multiple attribute frequent mining-based for dengue outbreak

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Mining uncertain sentences with multiple instance learning

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
BioExcom: detection and categorization of speculative sentences in biomedical literature

LTC'09 Proceedings of the 4th conference on Human language technology: challenges for computer science and linguistics
A parser-based approach to detecting modification of biomedical events

Proceedings of the ACM fifth international workshop on Data and text mining in biomedical informatics
Modality and negation: An introduction to the special issue

Computational Linguistics
Kernel-Based logical and relational learning with klog for hedge cue detection

ILP'11 Proceedings of the 21st international conference on Inductive Logic Programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

We investigate automatic identification of speculative language, or 'hedging', in scientific literature from the biomedical domain. Our contributions include a precise description of the task including annotation guidelines, theoretical analysis and discussion. We show that good agreement can be achieved using our guidelines and present a publicly available benchmark dataset for the task. We argue for separation of the acquisition and classification phases in semi-supervised machine learning, and present a probabilistic acquisition model which is evaluated both theoretically and experimentally. We explore the impact of different sample representations on classification accuracy across the learning curve and demonstrate the effectiveness of using machine learning for the hedge identification task. Finally, we examine the errors made by our approach and point toward avenues for future research.