Learning the structure of task-driven human-human dialogs
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
A study on automatically extracted keywords in text categorization
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Bidirectional inference with the easiest-first strategy for tagging sequence data
HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Answering Clinical Questions with Knowledge-Based and Statistical Techniques
Computational Linguistics
A study of structured clinical abstracts and the semantic classification of sentences
BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
Journal of Biomedical Informatics
Clinical information retrieval using document and PICO structure
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Tagging and linking web forum posts
CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Discourse structure and computation: past, present and future
ACL '12 Proceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries
Hi-index | 0.00 |
AIM Given a set of pre-defined medical categories used in Evidence Based Medicine, we aim to automatically annotate sentences in medical abstracts with these labels. METHOD We construct a corpus of 1,000 medical abstracts annotated by hand with medical categories (e.g. "Intervention", "Outcome"). We explore the use of various features based on lexical, semantic, structural, and sequential information in the data, using Conditional Random Fields (CRF) for classification. RESULT For the classification tasks over all labels, our systems achieved micro-averaged F-scores of 80.9% and 66.9% in structured and unstructured datasets respectively, using sequential features. In labeling only key sentences, our systems produced F-scores of 89.3% and 74.0% in structured and unstructured datasets respectively, using the same sequential features. The results over an external dataset were lower (F-scores of 63.1% for all-labels, and 83.8% for key sentences). CONCLUSION Of the features we used, the best for classifying any given sentence in an abstract are based on unigrams, section headings, and sequential information from preceding sentences. These features resulted in improved performance over a simple bag-of-words approach, and outperform feature sets used in previous work.