A shared task involving multi-label classification of clinical free text

Authors:
John P. Pestian;Christopher Brew;Paweł Matykiewicz;D. J. Hovermale;Neil Johnson;K. Bretonnel Cohen;Włodzisław Duch
Affiliations:
University of Cincinnati;Ohio State University;University of Cincinnati and Nicolaus Copernicus University, Toruń, Poland;Ohio State University;University of Cincinnati;University of Colorado;Nicolaus Copernicus University, Toruń, Poland
Venue:
BioNLP '07 Proceedings of the Workshop on BioNLP 2007: Biological, Translational, and Clinical Language Processing
Year:
2007

Citing 2
Cited 24

Two biomedical sublanguages: a description based on the theories of Zellig Harris

Journal of Biomedical Informatics - Special issue: Sublanguage
Role of local context in automatic deidentification of ungrammatical, fragmented text

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics

Semi-structured document categorization with a semantic kernel

Pattern Recognition
Clinical text classification under the Open and Closed Topic Assumptions

International Journal of Data Mining and Bioinformatics
A hierarchical approach to encoding medical concepts for clinical notes

HLT-SRWS '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Student Research Workshop
The BioScope corpus: annotation for negation, uncertainty and their scope in biomedical texts

BioNLP '08 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Building a semantically annotated corpus of clinical texts

Journal of Biomedical Informatics
Guest Editorial: Current issues in biomedical text mining and natural language processing

Journal of Biomedical Informatics
Methodological Review: What can natural language processing do for clinical decision support?

Journal of Biomedical Informatics
Specializing for predicting obesity and its co-morbidities

Journal of Biomedical Informatics
Annotating and recognising named entities in clinical notes

ACLstudent '09 Proceedings of the ACL-IJCNLP 2009 Student Research Workshop
Selecting information in electronic health records for knowledge acquisition

Journal of Biomedical Informatics
Does negation really matter?

NeSp-NLP '10 Proceedings of the Workshop on Negation and Speculation in Natural Language Processing
Machine learning and features selection for semi-automatic ICD-9-CM encoding

Louhi '10 Proceedings of the NAACL HLT 2010 Second Louhi Workshop on Text and Data Mining of Health Documents
Symbolic classification methods for patient discharge summaries encoding into ICD

IceTAL'10 Proceedings of the 7th international conference on Advances in natural language processing
lexically-triggered hidden Markov models for clinical document coding

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Learning local content shift detectors from document-level information

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Navigating through very large sets of medical records: an information retrieval evaluation architecture for non-standardized text

USAB'11 Proceedings of the 7th conference on Workgroup Human-Computer Interaction and Usability Engineering of the Austrian Computer Society: information Quality in e-Health
Local analgesia adverse effects prediction using multi-label classification

Neurocomputing
Lexical acquisition for clinical text mining using distributional similarity

CICLing'12 Proceedings of the 13th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part II
Multi-label classification using boolean matrix decomposition

Proceedings of the 27th Annual ACM Symposium on Applied Computing
Anaphoric reference in clinical reports: Characteristics of an annotated corpus

Journal of Biomedical Informatics
A machine-learning approach to negation and speculation detection in clinical texts

Journal of the American Society for Information Science and Technology
Ontology-guided feature engineering for clinical text classification

Journal of Biomedical Informatics
Special Communication: Natural language processing: State of the art and prospects for significant progress, a workshop sponsored by the National Library of Medicine

Journal of Biomedical Informatics
Semi-supervised clinical text classification with Laplacian SVMs: An application to cancer case management

Journal of Biomedical Informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper reports on a shared task involving the assignment of ICD-9-CM codes to radiology reports. Two features distinguished this task from previous shared tasks in the biomedical domain. One is that it resulted in the first freely distributable corpus of fully anonymized clinical text. This resource is permanently available and will (we hope) facilitate future research. The other key feature of the task is that it required categorization with respect to a large and commercially significant set of labels. The number of participants was larger than in any previous biomedical challenge task. We describe the data production process and the evaluation measures, and give a preliminary analysis of the results. Many systems performed at levels approaching the inter-coder agreement, suggesting that human-like performance on this task is within the reach of currently available technologies.