Techniques for automatically correcting words in text
ACM Computing Surveys (CSUR)
The String-to-String Correction Problem
Journal of the ACM (JACM)
A technique for computer detection and correction of spelling errors
Communications of the ACM
Improving Precision and Recall for Soundex Retrieval
ITCC '02 Proceedings of the International Conference on Information Technology: Coding and Computing
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
ADAM: another database of abbreviations in MEDLINE
Bioinformatics
Disambiguation of biomedical abbreviations
BioNLP '09 Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Methodological Review: What can natural language processing do for clinical decision support?
Journal of Biomedical Informatics
Annotating and recognising named entities in clinical notes
ACLstudent '09 Proceedings of the ACL-IJCNLP 2009 Student Research Workshop
Artificial Intelligence in Medicine
Hi-index | 0.00 |
Motivation: Progress notes are narrative summaries about the status of patients during the course of treatment or care. Time and efficiency pressures have ensured clinicians' continued preference for unstructured text over entering data in forms when composing progress notes. The ability to extract meaningful data from the unstructured text contained within the notes is invaluable for retrospective analysis and decision support. The automatic extraction of data from unstructured notes, however, has been largely prevented due to the complexity of handling abbreviations, misspelling, punctuation errors and other types of noise. Objective: We present a robust system for cleaning noisy progress notes in real-time, with a focus on abbreviations and misspellings. Methods: The system uses statistical semantic analysis based on Web data and the occasional participation of clinicians to automatically replace abbreviations with the actual senses and misspellings with the correct words. Results: An accuracy of as high as 88.73% was achieved based only on statistical semantic analysis using Web data. The response time of the system with the caching mechanism enabled is 1.5-2s per word which is about the same as the average typing speed of clinicians. Conclusions: The overall accuracy and the response time of the system will improve with time, especially when the confidence mechanism is activated through clinicians' interactions with the system. This system will be implemented in a clinical information system to drive interactive decision support and analysis functions leading to improved patient care and outcomes.