Converting Semi-structured Clinical Medical Records into Information and Knowledge

Authors:
Xiaohua Zhou;Hyoil Han;Isaac Chankai;Ann A. Prestrud;Ari D. Brooks
Affiliations:
College of Information Science and Technology, Drexel University;College of Information Science and Technology, Drexel University;College of Medicine, Drexel University;College of Medicine, Drexel University;College of Medicine, Drexel University
Venue:
ICDEW '05 Proceedings of the 21st International Conference on Data Engineering Workshops
Year:
2005

Citing 0
Cited 6

Approaches to text mining for clinical medical records

Proceedings of the 2006 ACM symposium on Applied computing
MaxMatcher: biological concept extraction using approximate dictionary lookup

PRICAI'06 Proceedings of the 9th Pacific Rim international conference on Artificial intelligence
Adapting recommender systems to the requirements of personal health record systems

Proceedings of the 1st ACM International Health Informatics Symposium
Relation-Based document retrieval for biomedical literature databases

DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
Relation-Based document retrieval for biomedical IR

Transactions on Computational Systems Biology V
Using concept-based indexing to improve language modeling approach to genomic IR

ECIR'06 Proceedings of the 28th European conference on Advances in Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clinical medical records contain a wealth of information, largely in free-textual form. Thus, means to extract structured information from free-text records becomes an important research endeavor. In this paper, we propose and implement an information extraction system that extracts three types of information - numeric values, medical terms and categorical value - from semi-structured patient records. Three approaches are proposed to solve the problems posed by each of the three types of values, respectively, and very good performance (precision and recall) is achieved. A novel link-grammar based approach was invented to associate feature and number in a sentence, and extremely high accuracy was achieved. A simple but efficient approach, using POS-based pattern and domain ontology, was adopted to extract medical terms of interest. Finally, an NLPbased feature extraction method coupled with an ID3 baseddecision tree is used to classify and extract categorical cases. This preliminary approach to categorical fields has, so far, proven to be quite effective.