A coherent biomedical literature clustering and summarization approach through ontology-enriched graphical representations

  • Authors:
  • Illhoi Yoo;Xiaohua Hu;Il-Yeol Song

  • Affiliations:
  • Department of Health Management and Informatics, School of Medicine, University of Missouri-Columbia, Columbia, MO;College of Information Science and Technology, Drexel University, Philadelphia, PA;College of Information Science and Technology, Drexel University, Philadelphia, PA

  • Venue:
  • DaWaK'06 Proceedings of the 8th international conference on Data Warehousing and Knowledge Discovery
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we introduce a coherent biomedical literature clustering and summarization approach that employs a graphical representation method for text using a biomedical ontology. The key of the approach is to construct document cluster models as semantic chunks capturing the core semantic relationships in the ontology-enriched scale-free graphical representation of documents. These document cluster models are used for both document clustering and text summarization by constructing Text Semantic Interaction Network (TSIN). Our extensive experimental results indicate our approach shows 45% cluster quality improvement and 72% clustering reliability improvement, in terms of misclassification index, over Bisecting K-means as a leading document clustering approach. In addition, our approach provides concise but rich text summary in key concepts and sentences. The primary contribution of this paper is we introduce a coherent biomedical literature clustering and summarization approach that takes advantage of ontology-enriched graphical representations. Our approach significantly improves the quality of document clusters and understandability of documents through summaries.