Utilization of global ranking information in graph-based biomedical literature clustering

  • Authors:
  • Xiaodan Zhang;Xiaohua Hu;Jiali Xia;Xiaohua Zhou;Palakorn Achananuparp

  • Affiliations:
  • College of Information Science and Technology, Drexel University 3141 Chestnut street, Philadelphia, PA;College of Information Science and Technology, Drexel University 3141 Chestnut street, Philadelphia, PA and UFSoft School of Software, Jiangxi University of Finance and Economics Nanchang, Jiangxi ...;UFSoft School of Software, Jiangxi University of Finance and Economics Nanchang, Jiangxi, China;College of Information Science and Technology, Drexel University 3141 Chestnut street, Philadelphia, PA;College of Information Science and Technology, Drexel University 3141 Chestnut street, Philadelphia, PA

  • Venue:
  • DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we explore how global ranking method in conjunction with local density method help identify meaningful term clusters from ontology enriched graph representation of biomedical literature corpus. One big problem with document clustering is how to discount the effects of class-unspecific general terms and strengthen the effects of class-specific core terms. We claim that a well constructed term graph can help improve the global ranking of classspecific core terms. We first apply PageRank and HITS to a directed abstracttitle term graph to target class specific core terms. Then k dense term clusters (graphs) are identified from these terms. Last, each document is assigned to its closest core term graph. A series of experiments are conducted on a document corpus collected from PubMed. Experimental results show that our approach is very effective to identify class-specific core terms and thus help document clustering.