Incremental Document Clustering Based on Graph Model

  • Authors:
  • Tu-Anh Nguyen-Hoang;Kiem Hoang;Danh Bui-Thi;Anh-Thy Nguyen

  • Affiliations:
  • Faculty of Information Technology, University of Science, VNU-HCM, Vietnam 70000;Faculty of Computer Science, University of Information Technology, VNU-HCM, Vietnam 70000;Faculty of Information Technology, University of Science, VNU-HCM, Vietnam 70000;Faculty of Information Technology, University of Science, VNU-HCM, Vietnam 70000

  • Venue:
  • ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose a new approach based on graph model and enhanced IncrementalDBSCAN to solve incremental document clustering problem. Instead of traditional vector-based model, a graph-based is used for document representation. By using graph model, we can easily update graph structure when a new document is added to database. Meanwhile, IncrementalDBSCAN is an effective incremental clustering algorithm suitable for mining in dynamically changing databases. Similarity between two documents is measured by hybrid similarity of their adapting feature vectors and shared-phrase information. Our experimental results demonstrate the effectiveness of the proposed method.