A vector space model for automatic indexing
Communications of the ACM
Modern Information Retrieval
Learning Approaches for Detecting and Tracking News Events
IEEE Intelligent Systems
A Comparative Study on Feature Selection in Text Categorization
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Incremental Clustering for Mining in a Data Warehousing Environment
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Phrase-based Document Similarity Based on an Index Graph Model
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Incremental Document Clustering Using Cluster Similarity Histograms
WI '03 Proceedings of the 2003 IEEE/WIC International Conference on Web Intelligence
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Expert Systems with Applications: An International Journal
Leveraging network structure for incremental document clustering
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
Weighted Fuzzy-Possibilistic C-Means Over Large Data Sets
International Journal of Data Warehousing and Mining
Hi-index | 0.00 |
In this paper, we propose a new approach based on graph model and enhanced IncrementalDBSCAN to solve incremental document clustering problem. Instead of traditional vector-based model, a graph-based is used for document representation. By using graph model, we can easily update graph structure when a new document is added to database. Meanwhile, IncrementalDBSCAN is an effective incremental clustering algorithm suitable for mining in dynamically changing databases. Similarity between two documents is measured by hybrid similarity of their adapting feature vectors and shared-phrase information. Our experimental results demonstrate the effectiveness of the proposed method.