Information Retrieval Systems: Theory and Implementation
Information Retrieval Systems: Theory and Implementation
Information Retrieval
Self-Organizing Maps
Using a Hash-Based Method with Transaction Trimming for Mining Association Rules
IEEE Transactions on Knowledge and Data Engineering
Self organization of a massive document collection
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
With the rapid development of global networking through the network, more and more information is accessible on-line. It makes the document clustering technique more dispensable. With the clustering process we can efficiently browse the large information. In this paper, we focus on Chinese document clustering process, which uses data mining technique and neural network model. There are two main phases: preprocessing phase and clustering phase. In the preprocessing phase, we propose another Chinese sentence segmentation method, which based on data mining technique of using a hash-based method. In the clustering phase, we adopt the dynamical SOM model with a view to dynamically clustering data. Furthermore, we use term vectors clustering process instead of document vectors clustering process. Our experiments demonstrate that the term clustering results in better precision rate, and the term clustering will be more efficiently when the amount of documents grows gradually.