Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Learning rules for a fuzzy inference model
Fuzzy Sets and Systems - Special issue on fuzzy data analysis
WordNet: a lexical database for English
Communications of the ACM
Frequent term-based text clustering
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining the peanut gallery: opinion extraction and semantic classification of product reviews
WWW '03 Proceedings of the 12th international conference on World Wide Web
Fuzzy data mining for interesting generalized association rules
Fuzzy Sets and Systems - Theme: Learning and modeling
Scalable Construction of Topic Directory with Nonparametric Closed Termset Mining
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
WordNet-based text document clustering
ROMAND '04 Proceedings of the 3rd Workshop on RObust Methods in Analysis of Natural Language Data
Data & Knowledge Engineering
W-kmeans: clustering news articles using wordNet
KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part III
A clustering technique for news articles using WordNet
Knowledge-Based Systems
Hi-index | 0.00 |
With the rapid growth of text documents, document clustering has become one of the main techniques for organizing large amount of documents into a small number of meaningful clusters. However, there still exist several challenges for document clustering, such as high dimensionality, scalability, accuracy, meaningful cluster labels, and extracting semantics from texts. In order to improve the quality of document clustering results, we propose an effective Fuzzy Frequent Itemset-based Document Clustering (F2IDC) approach that combines fuzzy association rule mining with the background knowledge embedded in WordNet. A term hierarchy generated from WordNet is applied to discovery fuzzy frequent itemsets as candidate cluster labels for grouping documents. We have conducted experiments to evaluate our approach on Reuters-21578 dataset. The experimental result shows that our proposed method outperforms the accuracy quality of FIHC, HFTC, and UPGMA.