An Integration of Fuzzy Association Rules and WordNet for Document Clustering

Authors:
Chun-Ling Chen;Frank S. Tseng;Tyne Liang
Affiliations:
Dept. of Computer Science, National Chiao Tung University, Taiwan, ROC;Dept. of Information Management National Kaohsiung 1st Univ. of Sci. & Tech., Taiwan, ROC;Dept. of Computer Science, National Chiao Tung University, Taiwan, ROC
Venue:
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Year:
2009

Citing 9
Cited 3

Scatter/Gather: a cluster-based approach to browsing large document collections

SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Learning rules for a fuzzy inference model

Fuzzy Sets and Systems - Special issue on fuzzy data analysis
WordNet: a lexical database for English

Communications of the ACM
Frequent term-based text clustering

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining the peanut gallery: opinion extraction and semantic classification of product reviews

WWW '03 Proceedings of the 12th international conference on World Wide Web
Fuzzy data mining for interesting generalized association rules

Fuzzy Sets and Systems - Theme: Learning and modeling
Scalable Construction of Topic Directory with Nonparametric Closed Termset Mining

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Utilizing Genetic Algorithms to Optimize Membership Functions for Fuzzy Weighted Association Rules Mining

Applied Intelligence
WordNet-based text document clustering

ROMAND '04 Proceedings of the 3rd Workshop on RObust Methods in Analysis of Natural Language Data

Editorial: An integration of WordNet and fuzzy association rule mining for multi-label document clustering

Data & Knowledge Engineering
W-kmeans: clustering news articles using wordNet

KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part III
A clustering technique for news articles using WordNet

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the rapid growth of text documents, document clustering has become one of the main techniques for organizing large amount of documents into a small number of meaningful clusters. However, there still exist several challenges for document clustering, such as high dimensionality, scalability, accuracy, meaningful cluster labels, and extracting semantics from texts. In order to improve the quality of document clustering results, we propose an effective Fuzzy Frequent Itemset-based Document Clustering (F2IDC) approach that combines fuzzy association rule mining with the background knowledge embedded in WordNet. A term hierarchy generated from WordNet is applied to discovery fuzzy frequent itemsets as candidate cluster labels for grouping documents. We have conducted experiments to evaluate our approach on Reuters-21578 dataset. The experimental result shows that our proposed method outperforms the accuracy quality of FIHC, HFTC, and UPGMA.