Editorial: An integration of WordNet and fuzzy association rule mining for multi-label document clustering

Authors:
Chun-Ling Chen;Frank S. C. Tseng;Tyne Liang
Affiliations:
Department of Computer Science, National Chiao Tung University, HsinChu 300, Taiwan, ROC;Dept. of Information Management, National Kaohsiung 1st University of Science & Technology, YanChao, Kaohsiung 824, Taiwan, ROC;Department of Computer Science, National Chiao Tung University, HsinChu 300, Taiwan, ROC
Venue:
Data & Knowledge Engineering
Year:
2010

Citing 20
Cited 7

Recent trends in hierarchic document clustering: a critical review

Information Processing and Management: an International Journal
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
WordNet: a lexical database for English

Communications of the ACM
Learning to extract symbolic knowledge from the World Wide Web

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Pruning and summarizing the discovered associations

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Frequent term-based text clustering

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining the peanut gallery: opinion extraction and semantic classification of product reviews

WWW '03 Proceedings of the 12th international conference on World Wide Web
Clustering Data Streams: Theory and Practice

IEEE Transactions on Knowledge and Data Engineering
Fuzzy data mining for interesting generalized association rules

Fuzzy Sets and Systems - Theme: Learning and modeling
Scalable Construction of Topic Directory with Nonparametric Closed Termset Mining

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Document Clustering with Semantic Analysis

HICSS '06 Proceedings of the 39th Annual Hawaii International Conference on System Sciences - Volume 03
Utilizing Genetic Algorithms to Optimize Membership Functions for Fuzzy Weighted Association Rules Mining

Applied Intelligence
Topic discovery based on text mining techniques

Information Processing and Management: an International Journal
A new unsupervised method for document clustering by using WordNet lexical and conceptual relations

Information Retrieval
Incremental clustering of dynamic data streams using connectivity based representative points

Data & Knowledge Engineering
An active learning framework for semi-supervised document clustering with language modeling

Data & Knowledge Engineering
Frequent items in streaming data: An experimental evaluation of the state-of-the-art

Data & Knowledge Engineering
Mining non-derivable frequent itemsets over data stream

Data & Knowledge Engineering
An Integration of Fuzzy Association Rules and WordNet for Document Clustering

PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
WordNet-based text document clustering

ROMAND '04 Proceedings of the 3rd Workshop on RObust Methods in Analysis of Natural Language Data

SyMSS: A syntax-based measure for short-text semantic similarity

Data & Knowledge Engineering
Extracting hot spots of topics from time-stamped documents

Data & Knowledge Engineering
SBV-Cut: Vertex-cut based graph partitioning using structural balance vertices

Data & Knowledge Engineering
An architecture for component-based design of representative-based clustering algorithms

Data & Knowledge Engineering
Measuring the coverage and redundancy of information search services on e-commerce platforms

Electronic Commerce Research and Applications
Extraction of fuzzy rules from fuzzy decision trees: An axiomatic fuzzy sets (AFS) approach

Data & Knowledge Engineering
Summarising customer online reviews using a new text mining approach

International Journal of Business Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the rapid growth of text documents, document clustering has become one of the main techniques for organizing large amount of documents into a small number of meaningful clusters. However, there still exist several challenges for document clustering, such as high dimensionality, scalability, accuracy, meaningful cluster labels, overlapping clusters, and extracting semantics from texts. In order to improve the quality of document clustering results, we propose an effective Fuzzy-based Multi-label Document Clustering (FMDC) approach that integrates fuzzy association rule mining with an existing ontology WordNet to alleviate these problems. In our approach, the key terms will be extracted from the document set, and the initial representation of all documents is further enriched by using hypernyms of WordNet in order to exploit the semantic relations between terms. Then, a fuzzy association rule mining algorithm for texts is employed to discover a set of highly-related fuzzy frequent itemsets, which contain key terms to be regarded as the labels of the candidate clusters. Finally, each document is dispatched into more than one target cluster by referring to these candidate clusters, and then the highly similar target clusters are merged. We conducted experiments to evaluate the performance based on Classic, Re0, R8, and WebKB datasets. The experimental results proved that our approach outperforms the influential document clustering methods with higher accuracy. Therefore, our approach not only provides more general and meaningful labels for documents, but also effectively generates overlapping clusters.