Evaluation of hierarchical clustering algorithms for document datasets
Proceedings of the eleventh international conference on Information and knowledge management
Text Clustering Based on Good Aggregations
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Frequent term-based text clustering
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Scalable Construction of Topic Directory with Nonparametric Closed Termset Mining
ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
High Quality, Efficient Hierarchical Document Clustering Using Closed Interesting Itemsets
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Enhancing text clustering by leveraging Wikipedia semantics
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Exploiting Wikipedia as external knowledge for document clustering
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Computing semantic relatedness using Wikipedia-based explicit semantic analysis
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
A comparative study of ontology based term similarity measures on PubMed document clustering
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Extracting temporal equivalence relationships among keywords from time-stamped documents
DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part I
Association rule centric clustering of web search results
MIWAI'11 Proceedings of the 5th international conference on Multi-Disciplinary Trends in Artificial Intelligence
A clustering technique for news articles using WordNet
Knowledge-Based Systems
Semantic Labelling for Document Feature Patterns Using Ontological Subjects
WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01
Mapping semantic knowledge for unsupervised text categorisation
ADC '13 Proceedings of the Twenty-Fourth Australasian Database Conference - Volume 137
Hi-index | 0.00 |
High dimensionality is a major challenge in document clustering. Some of the recent algorithms address this problem by using frequent itemsets for clustering. But, most of these algorithms neglect the semantic relationship between the words. On the other hand there are algorithms that take care of the semantic relations between the words by making use of external knowledge contained in Word Net, Mesh, Wikipedia, etc but do not handle the high dimensionality. In this paper we present an efficient solution that addresses both these problems. We propose a hierarchical clustering algorithm using closed frequent itemsets that use Wikipedia as an external knowledge to enhance the document representation. We evaluate our methods based on F-Score on standard datasets and show our results to be better than existing approaches.