Silhouettes: a graphical aid to the interpretation and validation of cluster analysis
Journal of Computational and Applied Mathematics
Proximity Estimation and Hardness of Short-Text Corpora
DEXA '08 Proceedings of the 2008 19th International Conference on Database and Expert Systems Application
Clustering Narrow-Domain Short Texts by Using the Kullback-Leibler Distance
CICLing '07 Proceedings of the 8th International Conference on Computational Linguistics and Intelligent Text Processing
Particle Swarm Optimization for clustering short-text corpora
Proceedings of the 2009 conference on Computational Intelligence and Bioengineering: Essays in Memory of Antonina Starita
On the relative hardness of clustering corpora
TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
Evaluation of internal validity measures in short-text corpora
CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
An approach to clustering abstracts
NLDB'05 Proceedings of the 10th international conference on Natural Language Processing and Information Systems
A general bio-inspired method to improve the short-text clustering task
CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
An efficient Particle Swarm Optimization approach to cluster short texts
Information Sciences: an International Journal
Hi-index | 0.00 |
The current tendency for people to use very short documents, e.g. blogs, text-messaging, news and others, has produced an increasing interest in automatic processing techniques which are able to deal with documents with these characteristics. In this context, "short-text clustering" is a very important research field where new clustering algorithms have been recently proposed to deal with this difficult problem. In this work, ITSA*, an iterative method based on the bio-inspired method PAntSA* is proposed for this task. ITSA* takes as input the results obtained by arbitrary clustering algorithms and refines them by iteratively using the PAntSA* algorithm. The proposal shows an interesting improvement in the results obtained with different algorithms on several short-text collections. However, ITSA* can not only be used as an effective improvement method. Using random initial clusterings, ITSA* outperforms well-known clustering algorithms in most of the experimental instances.