Short Text Clustering for Search Results

Authors:
Xingliang Ni;Zhi Lu;Xiaojun Quan;Wenyin Liu;Bei Hua
Affiliations:
Dept. of Computer Sci. and Tech., University of Sci. and Tech. of China, Hefei, China and Department of Computer Science, City University of Hong Kong, HKSAR, China and Joint Research Lab of Excel ...;Department of Computer Science, City University of Hong Kong, HKSAR, China;Department of Computer Science, City University of Hong Kong, HKSAR, China;Department of Computer Science, City University of Hong Kong, HKSAR, China and Joint Research Lab of Excellence, CityU-USTC Advanced Research Institute, Suzhou, China;Dept. of Computer Sci. and Tech., University of Sci. and Tech. of China, Hefei, China and Joint Research Lab of Excellence, CityU-USTC Advanced Research Institute, Suzhou, China
Venue:
APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
Year:
2009

Citing 4
Cited 0

Fast and effective text mining using linear-time document clustering

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Evaluation of hierarchical clustering algorithms for document datasets

Proceedings of the eleventh international conference on Information and knowledge management
A Min-max Cut Algorithm for Graph Partitioning and Data Clustering

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Sequence modelling for sentence classification in a legal summarisation system

Proceedings of the 2005 ACM symposium on Applied computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

An approach to clustering short text snippets is proposed, which can be used to cluster search results into a few relevant groups to help users quickly locate their interesting groups of results. Specifically, the collection of search result snippets is regarded as a similarity graph implicitly, in which each snippet is a vertex and each edge between the vertices is weighted by the similarity between the corresponding snippets. TermCut , the proposed clustering algorithm, is then applied to recursively bisect the similarity graph by selecting the current core term such that one cluster contains the term and the other does not. Experimental results show that the proposed algorithm improves the KMeans algorithm by about 0.3 on FScore criterion.