Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Reexamining the cluster hypothesis: scatter/gather on retrieval results
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Web document clustering: a feasibility demonstration
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Grouper: a dynamic clustering interface to Web search results
WWW '99 Proceedings of the eighth international conference on World Wide Web
Use Link-Based Clustering to Improve Web Search Results
WISE '01 Proceedings of the Second International Conference on Web Information Systems Engineering (WISE'01) Volume 1 - Volume 1
Learning to cluster web search results
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A personalized search engine based on web-snippet hierarchical clustering
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
Improving Web Clustering by Cluster Selection
WI '05 Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence
A New Web Search Result Clustering based on True Common Phrase Label Discovery
CIMCA '06 Proceedings of the International Conference on Computational Inteligence for Modelling Control and Automation and International Conference on Intelligent Agents Web Technologies and International Commerce
Deep classifier: automatically categorizing search results into large-scale hierarchies
WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Web Search Results Clustering Based on a Novel Suffix Tree Structure
ATC '08 Proceedings of the 5th international conference on Autonomic and Trusted Computing
Mitos: Design and Evaluation of a DBMS-Based Web Search Engine
PCI '08 Proceedings of the 2008 Panhellenic Conference on Informatics
FleXplorer: A Framework for Providing Faceted and Dynamic Taxonomy-Based Information Exploration
DEXA '08 Proceedings of the 2008 19th International Conference on Database and Expert Systems Application
Linear pattern matching algorithms
SWAT '73 Proceedings of the 14th Annual Symposium on Switching and Automata Theory (swat 1973)
Automatic Extraction of Useful Facet Hierarchies from Text Databases
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Dynamic Taxonomies and Faceted Search: Theory, Practice, and Experience
Dynamic Taxonomies and Faceted Search: Theory, Practice, and Experience
Carrot2 and language properties in web search results clustering
AWIC'03 Proceedings of the 1st international Atlantic web intelligence conference on Advances in web intelligence
Exploratory web searching with dynamic taxonomies and results clustering
ECDL'09 Proceedings of the 13th European conference on Research and advanced technology for digital libraries
Exploiting available memory and disk for scalable instant overview search
WISE'11 Proceedings of the 12th international conference on Web information system engineering
Scalable, flexible and generic instant overview search
Proceedings of the 21st international conference companion on World Wide Web
Hi-index | 0.00 |
Results clustering in Web Searching is useful for providing users with overviews of the results and thus allowing them to restrict their focus to the desired parts. However, the task of deriving single-word or multiple-word names for the clusters (usually referred as cluster labeling ) is difficult, because they have to be syntactically correct and predictive. Moreover efficiency is an important requirement since results clustering is an online task. Suffix Tree Clustering (STC) is a clustering technique where search results (mainly snippets) can be clustered fast (in linear time), incrementally, and each cluster is labeled with a phrase. In this paper we introduce: (a) a variation of the STC, called STC+, with a scoring formula that favors phrases that occur in document titles and differs in the way base clusters are merged, and (b) a novel non merging algorithm called NM-STC that results in hierarchically organized clusters. The comparative user evaluation showed that both STC+ and NM-STC are significantly more preferred than STC, and that NM-STC is about two times faster than STC and STC+.