Elements of information theory
Elements of information theory
Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Web document clustering: a feasibility demonstration
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Sublinear time algorithms for metric space problems
STOC '99 Proceedings of the thirty-first annual ACM symposium on Theory of computing
Deciphering cluster representations
Information Processing and Management: an International Journal
Similarity estimation techniques from rounding algorithms
STOC '02 Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
Evaluating strategies for similarity search on the web
Proceedings of the 11th international conference on World Wide Web
The effectiveness of query-specific hierarchic clustering in information retrieval
Information Processing and Management: an International Journal
Generating hierarchical summaries for web searches
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Relationship-based clustering and cluster ensembles for high-dimensional data mining
Relationship-based clustering and cluster ensembles for high-dimensional data mining
Proceedings of the 13th international conference on World Wide Web
Learning to cluster web search results
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
A personalized search engine based on web-snippet hierarchical clustering
WWW '05 Special interest tracks and posters of the 14th international conference on World Wide Web
A scalable algorithm for high-quality clustering of web snippets
Proceedings of the 2006 ACM symposium on Applied computing
Clustering information retrieval search outputs
IRSG'99 Proceedings of the 21st Annual BCS-IRSG conference on Information Retrieval Research
Extraction and classification of dense communities in the web
Proceedings of the 16th international conference on World Wide Web
VISTO: visual storyboard for web video browsing
Proceedings of the 6th ACM international conference on Image and video retrieval
The opposite of smoothing: a language model approach to ranking query-specific document clusters
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Collection Browsing through Automatic Hierarchical Tagging
AH '08 Proceedings of the 5th international conference on Adaptive Hypermedia and Adaptive Web-Based Systems
Dynamic user-defined similarity searching in semi-structured text retrieval
Proceedings of the 3rd international conference on Scalable information systems
A Co-occurrence Based Hierarchical Method for Clustering Web Search Results
WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Web Search Clustering and Labeling with Hidden Topics
ACM Transactions on Asian Language Information Processing (TALIP)
Using semantic techniques to access web data
Information Systems
The role of queries in ranking labeled instances extracted from text
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Exploiting user feedback to improve quality of search results clustering
Proceedings of the 5th International Conference on Ubiquitous Information Management and Communication
Nonlinear evidence fusion and propagation for hyponymy relation mining
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
The opposite of smoothing: a language model approach to ranking query-specific document clusters
Journal of Artificial Intelligence Research
Beyond precision@10: clustering the long tail of web search results
Proceedings of the 20th ACM international conference on Information and knowledge management
Journal of Intelligent Information Systems
Search result presentation based on faceted clustering
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
This paper describes Armil, a meta-search engine that groups into disjoint labelled clusters the Web snippets returned by auxiliary search engines. The cluster labels generated by Armil provide the user with a compact guide to assessing the relevance of each cluster to her information need. Striking the right balance between running time and cluster well-formedness was a key point in the design of our system. Both the clustering and the labelling tasks are performed on the fly by processing only the snippets provided by the auxiliary search engines, and use no external sources of knowledge. Clustering is performed by means of a fast version of the furthest-point-first algorithm for metric k-center clustering. Cluster labelling is achieved by combining intra-cluster and inter-cluster term extraction based on a variant of the information gain measure. We have tested the clustering effectiveness of Armil against Vivisimo, the de facto industrial standard in Web snippet clustering, using as benchmark a comprehensive set of snippets obtained from the Open Directory Project hierarchy. According to two widely accepted “external” metrics of clustering quality, Armil achieves better performance levels by 10%. We also report the results of a thorough user evaluation of both the clustering and the cluster labelling algorithms.