Scatter/Gather: a cluster-based approach to browsing large document collections
SIGIR '92 Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval
Fast and effective text mining using linear-time document clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Probabilistic latent semantic indexing
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Hierarchical faceted metadata in site search interfaces
CHI '02 Extended Abstracts on Human Factors in Computing Systems
Probabilistic topic decomposition of an eighteenth-century American newspaper
Journal of the American Society for Information Science and Technology
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Hi-index | 0.00 |
The State and University Library of Denmark is developing an integrated search system called Summa, and as part of the Summa project a clustering module and a facet module. Simple clusters have been created for a collection of more than six and a half million library metadata records using a linear clustering algorithm. The created clusters are used to enrich the metadata records, and search results are presented to the user using a faceted browsing interface alongside a ranked result list. The most frequent tags in the different facets in the search result can be calculated and presented at a rate of approximately three million records per second per machine.