Applications of a Lightweight, Web-Based Retrieval, Clustering, and Visualisation Framework

  • Authors:
  • Vedran Sabol;W. Kienreich;M. Granitzer;J. Becker;Klaus Tochtermann;Keith Andrews

  • Affiliations:
  • -;-;-;-;-;-

  • Venue:
  • PAKM '02 Proceedings of the 4th International Conference on Practical Aspects of Knowledge Management
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Today's web search engines return very large result sets for query formulations consisting of few specific keywords. Results are presented as ranked lists containing textual description of found items. Such representations do not allow identification of topical clusters, and consequentially make it difficult for users to refine queries efficiently.In this paper, we present WebRat, a framework for web-based retrieval, clustering and visualisation which enables parallel querying of multiple search engines, merging of retrieved result sets, automatic identification of topical clusters and interactive visualisation of the result sets and clusters for query refinement. This framework is lightweight in the sense that it consists of a small, platform-independent component which can be easily integrated into exisiting Internet or Intranet search forms without requiring specific system environments, server resources or precalculation efforts.The WebRat system extends existing approaches to web search result visualisation in many aspects: Found results are added incrementally as they arrive, labelling is performed in 2-dimensional space on clusters the user can see and rendering is optimised to provide sufficient performance on standard office machines.The WebRat framework has been used to implement a variety of applications: We have provided enhanced web search capabilities for users doing scientific research. Overview and refinement capabilities have been implemented for the environmental domain. Finally, abstracts generated on the fly by a knowledge management system have been used to provide topical navigation capabilities to developers searching for technical information in mailing list archives.