Data discretization for novel relationship discovery in information retrieval

  • Authors:
  • G. Benoît

  • Affiliations:
  • Univ. of Kentucky, Lexington

  • Venue:
  • Journal of the American Society for Information Science and Technology
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

This article describes an information retrieval, visualization, and manipulation model. After term or phrases have been input for a query, the system designed on this model offers the user multiple ways to exploit the retrieval set via an interactive interface. The retrieved data are clustered into thematic concepts related to the query, represented on screen as a grid of nodes. Users of the system may manipulate the retrieval set to explore document-document, document-concept, concept-concept relationships in the retrieval set that might otherwise be masked by altering (a) the discrete grid size of the display, (b) the influence, or weight, of various document terms and properties, and (c) mixed levels of granularity. As these factors are reweighed, the display is updated in real-time to expose unanticipated document relationships, and shifts in cluster membership. The article outlines the mathematical model and then describes an information-retrieval application built on the model to search structured and full-text files. The application, written in Java, uses a small test collection of Dialog and Swiss-Prot documents.