Exploration of textual document archives using a fuzzy hierarchical clustering algorithm in the GAMBAL system

  • Authors:
  • Vicenç Torra;Sadaaki Miyamoto;Sergi Lanau

  • Affiliations:
  • Institut d'Investigació en Intelligència Artificial -CSIC, Campus UAB s/n, 08193 Bellaterra, Catalonia, Spain;Institute of Engineering Mechanics and Systems, University of Tsukuba, Ibaraki 305-8573, Japan;Institut d'Investigació en Intelligència Artificial -CSIC, Campus UAB s/n, 08193 Bellaterra, Catalonia, Spain

  • Venue:
  • Information Processing and Management: an International Journal - Special issue: Cross-language information retrieval
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Internet together with the large amount of textual information available in document archives, has increased the relevance of information retrieval related tools. In this work we present an extension of the Gambal system for clustering and visualization of documents based on fuzzy clustering techniques. The tool allows to structure the set of documents in a hierarchical way (using a fuzzy hierarchical structure) and represent this structure in a graphical interface (a 3D sphere) over which the user can navigate.Gambal allows the analysis of the documents and the computation of their similarity not only on the basis of the syntactic similarity between words but also based on a dictionary (Wordnet 1.7) and latent semantics analysis.