Fuzzy Clustering for Topic Analysis and Summarization of Document Collections

  • Authors:
  • René Witte;Sabine Bergler

  • Affiliations:
  • Institut für Programmstrukturen und Datenorganisation (IPD), Universität Karlsruhe (TH), Germany;Department of Computer Science and Software Engineering, Concordia University, Montréal, Canada

  • Venue:
  • CAI '07 Proceedings of the 20th conference of the Canadian Society for Computational Studies of Intelligence on Advances in Artificial Intelligence
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Large document collections, such as those delivered by Internet search engines, are difficult and time-consuming for users to read and analyse. The detection of common and distinctive topics within a document set, together with the generation of multi-document summaries, can greatly ease the burden of information management. We show how this can be achieved with a clustering algorithm based on fuzzy set theory, which (i) is easy to implement and integrate into a personal information system, (ii) generates a highly flexible data structure for topic analysis and summarization, and (iii) also delivers excellent performance.