Comparative behaviour of recent incremental and non-incremental clustering methods on text: an extended study

  • Authors:
  • Jean-Charles Lamirel;Raghvendra Mall;Mumtaz Ahmad

  • Affiliations:
  • LORIA, Vandoeuvre-lès-Nancy, France;Center of Data Engineering, IIIT Hyderabad, Hyderabad, Andhra Pradesh, India;LORIA, Vandoeuvre-lès-Nancy, France

  • Venue:
  • IEA/AIE'11 Proceedings of the 24th international conference on Industrial engineering and other applications of applied intelligent systems conference on Modern approaches in applied intelligence - Volume Part I
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper represents an attempt to throw some light on the quality and on the defects of some recent clustering methods, either they are incremental or not, on "real world data". An extended evaluation of the methods is achieved through the use of textual datasets of increasing complexity. The third test dataset is a highly polythematic dataset that figures out a static simulation of evolving data. It thus represents an interesting benchmark for comparing the behaviour of incremental and non incremental methods. The focus is put on neural clustering methods but the standard K-means method is included as reference in the comparison. Generic quality measures are used for quality evaluation.