Concept-based clustering of textual documents using SOM

  • Authors:
  • Abdelmalek Amine;Zakaria Elberrichi;Ladjel Bellatreche;Michel Simonet;Mimoun Malki

  • Affiliations:
  • EEDIS Laboratory, Department of computer science, Djillali Liabes University, Sidi Belabbes - Algeria;EEDIS Laboratory, Department of computer science, Djillali Liabes University, Sidi Belabbes - Algeria;LISI/ENSMA University of Poitiers, Futuroscope 86960 France;TIMC-IMAG Laboratory, IN3S, University Joseph Fourier, Grenoble - France;EEDIS Laboratory, Department of computer science, Djillali Liabes University, Sidi Belabbes - Algeria

  • Venue:
  • AICCSA '08 Proceedings of the 2008 IEEE/ACS International Conference on Computer Systems and Applications
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The classification of textual documents has been widely studied. The majority of classification approaches use supervised learning methods, which are acceptable for rather small corpora allowing experts to generate representative sets of data for the training, but are not feasible for significant flows of data. Unsupervised classification methods discover latent (hidden) classes automatically while minimizing human intervention. Many such methods exist, among which Kohonen self-organizing maps (SOM), which gather a certain number of similar objects without prior information. In this paper, we evaluate and compare the use of SOMs for the classification of textual documents in two situations: a conceptual representation of texts and a representation based on n-grams.