Self-organising maps in document classification: a comparison with six machine learning methods

Authors:
Jyri Saarikoski;Jorma Laurikkala;Kalervo Järvelin;Martti Juhola
Affiliations:
Department of Computer Sciences, University of Tampere, Finland;Department of Computer Sciences, University of Tampere, Finland;Department of Information Studies and Interactive Media, University of Tampere, Finland;Department of Computer Sciences, University of Tampere, Finland
Venue:
ICANNGA'11 Proceedings of the 10th international conference on Adaptive and natural computing algorithms - Volume Part I
Year:
2011

Citing 12
Cited 1

Automatic text processing: the transformation, analysis, and retrieval of information by computer

Automatic text processing: the transformation, analysis, and retrieval of information by computer
Automated learning of decision rules for text categorization

ACM Transactions on Information Systems (TOIS)
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Self-Organizing Maps

Self-Organizing Maps
Document organization using Kohonen's algorithm

Information Processing and Management: an International Journal
Text Retrieval Using Self-Organized Document Maps

Neural Processing Letters
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Mining massive document collections by the WEBSOM method

Information Sciences: an International Journal - Special issue: Soft computing data mining
A connectionist and multivariate approach to science maps: the SOM, clustering and MDS applied to library and information science research

Journal of Information Science
Classifying Amharic news text using self-organizing maps

Semitic '05 Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages
On document classification with self-organising maps

ICANNGA'09 Proceedings of the 9th international conference on Adaptive and natural computing algorithms
Unsupervised text classification using kohonen's self organizing network

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing

Theoretical aspects of mapping to multidimensional optimal regions as a multi-classifier

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper focuses on the use of self-organising maps, also known as Kohonen maps, for the classification task of text documents. The aim is to effectively and automatically classify documents to separate classes based on their topics. The classification with self-organising map was tested with three data sets and the results were then compared to those of six well known baseline methods: k-means clustering, Ward's clustering, k nearest neighbour searching, discriminant analysis, Naïve Bayes classifier and classification tree. The self-organising map proved to be yielding the highest accuracies of tested unsupervised methods in classification of the Reuters news collection and the Spanish CLEF 2003 news collection, and comparable accuracies against some of the supervised methods in all three data sets.