HDGSOMr: A High Dimensional Growing Self-Organizing Map Using Randomness for Efficient Web and Text Mining

  • Authors:
  • Rasika Amarasiri;Damminda Alahakoon;Kate Smith;Malin Premaratne

  • Affiliations:
  • Monash University;Monash University;Monash University;Monash University

  • Venue:
  • WI '05 Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Mining of text data from the web has become a necessity in modern days due to the volumes of data available on the web. While searching for information on the web using search engines is popular, to analyze the content on large collections of web pages, feature map techniques are still popular. One of the problems associated with processing large collections of text data from the web using feature map techniques is the time taken to cluster them. This paper presents an algorithm based on a growing variant of the Self Organizing Map called the HDGSOMr. This novel algorithm incorporates randomness into the self-organizing process to produce higher quality clusters within few epochs and utilizing smaller neighborhood sizes resulting in a significant reduction in overall processing time. Details of the HDGSOMr algorithm and results of processing large collections of text data proving the efficiency of the algorithm are also presented.