Table Based Single Pass Algorithm for Clustering News Articles in NewsPage.com

  • Authors:
  • Gwyduk Yeom;Taeho Jo;Youngik Yeom

  • Affiliations:
  • School of Information and Communication Engineering, Sungkyunkwan University, Suwon, Korea 440-746;School of Information and Information Engineering, Inha University, and School of Information and Communication Engineering, Sungkyunkwan University, Suwon, Korea 440-746;School of Information and Communication Engineering, Sungkyunkwan University, Suwon, Korea 440-746

  • Venue:
  • ICCSA '08 Proceedings of the international conference on Computational Science and Its Applications, Part II
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This research proposes a modified version of single pass algorithm specialized for text clustering. Encoding documents into numerical vectors for using the traditional version of single pass algorithm causes the two main problems: huge dimensionality and sparse distribution. Therefore, in order to address the two problems, this research modifies the single pass algorithm into its version where documents are encoded into other forms than numerical vectors. In the proposed version, documents are mapped into tables and an operation on two tables is defined for using the single pass algorithm. The goal of this research is to improve the performance of single pass algorithm for text clustering by modifying it.