A Comparative Study of Feature Vector-Based Topic Detection Schemes A Comparative Study of Feature Vector-Based Topic Detection Schemes

  • Authors:
  • Masafumi Hamamoto;Hiroyuki Kitagawa;Jia-Yu Pan;Christos Faloutsos

  • Affiliations:
  • Graduate School of Systems and Information Engineering;Center for Computational Science, University of Tsukuba;Computer Science Department Carnegie Mellon University;Computer Science Department Carnegie Mellon University

  • Venue:
  • WIRI '05 Proceedings of the International Workshop on Challenges in Web Information Retrieval and Integration
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Topic detection is an important subject when voluminous text data is sent continuously to a user. We examine a method to detect topics in text data using feature vectors. Feature vectors represent the main distribution of data and they are obtained by various data analysis methods. This paper examines three methods: Singular Value Decomposition (SVD), clustering, and Independent Component Analysis (ICA). SVD and clustering are popular existing methods. Clustering, especially, is applied to many topic detection methods. ICA was recently developed in signal processing research. In applications related to text data, however, ICA has not been compared with SVD and clustering, nor has its relationship with them been explored. This paper reports comparative experiments for these three methods and then shows properties as they apply to text data.