Text categorization based on combination of modified back propagation neural network and latent semantic analysis

  • Authors:
  • Wei Wang;Bo Yu

  • Affiliations:
  • School of Electronics and Information, Sichuan University, Institute of Image and Information, 610065, Chengdu, China;Xi’an Jiaotong University, School of Electronic and Information Engineering, 710049, Xi’an, China

  • Venue:
  • Neural Computing and Applications
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

This paper proposed a new text categorization model based on the combination of modified back propagation neural network (MBPNN) and latent semantic analysis (LSA). The traditional back propagation neural network (BPNN) has slow training speed and is easy to trap into a local minimum, and it will lead to a poor performance and efficiency. In this paper, we propose the MBPNN to accelerate the training speed of BPNN and improve the categorization accuracy. LSA can overcome the problems caused by using statistically derived conceptual indices instead of individual words. It constructs a conceptual vector space in which each term or document is represented as a vector in the space. It not only greatly reduces the dimension but also discovers the important associative relationship between terms. We test our categorization model on 20-newsgroup corpus and reuter-21578 corpus, experimental results show that the MBPNN is much faster than the traditional BPNN. It also enhances the performance of the traditional BPNN. And the application of LSA for our system can lead to dramatic dimensionality reduction while achieving good classification results.