A probabilistic model for compact document topic representation

  • Authors:
  • Zsolt Berényi;István Vajk

  • Affiliations:
  • Budapest University of Technology and Economics, Department of Automation and Applied Informatics, Budapest, Hungary;Budapest University of Technology and Economics, Department of Automation and Applied Informatics, Budapest, Hungary

  • Venue:
  • SMO'09 Proceedings of the 9th WSEAS international conference on Simulation, modelling and optimization
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

When building document categorization in distributed mobile environments, feature selection methods need to be employed to have a compact representation for each document topic and to reduce noise during classification. When interaction occurs between the nodes, locally retrieved features representing the document topic and their attributes have to be shared to have a more accurate estimation of the global classifier at every node. The network traffic should be kept at a minimum to reduce costs. We propose a probabilistic model for a keyword selection method, which makes a more thorough analysis possible and can be considered as a basis when sharing information. It can be used for building up the local document topic representations incrementally ensuring minimal network traffic. The description of the probabilistic model is complemented by experimental results.