Exploiting Attribute-Wise Distribution of Keywords and Category Dependent Attributes for E-Catalog Classification

  • Authors:
  • Young-Gon Kim;Taehee Lee;Sang-Goo Lee;Jong-Heung Park

  • Affiliations:
  • Postal Technology Research Center Electronics and Telecommunications Research Institute, , Daejeon, Republic of Korea 305-700;Electrical and Computer Engineering Department, Carnegie Mellon University, Pittsburgh,;School of Computer Science and Engineering / Center for E-Business Research, Seoul National University, Seoul, Republic of Korea 151-742;Postal Technology Research Center Electronics and Telecommunications Research Institute, , Daejeon, Republic of Korea 305-700

  • Venue:
  • ICIC '08 Proceedings of the 4th international conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications - with Aspects of Theoretical and Methodological Issues
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

E-catalogs are semi-structured documents that consist of multiple attributes and values. Although the conventional text classification techniques are applicable to the e-catalog classification as well, they cannot use the attribute information effectively to improve the classification accuracy. In this paper, we propose an e-catalog classification algorithm by extending Naïve Bayesian Classifier to use the attribute information. Specifically, we focus on exploiting two e-catalog specific characteristics: the attribute-wise keyword distribution and the category dependent attributes. Experiments on real data validate the proposed method.