A classifier for semi-structured documents
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
Machine Learning
Bayesian network model for semi-structured document classification
Information Processing and Management: an International Journal - Special issue: Bayesian networks and information retrieval
Exploiting structural information for semi-structured document categorization
Information Processing and Management: an International Journal
Modified naïve bayes classifier for e-catalog classification
DEECS'06 Proceedings of the Second international conference on Data Engineering Issues in E-Commerce and Services
E-commerce market analysis from a graph-based product classifier
PROPOR'12 Proceedings of the 10th international conference on Computational Processing of the Portuguese Language
Hi-index | 0.00 |
E-catalogs are semi-structured documents that consist of multiple attributes and values. Although the conventional text classification techniques are applicable to the e-catalog classification as well, they cannot use the attribute information effectively to improve the classification accuracy. In this paper, we propose an e-catalog classification algorithm by extending Naïve Bayesian Classifier to use the attribute information. Specifically, we focus on exploiting two e-catalog specific characteristics: the attribute-wise keyword distribution and the category dependent attributes. Experiments on real data validate the proposed method.