Confidence-Based incremental classification for objects with limited attributes in vertical search

  • Authors:
  • Ozer Ozdikis;Pinar Senkul;Siyamed Sinir

  • Affiliations:
  • Computer Engineering Department, Middle East Technical University, Ankara, Turkey;Computer Engineering Department, Middle East Technical University, Ankara, Turkey;Karniyarik Ltd., Ankara, Turkey

  • Venue:
  • IEA/AIE'12 Proceedings of the 25th international conference on Industrial Engineering and Other Applications of Applied Intelligent Systems: advanced research in applied artificial intelligence
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

With vertical search engines, it is possible to search the web pages on a specific domain such as products, restaurants or academic papers and present the users only the interested information. Gathering and integrating such objects from multiple web pages into a single system provides a useful facility for users. Placing the extracted objects from multiple data sources into a single hierarchical structure is a challenging classification problem, especially if there are limited object attributes. In this work, we propose a confidence-based incremental Naïve Bayesian approach for categorization, focusing on the product domain. Incremental approach is based on extending the training set and retraining the classifier as new objects are assigned to a category with high confidence. The ordering of product data is taken into account as well. The proposed approach is applied on a vertical search engine that collects product data from several online stores.