Item categorization in the e-commerce domain

  • Authors:
  • Dan Shen;Jean David Ruvini;Manas Somaiya;Neel Sundaresan

  • Affiliations:
  • eBay Research Labs, Shanghai, China;eBay Research Labs, San Jose, CA, USA;eBay Research Labs, San Jose, CA, USA;eBay Research Labs, San Jose, CA, USA

  • Venue:
  • Proceedings of the 20th ACM international conference on Information and knowledge management
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Hierarchical classification is a challenging problem yet bears a broad application in real-world tasks. Item categorization in the ecommerce domain is such an example. In a large-scale industrial setting such as eBay, a vast amount of items need to be categorized into a large number of leaf categories, on top of which a complex topic hierarchy is defined. Other than the scale challenges, item data is extremely sparse and skewed distributed over categories, and exhibits heterogeneous characteristics across categories. A common strategy for hierarchical classification is the "gates-and-experts" methods, where a high-level classification is made first (the gates), followed by a low-level distinction (the experts). In this paper, we propose to leverage domain-specific feature generation and modeling techniques to greatly enhance the classification accuracy of the experts. In particular, we innovatively derive features to encode various rich domain knowledge and linguistic hints, and then adapt a SVM-based model to distinguish several very confusing category groups appeared as the performance bottleneck of a currently deployed live system at eBay. We use illustrative examples and empirical results to demonstrate the effectiveness of our approach, particularly the merit of smartly designed domain-specific features.