Product query classification

  • Authors:
  • Dou Shen;Ying Li;Xiao Li;Dengyong Zhou

  • Affiliations:
  • Microsoft, Redmond, WA, USA;Microsoft, Redmond, WA, USA;Microsoft Research, Redmond, WA, USA;Microsoft Research, Redmond, WA, USA

  • Venue:
  • Proceedings of the 18th ACM conference on Information and knowledge management
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Web query classification is an effective way to understand Web user intents, which can further improve Web search and online advertising relevance. However, Web queries are usually very short which cannot fully reflect their meanings. What is more, it is quite hard to obtain enough training data for training accurate classifiers. Therefore, previous work on query classification has focused on two issues. One is how to represent Web queries through query expansion. The other is how to increase the amount of training data. In this paper, we took product query classification as an example, which is to classify Web queries into a predefined product taxonomy, and systematically studied the impact of query expansion and the size of training data. We proposed two methods of enriching Web queries and three approaches of collecting training data. Thereafter, we conducted a series of experiments to compare the classification performance of using different combinations of training data and query representations over a real data set. The data set consists of hundreds of thousands queries collected from a popular commercial search engine. From the experiments, we found some interesting observations, which were not discussed before. Finally, we proposed an effective and efficient product query classification method based on our observations.