Tagging web product titles based on hidden Markov model

  • Authors:
  • Peng Wang;Baowen Xu;Yue You;Lu Chen

  • Affiliations:
  • Southeast University, China;Nanjing University, China;Southeast University, China;Southeast University, China

  • Venue:
  • Proceedings of the 2011 ACM Symposium on Applied Computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

E-commerce web sites usually have to maintain a large number of product information. To organize this product information, a feasible way is to add semantic tags to the information. However, the Web product information often consists of many irregular statements published by users. Therefore, it is difficult to find rules to automatically tag the product information. This paper mainly focus on the problem of tagging Web product titles and proposes a tagging method based on the hidden markov model (HMM). This method first trains HMM with the maximum likelihood (ML) algorithm, then employs the Viterbi algorithm to tag product titles. Moreover, some strategies including smoothing process, background knowledge, extraction rules and simplifying HMM output observations are used for improving the quality of results. Experimental results on the real world dataset show that our method can achieve more than 51% precision and 60% recall.