A practical system of keyphrase extraction for web pages

  • Authors:
  • Mo Chen;Jian-Tao Sun;Hua-Jun Zeng;Kwok-Yan Lam

  • Affiliations:
  • Tsinghua University, Beijing, China;Tsinghua University, Beijing, China;Microsoft Research Asia, Beijing, P.R. China;Tsinghua University, Beijing, China

  • Venue:
  • Proceedings of the 14th ACM international conference on Information and knowledge management
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Keyphrases can be used to facilitate Web users grasping the main topic(s) of a Web page. We present a practical system of automatic keyphrase extraction for Web pages. In this system, a regression model was first trained based on a set of human-labeled documents. Then it was used to extract keyphrases from new pages automatically. This paper makes three contributions. First, the structure information in a Web page was investigated for keyphrase extraction task. Second, the query log data associated with a Web page collected by a search engine server were used to help keyphrase extraction. Third, a method was put forward in this paper in order to evaluate the similarity of phrases.