Keyword extraction based on sequential pattern mining

  • Authors:
  • Jiajia Feng;Fei Xie;Xuegang Hu;Peipei Li;Jie Cao;Xindong Wu

  • Affiliations:
  • Hefei University of Tech., Hefei, China;Hefei University of Tech., Hefei, China;Hefei University of Tech., Hefei, China;Hefei University of Tech., Hefei, China;Nanjing University of Finance and Economics, Nanjing, China;University of Vermont, Burlington

  • Venue:
  • Proceedings of the Third International Conference on Internet Multimedia Computing and Service
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Keyword extraction is to automatically extract keywords that capture the main topic discussed in a given document. In this paper, a new keyword extraction algorithm based on sequential patterns is proposed. By preprocessing, a document is represented as sequences of words where a sequential pattern mining algorithm is applied on, and important sequential patterns are mined that reflect the semantic relatedness between words. Both statistical features and pattern features within words are used to build the keyword extraction model. The algorithm is independent of languages and does not need the help of a semantic dictionary to get the semantic features. Experimental results on Chinese journal articles show that the proposed algorithm always outperforms the baseline method KEA.