Keyword extraction based on pagerank

  • Authors:
  • Jinghua Wang;Jianyi Liu;Cong Wang

  • Affiliations:
  • Beijing University of Posts and Telecommunications, Beijing, China;Beijing University of Posts and Telecommunications, Beijing, China;Beijing University of Posts and Telecommunications, Beijing, China

  • Venue:
  • PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Keywords are viewed as the words that represent the topic and the content of the whole text. Keyword extraction is an important technology in many areas of document processing, such as text clustering, text summarization, and text retrieval. This paper provides a keyword extraction algorithm based on WordNet and PageRank. Firstly, a text is represented as a rough undirected weighted semantic graph with WordNet, which defines synsets as vertices and relations of vertices as edges, and assigns the weight of edges with the relatedness of connected synsets. Then we apply UW-PageRank in the rough graph to do word sense disambiguation, prune the graph, and finally apply UW-PageRank again on the pruned graph to extract keywords. The experimental results show our algorithm is practical and effective.