A semi-supervised key phrase extraction approach: learning from title phrases through a document semantic network

Authors:
Decong Li;Sujian Li;Wenjie Li;Wei Wang;Weiguang Qu
Affiliations:
Peking University;Peking University;The Hong Kong Polytechnic University;Peking University;Nanjing Normal University
Venue:
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Year:
2010

Citing 4
Cited 3

KEA: practical automatic keyphrase extraction

Proceedings of the fourth ACM conference on Digital libraries
Using random walks for question-focused sentence retrieval

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Extracting key terms from noisy and multitheme documents

Proceedings of the 18th international conference on World wide web
WikiRelate! computing semantic relatedness using wikipedia

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2

Hypergraph-based inductive learning for generating implicit key phrases

Proceedings of the 20th international conference companion on World wide web
Unsupervised topic-oriented keyphrase extraction and its application to Croatian

TSD'11 Proceedings of the 14th international conference on Text, speech and dialogue
Determining the titles of Web pages using anchor text and link analysis

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

It is a fundamental and important task to extract key phrases from documents. Generally, phrases in a document are not independent in delivering the content of the document. In order to capture and make better use of their relationships in key phrase extraction, we suggest exploring the Wikipedia knowledge to model a document as a semantic network, where both n-ary and binary relationships among phrases are formulated. Based on a commonly accepted assumption that the title of a document is always elaborated to reflect the content of a document and consequently key phrases tend to have close semantics to the title, we propose a novel semi-supervised key phrase extraction approach in this paper by computing the phrase importance in the semantic network, through which the influence of title phrases is propagated to the other phrases iteratively. Experimental results demonstrate the remarkable performance of this approach.