Likey: Unsupervised language-independent keyphrase extraction

Authors:
Mari-Sanna Paukkeri;Timo Honkela
Affiliations:
Aalto University School of Science and Technology, AALTO, Finland;Aalto University School of Science and Technology, AALTO, Finland
Venue:
SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
Year:
2010

Citing 6
Cited 1

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
Generating and evaluating domain-oriented multi-word terms from texts

Information Processing and Management: an International Journal
Improved automatic keyword extraction given more linguistic knowledge

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Domain-specific keyphrase extraction

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Clustering to find exemplar terms for keyphrase extraction

EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1 - Volume 1
SemEval-2010 task 5: Automatic keyphrase extraction from scientific articles

SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation

Automatic keyphrase extraction from scientific articles

Language Resources and Evaluation

Quantified Score

Hi-index	0.01

Visualization

Abstract

Likey is an unsupervised statistical approach for keyphrase extraction. The method is language-independent and the only language-dependent component is the reference corpus with which the documents to be analyzed are compared. In this study, we have also used another language-dependent component: an English-specific Porter stemmer as a preprocessing step. In our experiments of keyphrase extraction from scientific articles, the Likey method outperforms both supervised and unsupervised baseline methods.