BUAP: An unsupervised approach to automatic keyphrase extraction from scientific articles

Authors:
Roberto Ortiz;David Pinto;Mireya Tovar;Héctor Jiménez-Salazar
Affiliations:
BUAP, Puebla, Mexico;BUAP, Puebla, Mexico;BUAP, Puebla, Mexico;UAM DF, Mexico
Venue:
SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
Year:
2010

Citing 7
Cited 2

The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
KEA: practical automatic keyphrase extraction

Proceedings of the fourth ACM conference on Digital libraries
Finding topic words for hierarchical summarization

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Using Noun Phrase Heads to Extract Document Keyphrases

AI '00 Proceedings of the 13th Biennial Conference of the Canadian Society on Computational Studies of Intelligence: Advances in Artificial Intelligence
Domain-independent automatic keyphrase indexing with small training sets

Journal of the American Society for Information Science and Technology
Domain-specific keyphrase extraction

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
SemEval-2010 task 5: Automatic keyphrase extraction from scientific articles

SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation

MFSRank: an unsupervised method to extract keyphrases using semantic information

MICAI'11 Proceedings of the 10th Mexican international conference on Advances in Artificial Intelligence - Volume Part I
Automatic keyphrase extraction from scientific articles

Language Resources and Evaluation

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, it is presented an unsupervised approach to automatically discover the latent keyphrases contained in scientific articles. The proposed technique is constructed on the basis of the combination of two techniques: maximal frequent sequences and pageranking. We evaluated the obtained results by using micro-averaged precision, recall and F-scores with respect to two different gold standards: 1) reader's keyphrases, and 2) a combined set of author's and reader's keyphrases. The obtained results were also compared against three different baselines: one unsupervised (TF-IDF based) and two supervised (Naïve Bayes and Maximum Entropy).