Keyphrase extraction in biomedical publications using mesh and intraphrase word co-occurrence information

Authors:
Seong-Yong Bong;Kyu-Baek Hwang
Affiliations:
Soongsil University, Seoul, South Korea;Soongsil University, Seoul, South Korea
Venue:
Proceedings of the ACM fifth international workshop on Data and text mining in biomedical informatics
Year:
2011

Citing 7
Cited 1

Mining the most interesting rules

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
KEA: practical automatic keyphrase extraction

Proceedings of the fourth ACM conference on Digital libraries
Learning Algorithms for Keyphrase Extraction

Information Retrieval
Improved automatic keyword extraction given more linguistic knowledge

EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Automatic assignment of biomedical categories: toward a generic approach

Bioinformatics
Exploiting neighborhood knowledge for single document summarization and keyphrase extraction

ACM Transactions on Information Systems (TOIS)
Automatic extraction and learning of keyphrases from scientific articles

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing

DTMBIO 2011: international workshop on data and textmining in biomedical informatics

Proceedings of the 20th ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Document keyphrases are used for various tasks such as indexing, clustering, and summarization. To documents without keyphrases, an automatic extraction method can be applied. In this paper, we propose an enhanced method of extracting keyphrases from biomedical papers, using MeSH (Medical Subject Headings) and intraphrase word co-occurrence information. MeSH terms assigned to biomedical papers can serve, not only as important features for keyphrase extraction, but also for expansion of keyphrase candidates. Intraphrase word co-occurrence information can be exploited for re-ranking keyphrase candidates. Through an experimental evaluation on 1,799 articles from three academic journals in the biomedical literature, we show that the candidate expansion and re-ranking steps of our approach are highly effective for improving the performance of keyphrase extraction.