Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
CiteSeer: an automatic citation indexing system
Proceedings of the third ACM conference on Digital libraries
KEA: practical automatic keyphrase extraction
Proceedings of the fourth ACM conference on Digital libraries
Information Retrieval
Journal of the American Society for Information Science and Technology
Learning Algorithms for Keyphrase Extraction
Information Retrieval
Using Noun Phrase Heads to Extract Document Keyphrases
AI '00 Proceedings of the 13th Biennial Conference of the Canadian Society on Computational Studies of Intelligence: Advances in Artificial Intelligence
Thesaurus based automatic keyphrase indexing
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Mining Domain-Specific Thesauri from Wikipedia: A Case Study
WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Domain-independent automatic keyphrase indexing with small training sets
Journal of the American Society for Information Science and Technology
Learning to link with wikipedia
Proceedings of the 17th ACM conference on Information and knowledge management
Extracting key terms from noisy and multitheme documents
Proceedings of the 18th international conference on World wide web
WikiRelate! computing semantic relatedness using wikipedia
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
International Journal of Human-Computer Studies
Coherent keyphrase extraction via web mining
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Re-examining automatic keyphrase extraction approaches in scientific articles
MWE '09 Proceedings of the Workshop on Multiword Expressions: Identification, Interpretation, Disambiguation and Applications
Human-competitive tagging using automatic keyphrase extraction
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Keyphrase extraction in scientific publications
ICADL'07 Proceedings of the 10th international conference on Asian digital libraries: looking back 10 years and forging new frontiers
HUMB: Automatic key term extraction from scientific articles in GROBID
SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
A citation-based approach to automatic topical indexing of scientific literature
Journal of Information Science
Cross-language patent matching via an international patent classification-based concept bridge
Journal of Information Science
Hi-index | 0.00 |
Topical annotation of documents with keyphrases is a proven method for revealing the subject of scientific and research documents to both human readers and information retrieval systems. This article describes a machine learning-based keyphrase annotation method for scientific documents that utilizes Wikipedia as a thesaurus for candidate selection from documents' content. We have devised a set of 20 statistical, positional and semantical features for candidate phrases to capture and reflect various properties of those candidates that have the highest keyphraseness probability. We first introduce a simple unsupervised method for ranking and filtering the most probable keyphrases, and then evolve it into a novel supervised method using genetic algorithms. We have evaluated the performance of both methods on a third-party dataset of research papers. Reported experimental results show that the performance of our proposed methods, measured in terms of consistency with human annotators, is on a par with that achieved by humans and outperforms rival supervised and unsupervised methods.