Automatic extraction of document keyphrases for use in digital libraries: evaluation and applications

Authors:
Steve Jones;Gordon W. Paynter
Affiliations:
Univ. of Waikato, Hamilton, New Zealand;Univ. of Waikato, Hamilton, New Zealand
Venue:
Journal of the American Society for Information Science and Technology
Year:
2002

Citing 23
Cited 19

The use of phrases and structured queries in information retrieval

SIGIR '91 Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval
The HCI Bibliography project

ACM SIGCHI Bulletin - Special issue: Computer supported cooperative work
C4.5: programs for machine learning

C4.5: programs for machine learning
Exploiting clustering and phrases for context-based information retrieval

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
A public library based on full-text retrieval

Communications of the ACM
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss

Machine Learning - Special issue on learning with probabilistic representations
Inductive learning algorithms and representations for text categorization

Proceedings of the seventh international conference on Information and knowledge management
Web document clustering: a feasibility demonstration

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Phase-based information retrieval

Information Processing and Management: an International Journal
Phrasier: a system for interactive document retrieval using keyphrases

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Grouper: a dynamic clustering interface to Web search results

WWW '99 Proceedings of the eighth international conference on World Wide Web
Topic-based browsing within a digital library using keyphrases

Proceedings of the fourth ACM conference on Digital libraries
A patent search and classification system

Proceedings of the fourth ACM conference on Digital libraries
KEA: practical automatic keyphrase extraction

Proceedings of the fourth ACM conference on Digital libraries
Comparing noun phrasing techniques for use with medical digital library tools

Journal of the American Society for Information Science - Special topic issue on digital libraries: part 2
Improving browsing in digital libraries with keyphrase indexes

Decision Support Systems - From information retrieval to knowledge management: enabling technologies and best practices
Learning Algorithms for Keyphrase Extraction

Information Retrieval
Managing Complexity in a Distributed Digital Library

Computer
The InfoFinder Agent: Learning User Interests through Heuristic Phrase Extraction

IEEE Expert: Intelligent Systems and Their Applications
Domain-Specific Keyphrase Extraction

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
A simple rule-based part of speech tagger

ANLC '92 Proceedings of the third conference on Applied natural language processing
Data Mining

Data Mining
User-chosen phrases in interactive query formulation for information retrieval

IRSG'98 Proceedings of the 20th Annual BCS-IRSG conference on Information Retrieval Research

Using keyphrases as search result surrogates on small screen devices

Personal and Ubiquitous Computing
Findex: search result categories help users when document ranking fails

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Developing practical automatic metadata assignment and evaluation tools for internet resources

Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries
Discovering "title-like" terms

Information Processing and Management: an International Journal
Finding nuggets in documents: A machine learning approach

Journal of the American Society for Information Science and Technology
Identifying important concepts from medical documents

Journal of Biomedical Informatics
Keywords given by authors of scientific articles in database descriptors

Journal of the American Society for Information Science and Technology
Functionalities for automatic metadata generation applications: a survey of metadata experts' opinions

International Journal of Metadata, Semantics and Ontologies
Document keyphrases as subject metadata: incorporating document key concepts in search results

Information Retrieval
AUTOMATIC MACHINE LEARNING OF KEYPHRASE EXTRACTION FROM SHORT HTML DOCUMENTS WRITTEN IN HEBREW

Cybernetics and Systems
An Approach to a Visual Semantic Query for Document Retrieval

Edutainment '08 Proceedings of the 3rd international conference on Technologies for E-Learning and Digital Entertainment
Coherent keyphrase extraction via web mining

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Developing a holistic model for digital library evaluation

Journal of the American Society for Information Science and Technology
Automatic Keyphrase Extraction from Medical Documents

PReMI '09 Proceedings of the 3rd International Conference on Pattern Recognition and Machine Intelligence
Automatic keyphrases extraction from document using neural network

ICMLC'05 Proceedings of the 4th international conference on Advances in Machine Learning and Cybernetics
Ensemble learning for keyphrases extraction from scientific document

ISNN'06 Proceedings of the Third international conference on Advances in Neural Networks - Volume Part I
Automatic extraction and learning of keyphrases from scientific articles

CICLing'05 Proceedings of the 6th international conference on Computational Linguistics and Intelligent Text Processing
An approach to social recommendation for context-aware mobile services

ACM Transactions on Intelligent Systems and Technology (TIST) - Special section on twitter and microblogging services, social recommender systems, and CAMRa2010: Movie recommendation in context
Automatic keyphrase annotation of scientific documents using Wikipedia and genetic algorithms

Journal of Information Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

This article describes an evaluation of the Kea automatic keyphrase extraction algorithm. Document keyphrases are conventionally used as concise descriptors of document content, and are increasingly used in novel ways, including document clustering, searching and browsing interfaces, and retrieval engines. However, it is costly and time consuming to manually assign keyphrases to documents, motivating the development of tools that automatically perform this function. Previous studies have evaluated Kea's performance by measuring its ability to identify author keywords and keyphrases, but this methodology has a number of well-known limitations. The results presented in this article are based on evaluations by human assessors of the quality and appropriateness of Kea keyphrases. The results indicate that, in general, Kea produces keyphrases that are rated positively by human assessors. However, typical Kea settings can degrade performance, particularly those relating to keyphrase length and domain specificity. We found that for some settings, Kea's performance is better than that of similar systems, and that Kea's ranking of extracted keyphrases is effective. We also determined that author-specified keyphrases appear to exhibit an inherent ranking, and that they are rated highly and therefore suitable for use in training and evaluation of automatic keyphrasing systems.