KEA: practical automatic keyphrase extraction
Proceedings of the fourth ACM conference on Digital libraries
Automatically indexing documents: content vs. reference
Proceedings of the 7th international conference on Intelligent user interfaces
Scraping the ACM Digital Library
ACM SIGIR Forum
Information Retrieval
Bibliographic attribute extraction from erroneous references based on a statistical model
Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries
Information extraction from research papers using conditional random fields
Information Processing and Management: an International Journal
Reference metadata extraction using a hierarchical knowledge representation framework
Decision Support Systems
Google Book Search: Document Understanding on a Massive Scale
ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 02
Domain-independent automatic keyphrase indexing with small training sets
Journal of the American Society for Information Science and Technology
Comparing citation contexts for information retrieval
Proceedings of the 17th ACM conference on Information and knowledge management
A hidden Markov model-based text classification of medical documents
Journal of Information Science
Extracting key terms from noisy and multitheme documents
Proceedings of the 18th international conference on World wide web
How to find better index terms through citations
CLIIR '06 Proceedings of the Workshop on How Can Computational Linguistics Improve Information Retrieval?
Document clustering of scientific texts using citation contexts
Information Retrieval
Using terms from citations for IR: some first results
ECIR'08 Proceedings of the IR research, 30th European conference on Advances in information retrieval
Combining contents and citations for scientific document classification
AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
Journal of Information Science
EKAW'12 Proceedings of the 18th international conference on Knowledge Engineering and Knowledge Management
Automatic keyphrase annotation of scientific documents using Wikipedia and genetic algorithms
Journal of Information Science
Hi-index | 0.00 |
Topical indexing of documents with keyphrases is a common method used for revealing the subject of scientific and research documents to both human readers and information retrieval tools, such as search engines. However, scientific documents that are manually indexed with keyphrases are still in the minority. This article describes a new unsupervised method for automatic keyphrase extraction from scientific documents which yields a performance on a par with human indexers. The method is based on identifying references cited in the document to be indexed and, using the keyphrases assigned to those references, for generating a set of high-likelihood keyphrases for the document. We have evaluated the performance of the proposed method by using it to automatically index a third-party testset of research documents. Reported experimental results show that the performance of our method, measured in terms of consistency with human indexers, is competitive with that achieved by state-of-the-art supervised methods.