KEA: practical automatic keyphrase extraction
Proceedings of the fourth ACM conference on Digital libraries
Automatic glossary extraction: beyond terminology identification
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Domain-independent automatic keyphrase indexing with small training sets
Journal of the American Society for Information Science and Technology
ECDL'09 Proceedings of the 13th European conference on Research and advanced technology for digital libraries
“Without the clutter of unimportant words”: Descriptive keyphrases for text visualization
ACM Transactions on Computer-Human Interaction (TOCHI)
Automatic keyphrase annotation of scientific documents using Wikipedia and genetic algorithms
Journal of Information Science
Automatic keyphrase extraction from scientific articles
Language Resources and Evaluation
Hi-index | 0.00 |
The Semeval task 5 was an opportunity for experimenting with the key term extraction module of GROBID, a system for extracting and generating bibliographical information from technical and scientific documents. The tool first uses GROBID's facilities for analyzing the structure of scientific articles, resulting in a first set of structural features. A second set of features captures content properties based on phraseness, informativeness and keywordness measures. Two knowledge bases, GRISP and Wikipedia, are then exploited for producing a last set of lexical/semantic features. Bagged decision trees appeared to be the most efficient machine learning algorithm for generating a list of ranked key term candidates. Finally a post ranking was realized based on statistics of cousage of keywords in HAL, a large Open Access publication repository.