Term-weighting approaches in automatic text retrieval
Information Processing and Management: an International Journal
Power browser: efficient Web browsing for PDAs
Proceedings of the SIGCHI conference on Human Factors in Computing Systems
Efficient web browsing on handheld devices using page and form summarization
ACM Transactions on Information Systems (TOIS)
Learning Algorithms for Keyphrase Extraction
Information Retrieval
Probabilistic models of indexing and searching
SIGIR '80 Proceedings of the 3rd annual ACM conference on Research and development in information retrieval
Using Noun Phrase Heads to Extract Document Keyphrases
AI '00 Proceedings of the 13th Biennial Conference of the Canadian Society on Computational Studies of Intelligence: Advances in Artificial Intelligence
Fractal summarization for mobile devices to access large documents on the web
WWW '03 Proceedings of the 12th international conference on World Wide Web
ThemeRiver: Visualizing Theme Changes over Time
INFOVIS '00 Proceedings of the IEEE Symposium on Information Vizualization 2000
Accurate methods for the statistics of surprise and coincidence
Computational Linguistics - Special issue on using large corpora: I
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Applied morphological processing of English
Natural Language Engineering
Applications of term identification technology: domain description and content characterisation
Natural Language Engineering
Towards automatic extraction of monolingual and bilingual terminology
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
Feature-rich part-of-speech tagging with a cyclic dependency network
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Accurate unlexicalized parsing
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Improved automatic keyword extraction given more linguistic knowledge
EMNLP '03 Proceedings of the 2003 conference on Empirical methods in natural language processing
Visualizing email content: portraying relationships from conversational histories
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Thesaurus based automatic keyphrase indexing
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Extending the Linear Model with R (Texts in Statistical Science)
Extending the Linear Model with R (Texts in Statistical Science)
Incorporating non-local information into information extraction systems by Gibbs sampling
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Introduction to Information Retrieval
Introduction to Information Retrieval
Jigsaw: supporting investigative analysis through interactive visualization
Information Visualization
Comparing corpora using frequency profiling
CompareCorpora '00 Proceedings of the Workshop on Comparing Corpora
Search User Interfaces
Participatory Visualization with Wordle
IEEE Transactions on Visualization and Computer Graphics
The automatic creation of literature abstracts
IBM Journal of Research and Development
Crowdsourcing graphical perception: using mechanical turk to assess visualization design
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
SemEval-2010 task 5: Automatic keyphrase extraction from scientific articles
SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
HUMB: Automatic key term extraction from scientific articles in GROBID
SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
FacetAtlas: Multifaceted Visualization for Rich Text Corpora
IEEE Transactions on Visualization and Computer Graphics
Hi-index | 0.00 |
Keyphrases aid the exploration of text collections by communicating salient aspects of documents and are often used to create effective visualizations of text. While prior work in HCI and visualization has proposed a variety of ways of presenting keyphrases, less attention has been paid to selecting the best descriptive terms. In this article, we investigate the statistical and linguistic properties of keyphrases chosen by human judges and determine which features are most predictive of high-quality descriptive phrases. Based on 5,611 responses from 69 graduate students describing a corpus of dissertation abstracts, we analyze characteristics of human-generated keyphrases, including phrase length, commonness, position, and part of speech. Next, we systematically assess the contribution of each feature within statistical models of keyphrase quality. We then introduce a method for grouping similar terms and varying the specificity of displayed phrases so that applications can select phrases dynamically based on the available screen space and current context of interaction. Precision-recall measures find that our technique generates keyphrases that match those selected by human judges. Crowdsourced ratings of tag cloud visualizations rank our approach above other automatic techniques. Finally, we discuss the role of HCI methods in developing new algorithmic techniques suitable for user-facing applications.