A trainable document summarizer
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Phrasier: a system for interactive document retrieval using keyphrases
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
On Relevance, Probabilistic Indexing and Information Retrieval
Journal of the ACM (JACM)
Improving browsing in digital libraries with keyphrase indexes
Decision Support Systems - From information retrieval to knowledge management: enabling technologies and best practices
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Learning Algorithms for Keyphrase Extraction
Information Retrieval
OIL: An Ontology Infrastructure for the Semantic Web
IEEE Intelligent Systems
Information search and re-access strategies of experienced web users
WWW '05 Proceedings of the 14th international conference on World Wide Web
Thesaurus based automatic keyphrase indexing
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Tag clouds for summarizing web search results
Proceedings of the 16th international conference on World Wide Web
The folksonomy tag cloud: when is it useful?
Journal of Information Science
Web Document Clustering by Using Automatic Keyphrase Extraction
WI-IATW '07 Proceedings of the 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops
Real-time automatic tag recommendation
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Data clouds: summarizing keyword search results over structured data
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Semantically structured tag clouds: an empirical evaluation of clustered presentation approaches
Proceedings of the SIGCHI Conference on Human Factors in Computing Systems
Domain-specific keyphrase extraction
IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Human-competitive tagging using automatic keyphrase extraction
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Hi-index | 0.00 |
Tag cloud, also known as word cloud, are very useful for quickly perceiving the most prominent terms embedded within a text collection to determine their relative prominence. The effectiveness of tag clouds to conceptualize a text corpus is directly proportional to the quality of the keyphrases extracted from the corpus. Although, authors provide a list of about five to ten keywords in scientific publications that are used to map them into their respective domain, due to exponential growth in non-scientific documents on the World Wide Web, an automatic mechanism is sought to identify keyphrases embedded within them for tag cloud generation. In this paper, we propose a web content mining technique to extract keyphrases from web documents for tag cloud generation. Instead of using partial or full parsing, the proposed method applies n-gram technique followed by various heuristics-based refinements to identify a set of lexical and semantic features from text documents. We propose a rich set of domain-independent features to model candidate keyphrases very effectively for establishing their keyphraseness using classification models. We also propose a font-determination function to determine the relative font-size of keyphrases for tag cloud generation. The efficacy of the proposed method is established through experimentation. The proposed method outperforms the popular keyphrase extraction system KEA.