Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Lucene in Action (In Action series)
Lucene in Action (In Action series)
Top 10 algorithms in data mining
Knowledge and Information Systems
Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Retrievability: an evaluation measure for higher order information access tasks
Proceedings of the 17th ACM conference on Information and knowledge management
SemEval-2010 task 5: Automatic keyphrase extraction from scientific articles
SemEval '10 Proceedings of the 5th International Workshop on Semantic Evaluation
Reverted indexing for feedback and expansion
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Capacity-constrained query formulation
ECDL'10 Proceedings of the 14th European conference on Research and advanced technology for digital libraries
Candidate document retrieval for web-scale text reuse detection
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
The optimum clustering framework: implementing the cluster hypothesis
Information Retrieval
Towards optimum query segmentation: in doubt without
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
We introduce the concept of keyqueries as dynamic content descriptors for documents. Keyqueries are defined implicitly by the index and the retrieval model of a reference search engine: keyqueries for a document are the minimal queries that return the document in the top result ranks. Besides applications in the fields of information retrieval and data mining, keyqueries have the potential to form the basis of a dynamic classification system for future digital libraries---the modern version of keywords for content description. To determine the keyqueries for a document, we present an exhaustive search algorithm along with effective pruning strategies. For applications where a small number of diverse keyqueries is sufficient, two tailored search strategies are proposed. Our experiments emphasize the role of the reference search engine and show the potential of keyqueries as innovative document descriptors for large, fast evolving bodies of digital content such as the web.