On content-driven search-keyword suggesters for literature digital libraries

Authors:
Sulieman A. Bani-Ahmad;Gultekin Ozsoyoglu
Affiliations:
Case Western Reserve University, Cleveland, OH, USA;Case Western Reserve University, Cleveland, OH, USA
Venue:
Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries
Year:
2008

Citing 10
Cited 0

The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
How can we investigate citation behavior?: a study of reasons for citing literature in communication

Journal of the American Society for Information Science
SimRank: a measure of structural-context similarity

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Text Retrieval by Using k-word Proximity Search

DANTE '99 Proceedings of the 1999 International Symposium on Database Applications in Non-Traditional Environments
Homonymy and polysemy in information retrieval

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Automatic identification of user goals in Web search

WWW '05 Proceedings of the 14th international conference on World Wide Web
Type less, find more: fast autocompletion search with a succinct index

SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Efficient interactive query expansion with complete search

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
elGiza, a research-pyramid based search tool for vertical literature digital libraries

ICDEW '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering Workshop
Improved publication scores for online digital libraries via research pyramids

ECDL'07 Proceedings of the 11th European conference on Research and Advanced Technology for Digital Libraries

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose and evaluate a "content-driven search keyword suggester" for keyword-based search in literature digital libraries. Suggesting search keywords at an early stage, i.e., while the user is entering search terms, is helpful for constructing more accurate, less ambiguous, and focused search keywords for queries. Our search keyword suggestion approach is based on an a priori analysis of the publication collection in the digital library at hand, and consists of the following steps. We (i) parse the document collection using the Link Grammar parser, a syntactic parser of English, (ii) group publications based on their "most-specific" research topics, (iii) use the parser output to build a hierarchical structure of simple and compound tokens to be used to suggest search terms, (iv) use TextRank, a text summarization tool, to assign topic-sensitive scores to keywords, and (v) use the identified research-topics to help user aggregate search keywords prior to the actual search query execution. We experimentally show that the proposed framework, which is optimized to work on literature digital libraries, promises a more scalable, high quality, and user-friendly search-keyword suggester when compared to its competitors. We validate our proposal experimentally using a subset of the ACM SIGMOD Anthology digital library as a testbed, and by employing the research-pyramid model to identify the "most-specific" research topics.