On content-driven search-keyword suggesters for literature digital libraries

  • Authors:
  • Sulieman A. Bani-Ahmad;Gultekin Ozsoyoglu

  • Affiliations:
  • Case Western Reserve University, Cleveland, OH, USA;Case Western Reserve University, Cleveland, OH, USA

  • Venue:
  • Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose and evaluate a "content-driven search keyword suggester" for keyword-based search in literature digital libraries. Suggesting search keywords at an early stage, i.e., while the user is entering search terms, is helpful for constructing more accurate, less ambiguous, and focused search keywords for queries. Our search keyword suggestion approach is based on an a priori analysis of the publication collection in the digital library at hand, and consists of the following steps. We (i) parse the document collection using the Link Grammar parser, a syntactic parser of English, (ii) group publications based on their "most-specific" research topics, (iii) use the parser output to build a hierarchical structure of simple and compound tokens to be used to suggest search terms, (iv) use TextRank, a text summarization tool, to assign topic-sensitive scores to keywords, and (v) use the identified research-topics to help user aggregate search keywords prior to the actual search query execution. We experimentally show that the proposed framework, which is optimized to work on literature digital libraries, promises a more scalable, high quality, and user-friendly search-keyword suggester when compared to its competitors. We validate our proposal experimentally using a subset of the ACM SIGMOD Anthology digital library as a testbed, and by employing the research-pyramid model to identify the "most-specific" research topics.