Taking chemistry to the task: personalized queries for chemical digital libraries

  • Authors:
  • Sascha Tönnies;Benjamin Köhncke;Wolf-Tilo Balke

  • Affiliations:
  • L3S Research Center, Hannover, Germany;L3S Research Center, Hannover, Germany;TU Braunschweig, Braunschweig, Germany

  • Venue:
  • Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Nowadays, the information access is conducted almost exclusively using the Web. Simple keyword based Web search engines, e.g. Google or Yahoo!, offer suitable retrieval and ranking features. In contrast, for highly specialized domains, represented by digital libraries, these features are insufficient. Considering the domain of chemistry, where searching for relevant literature is essentially centered on chemical entities. Beside commercial information providers such as Chemical Abstract Service (CAS) numerous groups are working on building free chemical search engines to overcome the expensive access to chemical literature. However, due to the nature of chemical queries these are often overspecialized. Often we need meaningful similarity measures for chemical entities for query relaxation. In chemistry, the similarity measures are vast; more than 40 similarity measures are available and focus on different aspects of chemical entities. This vast number of similarity measures is obvious, because the desired search results highly depend on the working field of the chemist. In this paper we present a personalized retrieval system for chemical documents taking into account the background knowledge of the individual chemist. This is done by a query relaxation for chemical entities using similar substances. We evaluate our approach extensively by analyzing the correlation of commonly used chemical similarity measures and fingerprint representations. All uncorrelated measures are finally used by our feedback engine to learn preferred similarity measures for each user. We also conducted a user study with domain experts showing that our system can assign a unique similarity measure for 75% of the users after only 10 feedback cycles.