The PHASAR search engine

  • Authors:
  • Cornelis H. A. Koster;Olaf Seibert;Marc Seutter

  • Affiliations:
  • Department of Computer Science, Radboud University Nijmegen, Nijmegen, The Netherlands;Department of Computer Science, Radboud University Nijmegen, Nijmegen, The Netherlands;Department of Computer Science, Radboud University Nijmegen, Nijmegen, The Netherlands

  • Venue:
  • NLDB'06 Proceedings of the 11th international conference on Applications of Natural Language to Information Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This article describes the rationale behind the PHASAR system (Phrase-based Accurate Search And Retrieval), a professional Information Retrieval and Text Mining system under development for the collection of information about metabolites from the biological literature. The system is generic in nature and applicable (given suitable linguistic resources and thesauri) to many other forms of professional search. Instead of keywords, the PHASAR search engine uses Dependency Triples as terms. Both the documents and the queries are parsed, transduced to Dependency Triples and lemmatized. Queries consist of a set of Dependency Triples, whose elements may be generalized or specialized in order to achieve the desired precision and recall. In order to help in interactive exploration, the search process is supported by document frequency information from the index, both for terms from the query and for terms from the thesaurus.