The PHASAR search engine

Authors:
Cornelis H. A. Koster;Olaf Seibert;Marc Seutter
Affiliations:
Department of Computer Science, Radboud University Nijmegen, Nijmegen, The Netherlands;Department of Computer Science, Radboud University Nijmegen, Nijmegen, The Netherlands;Department of Computer Science, Radboud University Nijmegen, Nijmegen, The Netherlands
Venue:
NLDB'06 Proceedings of the 11th international conference on Applications of Natural Language to Information Systems
Year:
2006

Citing 7
Cited 2

Natural language information retrieval

TREC-2 Proceedings of the second conference on Text retrieval conference
A study of aboutness in information retrieval

Artificial Intelligence Review
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Natural Language Information Retrieval

Natural Language Information Retrieval
The AGFL Grammar Work Lab

Proceedings of the FREENIX Track: 2002 USENIX Annual Technical Conference
Question answering passage retrieval using dependency relations

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Question answering for dutch using dependency relations

CLEF'05 Proceedings of the 6th international conference on Cross-Language Evalution Forum: accessing Multilingual Information Repositories

Genre and domain in patent texts

PaIR '10 Proceedings of the 3rd international workshop on Patent information retrieval
Automatic issue extraction from a focused dialogue

NLDB'07 Proceedings of the 12th international conference on Applications of Natural Language to Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This article describes the rationale behind the PHASAR system (Phrase-based Accurate Search And Retrieval), a professional Information Retrieval and Text Mining system under development for the collection of information about metabolites from the biological literature. The system is generic in nature and applicable (given suitable linguistic resources and thesauri) to many other forms of professional search. Instead of keywords, the PHASAR search engine uses Dependency Triples as terms. Both the documents and the queries are parsed, transduced to Dependency Triples and lemmatized. Queries consist of a set of Dependency Triples, whose elements may be generalized or specialized in order to achieve the desired precision and recall. In order to help in interactive exploration, the search process is supported by document frequency information from the index, both for terms from the query and for terms from the thesaurus.