The patents retrieval prototype in the MOLTO project

Authors:
Milen Chechev;Meritxell Gonzàlez;Lluís Màrquez;Cristina España-Bonet
Affiliations:
Ontotext AD, Sofia, Bulgaria;Universitat Politècnica de Catalunya, Barcelona, Spain;Universitat Politècnica de Catalunya, Barcelona, Spain;Universitat Politècnica de Catalunya, Barcelona, Spain
Venue:
Proceedings of the 21st international conference companion on World Wide Web
Year:
2012

Citing 4
Cited 0

Grammatical Framework

Journal of Functional Programming
Large-scale, parallel automatic patent annotation

Proceedings of the 1st ACM workshop on Patent information retrieval
Moses: open source toolkit for statistical machine translation

ACL '07 Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions
Tools for multilingual grammar-based translation on the web

ACLDemos '10 Proceedings of the ACL 2010 System Demonstrations

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes the patents retrieval prototype developed within the MOLTO project. The prototype aims to provide a multilingual natural language interface for querying the content of patent documents. The developed system is focused on the biomedical and pharmaceutical domain and includes the translation of the patent claims and abstracts into English, French and German. Aiming at the best retrieval results of the patent information and text content, patent documents are preprocessed and semantically annotated. Then, the annotations are stored and indexed in an OWLIM semantic repository, which contains a patent specific ontology and others from different domains. The prototype, accessible online at http://molto-patents.ontotext.com, presents a multilingual natural language interface to query the retrieval system. In MOLTO, the multilingualism of the queries is addressed by means of the GF Tool, which provides an easy way to build and maintain controlled language grammars for interlingual translation in limited domains. The abstract representation obtained from the GF is used to retrieve both the matched RDF instances and the list of patents semantically related to the user's search criteria. The online interface allows to browse the retrieved patents and shows on the text the semantic annotations that explain the reason why any particular patent has matched the user's criteria.