Handling noisy queries in cross language FAQ retrieval

  • Authors:
  • Danish Contractor;Govind Kothari;Tanveer A. Faruquie;L. Venkata Subramaniam;Sumit Negi

  • Affiliations:
  • IBM Research India, New Delhi, India;IBM Research India, New Delhi, India;IBM Research India, New Delhi, India;IBM Research India, New Delhi, India;IBM Research India, New Delhi, India

  • Venue:
  • EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent times have seen a tremendous growth in mobile based data services that allow people to use Short Message Service (SMS) to access these data services. In a multilingual society it is essential that data services that were developed for a specific language be made accessible through other local languages also. In this paper, we present a service that allows a user to query a Frequently-Asked-Questions (FAQ) database built in a local language (Hindi) using Noisy SMS English queries. The inherent noise in the SMS queries, along with the language mismatch makes this a challenging problem. We handle these two problems by formulating the query similarity over FAQ questions as a combinatorial search problem where the search space consists of combinations of dictionary variations of the noisy query and its top-N translations. We demonstrate the effectiveness of our approach on a real-life dataset.