Shedding Light on a Troublesome Issue in NLIDBS

  • Authors:
  • Rodolfo Pazos;René Santaolalaya S.;Juan C. Rojas P.;Joaquín Pérez O.

  • Affiliations:
  • Departamento de Ciencias Computacionales, Centro Nacional de Investigación y Desarrollo Tecnológico (CENIDET), Cuernavaca 62490, México AP 5-164 and Instituto Tecnológico de Ci ...;Departamento de Ciencias Computacionales, Centro Nacional de Investigación y Desarrollo Tecnológico (CENIDET), Cuernavaca 62490, México AP 5-164;Departamento de Ciencias Computacionales, Centro Nacional de Investigación y Desarrollo Tecnológico (CENIDET), Cuernavaca 62490, México AP 5-164;Departamento de Ciencias Computacionales, Centro Nacional de Investigación y Desarrollo Tecnológico (CENIDET), Cuernavaca 62490, México AP 5-164

  • Venue:
  • TSD '08 Proceedings of the 11th international conference on Text, Speech and Dialogue
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

A natural language interface to databases (NLIDB) without help mechanisms that permit clarifying queries is prone to incorrect query translation. In this paper we draw attention to a problem in NLIDBs that has been overlooked and has not been dealt with systematically: word economy; i.e., the omission of words when expressing a query in natural language (NL). In order to get an idea of the magnitude of this problem, we conducted experiments on EnglishQuery when applied to a corpora of economized-wording queries. The results show that the percentage of correctly answered queries is 18%, which is substantially lower than those obtained with corpora of regular queries (53%---83%). In this paper we describe a typification of problems found in economized-wording queries, which has been used to implement domain-independent dialog processes for an NLIDB in Spanish. The incorporation of dialog processes in an NLIDB permits users to clarify queries in NL, thus improving the percentage of correctly answered queries. This paper presents the tests of a dialog manager that deals with four types of query problems, which permits to improve the percentage of correctly answered queries from 60% to 91%. Due to the generality of our approach, we claim that it can be applied to other domain-dependent or domain-independent NLIDBs, as well as other NLs such as English, French, Italian, etc.