Automatically generating structured queries in XML keyword search

  • Authors:
  • Felipe Da C. Hummel;Altigran S. Da Silva;Mirella M. Moro;Alberto H. F. Laender

  • Affiliations:
  • Departamento de Ciência da Computação, Universidade Federal do Amazonas, Manaus, Brazil;Departamento de Ciência da Computação, Universidade Federal do Amazonas, Manaus, Brazil;Departamento de Ciência da Computação, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil;Departamento de Ciência da Computação, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil

  • Venue:
  • INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present a novel method for automatically deriving structured XML queries from keyword-based queries and show how it was applied to the experimental tasks proposed for the INEX 2010 data-centric track. In our method, called StruX, users specify a schema-independent unstructured keyword-based query and it automatically generates a top-k ranking of schemaaware queries based on a target XML database. Then, one of the top ranked structured queries can be selected, automatically or by a user, to be executed by an XML query engine. The generated structured queries are XPath expressions consisting of an entity path (e.g., dblp/article) and predicates (e.g., /dblp/article[author="john" and title="xml"]). We use the concept of entity, commonly adopted in the XML keyword search literature, to define suitable root nodes for the query results. Also, StruX uses IR techniques to determine in which elements a term is more likely to occur.