KESOSD: keyword search over structured data

  • Authors:
  • Jaime I. Lopez-Veyna;Victor J. Sosa-Sosa;Ivan Lopez-Arevalo

  • Affiliations:
  • Information Technology Laboratory, Center of Research and Advanced Studies of the National Polytechnic Institute (CINVESTAV), Cd. Victoria, Tamaulipas, Mexico;Information Technology Laboratory, Center of Research and Advanced Studies of the National Polytechnic Institute (CINVESTAV), Cd. Victoria, Tamaulipas, Mexico;Information Technology Laboratory, Center of Research and Advanced Studies of the National Polytechnic Institute (CINVESTAV), Cd. Victoria, Tamaulipas, Mexico

  • Venue:
  • KEYS '12 Proceedings of the Third International Workshop on Keyword Search on Structured Data
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most of the information on the Web can be currently classified according to its (information) structure in three different forms: unstructured (plain text), semi-structured (XML files) and structured (tables in a relational database). Currently Web search is the primary way to access massive information. Keyword search also becomes an alternative of querying over relational databases and XML documents, which is simple to people who are familiar with the use of Web search engines. There are several approaches to perform keyword search over relational databases such as Steiner Trees, Candidate Networks and Tuple Units. However these methods have some constraints. The Steiner Trees method is considered a NP-hard problem, moreover, a real databases can produce a large number of Steiner Trees, which are difficult to identify and index. The Candidate Network approach first needs to generate the candidate networks and then to evaluate them to find the best answer. The problem is that for a keyword query the number of Candidate Networks can be very large and to find a common join expression to evaluate all the candidate networks could require a big computational effort. Finally, the use of Tuple Units in a general conception produce very large structures that most of the time store redundant information. To address this problem we propose a novel approach for keywords search over structured data (KESOSD). KESOSD models the structured information as graphs and proposed the use of a keyword-structure-aware-index called KSAI that captures the implicit structural relationships of the information producing fast and accuracy search responses. We have conducted some experiments and the results show that KESOSD achieves high search efficiency and high accuracy for keyword search over structured data.