Efficient keyword proximity search using a frontier-reduce strategy based on d-distance graph index

Authors:
Ming Zhong;Mengchi Liu
Affiliations:
Wuhan University, Wuhan, China;Carleton University, Ottawa, Canada
Venue:
IDEAS '09 Proceedings of the 2009 International Database Engineering & Applications Symposium
Year:
2009

Citing 19
Cited 3

Retrieving and organizing web pages by “information unit”

Proceedings of the 10th international conference on World Wide Web
Proximity Search in Databases

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
XRANK: ranked keyword search over XML documents

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
DBXplorer: A System for Keyword-Based Search over Relational Databases

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Keyword Searching and Browsing in Databases using BANKS

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Efficient keyword search for smallest LCAs in XML databases

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Bidirectional expansion for keyword search on graph databases

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Principles of dataspace systems

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Effective keyword search in relational databases

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Multiway SLCA-based keyword search in XML data

Proceedings of the 16th international conference on World Wide Web
Spark: top-k keyword query in relational databases

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
BLINKS: ranked keyword searches on graphs

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Discover: keyword search in relational databases

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Efficient IR-style keyword search over relational databases

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Objectrank: authority-based keyword search in databases

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Effective keyword search for valuable lcas over xml documents

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Efficient LCA based keyword search in XML data

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
EASE: an effective 3-in-1 keyword search method for unstructured, semi-structured and structured data

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Keyword proximity search in complex data graphs

Proceedings of the 2008 ACM SIGMOD international conference on Management of data

KESOSD: keyword search over structured data

KEYS '12 Proceedings of the Third International Workshop on Keyword Search on Structured Data
3SEPIAS: A Semi-Structured Search Engine for Personal Information in dAtaspace System

Information Sciences: an International Journal
A distributed index for efficient parallel top-k keyword search on massive graphs

Proceedings of the twelfth international workshop on Web information and data management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Current keyword proximity search approaches on general graph lack effective means to reduce the search space, and thus suffer from low efficiency when dealing with large search space. In this paper, we present a novel approach in order to address this problem. Our approach employs a best-effort frontier-reduce strategy that aims to find a set of subgraphs containing the best answers. So we need only to search over these small subgraphs to get the top-k answers, and thus the efficiency can be significantly improved. To fulfill our strategy, we define a d-distance subgraph with upper size bound, and extract such subgraphs from the graph to build a new index structure combining the mappings between keywords, vertexes and subgraphs, by which we can quickly look up the target subgraphs for specific queries. Then, we perform an efficient algorithm to find the top-k answers, which can overcome the subgraph overlap problem and support existing optimal prioritization techniques. Lastly, we evaluate the effectiveness and efficiency of our approach with extensive experiments. The experimental results show that our approach can outperform existing approaches by a large margin with little or none loss of answer quality.