Flexible and efficient XML search with complex full-text predicates

Authors:
Sihem Amer-Yahia;Emiran Curtmola;Alin Deutsch
Affiliations:
AT&T Labs Research;University of California, San Diego;University of California, San Diego
Venue:
Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Year:
2006

Citing 18
Cited 20

Fast evaluation of structured queries for information retrieval

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
A probabilistic relational algebra for the integration of information retrieval and database systems

ACM Transactions on Information Systems (TOIS)
Algebras for querying text regions: expressive power and optimization

Journal of Computer and System Sciences - Fourteenth ACM SIGACT-SIGMOD-SIGART symposium on principles of database systems
Integrating keyword search into XML query processing

Proceedings of the 9th international World Wide Web conference on Computer networks : the international journal of computer and telecommunications netowrking
On supporting containment queries in relational database management systems

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Modern Information Retrieval

Modern Information Retrieval
The Index-Based XXL Search Engine for Querying XML Data with Relevance Ranking

EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Querying XML Documents Made Easy: Nearest Concept Queries

Proceedings of the 17th International Conference on Data Engineering
Searching XML documents via XML fragments

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Querying structured text in an XML database

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
XRANK: ranked keyword search over XML documents

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
TOSS: an extension of TAX with ontologies and similarity queries

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Efficient keyword search for smallest LCAs in XML databases

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Controlling overlap in content-oriented XML retrieval

Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Structure and content scoring for XML

VLDB '05 Proceedings of the 31st international conference on Very large data bases
XSEarch: a semantic search engine for XML

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Schema-free XQuery

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
An algebra for structured queries in bayesian networks

INEX'04 Proceedings of the Third international conference on Initiative for the Evaluation of XML Retrieval

XML search: languages, INEX and scoring

ACM SIGMOD Record
Enabling structural summaries for efficient update and workload adaptation

Data & Knowledge Engineering
AJAXSearch: crawling, indexing and searching web 2.0 applications

Proceedings of the VLDB Endowment
XTreeNet: democratic community search

Proceedings of the VLDB Endowment
Finding frequent co-occurring terms in relational keyword search

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
SAIL: Structure-aware indexing for effective and progressive top-k keyword search over XML documents

Information Sciences: an International Journal
Space-economical partial gram indices for exact substring matching

Proceedings of the 18th ACM conference on Information and knowledge management
Finding and ranking compact connected trees for effective keyword proximity search in XML documents

Information Systems
Efficient keyword search over data-centric XML documents

APWeb/WAIM'07 Proceedings of the joint 9th Asia-Pacific web and 8th international conference on web-age information management conference on Advances in data and web management
OOXsearch: a search engine for answering loosely structured XML queries using OO programming

BNCOD'07 Proceedings of the 24th British national conference on Databases
Query and update through XML views

DNIS'07 Proceedings of the 5th international conference on Databases in networked information systems
A ranking scheme for XML information retrieval based on benefit and reading effort

ICADL'07 Proceedings of the 10th international conference on Asian digital libraries: looking back 10 years and forging new frontiers
Structural consistency: enabling XML keyword search to eliminate spurious results consistently

The VLDB Journal — The International Journal on Very Large Data Bases
Predicate-based indexing for desktop search

The VLDB Journal — The International Journal on Very Large Data Bases
BusSEngine: a business search engine

Knowledge and Information Systems
Updating XML views and querying XML views with update syntax

International Journal of Computational Science and Engineering
Score-consistent algebraic optimization of full-text search queries with GRAFT

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
BUAP: a first approach to the data-centric track of INEX 2010

INEX'10 Proceedings of the 9th international conference on Initiative for the evaluation of XML retrieval: comparative evaluation of focused retrieval
Kikori-KS: an effective and efficient keyword search system for digital libraries in XML

ICADL'06 Proceedings of the 9th international conference on Asian Digital Libraries: achievements, Challenges and Opportunities
Optimizing XML twig queries with full-text predicates

ACM SIGMOD Record

Quantified Score

Hi-index	0.01

Visualization

Abstract

Recently, there has been extensive research that generated a wealth of new XML full-text query languages, ranging from simple Boolean search to combining sophisticated proximity and order predicates on keywords. While computing least common ancestors of query terms was proposed for efficient evaluation of conjunctive keyword queries by exploiting the document structure, no such solution was developed to evaluate complex full-text queries. We present efficient evaluation algorithms based on a formalization of XML queries in terms of keyword patterns and an algebra which manipulates pattern matches. Our algebra captures most existing languages and their varying semantics and our algorithms combine relational query evaluation techniques with the exploitation of document structure to process queries with complex full-text predicates. We show how scoring can be incorporated into our framework without compromising the algorithms complexity. Our experiments show that considering element nesting dramatically improves the performance of queries with complex full-text predicates.