Using structural information in XML keyword search effectively

Authors:
Arash Termehchy;Marianne Winslett
Affiliations:
University of Illinois at Urbana-Champaign, Urbana, IL;University of Illinois at Urbana-Champaign, Urbana, IL
Venue:
ACM Transactions on Database Systems (TODS)
Year:
2011

Citing 54
Cited 8

Elements of information theory

Elements of information theory
Beyond market baskets: generalizing association rules to correlations

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Discovering typical structures of documents: a road map approach

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Information dependencies

PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Clustering user queries of a search engine

Proceedings of the 10th international conference on World Wide Web
Storing and querying ordered XML using a relational database system

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Holistic twig joins: optimal XML pattern matching

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Efficient Discovery of Functional and Approximate Dependencies Using Partitions

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Querying XML Documents Made Easy: Nearest Concept Queries

Proceedings of the 17th International Conference on Data Engineering
Mining Mutually Dependent Patterns

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
XRANK: ranked keyword search over XML documents

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
DBXplorer: A System for Keyword-Based Search over Relational Databases

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Keyword Searching and Browsing in Databases using BANKS

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
A Visual Interface for Native XML Database

DEXA '03 Proceedings of the 14th International Workshop on Database and Expert Systems Applications
A normal form for XML documents

ACM Transactions on Database Systems (TODS)
CORDS: automatic discovery of correlations and soft functional dependencies

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Using mutual information to resolve query translation ambiguities and query term weighting

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Efficient keyword search for smallest LCAs in XML databases

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Efficiently Mining Frequent Trees in a Forest: Algorithms and Applications

IEEE Transactions on Knowledge and Data Engineering
From region encoding to extended dewey: on efficient processing of XML twig pattern matching

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Theory of Relational Databases

Theory of Relational Databases
Searching for related objects in relational databases

SSDBM'2005 Proceedings of the 17th international conference on Scientific and statistical database management
Keyword Proximity Search in XML Trees

IEEE Transactions on Knowledge and Data Engineering
Précis: The Essence of a Query Answer

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Generating query substitutions

Proceedings of the 15th international conference on World Wide Web
Optimizing scoring functions and indexes for proximity search in type-annotated corpora

Proceedings of the 15th international conference on World Wide Web
Finding and approximating top-k answers in keyword proximity search

Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Effective keyword search in relational databases

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
The Wikipedia XML corpus

ACM SIGIR Forum
Mining quantitative correlated patterns using an information-theoretic approach

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient discovery of XML data redundancies

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Report on the SIGIR 2006 workshop on XML element retrieval methodology

ACM SIGIR Forum
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Multiway SLCA-based keyword search in XML data

Proceedings of the 16th international conference on World Wide Web
Assisted querying using instant-response interfaces

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
XMark: a benchmark for XML data management

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
XSEarch: a semantic search engine for XML

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Schema-free XQuery

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Objectrank: authority-based keyword search in databases

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Effective keyword search for valuable lcas over xml documents

Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Efficient LCA based keyword search in XML data

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Keyword proximity search in complex data graphs

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Introduction to Information Retrieval

Introduction to Information Retrieval
Database Systems: The Complete Book

Database Systems: The Complete Book
Reasoning and identifying relevant matches for XML keyword search

Proceedings of the VLDB Endowment
Response time in man-computer conversational transactions

AFIPS '68 (Fall, part I) Proceedings of the December 9-11, 1968, fall joint computer conference, part I
Refining Keyword Queries for XML Retrieval by Combining Content and Structure

ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
Effective XML Keyword Search with Relevance Oriented Ranking

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Effective, design-independent XML keyword search

Proceedings of the 18th ACM conference on Information and knowledge management
Relevance measures for subset variable selection in regression problems based on k-additive mutual information

Computational Statistics & Data Analysis
Information theoretical analysis of multivariate correlation

IBM Journal of Research and Development
Feedback-Driven structural query expansion for ranked retrieval of XML data

EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Constructing a generic natural language interface for an XML database

EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology

iSearch: an interpretation based framework for keyword search in relational databases

KEYS '12 Proceedings of the Third International Workshop on Keyword Search on Structured Data
MALEX: a MAp-like exploration model on XML database

KEYS '12 Proceedings of the Third International Workshop on Keyword Search on Structured Data
Efficient keyword search on large tree structured datasets

KEYS '12 Proceedings of the Third International Workshop on Keyword Search on Structured Data
An extended compact TVP index for finding top-k nearest neighbors over XML data tree

WISE'12 Proceedings of the 13th international conference on Web Information Systems Engineering
A distance-based spelling suggestion method for XML keyword search

ER'12 Proceedings of the 31st international conference on Conceptual Modeling
Supporting range queries in XML keyword search

Proceedings of the Joint EDBT/ICDT 2013 Workshops
Exploiting structures in keyword queries for effective XML search

Information Sciences: an International Journal
XML keyword search with promising result type recommendations

World Wide Web

Quantified Score

Hi-index	0.00

Visualization

Abstract

The popularity of XML has exacerbated the need for an easy-to-use, high precision query interface for XML data. When traditional document-oriented keyword search techniques do not suffice, natural language interfaces and keyword search techniques that take advantage of XML structure make it very easy for ordinary users to query XML databases. Unfortunately, current approaches to processing these queries rely heavily on heuristics that are intuitively appealing but ultimately ad hoc. These approaches often retrieve false positive answers, overlook correct answers, and cannot rank answers appropriately. To address these problems for data-centric XML, we propose coherency ranking (CR), a domain- and database design-independent ranking method for XML keyword queries that is based on an extension of the concepts of data dependencies and mutual information. With coherency ranking, the results of a keyword query are invariant under a class of equivalency-preserving schema reorganizations. We analyze the way in which previous approaches to XML keyword search approximate coherency ranking, and present efficient algorithms to process queries and rank their answers using coherency ranking. Our empirical evaluation with two real-world XML data sets shows that coherency ranking has better precision and recall and provides better ranking than all previous approaches.