Querying structured text in an XML database

Authors:
Shurug Al-Khalifa;Cong Yu;H. V. Jagadish
Affiliations:
University of Michigan, Ann Arbor, MI;University of Michigan, Ann Arbor, MI;University of Michigan, Ann Arbor, MI
Venue:
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Year:
2003

Citing 16
Cited 41

A probabilistic relational algebra for the integration of information retrieval and database systems

ACM Transactions on Information Systems (TOIS)
Integration of heterogeneous databases without common domains using queries based on textual similarity

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Pattern Matching in Trees

Journal of the ACM (JACM)
A framework for expressing and combining preferences

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
PREFER: a system for the efficient execution of multi-parametric ranked queries

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
On supporting containment queries in relational database management systems

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
XIRQL: a query language for information retrieval in XML documents

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Holistic twig joins: optimal XML pattern matching

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Minimal probing: supporting expensive predicates for top-k queries

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
The Index-Based XXL Search Engine for Querying XML Data with Relevance Ranking

EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Querying XML Documents Made Easy: Nearest Concept Queries

Proceedings of the 17th International Conference on Data Engineering
TAX: A Tree Algebra for XML

DBPL '01 Revised Papers from the 8th International Workshop on Database Programming Languages
Sideway value algebra for object-relational databases

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Efficient structural joins on indexed XML documents

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
ProTDB: probabilistic data in XML

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases

Querying XML using structures and keywords in timber

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
TIMBER: a native system for querying XML

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
TOSS: an extension of TAX with ontologies and similarity queries

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
On the integration of structure indexes and inverted lists

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Measuring similarity between collection of values

Proceedings of the 6th annual ACM international workshop on Web information and data management
Querying web metadata: Native score management and text support in databases

ACM Transactions on Database Systems (TODS)
Efficient Creation and Incremental Maintenance of the HOPI Index for Complex XML Document Collections

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Hybrid XML Retrieval: Combining Information Retrieval and a Native XML Database

Information Retrieval
An efficient and versatile query engine for TopX search

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Report on the DB/IR panel at SIGMOD 2005

ACM SIGMOD Record
Flexible and efficient XML search with complex full-text predicates

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
Integrating document and data retrieval based on XML

The VLDB Journal — The International Journal on Very Large Data Bases
TWIX: twig structure and content matching of selective queries using binary labeling

InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
XQuery full-text extensions explained

IBM Systems Journal
An algebraic query model for effective and efficient retrieval of XML fragments

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
XML search: languages, INEX and scoring

ACM SIGMOD Record
Schema-free XQuery

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Querying complex structured databases

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Efficient keyword search over virtual XML views

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Enabling Schema-Free XQuery with meaningful query focus

The VLDB Journal — The International Journal on Very Large Data Bases
Usage-based ranking of distributed XML data

Proceedings of the 2008 ACM symposium on Applied computing
Towards an integrated framework for querying collection of heterogeneous data

Proceedings of the 3rd International Conference on Ubiquitous Information Management and Communication
Efficient keyword search over virtual XML views

The VLDB Journal — The International Journal on Very Large Data Bases
Semantic Search --- Using Graph-Structured Semantic Models for Supporting the Search Process

ICCS '09 Proceedings of the 17th International Conference on Conceptual Structures: Conceptual Structures: Leveraging Semantic Technologies
A Prüfer Based Approach to Process Top-k Queries in XML

DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Indexing and searching XML documents based on content and structure synopses

BNCOD'07 Proceedings of the 24th British national conference on Databases
OOXsearch: a search engine for answering loosely structured XML queries using OO programming

BNCOD'07 Proceedings of the 24th British national conference on Databases
SSRS: an XML information retrieval system

DNIS'07 Proceedings of the 5th international conference on Databases in networked information systems
A Survey on Uncertainty Management in Data Integration

Journal of Data and Information Quality (JDIQ)
Full-text capabilities for querying XML repositories: a formal model

ICCOMP'06 Proceedings of the 10th WSEAS international conference on Computers
A survey on XML keyword search

APWeb'11 Proceedings of the 13th Asia-Pacific web conference on Web technologies and applications
Database and information retrieval techniques for XML

ASIAN'05 Proceedings of the 10th Asian Computing Science conference on Advances in computer science: data management on the web
REX: explaining relationships between entity pairs

Proceedings of the VLDB Endowment
Handling uncertainty and ignorance in databases: a rule to combine dependent data

DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
Effective keyword search in XML documents based on MIU

DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
No tag, a little nesting, and great XML keyword search

AIRS'06 Proceedings of the Third Asia conference on Information Retrieval Technology
FLUX: content and structure matching of XPath queries with range predicates

XSym'06 Proceedings of the 4th international conference on Database and XML Technologies
Effectively scoring for XML IR queries

DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
3SEPIAS: A Semi-Structured Search Engine for Personal Information in dAtaspace System

Information Sciences: an International Journal
Locating and ranking XML documents based on content and structure synopses

DEXA'07 Proceedings of the 18th international conference on Database and Expert Systems Applications
Leveraging the storage layer to support XML similarity joins in XDBMSs

ADBIS'12 Proceedings of the 16th East European conference on Advances in Databases and Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

XML databases often contain documents comprising structured text. Therefore, it is important to integrate "information retrieval style" query evaluation, which is well-suited for natural language text, with standard "database style" query evaluation, which handles structured queries efficiently. Relevance scoring is central to information retrieval. In the case of XML, this operation becomes more complex because the data required for scoring could reside not directly in an element itself but also in its descendant elements.In this paper, we propose a bulk-algebra, TIX, and describe how it can be used as a basis for integrating information retrieval techniques into a standard pipelined database query evaluation engine. We develop new evaluation strategies essential to obtaining good performance, including a stack-based TermJoin algorithm for efficiently scoring composite elements. We report results from an extensive experimental evaluation, which show, among other things, that the new TermJoin access method outperforms a direct implementation of the same functionality using standard operators by a large factor.