HID: an efficient path index for complex XML collections with arbitrary links

Authors:
Awny Sayed;Rainer Unland
Affiliations:
Institute for Computer Science and Business Information Systems, University of Duisburg-Essen, Essen, Germany;Institute for Computer Science and Business Information Systems, University of Duisburg-Essen, Essen, Germany
Venue:
DNIS'05 Proceedings of the 4th international conference on Databases in Networked Information Systems
Year:
2005

Citing 19
Cited 2

Data on the Web: from relations to semistructured data and XML

Data on the Web: from relations to semistructured data and XML
Compact labeling schemes for ancestor queries

SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
On supporting containment queries in relational database management systems

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
XRel: a path-based approach to storage and retrieval of XML documents using relational databases

ACM Transactions on Internet Technology (TOIT)
Labeling dynamic XML trees

Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Reachability and distance queries via 2-hop labels

SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
A comparison of labeling schemes for ancestor queries

SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
APEX: an adaptive path index for XML data

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Covering indexes for branching path queries

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Storing and querying ordered XML using a relational database system

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Introduction to Algorithms

Introduction to Algorithms
Index Structures for Path Expressions

ICDT '99 Proceedings of the 7th International Conference on Database Theory
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Indexing and Querying XML Data for Regular Path Expressions

Proceedings of the 27th International Conference on Very Large Data Bases
A Fast Index for Semistructured Data

Proceedings of the 27th International Conference on Very Large Data Bases
Short and Simple Labels for Small Distances and Other Functions

WADS '01 Proceedings of the 7th International Workshop on Algorithms and Data Structures
D(k)-index: an adaptive structural summary for graph-structured data

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Exploiting Local Similarity for Indexing Paths in Graph-Structured Data

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Efficient structural joins on indexed XML documents

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases

Compact reachability labeling for graph-structured data

Proceedings of the 14th ACM international conference on Information and knowledge management
Two-Phase path retrieval method for similar XML document retrieval

KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

The increasing popularity of XML has generated a lot of interest in query processing over graph-structured data. To support efficient evaluation of path expressions structured indexes have been proposed. However, most variants of structures indexes ignore inter- or intra-document references. They assume a tree-like structure of XML-documents. Extending these indexes to work with large XML graphs and to support intra-or inter-document links requires a lot of computing power for the creation process and a lot of space to store the indexes. Moreover, the efficient evaluation of ancestors-descendants queries over arbitrary graphs with long paths is a severe problem. In this paper, we propose a scalable connection index that is based on the concept of 2-hop covers as introduced by Cohen el al. The proposed algorithm for index creation scales down the original graph size substantially. As a result a directed acyclic graph with a smaller number of nodes and edges will emerge. This reduces the number of computing steps required for building the index. Thus, computing time and space will be reduced as well . The index also permits to efficiently evaluate ancestors-descendants relationships. Moreover, the proposed index has a nice property in comparison to most other work; it is optimized for descendants-or-self queries on arbitrary graphs with link relationships.