Efficient indexing and querying of XML data using modified Prüfer sequences

Authors:
K. Hima Prasad;P. Sreenivasa Kumar
Affiliations:
Indian Institute of Technology Madras, Chennai, India;Indian Institute of Technology Madras, Chennai, India
Venue:
Proceedings of the 14th ACM international conference on Information and knowledge management
Year:
2005

Citing 7
Cited 5

Holistic twig joins: optimal XML pattern matching

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Indexing and Querying XML Data for Regular Path Expressions

Proceedings of the 27th International Conference on Very Large Data Bases
ViST: a dynamic index method for querying XML data by tree structures

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Structural Joins: A Primitive for Efficient XML Query Pattern Matching

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
PRIX: Indexing And Querying XML Using Prüfer Sequences

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
On the Sequencing of Tree Structures for XML Indexing

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Efficient structural joins on indexed XML documents

VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases

LCS-TRIM: dynamic programming meets XML indexing and querying

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
RRSi: indexing XML data for proximity twig queries

Knowledge and Information Systems
Effective pruning for XML structural match queries

Data & Knowledge Engineering
Key concepts for native XML processing

From active data management to event-based systems and more
Examining the impact of data-access cost on XML twig pattern matching

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the advent of XML as the new standard for information representation and exchange, indexing and querying of XML data is of major concern. In this paper, we propose a method for representing an XML document as a sequence based on a variation of Prüfer sequences. We incorporate new components in the node encodings such as level, number of a certain kind of descendants and develop methods for holistic processing of tree pattern queries. The query processing involves converting the query also into a sequence and performing subsequence matching on the document sequence. We establish certain interesting properties of the proposed method of sequencing that give rise to a new efficient pattern matching algorithm. The sequence data is stored in a two level B+-trees to support query processing. We also propose an optimization for parent-child axis to speed up the query processing. Our approach does not require any post-processing and guarantees results that are free of false positives and duplicates. Experimental results show that our system performs significantly better than previous systems in a large number of cases.