Storing and querying of XML documents without redundant path information

Authors:
Byeong-Soo Jeong;Young-Koo Lee
Affiliations:
College of Electronics and Information, Kyung Hee University, Kyung-gi, Korea;College of Electronics and Information, Kyung Hee University, Kyung-gi, Korea
Venue:
ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part II
Year:
2006

Citing 10
Cited 0

Application of OODB and SGML techniques in text database: an electronic dictionary system

ACM SIGMOD Record
XRel: a path-based approach to storage and retrieval of XML documents using relational databases

ACM Transactions on Internet Technology (TOIT)
APEX: an adaptive path index for XML data

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Storing and querying ordered XML using a relational database system

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Relational Databases for Querying XML Documents: Limitations and Opportunities

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Indexing and Querying XML Data for Regular Path Expressions

Proceedings of the 27th International Conference on Very Large Data Bases
A Fast Index for Semistructured Data

Proceedings of the 27th International Conference on Very Large Data Bases
Developing an Indexing Scheme for XML Document Collection using the Oracle8i Extensibility Framework

Proceedings of the 27th International Conference on Very Large Data Bases
Indexing XML data stored in a relational database

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Efficient evaluation of partial match queries for XML documents using information retrieval techniques

DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose an improved approach that stores and queries a large volume of XML documents in a relational database, while removing the redundancy of path information and using an inverted index on the reduced path information. In order to store XML documents in a relational database, the XML document is decomposed into nodes based on its tree structure, and stored in relational tables with path information from the root node to each node. The existing XML storage methods which use relational data model, usually store path information for every node. Thus, they can increase storage overhead and decrease query processing performance with the increased data volume. Our approach stores only leaf node path information in XML tree structure while finding out internal node path information from the leaf node path information. In this manner, our approach can reduce data volume for a large amount of XML documents to a degree and also reduce the size of inverted index for the path information with the smaller number of posting lists by key words. We show the effectiveness of this approach through several experiments that compare XPath query performance with the existing methods.