Change detection in hierarchically structured information
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
SIGMOD '85 Proceedings of the 1985 ACM SIGMOD international conference on Management of data
XMill: an efficient compressor for XML data
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A normal form for XML documents
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Supporting Efficient Parametric Search of E-Commerce Data: A Loosely-Coupled Solution
EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
A Query Processing Strategy for the Decomposed Storage Model
Proceedings of the Third International Conference on Data Engineering
Storage and Querying of E-Commerce Data
Proceedings of the 27th International Conference on Very Large Data Bases
XML programming with SQL/XML and XQuery
IBM Systems Journal
System RX: one part relational, one part XML
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Native XML support in DB2 universal database
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Managing E-Commerce Catalogs in a DBMS with Native XML Support
ICEBE '05 Proceedings of the IEEE International Conference on e-Business Engineering
ACM SIGMOD Record
XSym'07 Proceedings of the 5th international conference on Database and XML Technologies
Hi-index | 0.00 |
Data in many industrial application systems are often neither completely structured nor unstructured. Consequently semi-structured data models such as XML have become popular as a lowest common denominator to manage such data. The problem is that although XML is adequate to represent the flexible portion of the data, it fails to exploit the highly structured portion of the data. XML normalization theory could be used to factor out the structured portion of the data at the schema level, however, queries written against the original schema no longer run on the normalized XML data. In this paper, we propose a new approach called eXtricate that stores XML documents in a space-efficient decomposed way while supporting efficient processing on the original queries. Our method exploits the fact that considerable amount of information is shared among similar XML documents, and by regarding each document as consisting of a shared framework and a small diff script, we can leverage the strengths of both the relational and XML data models at the same time to handle such data effectively. We prototyped our approach on top of DB2 9 pureXML (a commercial hybrid relational-XML DBMS). Our experiments validate the amount of redundancy in real e-catalog data and show the effectiveness of our method.