The VLDB Journal — The International Journal on Very Large Data Bases
Querying structured text in an XML database
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Structure and content scoring for XML
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Integrating XML data sources using approximate joins
ACM Transactions on Database Systems (TODS)
Duplicate Record Detection: A Survey
IEEE Transactions on Knowledge and Data Engineering
TopX: efficient and versatile top-k query processing for semistructured data
The VLDB Journal — The International Journal on Very Large Data Bases
Evaluating Performance and Quality of XML-Based Similarity Joins
ADBIS '08 Proceedings of the 12th East European conference on Advances in Databases and Information Systems
A cluster-based approach to XML similarity joins
IDEAS '09 Proceedings of the 2009 International Database Engineering & Applications Symposium
The pq-gram distance between ordered labeled trees
ACM Transactions on Database Systems (TODS)
Binary XML storage and query processing in Oracle 11g
Proceedings of the VLDB Endowment
Generalizing prefix filtering to improve set similarity joins
Information Systems
Keyword-based search and exploration on databases
ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Hi-index | 0.00 |
XML is widely applied to describe semi-structured data commonly generated and used by modern information systems. XML database management systems (XDBMSs) are thus essential platforms in this context. Most XDBMS architectures proposed so far aim at reproducing functionalities found in relational systems. As such, these architectures inherit the same deficiency of traditional systems in dealing with less-structured data. What is badly needed is efficient support of common database operations under the similarity matching paradigm. In this paper, we present an engineering approach to incorporating similarity joins into XDBMSs, which exploits XDBMS components--the storage layer in particular--to design efficient algorithms. We experimentally confirm the accuracy, performance, and scalability of our approach.