Lore: a database management system for semistructured data
ACM SIGMOD Record
On supporting containment queries in relational database management systems
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
XRel: a path-based approach to storage and retrieval of XML documents using relational databases
ACM Transactions on Internet Technology (TOIT)
APEX: an adaptive path index for XML data
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
BitCube: A Three-Dimensional Bitmap Indexing for XML Documents
Journal of Intelligent Information Systems
A Fast Index for Semistructured Data
Proceedings of the 27th International Conference on Very Large Data Bases
Semantic integration in Xyleme: a uniform tree-based approach
Data & Knowledge Engineering - Special issue: Data integration over the Web
BLAS: an efficient XPath processing system
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
HID: an efficient path index for complex XML collections with arbitrary links
DNIS'05 Proceedings of the 4th international conference on Databases in Networked Information Systems
Hi-index | 0.00 |
The existing three-dimensional bitmap indexing techniques have a performance problem in clustering the similar documents because they cannot detect the similar paths. The existing path clustering method which is based on the path construction similarity is one approach to solve the problem above. This method, however, consumes too much time in measuring the similarities between the similar paths for clustering. This paper defines the expected path construction similarity and proposes two-phase path retrieval method which effectively clustering the paths using it. The proposed method solved the performance degrade problem in path clustering by filtering the paths to be measured using the expected path construction similarity.