Pattern matching algorithms
Data integration using similarity joins and a word-based information representation language
ACM Transactions on Information Systems (TOIS)
EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
EDBT '02 Proceedings of the Worshops XMLDM, MDDE, and YRWS on XML-Based Data Management and Multimedia Engineering-Revised Papers
A Primitive Operator for Similarity Joins in Data Cleaning
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
A Path-sequence Based Discrimination for Subtree Matching in Approximate XML Joins
ICDEW '06 Proceedings of the 22nd International Conference on Data Engineering Workshops
Hi-index | 0.00 |
XML data is explosively increasing, and a large amount of XML data, in which similar contents are described using different tag names and structures, have been emerging as a consequence. In such a situation, one cannot write a query against such XML data unless he/she knows the structure of the data. In this research, we propose a scheme to cope with this problem. Specifically, we expand XPath queries by replacing tag names with similar ones with the help of ontologies. In addition, we try to realize (structural) proximity matching of path expressions using edit similarity, which is a similarity measure based on edit distance. We also discuss application of SSJoin, which is an operator to support similarity joins in relational database systems, for speeding up the proposed scheme. We finally show the effectiveness of the proposed method by a series of experimentations.