Research problems in data warehousing
CIKM '95 Proceedings of the fourth international conference on Information and knowledge management
Managing semantic heterogeneity in databases: a theoretical prospective
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Data on the Web: from relations to semistructured data and XML
Data on the Web: from relations to semistructured data and XML
On wrapping query languages and efficient XML integration
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
XTRACT: a system for extracting document type descriptors from XML documents
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Data integration: a theoretical perspective
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Statistical synopses for graph-structured XML databases
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
XClust: clustering XML schemas for effective integration
Proceedings of the eleventh international conference on Information and knowledge management
EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Adding Structure to Unstructured Data
ICDT '97 Proceedings of the 6th International Conference on Database Theory
DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Views in a Large Scale XML Repository
Proceedings of the 27th International Conference on Very Large Data Bases
Answering XML Queries on Heterogeneous Data Sources
Proceedings of the 27th International Conference on Very Large Data Bases
Extended Faceted Taxonomies for Web Catalogs
WISE '02 Proceedings of the 3rd International Conference on Web Information Systems Engineering
A Grammar Based Model for XML Schema Integration
BNCOD 17 Proceedings of the 17th British National Conferenc on Databases: Advances in Databases
Ontology-Based Integration of XML Web Resources
ISWC '02 Proceedings of the First International Semantic Web Conference on The Semantic Web
A Data Integration Framework for e-Commerce Product Classification
ISWC '02 Proceedings of the First International Semantic Web Conference on The Semantic Web
A survey of approaches to automatic schema matching
The VLDB Journal — The International Journal on Very Large Data Bases
Catalog Integration for Electronic Commerce through Category-Hierarchy Merging Technique
RIDE '02 Proceedings of the 12th International Workshop on Research Issues in Data Engineering: Engineering E-Commerce/E-Business Systems (RIDE'02)
Querying tree-structured data using dimension graphs
CAiSE'05 Proceedings of the 17th international conference on Advanced Information Systems Engineering
Hi-index | 0.00 |
Nowadays, huge volumes of Web data are organized or exported in tree-structured form. Popular examples of such structures are product catalogs of e-market stores, taxonomies of thematic categories, XML data encodings, etc. Even for a single knowledge domain, name mismatches, structural differences and structural inconsistencies raise difficulties when many data sources need to be integrated and queried in a uniform way. In this paper, we present a method for semantically integrating tree-structured data. We introduce dimensions which are sets of semantically related nodes in tree structures. Based on dimensions, we suggest dimension graphs. Dimension graphs can be automatically extracted from trees and abstract their structural information. They are semantically rich constructs that provide query guidance to pose queries, assist query evaluation and support integration of tree-structured data. We design a query language to query tree-structured data. The language allows full, partial or no specification of the structure of the underlying tree-structured data used to issue queries. Thus, queries in our language are not restricted by the structure of the trees. We provide necessary and sufficient conditions for checking query satisfiability and we present a technique for evaluating satisfiable queries. Finally, we conducted several experiments to compare our method for integrating tree-structured data with one that does not exploit dimension graphs. Our results demonstrate the superiority of our approach.