A Theory of Attributed Equivalence in Databases with Application to Schema Integration
IEEE Transactions on Software Engineering
Determining relationships among names in heterogeneous databases
ACM SIGMOD Record
Multivariate data analysis (4th ed.): with readings
Multivariate data analysis (4th ed.): with readings
A framework for the design and evaluation of reverse engineering methods for relational databases
Data & Knowledge Engineering
Semantic integration of conceptual schemas
Data & Knowledge Engineering - Special issue natural language for data bases
The Carnot Heterogeneous Database Project: Implemented Applications
Distributed and Parallel Databases
Schema coordination in federated database management: a comparison with schema integration
Decision Support Systems
Multidatabase query processing with uncertainty in global keys and attribute values
Journal of the American Society for Information Science - Special issue: management of imprecision and uncertainty
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Conceptual schema analysis: techniques and applications
ACM Transactions on Database Systems (TODS)
Tuple source relational model: a source-aware data model for multidatabases
Data & Knowledge Engineering
Data & Knowledge Engineering
Finding candidate keys for relational data bases
SIGMOD '75 Proceedings of the 1975 ACM SIGMOD international conference on Management of data
Entity Identification in Database Integration
Proceedings of the Ninth International Conference on Data Engineering
SNOUT: An Intelligent Assistant for Exploratory Data Anaylsis
PKDD '97 Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery
Semantic Integration in Heterogeneous Databases Using Neural Networks
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
A Heuristic Method for Correlating Attribute Group Pairs in Data Mining
ER '98 Proceedings of the Workshops on Data Warehousing and Data Mining: Advances in Database Technologies
A Schema Analysis and Reconciliation Tool Environment for Heterogeneous Databases
IDEAS '99 Proceedings of the 1999 International Symposium on Database Engineering & Applications
Schema Matching Using Duplicates
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
An Exploratory Study of Database Integration Processes
IEEE Transactions on Knowledge and Data Engineering
Theories of meaning in schema matching: An exploratory study
Information Systems
An instance-based approach for domain-independent schema matching
Proceedings of the 46th Annual Southeast Regional Conference on XX
Contextual factors in database integration: a Delphi study
ER'10 Proceedings of the 29th international conference on Conceptual modeling
Instance-Based matching of large ontologies using locality-sensitive hashing
ISWC'12 Proceedings of the 11th international conference on The Semantic Web - Volume Part I
Matching Attributes across Overlapping Heterogeneous Data Sources Using Mutual Information
Journal of Database Management
Hi-index | 0.00 |
Abstract.Most research on attribute identification in database integration has focused on integrating attributes using schema and summary information derived from the attribute values. No research has attempted to fully explore the use of attribute values to perform attribute identification. We propose an attribute identification method that employs schema and summary instance information as well as properties of attributes derived from their instances. Unlike other attribute identification methods that match only single attributes, our method matches attribute groups for integration. Because our attribute identification method fully explores data instances, it can identify corresponding attributes to be integrated even when schema information is misleading. Three experiments were performed to validate our attribute identification method. In the first experiment, the heuristic rules derived for attribute classification were evaluated on 119 attributes from nine public domain data sets. The second was a controlled experiment validating the robustness of the proposed attribute identification method by introducing erroneous data. The third experiment evaluated the proposed attribute identification method on five data sets extracted from online music stores. The results demonstrated the viability of the proposed method.