A comparative analysis of methodologies for database schema integration
ACM Computing Surveys (CSUR)
Combinatorial optimization: algorithms and complexity
Combinatorial optimization: algorithms and complexity
Applied multivariate statistical analysis
Applied multivariate statistical analysis
A Theory of Attributed Equivalence in Databases with Application to Schema Integration
IEEE Transactions on Software Engineering
Federated database systems for managing distributed, heterogeneous, and autonomous databases
ACM Computing Surveys (CSUR) - Special issue on heterogeneous databases
Information-Based Evaluation Criterion for Classifier's Performance
Machine Learning
Semantic heterogeneity as a result of domain evolution
ACM SIGMOD Record
On resolving schematic heterogeneity in multidatabase systems
Distributed and Parallel Databases
Automated resolution of semantic heterogeneity in multidatabases
ACM Transactions on Database Systems (TODS)
Data sharing economics and requirements for integration tool design
Information Systems - Special issue: distributed information systems in business and management
Rule based joins in heterogeneous databases
Decision Support Systems - Special issue on information technologies and systems
Building a data warehouse for decision support
Building a data warehouse for decision support
The merge/purge problem for large databases
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Identifying object isomerism in multidatabase systems
Distributed and Parallel Databases
Decision quality using ranked attribute weights
Management Science
Data warehouse: practical advice from the experts
Data warehouse: practical advice from the experts
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Clustering Algorithms
System-Guided View Integration for Object-Oriented Databases
IEEE Transactions on Knowledge and Data Engineering
The Inter-Database Instance Identification Problem in Integrating Autonomous Systems
Proceedings of the Fifth International Conference on Data Engineering
Semantic Integration in Heterogeneous Databases Using Neural Networks
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
A Methodology for View Inegration in Logical Database Design
VLDB '82 Proceedings of the 8th International Conference on Very Large Data Bases
Object Identification in Multidatabase Systems
Proceedings of the IFIP WG 2.6 Database Semantics Conference on Interoperable Database Systems (DS-5)
Knowledge Based Integration of Heterogeneous Databases
Proceedings of the IFIP WG 2.6 Database Semantics Conference on Interoperable Database Systems (DS-5)
Element matching across data-oriented XML sources using a multi-strategy clustering model
Data & Knowledge Engineering
A probabilistic model for approximate identity matching
dg.o '06 Proceedings of the 2006 international conference on Digital government research
Entity matching in heterogeneous databases: A logistic regression approach
Decision Support Systems
Data & Knowledge Engineering
Integral vs. Separable Attributes in Spatial Similarity Assessments
Proceedings of the international conference on Spatial Cognition VI: Learning, Reasoning, and Talking about Space
Towards privacy preserving data reconciliation for criminal justice chains
Proceedings of the 10th Annual International Conference on Digital Government Research: Social Networks: Making Connections between Citizens, Data and Government
A Survey on Uncertainty Management in Data Integration
Journal of Data and Information Quality (JDIQ)
Preserving privacy whilst integrating data: Applied to criminal justice
Information Polity - Government 2.0: Making Connections between citizens, data and government
Identity matching using personal and social identity features
Information Systems Frontiers
A hierarchical Naïve Bayes model for approximate identity matching
Decision Support Systems
Ontology and instance matching
Knowledge-driven multimedia information extraction and ontology evolution
Ontology-driven automatic entity disambiguation in unstructured text
ISWC'06 Proceedings of the 5th international conference on The Semantic Web
Monitoring research collaborations using semantic web technologies
ESWC'05 Proceedings of the Second European conference on The Semantic Web: research and Applications
Improving access to multimedia using multi-source hierarchical meta-data
AMR'05 Proceedings of the Third international conference on Adaptive Multimedia Retrieval: user, context, and feedback
Identity matching and information acquisition: Estimation of optimal threshold parameters
Decision Support Systems
Hi-index | 0.00 |
In modern organizations, decision makers must often be able to quickly access information from diverse sources in order to make timely decisions. A critical problem facing many such organizations is the inability to easily reconcile the information contained in heterogeneous data sources. To overcome this limitation, an organization must resolve several types of heterogeneity problems that may exist across different sources. In this paper, we examine one such problem called the entity heterogeneity problem, which arises when the same real-world entity type is represented using different identifiers in different applications. A decision-theoretic model to resolve the problem is proposed. Our model uses a distance measure to express the similarity between two entity instances. We have implemented the model and tested it on real-world data. The results indicate that the model performs quite well in terms of its ability to predict whether two entity instances should be matched or not. The model is shown to be computationally efficient. It also scales well to large relations from the perspective of the accuracy of prediction. Overall, the test results imply that this is certainly a viable approach in practical situations.