Proceedings of the third annual conference on Autonomous Agents
The Alternating Decision Tree Learning Algorithm
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Reference reconciliation in complex information spaces
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Exploiting relationships for object consolidation
Proceedings of the 2nd international workshop on Information quality in information systems
Duplicate Record Detection: A Survey
IEEE Transactions on Knowledge and Data Engineering
Extracting and Exploring the Geo-Temporal Semantics of Textual Resources
ICSC '08 Proceedings of the 2008 IEEE International Conference on Semantic Computing
Complex Data Transformations in Digital Libraries with Spatio-Temporal Information
ICADL 08 Proceedings of the 11th International Conference on Asian Digital Libraries: Universal and Ubiquitous Access to Information
Complex Data Transformations in Digital Libraries with Spatio-Temporal Information
ICADL 08 Proceedings of the 11th International Conference on Asian Digital Libraries: Universal and Ubiquitous Access to Information
Using national bibliographies for rights clearance
Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
Hi-index | 0.00 |
Entity resolution is the process of determining if, in a specific context, two or more references correspond to the same entity. In this work, we address this problem in the context of references to persons as they are found in bibliographic data, specifically in the case of consolidating multiple datasets. Or solution follows the extraction, transformation and loading (ETL) process, typical in data warehouses. It computes the similarities of the attribute values for the references, and employs a decision tree to decide when the references match. We describe the characteristics of these references within bibliographic datasets, and how we explored those characteristics by developing new similarity metrics to improve the quality of the consolidation process. We evaluated our work by designing an experiment with data from four national libraries. The results show that the proposed similarity metrics contribute significantly to the consolidation process.