Duplicate Record Detection: A Survey
IEEE Transactions on Knowledge and Data Engineering
Entity Resolution with Markov Logic
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Ontology Matching
A framework for semantic link discovery over relational data
Proceedings of the 18th ACM conference on Information and knowledge management
Frameworks for entity matching: A comparison
Data & Knowledge Engineering
Discovering and Maintaining Links on the Web of Data
ISWC '09 Proceedings of the 8th International Semantic Web Conference
Large-scale collective entity matching
Proceedings of the VLDB Endowment
A self-training approach for resolving object coreference on the semantic web
Proceedings of the 20th international conference on World wide web
Efficient SPectrAl Neighborhood blocking for entity resolution
ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Block-based load balancing for entity resolution with MapReduce
Proceedings of the 20th ACM international conference on Information and knowledge management
PARIS: probabilistic alignment of relations, instances, and schema
Proceedings of the VLDB Endowment
Beyond 100 million entities: large-scale blocking-based resolution for heterogeneous data
Proceedings of the fifth ACM international conference on Web search and data mining
Web Semantics: Science, Services and Agents on the World Wide Web
ICDE '12 Proceedings of the 2012 IEEE 28th International Conference on Data Engineering
Scalable Iterative Graph Duplicate Detection
IEEE Transactions on Knowledge and Data Engineering
Discovering interesting information with advances in web technology
ACM SIGKDD Explorations Newsletter
Entity disambiguation in anonymized graphs using graph kernels
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
A joint model for discovering and linking entities
Proceedings of the 2013 workshop on Automated knowledge base construction
Hi-index | 0.00 |
Linked Data has emerged as a powerful way of interconnecting structured data on the Web. However, the cross-linkage between Linked Data sources is not as extensive as one would hope for. In this paper, we formalize the task of automatically creating "sameAs" links across data sources in a globally consistent manner. Our algorithm, presented in a multi-core as well as a distributed version, achieves this link generation by accounting for joint evidence of a match. Experiments confirm that our system scales beyond 100 million entities and delivers highly accurate results despite the vast heterogeneity and daunting scale.