TAILOR: A Record Linkage Tool Box
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Duplicate Record Detection: A Survey
IEEE Transactions on Knowledge and Data Engineering
Scaling up all pairs similarity search
Proceedings of the 16th international conference on World Wide Web
External perfect hashing for very large key sets
Proceedings of the sixteenth ACM conference on Conference on information and knowledge management
Efficient similarity joins for near duplicate detection
Proceedings of the 17th international conference on World Wide Web
ACM Computing Surveys (CSUR)
Cross ontology query answering on the semantic web: an initial evaluation
Proceedings of the fifth international conference on Knowledge capture
Comparative evaluation of entity resolution approaches with FEVER
Proceedings of the VLDB Endowment
Dynamically scaling applications in the cloud
ACM SIGCOMM Computer Communication Review
Enterprise data classification using semantic web technologies
ISWC'10 Proceedings of the 9th international semantic web conference on The semantic web - Volume Part II
Introduction to linked data and its lifecycle on the web
RW'11 Proceedings of the 7th international conference on Reasoning web: semantic technologies for the web of data
FedX: optimization techniques for federated query processing on linked data
ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part I
Automatically generating data linkages using a domain-independent candidate selection approach
ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part I
OWL reasoning with WebPIE: calculating the closure of 100 billion triples
ESWC'10 Proceedings of the 7th international conference on The Semantic Web: research and Applications - Volume Part I
LIMES: a time-efficient approach for large-scale link discovery on the web of data
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Unsupervised learning of link discovery configuration
ESWC'12 Proceedings of the 9th international conference on The Semantic Web: research and applications
EAGLE: efficient active learning of link specifications using genetic programming
ESWC'12 Proceedings of the 9th international conference on The Semantic Web: research and applications
Introduction to linked data and its lifecycle on the web
RW'13 Proceedings of the 9th international conference on Reasoning Web: semantic technologies for intelligent data access
Hi-index | 0.00 |
Time-efficient algorithms are essential to address the complex linking tasks that arise when trying to discover links on the Web of Data. Although several lossless approaches have been developed for this exact purpose, they do not offer theoretical guarantees with respect to their performance. In this paper, we address this drawback by presenting the first Link Discovery approach with theoretical quality guarantees. In particular, we prove that given an achievable reduction ratio r, our Link Discovery approach $\mathcal{HR}^3$ can achieve a reduction ratio r′≤r in a metric space where distances are measured by the means of a Minkowski metric of any order p≥2. We compare $\mathcal{HR}^3$ and the HYPPO algorithm implemented in LIMES 0.5 with respect to the number of comparisons they carry out. In addition, we compare our approach with the algorithms implemented in the state-of-the-art frameworks LIMES 0.5 and SILK 2.5 with respect to runtime. We show that $\mathcal{HR}^3$ outperforms these previous approaches with respect to runtime in each of our four experimental setups.