COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Machine Learning
Learning object identification rules for information integration
Information Systems - Data extraction, cleaning and reconciliation
Crossover, Macromutationand, and Population-Based Search
Proceedings of the 6th International Conference on Genetic Algorithms
Learning domain-independent string transformation weights for high accuracy object identification
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
TAILOR: A Record Linkage Tool Box
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
An analysis of the behavior of a class of genetic adaptive systems.
An analysis of the behavior of a class of genetic adaptive systems.
Genetic Programming IV: Routine Human-Competitive Machine Intelligence
Genetic Programming IV: Routine Human-Competitive Machine Intelligence
Adaptive duplicate detection using learnable string similarity measures
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Proceedings of the 6th ACM/IEEE-CS joint conference on Digital libraries
Ontology Matching
Replica identification using genetic programming
Proceedings of the 2008 ACM symposium on Applied computing
On active learning of record matching packages
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
A self-training approach for resolving object coreference on the semantic web
Proceedings of the 20th international conference on World wide web
A Genetic Programming Approach to Record Deduplication
IEEE Transactions on Knowledge and Data Engineering
LIMES: a time-efficient approach for large-scale link discovery on the web of data
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Data Matching: Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection
EAGLE: efficient active learning of link specifications using genetic programming
ESWC'12 Proceedings of the 9th international conference on The Semantic Web: research and applications
Learning expressive linkage rules using genetic programming
Proceedings of the VLDB Endowment
Active learning of expressive linkage rules for the web of data
ICWE'12 Proceedings of the 12th international conference on Web Engineering
Hi-index | 0.00 |
A central problem in the context of the Web of Linked Data as well as in data integration in general is to identify entities in different data sources that describe the same real-world object. Many existing methods for matching entities rely on explicit linkage rules, which specify the conditions which must hold true for two entities in order to be interlinked. As writing good linkage rules by hand is a non-trivial problem, the burden to generate links between data sources is still high. In order to reduce the effort and expertise required to write linkage rules, we present the ActiveGenLink algorithm which combines genetic programming and active learning to generate expressive linkage rules interactively. The ActiveGenLink algorithm automates the generation of linkage rules and only requires the user to confirm or decline a number of link candidates. ActiveGenLink uses a query strategy which minimizes user involvement by selecting link candidates which yield a high information gain. Our evaluation shows that ActiveGenLink is capable of generating high quality linkage rules based on labeling a small number of candidate links and that our query strategy for selecting the link candidates outperforms the query-by-vote-entropy baseline.