EAGLE: efficient active learning of link specifications using genetic programming

Authors:
Axel-Cyrille Ngonga Ngomo;Klaus Lyko
Affiliations:
Department of Computer Science, University of Leipzig, Leipzig, Germany;Department of Computer Science, University of Leipzig, Leipzig, Germany
Venue:
ESWC'12 Proceedings of the 9th international conference on The Semantic Web: research and applications
Year:
2012

Citing 18
Cited 7

Genetic programming: on the programming of computers by means of natural selection

Genetic programming: on the programming of computers by means of natural selection
Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Induction of fuzzy decision trees

Fuzzy Sets and Systems
Asymptotic behaviors of support vector machines with Gaussian kernel

Neural Computation
Adaptive duplicate detection using learnable string similarity measures

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Duplicate Record Detection: A Survey

IEEE Transactions on Knowledge and Data Engineering
Replica identification using genetic programming

Proceedings of the 2008 ACM symposium on Applied computing
Febrl -: an open source data cleaning, deduplication and record linkage system with a graphical user interface

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Data fusion

ACM Computing Surveys (CSUR)
idMesh: graph-based disambiguation of linked data

Proceedings of the 18th international conference on World wide web
Comparative evaluation of entity resolution approaches with FEVER

Proceedings of the VLDB Endowment
Overcoming Schema Heterogeneity between Linked Semantic Repositories to Improve Coreference Resolution

ASWC '09 Proceedings of the 4th Asian Conference on The Semantic Web
On active learning of record matching packages

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Active learning with committees for text categorization

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Eliminating the redundancy in blocking-based entity resolution methods

Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries
Introduction to linked data and its lifecycle on the web

RW'11 Proceedings of the 7th international conference on Reasoning web: semantic technologies for the web of data
Automatically generating data linkages using a domain-independent candidate selection approach

ISWC'11 Proceedings of the 10th international conference on The semantic web - Volume Part I
LIMES: a time-efficient approach for large-scale link discovery on the web of data

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three

Link discovery with guaranteed reduction ratio in affine spaces with minkowski measures

ISWC'12 Proceedings of the 11th international conference on The Semantic Web - Volume Part I
DEQA: deep web extraction for question answering

ISWC'12 Proceedings of the 11th international conference on The Semantic Web - Volume Part II
Discovering keys in RDF/OWL dataset with KD2R

Proceedings of the 2nd International Workshop on Open Data
Introduction to linked data and its lifecycle on the web

RW'13 Proceedings of the 9th international conference on Reasoning Web: semantic technologies for intelligent data access
Active learning of expressive linkage rules using genetic programming

Web Semantics: Science, Services and Agents on the World Wide Web
An automatic key discovery approach for data linking

Web Semantics: Science, Services and Agents on the World Wide Web
Generating SPARQL queries using templates

Web Intelligence and Agent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the growth of the Linked Data Web, time-efficient approaches for computing links between data sources have become indispensable. Most Link Discovery frameworks implement approaches that require two main computational steps. First, a link specification has to be explicated by the user. Then, this specification must be executed. While several approaches for the time-efficient execution of link specifications have been developed over the last few years, the discovery of accurate link specifications remains a tedious problem. In this paper, we present EAGLE, an active learning approach based on genetic programming. EAGLE generates highly accurate link specifications while reducing the annotation burden for the user. We evaluate EAGLE against batch learning on three different data sets and show that our algorithm can detect specifications with an F-measure superior to 90% while requiring a small number of questions.