Identifying candidate datasets for data interlinking

Authors:
Luiz André P. Paes Leme;Giseli Rabello Lopes;Bernardo Pereira Nunes;Marco Antonio Casanova;Stefan Dietze
Affiliations:
Computer Science Institute, Fluminense Federal University, Niterói, RJ, Brazil;Department of Informatics, Pontifical Catholic University of Rio de Janeiro, Rio de Janeiro, RJ, Brazil;Department of Informatics, Pontifical Catholic University of Rio de Janeiro, Rio de Janeiro, RJ, Brazil,L3S Research Center, Leibniz University Hannover, Hannover, Germany;Department of Informatics, Pontifical Catholic University of Rio de Janeiro, Rio de Janeiro, RJ, Brazil;L3S Research Center, Leibniz University Hannover, Hannover, Germany
Venue:
ICWE'13 Proceedings of the 13th international conference on Web Engineering
Year:
2013

Citing 12
Cited 0

Foundations of statistical natural language processing

Foundations of statistical natural language processing
Recommender systems in e-commerce

Proceedings of the 1st ACM conference on Electronic commerce
Matching People and Jobs: A Bilateral Recommendation Approach

HICSS '06 Proceedings of the 39th Annual Hawaii International Conference on System Sciences - Volume 06
On social networks and collaborative recommendation

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Recommender Systems Handbook

Recommender Systems Handbook
Recommender Systems: An Introduction

Recommender Systems: An Introduction
Data Mining: Practical Machine Learning Tools and Techniques

Data Mining: Practical Machine Learning Tools and Techniques
Linked data-based concept recommendation: comparison of different methods in open innovation scenario

ESWC'12 Proceedings of the 9th international conference on The Semantic Web: research and applications
Feedback-based data set recommendation for building linked data applications

Proceedings of the 8th International Conference on Semantic Systems
What should i link to? identifying relevant sources and classes for data linking

JIST'11 Proceedings of the 2011 joint international conference on The Semantic Web
Using information quality for the identification of relevant web data sources: a proposal

Proceedings of the 14th International Conference on Information Integration and Web-based Applications & Services
Scientific data integration system in the linked open data space

Programming and Computing Software

Quantified Score

Hi-index	0.00

Visualization

Abstract

One of the design principles that can stimulate the growth and increase the usefulness of the Web of data is URIs linkage. However, the related URIs are typically in different datasets managed by different publishers. Hence, the designer of a new dataset must be aware of the existing datasets and inspect their content to define sameAs links. This paper proposes a technique based on probabilistic classifiers that, given a datasets S to be published and a set T of known published datasets, ranks each Ti ∈ T according to the probability that links between S and Ti can be found by inspecting the most relevant datasets. Results from our technique show that the search space can be reduced up to 85%, thereby greatly decreasing the computational effort.