Discovering missing links in large-scale linked data

Authors:
Nam Hau;Ryutaro Ichise;Bac Le
Affiliations:
University of Technology, Ho Chi Minh City, Vietnam;National Institute of Informatics, Tokyo, Japan;National Science University, Ho Chi Minh City, Vietnam
Venue:
ACIIDS'13 Proceedings of the 5th Asian conference on Intelligent Information and Database Systems - Volume Part II
Year:
2013

Citing 4
Cited 0

A Mixed Similarity Measure in Near-Linear Computational Complexity for Distance-Based Methods

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
A Comparison of Personal Name Matching: Techniques and Practical Issues

ICDMW '06 Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops
Discovering and Maintaining Links on the Web of Data

ISWC '09 Proceedings of the 8th International Semantic Web Conference
LIMES: a time-efficient approach for large-scale link discovery on the web of data

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three

Quantified Score

Hi-index	0.00

Visualization

Abstract

The explosion of linked data is creating sparse connection networks, primarily because more and more missing links among difference data sources are resulting from asynchronous and independent database development. DHR was proposed in other research to discover these links.However, DHR has limitations in a distributed environment. For example, while deploying on a distributed SPARQL server, the data transfer usually causes overhead on the network. Therefore, we propose a new method of detecting a missing link based on DHR. The method consists of two stages: finding the frequent graph and matching the similarity. In this paper, we enhance some features in the two stages to reduce the data flow before querying. We conduct an experiment using geographic data sources with a large number of triples to discover the missing links and compare the accuracy of our proposed matching method with DHR and the primitive mix similarity method. The experimental results show that our method can reduce a large amount of data flow on a network and increase the accuracy of discovering missing links.