Distributed human computation framework for linked data co-reference resolution

  • Authors:
  • Yang Yang;Priyanka Singh;Jiadi Yao;Ching-man Au Yeung;Amir Zareian;Xiaowei Wang;Zhonglun Cai;Manuel Salvadores;Nicholas Gibbins;Wendy Hall;Nigel Shadbolt

  • Affiliations:
  • Intelligence, Agents, Multimedia (IAM) Group, School of Electronics and Computer Science, University of Southampton, UK;Intelligence, Agents, Multimedia (IAM) Group, School of Electronics and Computer Science, University of Southampton, UK;Intelligence, Agents, Multimedia (IAM) Group, School of Electronics and Computer Science, University of Southampton, UK;NTT Communication Science Laboratories, Kyoto, Japan;Intelligence, Agents, Multimedia (IAM) Group, School of Electronics and Computer Science, University of Southampton, UK;Intelligence, Agents, Multimedia (IAM) Group, School of Electronics and Computer Science, University of Southampton, UK;Intelligence, Agents, Multimedia (IAM) Group, School of Electronics and Computer Science, University of Southampton, UK;Intelligence, Agents, Multimedia (IAM) Group, School of Electronics and Computer Science, University of Southampton, UK;Intelligence, Agents, Multimedia (IAM) Group, School of Electronics and Computer Science, University of Southampton, UK;Intelligence, Agents, Multimedia (IAM) Group, School of Electronics and Computer Science, University of Southampton, UK;Intelligence, Agents, Multimedia (IAM) Group, School of Electronics and Computer Science, University of Southampton, UK

  • Venue:
  • ESWC'11 Proceedings of the 8th extended semantic web conference on The semantic web: research and applications - Volume Part I
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Distributed Human Computation (DHC) is used to solve computational problems by incorporating the collaborative effort of a large number of humans. It is also a solution to AI-complete problems such as natural language processing. The Semantic Web with its root in AI has many research problems that are considered as AI-complete. E.g. co-reference resolution, which involves determining whether different URIs refer to the same entity, is a significant hurdle to overcome in the realisation of large-scale Semantic Web applications. In this paper, we propose a framework for building a DHC system on top of the Linked Data Cloud to solve various computational problems. To demonstrate the concept, we are focusing on handling the co-reference resolution when integrating distributed datasets. Traditionally machine-learning algorithms are used as a solution for this but they are often computationally expensive, error-prone and do not scale. We designed a DHC system named iamResearcher, which solves the scientific publication author identity coreference problem when integrating distributed bibliographic datasets. In our system, we aggregated 6 million bibliographic data from various publication repositories. Users can sign up to the system to audit and align their own publications, thus solving the co-reference problem in a distributed manner. The aggregated results are dereferenceable in the Open Linked Data Cloud.