Minimally-supervised extraction of domain-specific part-whole relations using Wikipedia as knowledge-base

  • Authors:
  • Ashwin Ittoo;Gosse Bouma

  • Affiliations:
  • Operations, Faculty of Economics and Business, University of Groningen, Nettelbosje 2, 9747 AE Groningen, The Netherlands;Computational Linguistics (Information Science), Faculty of Arts, University of Groningen, Oude Kijk in 't Jatstraat 26, 9712 EK Groningen, The Netherlands

  • Venue:
  • Data & Knowledge Engineering
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a minimally-supervised approach for learning part-whole relations from texts. Unlike previous techniques, we focused on sparse, domain-specific texts. The novelty in our approach lies in the use of Wikipedia as a knowledge-base, from which we first acquire a set of reliable patterns that express part-whole relations. This is achieved by a minimally-supervised algorithm. We then use the patterns acquired to extract part-whole relation triples from a collection of sparse, domain-specific texts. Our strategy, of learning in one domain and applying the knowledge in another domain is based upon the notion of domain-adaption. It allows us to overcome the challenges of learning the relations directly from the sparse, domain-specific corpus. Our experimental evaluations reveal that, despite its general-purpose nature, Wikipedia can be exploited as a source of knowledge for improving the performance of domain-specific part-whole relation extraction. As our other contributions, we propose a mechanism that mitigates the negative impact of semantic-drift on minimally-supervised algorithms. Also, we represent the patterns in the extracted relations using sophisticated syntactic structures that avoid the limitations of traditional surface string representations. In addition, we show that domain-specific part-whole relations cannot be conclusively classified in existing taxonomies.