Beyond term clusters: assigning Wikipedia concepts to scientific documents

  • Authors:
  • Ozge Yeloglu;Evangelos Milios;A. Nur ZIncir-Heywood

  • Affiliations:
  • Dalhousie University, Halifax, NS, Canada;Dalhousie University, Halifax, NS, Canada;Dalhousie University, Halifax, NS, Canada

  • Venue:
  • Proceedings of the 2013 ACM symposium on Document engineering
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a model for assigning Wikipedia Concepts as scientific category labels to scientific documents where their terms are first grouped together using the well-known topic modelling method, Latent Dirichlet Allocation (LDA) and then assigned to Wikipedia Concepts by wikification. We wikify the terms of the topic model of a document to extract related concepts from Wikipedia. We experiment on two different datasets: the abstracts of the documents from the ACM Digital Library and the full papers of the UvT Collection. The ACM dataset includes Computer Science publications whereas UvT includes scientific publications from a range of topics. Domain specific taxonomies are used for evaluation. Results show that our approach is able to assign Wikipedia Concepts to the scientific publications in an automated manner, removing any need for human supervision.