Using Wikipedia to learn semantic feature representations of concrete concepts in neuroimaging experiments

  • Authors:
  • Francisco Pereira;Matthew Botvinick;Greg Detre

  • Affiliations:
  • Psychology Department and Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08540, United States;Psychology Department and Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08540, United States;Psychology Department and Princeton Neuroscience Institute, Princeton University, Princeton, NJ 08540, United States

  • Venue:
  • Artificial Intelligence
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we show that a corpus of a few thousand Wikipedia articles about concrete or visualizable concepts can be used to produce a low-dimensional semantic feature representation of those concepts. The purpose of such a representation is to serve as a model of the mental context of a subject during functional magnetic resonance imaging (fMRI) experiments. A recent study by Mitchell et al. (2008) [19] showed that it was possible to predict fMRI data acquired while subjects thought about a concrete concept, given a representation of those concepts in terms of semantic features obtained with human supervision. We use topic models on our corpus to learn semantic features from text in an unsupervised manner, and show that these features can outperform those in Mitchell et al. (2008) [19] in demanding 12-way and 60-way classification tasks. We also show that these features can be used to uncover similarity relations in brain activation for different concepts which parallel those relations in behavioral data from human subjects.