Combining labelled and unlabelled data: a case study on fisher kernels and transductive inference for biological entity recognition

  • Authors:
  • Cyril Goutte;Hervé Déjean;Eric Gaussier;Nicola Cancedda;Jean-Michel Renders

  • Affiliations:
  • Xerox Research Center Europe, Meylan, France;Xerox Research Center Europe, Meylan, France;Xerox Research Center Europe, Meylan, France;Xerox Research Center Europe, Meylan, France;Xerox Research Center Europe, Meylan, France

  • Venue:
  • COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

We address the problem of using partially labelled data, eg large collections were only little data is annotated, for extracting biological entities. Our approach relies on a combination of probabilistic models, which we use to model the generation of entities and their context, and kernel machines, which implement powerful categorisers based on a similarity measure and some labelled data. This combination takes the form of the so-called Fisher kernels which implement a similarity based on an underlying probabilistic model. Such kernels are compared with transductive inference, an alternative approach to combining labelled and unlabelled data, again coupled with Support Vector Machines. Experiments are performed on a database of abstracts extracted from Medline.