Semi-supervised learning of the hidden vector state model for extracting protein-protein interactions

  • Authors:
  • Deyu Zhou;Yulan He;Chee Keong Kwoh

  • Affiliations:
  • School of Computer Engineering, Nanyang Technological University, Block N4, Nanyang Avenue, Singapore 639798, Singapore;Informatics Research Centre, The University of Reading, Whiteknights Reading, Berkshire RG6 6BX, UK;School of Computer Engineering, Nanyang Technological University, Block N4, Nanyang Avenue, Singapore 639798, Singapore

  • Venue:
  • Artificial Intelligence in Medicine
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Objective: The hidden vector state (HVS) model is an extension of the basic discrete Markov model in which context is encoded as a stack-oriented state vector. It has been applied successfully for protein-protein interactions extraction. However, the HVS model, being a statistically based approach, requires large-scale annotated corpora in order to reliably estimate model parameters. This is normally difficult to obtain in practical applications. Methods and materials: In this paper, we present two novel semi-supervised learning approaches, one based on classification and the other based on expectation-maximization, to train the HVS model from both annotated and un-annotated corpora. Results and conclusion: Experimental results show the improved performance over the baseline system using the HVS model trained solely from the annotated corpus, which gives the support to the feasibility and efficiency of our approaches.