BioPPISVMExtractor: A protein-protein interaction extractor for biomedical literature using SVM and rich feature sets

  • Authors:
  • Zhihao Yang;Hongfei Lin;Yanpeng Li

  • Affiliations:
  • -;-;-

  • Venue:
  • Journal of Biomedical Informatics
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Protein-protein interactions play a key role in various aspects of the structural and functional organization of the cell. Knowledge about them unveils the molecular mechanisms of biological processes. However, the amount of biomedical literature regarding protein interactions is increasing rapidly and it is difficult for interaction database curators to detect and curate protein interaction information manually. This paper presents a SVM-based system, named BioPPISVMExtractor, to identify protein-protein interactions in biomedical literature. This system uses rich feature sets including word features, keyword feature, protein names distance feature and Link path feature for SVM classification. In addition, the Link Grammar extraction result feature is introduced to improve the precision rate. Experimental evaluations with other state-of-the-art PPI extraction systems tested on the DIP corpus indicate that BioPPISVMExtractor can substantially improve recall at the cost of a moderate decline in precision.