The role of syntactic features in protein interaction extraction

  • Authors:
  • Timur Fayruzov;Martine De Cock;Chris Cornelis;Veronique Hoste

  • Affiliations:
  • Ghent University, Ghent, Belgium;Ghent University, Ghent, Belgium;Ghent University, Ghent, Belgium;University College Ghent, Ghent, Belgium

  • Venue:
  • Proceedings of the 2nd international workshop on Data and text mining in bioinformatics
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most approaches for protein interaction mining from biomedical texts use both lexical and syntactic features. However, the individual impact of these two kinds of features on the effectiveness of the mining process has not yet been thoroughly studied. In this paper, we perform such a study on a recently published state of the art support vector machine approach that uses both lexical and syntactic features. To this end, we strip this approach down to an algorithm that uses only a subset of the initial syntactic features. Next, we compare the original and the stripped-down method by evaluating them on 5 benchmark datasets as well as by performing 5 additional cross-dataset experiments. Although the original method exploits a very rich feature set including words, parts-of-speech and grammatical relations, it is not significantly better than the stripped-down version; in fact, the former does not even consistently outperform the latter.