Sentence filtering for BioNLP: searching for renaming acts

  • Authors:
  • Pierre Warnier;Claire Nédellec

  • Affiliations:
  • MIG INRA UR, Jouy-en-Josas, France and LIG Université de Grenoble, France;MIG INRA UR, Jouy-en-Josas, France

  • Venue:
  • BioNLP Shared Task '11 Proceedings of the BioNLP Shared Task 2011 Workshop
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Bacteria Gene Renaming (RENAME) task is a supporting task in the BioNLP Shared Task 2011 (BioNLP-ST'11). The task consists in extracting gene renaming acts and gene synonymy reminders in scientific texts about bacteria. In this paper, we present in details our method in three main steps: 1) the document segmentation into sentences, 2) the removal of the sentences exempt of renaming act (false positives) using both a gene nomenclature and supervised machine learning (feature selection and SVM), 3) the linking of gene names by the target renaming relation in each sentence. Our system ranked third at the official test with 64.4% of F-measure. We also present here an effective post-competition improvement: the representation as SVM features of regular expressions that detect combinations of trigger words. This increases the F-measure to 73.1%.