Improving term extraction with terminological resources

  • Authors:
  • Sophie Aubin;Thierry Hamon

  • Affiliations:
  • UMR CNRS 7030, LIPN, Villetaneuse;UMR CNRS 7030, LIPN, Villetaneuse

  • Venue:
  • FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Studies of different term extractors on a corpus of the biomedical domain revealed decreasing performances when applied to highly technical texts. Facing the difficulty or impossibility to customize existing tools, we developed a tunable term extractor. It exploits linguistic-based rules in combination with the reuse of existing terminologies, i.e. exogenous disambiguation. Experiments reported here show that the combination of the two strategies allows the extraction of a greater number of term candidates with a higher level of reliability. We further describe the extraction process involving both endogenous and exogenous disambiguation implemented in the term extractor $\rm Y\kern-.36em \lower.7ex\hbox{A}\kern-.25em T\kern-.1667em\lower.7ex\hbox{E}\kern-.08emA$.