Exclusion-inclusion based text categorization of biomedical articles

  • Authors:
  • Nadia Zerida;Nadine Lucas;Bruno Crémilleux

  • Affiliations:
  • University of Caen;University of Caen;University of Caen

  • Venue:
  • Proceedings of the 2007 ACM symposium on Document engineering
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we propose a new approach based on two original principles to categorize biomedical articles. On the one hand, we combine linguistic, structural and metric descriptors to build patterns stemming from data mining techniques. On the other hand, we take into account the importance of the absence of patterns to the categorization task by using an exclusion-inclusion method. To avoid a crisp effect between the absence and the presence of a pattern, the exclusion-inclusion method uses two regret measures to quantify the interest of a weak pattern according to the other classes and among patterns from a same class. The global decision is based on the generalization of the local patterns, firstly by using patterns excluding classes, then according to the regret ratios. Experiments show the effectiveness of the approach.