Efficient mining of emerging patterns: discovering trends and differences
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Machine learning in automated text categorization
ACM Computing Surveys (CSUR)
An associative classifier based on positive and negative rules
Proceedings of the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Combining linguistic and structural descriptors for mining biomedical literature
Proceedings of the 2006 ACM symposium on Document engineering
Hi-index | 0.00 |
In this paper, we propose a new approach based on two original principles to categorize biomedical articles. On the one hand, we combine linguistic, structural and metric descriptors to build patterns stemming from data mining techniques. On the other hand, we take into account the importance of the absence of patterns to the categorization task by using an exclusion-inclusion method. To avoid a crisp effect between the absence and the presence of a pattern, the exclusion-inclusion method uses two regret measures to quantify the interest of a weak pattern according to the other classes and among patterns from a same class. The global decision is based on the generalization of the local patterns, firstly by using patterns excluding classes, then according to the regret ratios. Experiments show the effectiveness of the approach.