Evaluating and optimizing autonomous text classification systems
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
An Evaluation of Statistical Approaches to Text Categorization
Information Retrieval
A study of thresholding strategies for text categorization
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Combining machine learning and hierarchical structures for text categorization
Combining machine learning and hierarchical structures for text categorization
RCV1: A New Benchmark Collection for Text Categorization Research
The Journal of Machine Learning Research
Developing Multi-Agent Systems with JADE (Wiley Series in Agent Technology)
Developing Multi-Agent Systems with JADE (Wiley Series in Agent Technology)
Classifying web documents in a hierarchy of categories: a comprehensive study
Journal of Intelligent Information Systems
A comparative study of thresholding strategies in progressive filtering
AI*IA'11 Proceedings of the 12th international conference on Artificial intelligence around man and beyond
Hi-index | 0.00 |
Most of the research on text categorization has focused on mapping text documents to a set of categories among which structural relationships hold, i.e., on hierarchical text categorization. For solutions of a hierarchical problem that make use of an ensemble of classifiers, the behavior of each classifier typically depends on an acceptance threshold, which turns a degree of membership into a dichotomous decision. In principle, the problem of finding the best acceptance thresholds for a set of classifiers related with taxonomic relationships is a hard problem. Hence, devising effective ways for finding suboptimal solutions to this problem may have great importance. In this paper, we assess a greedy threshold selection algorithm aimed at finding a suboptimal combination of thresholds in a hierarchical text categorization setting. Comparative experiments, performed on Reuters, report the performance of the proposed threshold selection algorithm against a relaxed brute-force algorithm and against two state-of-the-art algorithms. Results highlight the effectiveness of the approach.