A comparative study of thresholding strategies in progressive filtering

Authors:
Andrea Addis;Giuliano Armano;Eloisa Vargiu
Affiliations:
University of Cagliari, Department of Electrical and Electronic Engineering;University of Cagliari, Department of Electrical and Electronic Engineering;University of Cagliari, Department of Electrical and Electronic Engineering
Venue:
AI*IA'11 Proceedings of the 12th international conference on Artificial intelligence around man and beyond
Year:
2011

Citing 14
Cited 0

A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features

Machine Learning
Evaluating and optimizing autonomous text classification systems

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Hierarchical classification of Web content

SIGIR '00 Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
An Evaluation of Statistical Approaches to Text Categorization

Information Retrieval
A study of thresholding strategies for text categorization

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Hierarchical Text Classification and Evaluation

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Combining machine learning and hierarchical structures for text categorization

Combining machine learning and hierarchical structures for text categorization
RCV1: A New Benchmark Collection for Text Categorization Research

The Journal of Machine Learning Research
Developing Multi-Agent Systems with JADE (Wiley Series in Agent Technology)

Developing Multi-Agent Systems with JADE (Wiley Series in Agent Technology)
Classifying web documents in a hierarchy of categories: a comprehensive study

Journal of Intelligent Information Systems
A survey of hierarchical classification across different application domains

Data Mining and Knowledge Discovery
A comparative experimental assessment of a threshold selection algorithm in hierarchical text categorization

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Learning classifiers using hierarchically structured class taxonomies

SARA'05 Proceedings of the 6th international conference on Abstraction, Reformulation and Approximation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Thresholding strategies in automated text categorization are an underexplored area of research. Indeed, thresholding strategies are often considered a post-processing step of minor importance, the underlying assumptions being that they do not make a difference in the performance of a classifier and that finding the optimal thresholding strategy for any given classifier is trivial. Neither these assumptions are true. In this paper, we concentrate on progressive filtering, a hierarchical text categorization technique that relies on a local-classifier-per-node approach, thus mimicking the underlying taxonomy of categories. The focus of the paper is on assessing TSA, a greedy threshold selection algorithm, against a relaxed brute-force algorithm and the most relevant state-of-the-art algorithms. Experiments, performed on Reuters, confirm the validity of TSA.