Ensemble of binary learners for reliable text categorization with a reject option

Authors:
Giuliano Armano;Camelia Chira;Nima Hatami
Affiliations:
Department of Electrical and Electronic Engineering, University of Cagliari, Cagliari, Italy;Department of Computer Science, Babes-Bolyai University, Cluj-Napoca, Romania;Department of Electrical and Electronic Engineering, University of Cagliari, Cagliari, Italy
Venue:
HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part I
Year:
2012

Citing 15
Cited 0

Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Classification with Reject Option in Text Categorisation Systems

ICIAP '03 Proceedings of the 12th International Conference on Image Analysis and Processing
Reducing multiclass to binary: a unifying approach for margin classifiers

The Journal of Machine Learning Research
Error Control Coding, Second Edition

Error Control Coding, Second Edition
RCV1: A New Benchmark Collection for Text Categorization Research

The Journal of Machine Learning Research
Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data

Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data
Discriminant ECOC: A Heuristic Method for Application Dependent Design of Error Correcting Output Codes

IEEE Transactions on Pattern Analysis and Machine Intelligence
Data-driven decomposition for multi-class classification

Pattern Recognition
An incremental node embedding technique for error correcting output codes

Pattern Recognition
Solving multiclass learning problems via error-correcting output codes

Journal of Artificial Intelligence Research
Meta-conformity approach to reliable classification

Intelligent Data Analysis
A survey of hierarchical classification across different application domains

Data Mining and Knowledge Discovery
Thinned-ECOC ensemble based on sequential code shrinking

Expert Systems with Applications: An International Journal
A classification approach with a reject option for multi-label problems

ICIAP'11 Proceedings of the 16th international conference on Image analysis and processing: Part I
Improved naive bayes for extremely skewed misclassification costs

PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

Text categorization is a key task in information retrieval and natural language processing. Providing a reliability measure of the classification result for a text document into a particular category can benefit the recognition rate as well as better inform the user with regard to the confidence that should be attributed to the output. A novel reliability measure is proposed starting from running different binary classifiers in the Error-Correcting Output Codes (ECOC) framework. Documents classified in a particular category which have a higher ECOC-computed distance from their classification in the next ranked category also have a higher associated reliability. This is the main idea explored in the proposed ECOC-based text classifier with a reject option. Experiments performed for some commonly used text categorization benchmark datasets demonstrate the potential of the proposed method.