Text representation in multi-label classification: two new input representations

Authors:
Rodrigo Alfaro;Héctor Allende
Affiliations:
Universidad Técnica Federico Santa María, Chile and Pontificia Universidad Católica de Valparaíso, Chile;Universidad Técnica Federico Santa María, Chile and Universidad Adolfo Ibáñez, Chile
Venue:
ICANNGA'11 Proceedings of the 10th international conference on Adaptive and natural computing algorithms - Volume Part II
Year:
2011

Citing 10
Cited 0

Term-weighting approaches in automatic text retrieval

Information Processing and Management: an International Journal
Foundations of statistical natural language processing

Foundations of statistical natural language processing
BoosTexter: A Boosting-based Systemfor Text Categorization

Machine Learning - Special issue on information retrieval
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms

Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms
Text Categorization with Support Vector Machines. How to Represent Texts in Input Space?

Machine Learning
Latent Semantic Kernels

Journal of Intelligent Information Systems
Multilabel Neural Networks with Applications to Functional Genomics and Text Categorization

IEEE Transactions on Knowledge and Data Engineering
Supervised and Traditional Term Weighting Methods for Automatic Text Categorization

IEEE Transactions on Pattern Analysis and Machine Intelligence
Proposing a new term weighting scheme for text categorization

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1

Quantified Score

Hi-index	0.01

Visualization

Abstract

Automatic text classification is the task of assigning unseen documents to a predefined set of classes. Text representation for classification purposes has been traditionally approached using a vector space model due to its simplicity and good performance. On the other hand, multi-label automatic text classification has been typically addressed either by transforming the problem under study to apply binary techniques or by adapting binary algorithms to work with multiple labels. In this paper we present two new representations for text documents based on label-dependent term-weighting for multi-label classification. We focus on modifying the input. Performance was tested with a wellknown dataset and compared to alternative techniques. Experimental results based on Hamming loss analysis show an improvement against alternative approaches.