Improving Text Classification Performance with Incremental Background Knowledge

Authors:
Catarina Silva;Bernardete Ribeiro
Affiliations:
School of Technology and Management, Polytechnic Institute of Leiria, Portugal and Dep. Informatics Eng., Center Informatics and Systems, Univ. of Coimbra, Portugal;Dep. Informatics Eng., Center Informatics and Systems, Univ. of Coimbra, Portugal
Venue:
ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part I
Year:
2009

Citing 7
Cited 1

The nature of statistical learning theory

The nature of statistical learning theory
Using LSI for text classification in the presence of background text

Proceedings of the tenth international conference on Information and knowledge management
Information Retrieval

Information Retrieval
Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Less is More: Active Learning with Support Vector Machines

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Building Text Classifiers Using Positive and Unlabeled Examples

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
On Text-based Mining with Active Learning and Background Knowledge Using SVM

Soft Computing - A Fusion of Foundations, Methodologies and Applications

Purging false negatives in cancer diagnosis using incremental active learning

IDEAL'11 Proceedings of the 12th international conference on Intelligent data engineering and automated learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Text classification is generally the process of extracting interesting and non-trivial information and knowledge from text. One of the main problems with text classification systems is the lack of labeled data, as well as the cost of labeling unlabeled data. Thus, there is a growing interest in exploring the use of unlabeled data as a way to improve classification performance in text classification. The ready availability of this kind of data in most applications makes it an appealing source of information. In this work we propose an Incremental Background Knowledge (IBK) technique to introduce unlabeled data into the training set by expanding it using initial classifiers to deliver oracle decisions. The defined incremental SVM margin-based method was tested in the Reuters-21578 benchmark showing promising results.