Using Clustering and Co-5raining to Boost Classification Performance

Authors:
Antonia Kyriakopoulou
Affiliations:
-
Venue:
ICTAI '07 Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence - Volume 02
Year:
2007

Citing 0
Cited 2

Unsupervised and supervised learning in cascade for petroleum geology

Expert Systems with Applications: An International Journal
The impact of semi-supervised clustering on text classification

Proceedings of the 17th Panhellenic Conference on Informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper shows that the performance of a linear SVM classifier can be improved by utilizing meta-information derived from clustering. Clustering aims in discovering extra knowledge concerning the structure of the whole dataset, (both training and testing set). A co-training algo- rithm is introduced that uses clustering as a complementary step to text classification. At each iteration step of the algo- rithm the clustering phase augments the feature space with a new meta-feature that for each document reflects cluster membership and the classification phase introduces another meta-feature that indicates class membership. Experimen- tal results obtained using widely used datasets demonstrate the effectiveness of the proposed approaches especially for small training sets.