The impact of semi-supervised clustering on text classification

Authors:
Antonia Kyriakopoulou;Theodore Kalamboukis
Affiliations:
Athens University of Economics and Business, Athens, Greece;Athens University of Economics and Business, Athens, Greece
Venue:
Proceedings of the 17th Panhellenic Conference on Informatics
Year:
2013

Citing 16
Cited 0

The nature of statistical learning theory

The nature of statistical learning theory
Distributional clustering of words for text classification

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
A Scalable Approach to Balanced, High-Dimensional Clustering of Market-Baskets

HiPC '00 Proceedings of the 7th International Conference on High Performance Computing
Using Unlabelled Data for Text Classification through Addition of Cluster Parameters

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Clustering with Instance-level Constraints

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Combining clustering and co-training to enhance text classification using unlabelled data

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
A divisive information theoretic feature clustering algorithm for text classification

The Journal of Machine Learning Research
CBC: Clustering Based Text Classification Requiring Minimal Labeled Data

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Classifying large data sets using SVMs with hierarchical clusters

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
An Effective Support Vector Machines (SVMs) Performance Using Hierarchical Clustering

ICTAI '04 Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence
Untangling text data mining

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Using clustering to enhance text classification

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Enhancing semi-supervised clustering: a feature projection perspective

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Using Clustering and Co-5raining to Boost Classification Performance

ICTAI '07 Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence - Volume 02

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper addresses the problem of learning to classify texts by exploiting information derived from clustering both training and testing sets. The incorporation of knowledge resulting from clustering into the feature space representation of the texts is expected to boost the performance of a classifier. Two different approaches to clustering are described, an unsupervised and a semi-supervised one. We present an empirical study of the proposed algorithms on a variety of datasets. The results are encouraging, revealing that information resulting from clustering can create text classifiers of high-accuracy.