Email classification with co-training

Authors:
Svetlana Kiritchenko;Stan Matwin
Affiliations:
University of Ottawa, Ottawa, ON, Canada;University of Ottawa, Ottawa, ON, Canada
Venue:
Proceedings of the 2011 Conference of the Center for Advanced Studies on Collaborative Research
Year:
2011

Citing 13
Cited 1

A theory of the learnable

Communications of the ACM
The nature of statistical learning theory

The nature of statistical learning theory
Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Concept features in Re:Agent, an intelligent Email agent

AGENTS '98 Proceedings of the second international conference on Autonomous agents
MailCat: an intelligent assistant for organizing e-mail

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
Learning to construct knowledge bases from the World Wide Web

Artificial Intelligence - Special issue on Intelligent internet systems
Analyzing the effectiveness and applicability of co-training

Proceedings of the ninth international conference on Information and knowledge management
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Challenges of the Email Domain for Text Classification

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Improving Short-Text Classification using Unlabeled Data for Classification Problems

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Estimating continuous distributions in Bayesian classifiers

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence

Adaptive co-training SVM for sentiment classification on tweets

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

The main problems in text classification are lack of labeled data, as well as the cost of labeling the unlabeled data. We address these problems by exploring co-training - an algorithm that uses unlabeled data along with a few labeled examples to boost the performance of a classifier. We experiment with co-training on the email domain. Our results show that the performance of co-training depends on the learning algorithm it uses. In particular, Support Vector Machines significantly outperforms Naive Bayes on email classification.