Email answering assistance by semi-supervised text classification

  • Authors:
  • Tobias Scheffer

  • Affiliations:
  • Humboldt-Universität zu Berlin, Department of Computer Science, Unter den Linden 6, 10099 Berlin, Germany. E-mail: scheffer@informatik.hu-berlin.de

  • Venue:
  • Intelligent Data Analysis
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many individuals, organizations, and companies have to answer large amounts of emails. Often, many of these emails contain variations of relatively few frequently asked questions. We address the problem of predicting which of several frequently used answers a user will choose to respond to an email. We map the problem to a semi-supervised text classification problem. In a case study with emails that have been sent to a corporate customer service department, we investigate the ability of the naive Bayesian and support vector classifier to identify the appropriate answers to emails. We study how effectively the transductive Support Vector Machine and the co-training algorithm utilize unlabeled data and investigate why co-training is only beneficial when very few labeled data are available. In addition, we describe a practical assistance system.