Refinement method of post-processing and training for improvement of automated text classification

  • Authors:
  • Yun Jeong Choi;Seung Soo Park

  • Affiliations:
  • Department of Computer Science & Engineering, Ewha Womans University, Seoul, Korea;Department of Computer Science & Engineering, Ewha Womans University, Seoul, Korea

  • Venue:
  • ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part II
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

The paper presents a method for improving text classification by using examples that are difficult to classify. Generally, researches to improve the text categorization performance are focused on enhancing existing classification models and algorithms itself, but the range of which has been limited by the feature-based statistical methodology. In this paper, we propose a new method to improve the accuracy and the performance using refinement training and post-processing. Especially, we focused on complex documents that are generally considered to be hard to classify. Our proposed method has a different style from traditional classification methods, and take a data mining strategy and fault tolerant system approaches. In experiments, we applied our system to documents which usually get low classification accuracy because they are laid on a decision boundary. The result shows that our system has high accuracy and stability in actual conditions.