Organising documents based on standard-example split test

  • Authors:
  • Kenta Fukuoka;Tomofumi Nakano;Nobuhiro Inuzuka

  • Affiliations:
  • Graduate School of Engineering, Nagoya Institute of Technology, Nagoya, Japan;Graduate School of Engineering, Nagoya Institute of Technology, Nagoya, Japan;Graduate School of Engineering, Nagoya Institute of Technology, Nagoya, Japan

  • Venue:
  • KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part I
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

A purpose of text-mining is to summarise a large collection of documents. This paper proposes a new method to view a summary of large document set. It consists of two techniques, one of which constructs classification trees using a split test called the standard-example (standard-document) split test, and the other is a method to display features in each class of documents classified in the trees. The standard-example split test is a test which divides examples by their distance (or similarity) from a standard-example which is selected by a criterion. This is the first method which applies this test to text mining. The display method exhibits representative words of document classes which emphasise their feature.