Organising documents based on standard-example split test

Authors:
Kenta Fukuoka;Tomofumi Nakano;Nobuhiro Inuzuka
Affiliations:
Graduate School of Engineering, Nagoya Institute of Technology, Nagoya, Japan;Graduate School of Engineering, Nagoya Institute of Technology, Nagoya, Japan;Graduate School of Engineering, Nagoya Institute of Technology, Nagoya, Japan
Venue:
KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part I
Year:
2005

Citing 6
Cited 0

Ranking algorithms

Information retrieval
C4.5: programs for machine learning

C4.5: programs for machine learning
A re-examination of text categorization methods

Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Induction of Decision Trees

Machine Learning
Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

ECML '98 Proceedings of the 10th European Conference on Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

A purpose of text-mining is to summarise a large collection of documents. This paper proposes a new method to view a summary of large document set. It consists of two techniques, one of which constructs classification trees using a split test called the standard-example (standard-document) split test, and the other is a method to display features in each class of documents classified in the trees. The standard-example split test is a test which divides examples by their distance (or similarity) from a standard-example which is selected by a criterion. This is the first method which applies this test to text mining. The display method exhibits representative words of document classes which emphasise their feature.