Analysis and Design of a Decision Tree Based on Entropy Reduction and Its Application to Large Character Set Recognition

Authors:
Qing Ren Wang;Ching Y. Suen
Affiliations:
Department of Computer Science, Concordia University, Montreal, P. Q., Canada H3G 1M8.;Department of Computer Science, Concordia University, Montreal, P. Q., Canada H3G 1M8.
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
1984

Citing 0
Cited 3

Accelerating feature-vector matching using multiple-tree and sub-vector methods

Pattern Recognition
Using a bicriteria Boolean linear programming model for parameter selection in large multicategory classification problem

Mathematical and Computer Modelling: An International Journal
Dynamic exploration designs for graphical models using clustering with applications to petroleum exploration

Knowledge-Based Systems

Quantified Score

Hi-index	0.14

Visualization

Abstract

Based on a recursive process of reducing the entropy, the general decision tree classifier with overlap has been analyzed. Several theorems have been proposed and proved. When the number of pattern classes is very large, the theorems can reveal both the advantages of a tree classifier and the main difficulties in its implementation. Suppose H is Shannon's entropy measure of the given problem. The theoretical results indicate that the tree searching time can be minimized to the order O(H), but the error rate is also in the same order O(H) due to error accumulation. However, the memory requirement is in the order 0(H exp(H)) which poses serious problems in the implementation of a tree classifier for a large number of classes. To solve these problems, several theorems related to the bounds on the search time, error rate, memory requirement and overlap factor in the design of a decision tree have been proposed and some principles have been established to analyze the behaviors of the decision tree. When applied to classify sets of 64, 450, and 3200 Chinese characters, respectively, the experimental results support the theoretical predictions. For 3200 classes, a very high recognition rate of 99.88 percent was achieved at a high speed of 873 samples/s when the experiment was conducted on a Cyber 172 computer using a high-level language.