Induced states in a decision tree constructed by Q-learning

Authors:
Kao-Shing Hwang;Yu-Jen Chen;Wei-Cheng Jiang;Tsung-Wen Yang
Affiliations:
Department of Electrical Engineering, National Sun Yat-sen University, Taiwan;Department of Electrical Engineering, National Chung Cheng University, Taiwan;Department of Electrical Engineering, National Chung Cheng University, Taiwan;Department of Electrical Engineering, National Chung Cheng University, Taiwan
Venue:
Information Sciences: an International Journal
Year:
2012

Citing 22
Cited 2

Tree based discretization for continuous state space reinforcement learning

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms

Machine Learning
Data mining: concepts and techniques

Data mining: concepts and techniques
Solving Semi-Markov Decision Problems Using Average Reward Reinforcement Learning

Management Science
Accurate decision trees for mining high-speed data streams

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A novel approach for multi-agent-based Intelligent Manufacturing System

Information Sciences: an International Journal
Lookahead and pathology in decision tree induction

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
A modified gradient-based neuro-fuzzy learning algorithm and its convergence

Information Sciences: an International Journal
On the potential contributions of hybrid intelligent approaches to Multicomponent Robotic System development

Information Sciences: an International Journal
Optimal fuzzy control system using the cross-entropy method. A case study of a drilling process

Information Sciences: an International Journal
ANGLE: An autonomous, normative and guidable agent with changing knowledge

Information Sciences: an International Journal
Searching for overlapping coalitions in multiple virtual organizations

Information Sciences: an International Journal
Autonomic tracing of production processes with mobile and agent-based computing

Information Sciences: an International Journal
Hessian matrix distribution for Bayesian policy gradient reinforcement learning

Information Sciences: an International Journal
Self-organizing state aggregation for architecture design of Q-learning

Information Sciences: an International Journal
Nonlinear systems design by a novel fuzzy neural system via hybridization of electromagnetism-like mechanism and particle swarm optimisation algorithms

Information Sciences: an International Journal
Neural networks for classification: a survey

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Top-down induction of decision trees classifiers - a survey

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
C-fuzzy decision trees

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
A comparative study on heuristic algorithms for generating fuzzydecision trees

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Correspondence Mapping Induced State and Action Metrics for Robotic Imitation

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Look-ahead based fuzzy decision tree induction

IEEE Transactions on Fuzzy Systems

Simultaneous policy update algorithms for learning the solution of linear continuous-time H∞ state feedback control

Information Sciences: an International Journal
Adaptive learning algorithm of self-organizing teams

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.07

Visualization

Abstract

This paper develops a tree-construction method based on the framework of reinforcement learning (RL). The induction of a decision tree is regarded as a problem of RL, where the optimal policy should be found to obtain the maximal accumulated information gain. The proposed approach consists of two stages. At the first stage, the emulation/demonstration stage, sensory-action data in a mechatronic system or samples of training patterns are generated by an operator or stimulator. The records of these emulation data are aggregated into components of the state space represented by a decision tree. State aggregation for decartelization of a state space consists of two phases: split estimation and tree growing. In the split estimation phase, an inducer estimates long-term evaluations of splits at visited nodes. In the second phase, the inducer grows the tree by the predicted long-term evaluations, which is approximated by a neural network model. At the second stage, the learned behavior or classifier is shaped by the RL scheme with a discretized state space constructed by the decision tree derived from the previous stage. Unlike the conventional greedy procedure for constructing and pruning a the tree, the proposed method casts the sequential process of tree induction to policy iterations, where policies for node split are evaluated and improved repeatedly until an optimal or near-optimal policy is obtained. The splitting criterion regarded as an action policy is based on long-term evaluations of payoff instead of using immediate evaluations on impurity. A comparison with CART (classification and regression tree) and C4.5 on several benchmark datasets is presented. Furthermore, to show its applications for learning control, the proposed method is applied further to construct a so-called tree-based reinforcement learning method, where the mechanism works with a discrete state space derived from the proposed method. The results show the feasibility and high performance of the proposed system as a state partition by comparison with the renowned Adaptive Heuristic Critic (AHC) model.