Tree based discretization for continuous state space reinforcement learning
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Data mining: concepts and techniques
Data mining: concepts and techniques
Accurate decision trees for mining high-speed data streams
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
A novel approach for multi-agent-based Intelligent Manufacturing System
Information Sciences: an International Journal
Lookahead and pathology in decision tree induction
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
A modified gradient-based neuro-fuzzy learning algorithm and its convergence
Information Sciences: an International Journal
Information Sciences: an International Journal
Optimal fuzzy control system using the cross-entropy method. A case study of a drilling process
Information Sciences: an International Journal
ANGLE: An autonomous, normative and guidable agent with changing knowledge
Information Sciences: an International Journal
Searching for overlapping coalitions in multiple virtual organizations
Information Sciences: an International Journal
Autonomic tracing of production processes with mobile and agent-based computing
Information Sciences: an International Journal
Hessian matrix distribution for Bayesian policy gradient reinforcement learning
Information Sciences: an International Journal
Self-organizing state aggregation for architecture design of Q-learning
Information Sciences: an International Journal
Information Sciences: an International Journal
Neural networks for classification: a survey
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Top-down induction of decision trees classifiers - a survey
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
A comparative study on heuristic algorithms for generating fuzzydecision trees
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Correspondence Mapping Induced State and Action Metrics for Robotic Imitation
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Look-ahead based fuzzy decision tree induction
IEEE Transactions on Fuzzy Systems
Information Sciences: an International Journal
Adaptive learning algorithm of self-organizing teams
Expert Systems with Applications: An International Journal
Hi-index | 0.07 |
This paper develops a tree-construction method based on the framework of reinforcement learning (RL). The induction of a decision tree is regarded as a problem of RL, where the optimal policy should be found to obtain the maximal accumulated information gain. The proposed approach consists of two stages. At the first stage, the emulation/demonstration stage, sensory-action data in a mechatronic system or samples of training patterns are generated by an operator or stimulator. The records of these emulation data are aggregated into components of the state space represented by a decision tree. State aggregation for decartelization of a state space consists of two phases: split estimation and tree growing. In the split estimation phase, an inducer estimates long-term evaluations of splits at visited nodes. In the second phase, the inducer grows the tree by the predicted long-term evaluations, which is approximated by a neural network model. At the second stage, the learned behavior or classifier is shaped by the RL scheme with a discretized state space constructed by the decision tree derived from the previous stage. Unlike the conventional greedy procedure for constructing and pruning a the tree, the proposed method casts the sequential process of tree induction to policy iterations, where policies for node split are evaluated and improved repeatedly until an optimal or near-optimal policy is obtained. The splitting criterion regarded as an action policy is based on long-term evaluations of payoff instead of using immediate evaluations on impurity. A comparison with CART (classification and regression tree) and C4.5 on several benchmark datasets is presented. Furthermore, to show its applications for learning control, the proposed method is applied further to construct a so-called tree-based reinforcement learning method, where the mechanism works with a discrete state space derived from the proposed method. The results show the feasibility and high performance of the proposed system as a state partition by comparison with the renowned Adaptive Heuristic Critic (AHC) model.