A reinforcement neuro-fuzzy combiner for multiobjective control

Authors:
Chin-Teng Lin;I-Fang Chung
Affiliations:
Dept. of Electr. & Control Eng., Nat. Chiao Tung Univ., Hsinchu;-
Venue:
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Year:
1999

Citing 0
Cited 2

Ant colony optimization incorporated with fuzzy Q-learning for reinforcement fuzzy control

IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
A survey of multi-objective sequential decision-making

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a neuro-fuzzy combiner (NFC) with reinforcement learning capability for solving multiobjective control problems. The proposed NFC can combine n existing low-level controllers in a hierarchical way to form a multiobjective fuzzy controller. It is assumed that each low-level (fuzzy or nonfuzzy) controller has been well designed to serve a particular objective. The role of the NFC is to fuse the n actions decided by the n low-level controllers and determine a proper action acting on the environment (plant) at each time step. Hence, the NFC can combine low-level controllers and achieve multiple objectives (goals) at once. The NFC acts like a switch that chooses a proper action from the actions of low-level controllers according to the feedback information from the environment. In fact, the NFC is a soft switch; it allows more than one low-level actions to be active with different degrees through fuzzy combination at each time step. An NFC can be designed by the trial-and-error approach if enough a priori knowledge is available, or it can be obtained by supervised learning if precise input/output training data are available. In the more practical cases when there is no instructive teaching information available, the NFC can learn by itself using the proposed reinforcement learning scheme. Adopted with reinforcement learning capability, the NFC can learn to achieve desired multiobjectives simultaneously through the rough reinforcement feedback from the environment, which contains only critic information such as “success (good)” or “failure (bad)” for each desired objective. Computer simulations have been conducted to illustrate the performance and applicability of the proposed architecture and learning scheme