Quad-Q-learning

Authors:
C. Clausen;H. Wechsler
Affiliations:
Dept. of Comput. Sci., George Mason Univ., Fairfax, VA;-
Venue:
IEEE Transactions on Neural Networks
Year:
2000

Citing 0
Cited 7

Multiresolution state-space discretization method for Q-learning

ACC'09 Proceedings of the 2009 conference on American Control Conference
Multiresolution state-space discretization method for Q-learning with function approximation and policy iteration

SMC'09 Proceedings of the 2009 IEEE international conference on Systems, Man and Cybernetics
Hexagon-based q-learning to find a hidden target object

CIS'05 Proceedings of the 2005 international conference on Computational Intelligence and Security - Volume Part I
A hybrid cognitive/reactive intelligent agent autonomous path planning technique in a networked-distributed unstructured environment for reinforcement learning

The Journal of Supercomputing
Hexagon-Based q-learning for object search with multiple robots

ICNC'05 Proceedings of the First international conference on Advances in Natural Computation - Volume Part III
Towards proactive web service adaptation

CAiSE'12 Proceedings of the 24th international conference on Advanced Information Systems Engineering
Towards a Multiple-Lookahead-Levels agent reinforcement-learning technique and its implementation in integrated circuits

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Develops the theory of quad-Q-learning which is a learning algorithm that evolved from Q-learning. Quad-Q-learning is applicable to problems that can be solved by “divide and conquer” techniques. Quad-Q-learning concerns an autonomous agent that learns without supervision to act optimally to achieve specified goals. The learning agent acts in an environment that can be characterized by a state. In the Q-learning environment, when an action is taken, a reward is received and a single new state results. The objective of Q-learning is to learn a policy function that maps states to actions so as to maximize a function of the rewards such as the sum of rewards. However, with respect to quad-Q-learning, when an action is taken from a state either an immediate reward and no new state results, or no reward is received and four new states result from taking that action. The environment in which quad-Q-learning operates can thus be viewed as a hierarchy of states where lower level states are the children of higher level states. The hierarchical aspect of quad-Q-learning leads to a bottom up view of learning that improves the efficiency of learning at higher levels in the hierarchy. The objective of quad-Q-learning is to maximize the sum of rewards obtained from each of the environments that result as actions are taken. Two versions of quad-Q-learning are discussed; these are discrete state and mixed discrete and continuous state quad-Q-learning. The discrete state version is only applicable to problems with small numbers of states. Scaling up to problems with practical numbers of states requires a continuous state learning method. Continuous state learning can be accomplished using functional approximation methods. Application of quad-Q-learning to image compression is briefly described