Dyna, an integrated architecture for learning, planning, and reacting
ACM SIGART Bulletin
Technical Note: \cal Q-Learning
Machine Learning
Multiagent learning using a variable learning rate
Artificial Intelligence
Multilinear Analysis of Image Ensembles: TensorFaces
ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part I
General Tensor Discriminant Analysis and Gabor Features for Gait Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
Restricted gradient-descent algorithm for value-function approximation in reinforcement learning
Artificial Intelligence
Generalized model learning for reinforcement learning in factored domains
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Heuristic selection of actions in multiagent reinforcement learning
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Tensor Decompositions and Applications
SIAM Review
Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation
Value-function reinforcement learning in Markov games
Cognitive Systems Research
Reinforcement Learning: An Introduction
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Reinforcement learning (RL) and multi-agent reinforcement learning (MARL) are disciplines concerned with defining automatically the behaviour of an agent, or a set of interacting agents, by means of reward mechanisms coming from the environment. An important research issue in the context of RL and MARL is the definition of approaches to combine the knowledge of multiple learning agents to improve the overall performance of the multi-agent system (MAS). This paper illustrates how to improve RL and MARL algorithms by utilizing results from multi-linear algebra such as tensors and tensor factorizations. In particular, the focus is on showing how to modify existing algorithms from literature to include a tensor factorization step applied to the Q-Tables learned by the individual agents to generalize the knowledge about the actions performed in the environment. The modified algorithms are then evaluated in three RL and MARL scenarios against their unmodified version to show the benefits of the tensor factorization step.