Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Infinite-horizon policy-gradient estimation
Journal of Artificial Intelligence Research
Experiments with infinite-horizon, policy-gradient estimation
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
Relay stations are an important component of heterogeneous networks (HetNets) introduced in the LTE-Advanced technology as a means to provide very high capacity and QoS all over the cell area. This paper develops a self-organizing network (SON) feature to optimally allocate resources between backhaul and station to mobile links. Static and dynamic resource sharing mechanisms are investigated. In the static case we provide a queuing model to calculate the optimal resource sharing strategy and the maximal capacity of the network analytically. The influence of relay planning and number of deployed relays is investigated, and the gains resulting from good planning are evaluated analytically. Self-optimizing dynamic resource allocation is tackled using a Markov Decision Process (MDP) model. Both stability in the infinite buffer case and blocking rate and file transfer time in the finite buffer case are considered. To achieve a scalable solution with a large number of relays, a well-chosen parametrized family of policies is considered, to be used as expert knowledge. Finally, a model-free approach is shown in which the network can derive the optimal parametrized policy, and the convergence to a local optimum is proven.