On-line learning and optimization for wireless video transmission

Authors:
Yu Zhang;Fangwen Fu;Mihaela van der Schaar
Affiliations:
Department of Electrical Engineering, University of California, Los Angeles, CA;Department of Electrical Engineering, University of California, Los Angeles, CA;Department of Electrical Engineering, University of California, Los Angeles, CA
Venue:
IEEE Transactions on Signal Processing
Year:
2010

Citing 22
Cited 0

Parallel and distributed computation: numerical methods

Parallel and distributed computation: numerical methods
Probability

Probability
Convergence Results for Single-Step On-PolicyReinforcement-Learning Algorithms

Machine Learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Convex Optimization

Convex Optimization
Time-Aware Utility-Based Resource Allocation in Wireless Networks

IEEE Transactions on Parallel and Distributed Systems
Non-convex optimization and rate control for multi-class services in the Internet

IEEE/ACM Transactions on Networking (TON)
Optimized Scalable Video Streaming over IEEE 802.11a/e HCCA Wireless Networks under Delay Constraints

IEEE Transactions on Mobile Computing
Multimedia over IP and Wireless Networks: Compression, Networking, and Systems

Multimedia over IP and Wireless Networks: Compression, Networking, and Systems
Towards a general framework for cross-layer decision making in multimedia systems

IEEE Transactions on Circuits and Systems for Video Technology
Learning to act using real-time dynamic programming

Artificial Intelligence
MIMO Transmission Control in Fading Channels—A Constrained Markov Decision Process Formulation With Monotone Randomized Policies

IEEE Transactions on Signal Processing
-Learning Algorithms for Constrained Markov Decision Processes With Randomized Monotone Policies: Application to MIMO Transmission Control

IEEE Transactions on Signal Processing
Cross-layer wireless multimedia transmission: challenges, principles, and new paradigms

IEEE Wireless Communications
Cross-Layer combining of adaptive Modulation and coding with truncated ARQ over wireless links

IEEE Transactions on Wireless Communications
Rate-distortion optimized streaming of packetized media

IEEE Transactions on Multimedia
Cross-layer design: a survey and the road ahead

IEEE Communications Magazine
Principles and protocols for power control in wireless ad hoc networks

IEEE Journal on Selected Areas in Communications
Cross-layer QoS Analysis of Opportunistic OFDM-TDMA and OFDMA Networks

IEEE Journal on Selected Areas in Communications
Multi-user video streaming over multi-hop wireless networks: a distributed, cross-layer approach based on priority queuing

IEEE Journal on Selected Areas in Communications
Resource Allocation for Downlink Multiuser Video Transmission Over Wireless Lossy Networks

IEEE Transactions on Image Processing
Overview of the H.264/AVC video coding standard

IEEE Transactions on Circuits and Systems for Video Technology

Quantified Score

Hi-index	35.69

Visualization

Abstract

In this paper, we address the problem of how to optimize the cross-layer transmission policy for delay-sensitive video streaming over slow-varying flat-fading wireless channels on-line, at transmission time, when the environment dynamics are unknown. We first formulate the cross-layer optimization using a systematic layered Markov decision process (MDP) framework,which complies with the layered architecture of the OSI stack. Subsequently, considering the unknown dynamics of the video sources and underlying wireless channels, we propose a layered real-time dynamic programming (LRTDP) algorithm, which requires no a priori knowledge about the source and network dynamics. LRTDP allows each layer to learn the dynamics on-the-fly, and adjusts its policy autonomously, based on their experienced dynamics as well as limited message exchanges with other layers. Unlike existing cross-layer methods, LRTDP optimizes the cross-layer policy in a layered and on-line fashion, exhibits a low computational complexity, requires limited message exchanges among layers,and is capable to adapt on-the-fly to the experienced environment dynamics. Finally, we prove that LRTDP converges to the optimal cross-layer policy asymptotically. Our numerical experiments show that LRTDP provides comparable performance to the idealized optimal cross-layer solutions based on complete knowledge.