Optimal Cross-layer Wireless Control Policies using TD-Learning

Sean Meyn, Wei Chen and Daniel O'Neill

These plots illustrate the convexity, symmetry and non-increasing properties of L* for both the single and multiple flows cases.

Quadratic cost

The distance between the relative value function from VIA and the approximation by the proposed basis. The small error demonstrates the effectiveness of the basis.

VIA convergence

The corresponding control policies. As anticipated, the policy is zero when the state is large.

Abstract: We present an on-line crosslayer control technique to characterize and approximate optimal policies for wireless networks. Our approach combines network utility maximization and adaptive modulation over an infinite discrete-time horizon using a class of performance measures we call time smoothed utility functions. We model the system as an average-cost Markov decision problem. Model approximations are used to find suitable basis functions for application of least squares TD-learning techniques. The approach yields network control policies that learn the underlying characteristics of the random wireless channel and that approximately optimize network performance.

Financial support from the National Science Foundation under CCF-0729031 and ITMANET DARPA RK 2006-07284 is gratefully acknowledged

References

@unpublished{meycheone10,
Author = {Sean Meyn and Wei Chen and Daniel O'Neill},
Keywords = {adaptive modulation, radio networks, adaptive modulation, complex interfering networks, cross-layer design, TD learning, network utility maximization, wireless networks},
Note = {Proceedings {IEEE Conference Dec. and Control} (submitted)},
Title = {Optimal Cross-layer Wireless Control Policies using {TD}-Learning},
Year = {2010}}

See also,

@inproceedings{chehuakulunnzhumehmeywie09,
Author = {Chen, Wei and Huang, Dayu and Kulkarni, Ankur A. and Unnikrishnan, Jayakrishnan and Zhu, Quanyan and Mehta, Prashant and Meyn, Sean and Wierman, Adam},
Booktitle = {Decision and Control, 2009 held jointly with the 2009 28th Chinese Control Conference. CDC/CCC 2009. Proceedings of the 48th IEEE Conference on},
Month = {Dec.},
Pages = {3575-3580},
Title = {Approximate dynamic programming using fluid and diffusion approximations with applications to power management},
Year = {2009}}

Chapter 11 of CTCN, and
@unpublished{mehmey09a,
Author = {Mehta, P. and Meyn, S.},
Month = {December 16-18},
Note = {48th IEEE Conference on Decision and Control},
Title = {{Q}-Learning and {Pontryagin's Minimum Principle}},
Year = {2009}}