Author(s): Steven Perkins | David S. Leslie
Journal: Stochastic Systems
ISSN 1946-5238
Volume: 2;
Issue: 2;
Start page: 409;
Date: 2012;
Original page
Keywords: Asynchronous stochastic approximation | differential inclusion | two-timescales
ABSTRACT
The asymptotic pseudo-trajectory approach to stochastic approximation of Benaïm, Hofbauer and Sorin is extended for asynchronous stochastic approximations with a set-valued mean field. The asynchronicity of the process is incorporated into the mean field to produce convergence results which remain similar to those of an equivalent synchronous process. In addition, this allows many of the restrictive assumptions previously associated with asynchronous stochastic approximation to be removed. The framework is extended for a coupled asynchronous stochastic approximation process with set-valued mean fields. Two-timescales arguments are used here in a similar manner to the original work in this area by Borkar. The applicability of this approach is demonstrated through learning in a Markov decision process.
Journal: Stochastic Systems
ISSN 1946-5238
Volume: 2;
Issue: 2;
Start page: 409;
Date: 2012;
Original page
Keywords: Asynchronous stochastic approximation | differential inclusion | two-timescales
ABSTRACT
The asymptotic pseudo-trajectory approach to stochastic approximation of Benaïm, Hofbauer and Sorin is extended for asynchronous stochastic approximations with a set-valued mean field. The asynchronicity of the process is incorporated into the mean field to produce convergence results which remain similar to those of an equivalent synchronous process. In addition, this allows many of the restrictive assumptions previously associated with asynchronous stochastic approximation to be removed. The framework is extended for a coupled asynchronous stochastic approximation process with set-valued mean fields. Two-timescales arguments are used here in a similar manner to the original work in this area by Borkar. The applicability of this approach is demonstrated through learning in a Markov decision process.