Reinforcement learning in the automatic control area

Yu Jiang; Zhong-Ping Jiang, Global Adaptive Dynamic Programming for Continuous-Time Nonlinear Systems, in Automatic Control, IEEE Transactions on , vol.60, no.11, pp.2917-2929, Nov. 2015, DOI: 10.1109/TAC.2015.2414811.

This paper presents a novel method of global adaptive dynamic programming (ADP) for the adaptive optimal control of nonlinear polynomial systems. The strategy consists of relaxing the problem of solving the Hamilton-Jacobi-Bellman (HJB) equation to an optimization problem, which is solved via a new policy iteration method. The proposed method distinguishes from previously known nonlinear ADP methods in that the neural network approximation is avoided, giving rise to significant computational improvement. Instead of semiglobally or locally stabilizing, the resultant control policy is globally stabilizing for a general class of nonlinear polynomial systems. Furthermore, in the absence of the a priori knowledge of the system dynamics, an online learning method is devised to implement the proposed policy iteration technique by generalizing the current ADP theory. Finally, three numerical examples are provided to validate the effectiveness of the proposed method.

Comments are closed.

Post Navigation