Hierarchical Deep-RL for continuous and large state spaces

A. P. Pope et al. Hierarchical Reinforcement Learning for Air Combat at DARPA’s AlphaDogfight Trials, EEE Transactions on Artificial Intelligence, vol. 4, no. 6, pp. 1371-1385, Dec. 2023 DOI: 10.1109/TAI.2022.3222143.

Autonomous control in high-dimensional, continuous state spaces is a persistent and important challenge in the fields of robotics and artificial intelligence. Because of high risk and complexity, the adoption of AI for autonomous combat systems has been a long-standing difficulty. In order to address these issues, DARPA’s AlphaDogfight Trials (ADT) program sought to vet the feasibility of and increase trust in AI for autonomously piloting an F-16 in simulated air-to-air combat. Our submission to ADT solves the high-dimensional, continuous control problem using a novel hierarchical deep reinforcement learning approach consisting of a high-level policy selector and a set of separately trained low-level policies specialized for excelling in specific regions of the state space. Both levels of the hierarchy are trained using off-policy, maximum entropy methods with expert knowledge integrated through reward shaping. Our approach outperformed human expert pilots and achieved a second-place rank in the ADT championship event.

Comments are closed.

Post Navigation