POMDPs in robotics: QMDP-Net as a counterpart for the Partially Observable Markov Decision Process (POMDP) whose transition, observation, and reward functions are initially unknown

Collins N, Kurniawati H. Locally connected interrelated network: A forward propagation primitive, The International Journal of Robotics Research. 2023;42(6):371-384 DOI: 10.1177/02783649221093092.

End-to-end learning for planning is a promising approach for finding good robot strategies in situations where the state transition, observation, and reward functions are initially unknown. Many neural network architectures for this approach have shown positive results. Across these networks, seemingly small components have been used repeatedly in different architectures, which means improving the efficiency of these components has great potential to improve the overall performance of the network. This paper aims to improve one such component: The forward propagation module. In particular, we propose Locally Connected Interrelated Network (LCI-Net) \u2013 a novel type of locally connected layer with unshared but interrelated weights \u2013 to improve the efficiency of learning stochastic transition models for planning and propagating information via the learned transition models. LCI-Net is a small differentiable neural network module that can be plugged into various existing architectures. For evaluation purposes, we apply LCI-Net to VIN and QMDP-Net. VIN is an end-to-end neural network for solving Markov Decision Processes (MDPs) whose transition and reward functions are initially unknown, while QMDP-Net is its counterpart for the Partially Observable Markov Decision Process (POMDP) whose transition, observation, and reward functions are initially unknown. Simulation tests on benchmark problems involving 2D and 3D navigation and grasping indicate promising results: Changing only the forward propagation module alone with LCI-Net improves VIN\u2019s and QMDP-Net generalisation capability by more than 3� and 10�, respectively.

Comments are closed.

Post Navigation