Tag Archives: On-line Pomdps

Improving on-line Monte Carlo POMDP (DESTOP in particular) in discrete spaces through the use of importance sampling, and a nice summary of the problem and of current on-line POMDP approaches

Luo, Y., Bai, H., Hsu, D., & Lee, W. S., Importance sampling for online planning under uncertainty, The International Journal of Robotics Research, 38(2–3), 162–181, 2019 DOI: 10.1177/0278364918780322.

The partially observable Markov decision process (POMDP) provides a principled general framework for robot planning under uncertainty. Leveraging the idea of Monte Carlo sampling, recent POMDP planning algorithms have scaled up to various challenging robotic tasks, including, real-time online planning for autonomous vehicles. To further improve online planning performance, this paper presents IS-DESPOT, which introduces importance sampling to DESPOT, a state-of-the-art sampling-based POMDP algorithm for planning under uncertainty. Importance sampling improves DESPOT’s performance when there are critical, but rare events, which are difficult to sample. We prove that IS-DESPOT retains the theoretical guarantee of DESPOT. We demonstrate empirically that importance sampling significantly improves the performance of online POMDP planning for suitable tasks. We also present a general method for learning the importance sampling distribution.