Synthesizing a supervisor (a Finite State Machine) instead of finding a standard policy in MDPs, applied to multi-agent systems

September 16, 2019 15:23 , Juan-Antonio Fernández-Madrigal

B. Wu, X. Zhang and H. Lin Permissive Supervisor Synthesis for Markov Decision Processes Through Learning. IEEE Transactions on Automatic Control, vol. 64, no. 8, pp. 3332-3338, Aug. 2019. DOI: 10.1109/TAC.2018.2879505.

This paper considers the permissive supervisor synthesis for probabilistic systems modeled as Markov Decision Processes (MDP). Such systems are prevalent in power grids, transportation networks, communication networks, and robotics. We propose a novel supervisor synthesis framework using automata learning and compositional model checking to generate the permissive local supervisors in a distributed manner. With the recent advances in assume-guarantee reasoning verification for MDPs, constructing the composed system can be avoided to alleviate the state space explosion. Our framework learns the supervisors iteratively using counterexamples from the verification and is guaranteed to terminate in finite steps and to be correct.

Posted in: Reinforcement learning in AI , Tagged: MDPs, Multiagent systems

Comments are closed.

Post Navigation

← Previous Post