{"id":1484,"date":"2023-10-20T08:49:58","date_gmt":"2023-10-20T07:49:58","guid":{"rendered":"https:\/\/babel.isa.uma.es\/kipr\/?p=1484"},"modified":"2023-10-20T08:54:24","modified_gmt":"2023-10-20T07:54:24","slug":"hybridizing-model-free-and-model-based-in-continuous-rl-and-a-nice-review-of-current-research-and-benchmarks-in-robotics","status":"publish","type":"post","link":"https:\/\/babel.isa.uma.es\/kipr\/?p=1484","title":{"rendered":"Hybridizing model-free and model-based in continuous RL, and a nice review of current research and benchmarks in robotics"},"content":{"rendered":"<h4>Pinosky A, Abraham I, Broad A, Argall B, Murphey TD. <strong>Hybrid control for combining model-based and model-free reinforcement learning <\/strong> The International Journal of Robotics Research. 2023;42(6):337-355 <a href=\"https:\/\/doi.org\/10.1177\/02783649221083331\" target=\"_blank\">DOI: 10.1177\/02783649221083331<\/a>.<\/h4>\n<blockquote><p>We develop an approach to improve the learning capabilities of robotic systems by combining learned predictive models with experience-based state-action policy mappings. Predictive models provide an understanding of the task and the dynamics, while experience-based (model-free) policy mappings encode favorable actions that override planned actions. We refer to our approach of systematically combining model-based and model-free learning methods as hybrid learning. Our approach efficiently learns motor skills and improves the performance of predictive models and experience-based policies. Moreover, our approach enables policies (both model-based and model-free) to be updated using any off-policy reinforcement learning method. We derive a deterministic method of hybrid learning by optimally switching between learning modalities. We adapt our method to a stochastic variation that relaxes some of the key assumptions in the original derivation. Our deterministic and stochastic variations are tested on a variety of robot control benchmark tasks in simulation as well as a hardware manipulation task. We extend our approach for use with imitation learning methods, where experience is provided through demonstrations, and we test the expanded capability with a real-world pick-and-place task. The results show that our method is capable of improving the performance and sample efficiency of learning motor skills in a variety of experimental domains.<\/p><\/blockquote>\n","protected":false},"excerpt":{"rendered":"<p>Pinosky A, Abraham I, Broad A, Argall B, Murphey TD. Hybrid control for combining model-based and model-free reinforcement learning The <span class=\"ellipsis\">&hellip;<\/span> <span class=\"more-link-wrap\"><a href=\"https:\/\/babel.isa.uma.es\/kipr\/?p=1484\" class=\"more-link\"><span>Read More &rarr;<\/span><\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[14],"tags":[173,205,493,494],"class_list":["post-1484","post","type-post","status-publish","format-standard","hentry","category-applications-of-reinforcement-learning-to-robots","tag-continuous-mdps","tag-model-based-reinforcement-learning","tag-model-free-reinforcement-learning","tag-rl-benchmarks"],"_links":{"self":[{"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/posts\/1484"}],"collection":[{"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1484"}],"version-history":[{"count":3,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/posts\/1484\/revisions"}],"predecessor-version":[{"id":1487,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/posts\/1484\/revisions\/1487"}],"wp:attachment":[{"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1484"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1484"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1484"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}