{"id":1629,"date":"2023-12-21T09:43:48","date_gmt":"2023-12-21T08:43:48","guid":{"rendered":"https:\/\/babel.isa.uma.es\/kipr\/?p=1629"},"modified":"2023-12-21T09:44:02","modified_gmt":"2023-12-21T08:44:02","slug":"meta-rl-given-a-distribution-of-tasks-learn-a-policy-capable-of-adapting-to-any-new-task-from-the-task-distribution-with-as-little-data-as-possible","status":"publish","type":"post","link":"https:\/\/babel.isa.uma.es\/kipr\/?p=1629","title":{"rendered":"Meta-RL: given a distribution of tasks, learn a policy capable of adapting to any new task from the task distribution with as little data as possible"},"content":{"rendered":"<h4>Jacob Beck, Risto Vuorio, Evan Zheran Liu, Zheng Xiong, Luisa Zintgraf, Chelsea Finn, Shimon Whiteson, <strong>A Survey of Meta-Reinforcement Learning, <\/strong>   \tarXiv:2301.08028 [cs.LG], 2023 <a href=\"https:\/\/doi.org\/10.48550\/arXiv.2301.08028\" target=\"_blank\">DOI: 10.48550\/arXiv.2301.08028<\/a>.<\/h4>\n<blockquote><p>While deep reinforcement learning (RL) has fueled multiple high-profile successes in machine learning, it is held back from more widespread adoption by its often poor data efficiency and the limited generality of the policies it produces. A promising approach for alleviating these limitations is to cast the development of better RL algorithms as a machine learning problem itself in a process called meta-RL. Meta-RL is most commonly studied in a problem setting where, given a distribution of tasks, the goal is to learn a policy that is capable of adapting to any new task from the task distribution with as little data as possible. In this survey, we describe the meta-RL problem setting in detail as well as its major variations. We discuss how, at a high level, meta-RL research can be clustered based on the presence of a task distribution and the learning budget available for each individual task. Using these clusters, we then survey meta-RL algorithms and applications. We conclude by presenting the open problems on the path to making meta-RL part of the standard toolbox for a deep RL practitioner. <\/p><\/blockquote>\n","protected":false},"excerpt":{"rendered":"<p>Jacob Beck, Risto Vuorio, Evan Zheran Liu, Zheng Xiong, Luisa Zintgraf, Chelsea Finn, Shimon Whiteson, A Survey of Meta-Reinforcement Learning, <span class=\"ellipsis\">&hellip;<\/span> <span class=\"more-link-wrap\"><a href=\"https:\/\/babel.isa.uma.es\/kipr\/?p=1629\" class=\"more-link\"><span>Read More &rarr;<\/span><\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[84],"tags":[525,32],"class_list":["post-1629","post","type-post","status-publish","format-standard","hentry","category-reinforcement-learning-in-ai","tag-meta-rl","tag-survey"],"_links":{"self":[{"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/posts\/1629"}],"collection":[{"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1629"}],"version-history":[{"count":1,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/posts\/1629\/revisions"}],"predecessor-version":[{"id":1630,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/posts\/1629\/revisions\/1630"}],"wp:attachment":[{"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1629"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1629"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1629"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}