{"id":230,"date":"2015-07-21T11:24:35","date_gmt":"2015-07-21T10:24:35","guid":{"rendered":"http:\/\/babel.isa.uma.es\/kipr\/?p=230"},"modified":"2015-07-21T11:27:49","modified_gmt":"2015-07-21T10:27:49","slug":"transfer-learning-in-reinforcement-learning-through-case-based-and-the-use-of-heuristics-for-selecting-actions","status":"publish","type":"post","link":"https:\/\/babel.isa.uma.es\/kipr\/?p=230","title":{"rendered":"Transfer learning in reinforcement learning through case-based and the use of heuristics for selecting actions"},"content":{"rendered":"<h4>Reinaldo A.C. Bianchi, Luiz A. Celiberto Jr., Paulo E. Santos, Jackson P. Matsuura, Ramon Lopez de Mantaras, <strong>Transferring knowledge as heuristics in reinforcement learning: A case-based approach<\/strong>, Artificial Intelligence, Volume 226, September 2015, Pages 102-121, ISSN 0004-3702, <a href=\"http:\/\/dx.doi.org\/10.1016\/j.artint.2015.05.008\" target=\"_blank\">DOI: 10.1016\/j.artint.2015.05.008<\/a>.<\/h4>\n<blockquote><p>The goal of this paper is to propose and analyse a transfer learning meta-algorithm that allows the implementation of distinct methods using heuristics to accelerate a Reinforcement Learning procedure in one domain (the target) that are obtained from another (simpler) domain (the source domain). This meta-algorithm works in three stages: first, it uses a Reinforcement Learning step to learn a task on the source domain, storing the knowledge thus obtained in a case base; second, it does an unsupervised mapping of the source-domain actions to the target-domain actions; and, third, the case base obtained in the first stage is used as heuristics to speed up the learning process in the target domain.<br \/>\nA set of empirical evaluations were conducted in two target domains: the 3D mountain car (using a learned case base from a 2D simulation) and stability learning for a humanoid robot in the Robocup 3D Soccer Simulator (that uses knowledge learned from the Acrobot domain). The results attest that our transfer learning algorithm outperforms recent heuristically-accelerated reinforcement learning and transfer learning algorithms.<\/p><\/blockquote>\n","protected":false},"excerpt":{"rendered":"<p>Reinaldo A.C. Bianchi, Luiz A. Celiberto Jr., Paulo E. Santos, Jackson P. Matsuura, Ramon Lopez de Mantaras, Transferring knowledge as <span class=\"ellipsis\">&hellip;<\/span> <span class=\"more-link-wrap\"><a href=\"https:\/\/babel.isa.uma.es\/kipr\/?p=230\" class=\"more-link\"><span>Read More &rarr;<\/span><\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[84,44],"tags":[107,108,51,15,105],"class_list":["post-230","post","type-post","status-publish","format-standard","hentry","category-reinforcement-learning-in-ai","category-reinforcement-learning-theory","tag-bootstrapped-learning","tag-case-based-learning","tag-neural-networks","tag-reinforcement-learning","tag-transfer-learning"],"_links":{"self":[{"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/posts\/230"}],"collection":[{"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=230"}],"version-history":[{"count":1,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/posts\/230\/revisions"}],"predecessor-version":[{"id":231,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/posts\/230\/revisions\/231"}],"wp:attachment":[{"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=230"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=230"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=230"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}