{"id":477,"date":"2017-06-20T11:05:02","date_gmt":"2017-06-20T10:05:02","guid":{"rendered":"http:\/\/babel.isa.uma.es\/kipr\/?p=477"},"modified":"2017-06-20T11:06:43","modified_gmt":"2017-06-20T10:06:43","slug":"model-based-reinforcement-learning-with-a-reduced-number-of-kernels-and-a-study-of-its-convergence-guarantees","status":"publish","type":"post","link":"https:\/\/babel.isa.uma.es\/kipr\/?p=477","title":{"rendered":"Model-based reinforcement learning with a reduced number of basis functions to aproximate the value function, a study of its convergence guarantees, and a nice state of the art on the use of (mdel-based) reinforcement learning for automatic control"},"content":{"rendered":"<h4>Rushikesh Kamalapurkar, Joel A. Rosenfeld, Warren E. Dixon, <strong>Efficient model-based reinforcement learning for approximate online optimal control,<\/strong> Automatica, Volume 74, 2016, Pages 247-258, ISSN 0005-1098, <a href=\"http:\/\/dx.doi.org\/10.1016\/j.automatica.2016.08.004\" target=\"_blank\">DOI: 10.1016\/j.automatica.2016.08.004<\/a>.<\/h4>\n<blockquote><p>An infinite horizon optimal regulation problem is solved online for a deterministic control-affine nonlinear dynamical system using a state following (StaF) kernel method to approximate the value function. Unlike traditional methods that aim to approximate a function over a large compact set, the StaF kernel method aims to approximate a function in a small neighborhood of a state that travels within a compact set. Simulation results demonstrate that stability and approximate optimality of the control system can be achieved with significantly fewer basis functions than may be required for global approximation methods.<\/p><\/blockquote>\n","protected":false},"excerpt":{"rendered":"<p>Rushikesh Kamalapurkar, Joel A. Rosenfeld, Warren E. Dixon, Efficient model-based reinforcement learning for approximate online optimal control, Automatica, Volume 74, <span class=\"ellipsis\">&hellip;<\/span> <span class=\"more-link-wrap\"><a href=\"https:\/\/babel.isa.uma.es\/kipr\/?p=477\" class=\"more-link\"><span>Read More &rarr;<\/span><\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[18],"tags":[139,205],"class_list":["post-477","post","type-post","status-publish","format-standard","hentry","category-applications-of-reinforcement-learning-to-control-engineering","tag-adaptive-dynamic-programming","tag-model-based-reinforcement-learning"],"_links":{"self":[{"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/posts\/477"}],"collection":[{"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=477"}],"version-history":[{"count":4,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/posts\/477\/revisions"}],"predecessor-version":[{"id":481,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/posts\/477\/revisions\/481"}],"wp:attachment":[{"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=477"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=477"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=477"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}