{"id":1060,"date":"2019-03-15T11:28:18","date_gmt":"2019-03-15T10:28:18","guid":{"rendered":"https:\/\/babel.isa.uma.es\/kipr\/?p=1060"},"modified":"2019-03-15T11:29:17","modified_gmt":"2019-03-15T10:29:17","slug":"improving-q-learning-by-initialization-of-the-q-matrix","status":"publish","type":"post","link":"https:\/\/babel.isa.uma.es\/kipr\/?p=1060","title":{"rendered":"Improving Q-learning by initialization of the Q matrix and a nice related work of that approach"},"content":{"rendered":"<h4>Ee Soong Low, Pauline Ong, Kah Chun Cheah, <strong>Solving the optimal path planning of a mobile robot using improved Q-learning<\/strong>, Robotics and Autonomous Systems, Volume 115, 2019, Pages 143-161, <a href=\"https:\/\/doi.org\/10.1016\/j.robot.2019.02.013\" target=\"_blank\">DOI: 10.1016\/j.robot.2019.02.013<\/a>.<\/h4>\n<blockquote><p>Q-learning, a type of reinforcement learning, has gained increasing popularity in autonomous mobile robot path planning recently, due to its self-learning ability without requiring a priori model of the environment. Yet, despite such advantage, Q-learning exhibits slow convergence to the optimal solution. In order to address this limitation, the concept of partially guided Q-learning is introduced wherein, the flower pollination algorithm (FPA) is utilized to improve the initialization of Q-learning. Experimental evaluation of the proposed improved Q-learning under the challenging environment with a different layout of obstacles shows that the convergence of Q-learning can be accelerated when Q-values are initialized appropriately using the FPA. Additionally, the effectiveness of the proposed algorithm is validated in a real-world experiment using a three-wheeled mobile robot.<\/p><\/blockquote>\n","protected":false},"excerpt":{"rendered":"<p>Ee Soong Low, Pauline Ong, Kah Chun Cheah, Solving the optimal path planning of a mobile robot using improved Q-learning, <span class=\"ellipsis\">&hellip;<\/span> <span class=\"more-link-wrap\"><a href=\"https:\/\/babel.isa.uma.es\/kipr\/?p=1060\" class=\"more-link\"><span>Read More &rarr;<\/span><\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[14],"tags":[],"class_list":["post-1060","post","type-post","status-publish","format-standard","hentry","category-applications-of-reinforcement-learning-to-robots"],"_links":{"self":[{"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/posts\/1060"}],"collection":[{"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1060"}],"version-history":[{"count":2,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/posts\/1060\/revisions"}],"predecessor-version":[{"id":1062,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/posts\/1060\/revisions\/1062"}],"wp:attachment":[{"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1060"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1060"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1060"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}