{"id":964,"date":"2018-09-26T09:17:11","date_gmt":"2018-09-26T08:17:11","guid":{"rendered":"https:\/\/babel.isa.uma.es\/kipr\/?p=964"},"modified":"2018-09-26T09:21:26","modified_gmt":"2018-09-26T08:21:26","slug":"on-how-reinforcement-learning-depends-on-time-in-the-human-brain","status":"publish","type":"post","link":"https:\/\/babel.isa.uma.es\/kipr\/?p=964","title":{"rendered":"A very interesting analysis on how reinforcement learning depends on time, both for MDPs and for the psychological basis of RL in the human brain"},"content":{"rendered":"<h4>Elijah A. Petter, Samuel J. Gershman, Warren H. Meck, <strong>Integrating Models of Interval Timing and Reinforcement Learning<\/strong>, Trends in Cognitive Sciences, Volume 22, Issue 10, 2018, Pages 911-922 <a href=\"https:\/\/doi.org\/10.1016\/j.tics.2018.08.004\" target=\"_blank\">DOI: 10.1016\/j.tics.2018.08.004<\/a>.<\/h4>\n<blockquote><p>We present an integrated view of interval timing and reinforcement learning (RL) in the brain. The computational goal of RL is to maximize future rewards, and this depends crucially on a representation of time. Different RL systems in the brain process time in distinct ways. A model-based system learns \u2018what happens when\u2019, employing this internal model to generate action plans, while a model-free system learns to predict reward directly from a set of temporal basis functions. We describe how these systems are subserved by a computational division of labor between several brain regions, with a focus on the basal ganglia and the hippocampus, as well as how these regions are influenced by the neuromodulator dopamine.<\/p><\/blockquote>\n<h4>Some quotes beyond the abstract:<\/h4>\n<blockquote><p>The Markov assumption also makes explicit the requirements for temporal representation. All temporal dynamics must be captured by the state-transition function, which means that the state representation must encode the time-invariant structure of the environment.<\/p><\/blockquote>\n","protected":false},"excerpt":{"rendered":"<p>Elijah A. Petter, Samuel J. Gershman, Warren H. Meck, Integrating Models of Interval Timing and Reinforcement Learning, Trends in Cognitive <span class=\"ellipsis\">&hellip;<\/span> <span class=\"more-link-wrap\"><a href=\"https:\/\/babel.isa.uma.es\/kipr\/?p=964\" class=\"more-link\"><span>Read More &rarr;<\/span><\/a><\/span><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[35],"tags":[15,373],"class_list":["post-964","post","type-post","status-publish","format-standard","hentry","category-psycho-physiological-bases-of-engineering","tag-reinforcement-learning","tag-time-in-the-brain"],"_links":{"self":[{"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/posts\/964"}],"collection":[{"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=964"}],"version-history":[{"count":3,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/posts\/964\/revisions"}],"predecessor-version":[{"id":967,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=\/wp\/v2\/posts\/964\/revisions\/967"}],"wp:attachment":[{"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=964"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=964"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/babel.isa.uma.es\/kipr\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=964"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}