Decision making | kipr

A good intro about actor-critic and decision making without model on MDPs

June 30, 2017 11:31 , Juan-Antonio Fernández-Madrigal

J. Wang and I. C. Paschalidis, “An Actor-Critic Algorithm With Second-Order Actor and Critic,” in IEEE Transactions on Automatic Control, vol. 62, no. 6, pp. 2689-2703, June 2017.DOI: 10.1109/TAC.2016.2616384.

Actor-critic algorithms solve dynamic decision making problems by optimizing a performance metric of interest over a user-specified parametric class of policies. They employ a combination of an actor, making policy improvement steps, and a critic, computing policy improvement directions. Many existing algorithms use a steepest ascent method to improve the policy, which is known to suffer from slow convergence for ill-conditioned problems. In this paper, we first develop an estimate of the (Hessian) matrix containing the second derivatives of the performance metric with respect to policy parameters. Using this estimate, we introduce a new second-order policy improvement method and couple it with a critic using a second-order learning method. We establish almost sure convergence of the new method to a neighborhood of a policy parameter stationary point. We compare the new algorithm with some existing algorithms in two applications and demonstrate that it leads to significantly faster convergence.

Posted in: Artificial Intelligence , Tagged: Actor-critic, Decision making, MDPs

On how the calculus of utility of actions drives many human behaviours

June 8, 2017 12:11 , Juan-Antonio Fernández-Madrigal

Julian Jara-Ettinger, Hyowon Gweon, Laura E. Schulz, Joshua B. Tenenbaum, The Naïve Utility Calculus: Computational Principles Underlying Commonsense Psychology, Trends in Cognitive Sciences, Volume 20, Issue 8, 2016, Pages 589-604, ISSN 1364-6613, DOI: 10.1016/j.tics.2016.05.011.

We propose that human social cognition is structured around a basic understanding of ourselves and others as intuitive utility maximizers: from a young age, humans implicitly assume that agents choose goals and actions to maximize the rewards they expect to obtain relative to the costs they expect to incur. This \u2018naïve utility calculus\u2019 allows both children and adults observe the behavior of others and infer their beliefs and desires, their longer-term knowledge and preferences, and even their character: who is knowledgeable or competent, who is praiseworthy or blameworthy, who is friendly, indifferent, or an enemy. We review studies providing support for the naïve utility calculus, and we show how it captures much of the rich social reasoning humans engage in from infancy.

Posted in: Psycho-physiological bases of engineering , Tagged: Decision making

Theoretical models for explaining the human (quick) decicion-making process

April 20, 2016 16:42 , Juan-Antonio Fernández-Madrigal

Roger Ratcliff, Philip L. Smith, Scott D. Brown, Gail McKoon, Diffusion Decision Model: Current Issues and History, Trends in Cognitive Sciences, Volume 20, Issue 4, April 2016, Pages 260-281, ISSN 1364-6613, DOI: 10.1016/j.tics.2016.01.007.

There is growing interest in diffusion models to represent the cognitive and neural processes of speeded decision making. Sequential-sampling models like the diffusion model have a long history in psychology. They view decision making as a process of noisy accumulation of evidence from a stimulus. The standard model assumes that evidence accumulates at a constant rate during the second or two it takes to make a decision. This process can be linked to the behaviors of populations of neurons and to theories of optimality. Diffusion models have been used successfully in a range of cognitive tasks and as psychometric tools in clinical research to examine individual differences. In this review, we relate the models to both earlier and more recent research in psychology.

Posted in: Psycho-physiological bases of engineering , Tagged: Decision making

The quick-intuition vs. slow-deliberation dilemma from a decision-making perspective

December 22, 2015 17:36 , Juan-Antonio Fernández-Madrigal

Y-Lan Boureau, Peter Sokol-Hessner, Nathaniel D. Daw, Deciding How To Decide: Self-Control and Meta-Decision Making, Trends in Cognitive Sciences, Volume 19, Issue 11, November 2015, Pages 700-710, ISSN 1364-6613, DOI: 10.1016/j.tics.2015.08.013.

Many different situations related to self control involve competition between two routes to decisions: default and frugal versus more resource-intensive. Examples include habits versus deliberative decisions, fatigue versus cognitive effort, and Pavlovian versus instrumental decision making. We propose that these situations are linked by a strikingly similar core dilemma, pitting the opportunity costs of monopolizing shared resources such as executive functions for some time, against the possibility of obtaining a better outcome. We offer a unifying normative perspective on this underlying rational meta-optimization, review how this may tie together recent advances in many separate areas, and connect several independent models. Finally, we suggest that the crucial mechanisms and meta-decision variables may be shared across domains.

Posted in: Cognitive sciences , Tagged: Decision making, Intuition vs. deliberation

Modelling emotions in adaptive agents through the action selection part of reinforcement learning, plus some references on the neurophysiological bases of RL and a good review of literature on emotions

October 5, 2015 09:08 , Juan-Antonio Fernández-Madrigal

Joost Broekens , Elmer Jacobs , Catholijn M. Jonker, A reinforcement learning model of joy, distress, hope and fear, Connection Science, Vol. 27, Iss. 3, 2015, DOI: 10.1080/09540091.2015.1031081.

In this paper we computationally study the relation between adaptive behaviour and emotion. Using the reinforcement learning framework, we propose that learned state utility, V(s), models fear (negative) and hope (positive) based on the fact that both signals are about anticipation of loss or gain. Further, we propose that joy/distress is a signal similar to the error signal. We present agent-based simulation experiments that show that this model replicates psychological and behavioural dynamics of emotion. This work distinguishes itself by assessing the dynamics of emotion in an adaptive agent framework – coupling it to the literature on habituation, development, extinction and hope theory. Our results support the idea that the function of emotion is to provide a complex feedback signal for an organism to adapt its behaviour. Our work is relevant for understanding the relation between emotion and adaptation in animals, as well as for human–robot interaction, in particular how emotional signals can be used to communicate between adaptive agents and humans.

Posted in: Psycho-physiological bases of engineering, Reinforcement learning in AI , Tagged: Decision making, Directly bioinspired, Emotional robotics, Reinforcement learning, Survey, Useful for teaching

Quantum probability theory as an alternative to classical (Kolgomorov) probability theory for modelling human decision making processes, and a curious description of the effect of a particular ordering of decisions in the complete result

July 21, 2015 11:45 , Juan-Antonio Fernández-Madrigal

Peter D. Bruza, Zheng Wang, Jerome R. Busemeyer, Quantum cognition: a new theoretical approach to psychology, Trends in Cognitive Sciences, Volume 19, Issue 7, July 2015, Pages 383-393, ISSN 1364-6613, DOI: 10.1016/j.tics.2015.05.001.

What type of probability theory best describes the way humans make judgments under uncertainty and decisions under conflict? Although rational models of cognition have become prominent and have achieved much success, they adhere to the laws of classical probability theory despite the fact that human reasoning does not always conform to these laws. For this reason we have seen the recent emergence of models based on an alternative probabilistic framework drawn from quantum theory. These quantum models show promise in addressing cognitive phenomena that have proven recalcitrant to modeling by means of classical probability theory. This review compares and contrasts probabilistic models based on Bayesian or classical versus quantum principles, and highlights the advantages and disadvantages of each approach.

Posted in: Probability theories and interpretations, Psycho-physiological bases of engineering , Tagged: Decision making, Directly bioinspired, Quantum probability, Survey

A new approach to solve POMDP-like problems through gradient descent and optimal control

July 16, 2015 16:31 , Juan-Antonio Fernández-Madrigal

Vadim Indelman, Luca Carlone, Frank Dellaert, Planning in the continuous domain: A generalized belief space approach for autonomous navigation in unknown environments, The International Journal of Robotics Research, vol. 34 no. 7, pp. 849-882, DOI: 10.1177/0278364914561102.

We investigate the problem of planning under uncertainty, with application to mobile robotics. We propose a probabilistic framework in which the robot bases its decisions on the generalized belief, which is a probabilistic description of its own state and of external variables of interest. The approach naturally leads to a dual-layer architecture: an inner estimation layer, which performs inference to predict the outcome of possible decisions; and an outer decisional layer which is in charge of deciding the best action to undertake. Decision making is entrusted to a model predictive control (MPC) scheme. The formulation is valid for general cost functions and does not discretize the state or control space, enabling planning in continuous domain. Moreover, it allows to relax the assumption of maximum likelihood observations: predicted measurements are treated as random variables, and binary random variables are used to model the event that a measurement is actually taken by the robot. We successfully apply our approach to the problem of uncertainty-constrained exploration, in which the robot has to perform tasks in an unknown environment, while maintaining localization uncertainty within given bounds. We present an extensive numerical analysis of the proposed approach and compare it against related work. In practice, our planning approach produces smooth and natural trajectories and is able to impose soft upper bounds on the uncertainty. Finally, we exploit the results of this analysis to identify current limitations and show that the proposed framework can accommodate several desirable extensions.

Posted in: Robot task planning , Tagged: Active exploration, Decision making, POMDP, Task planning

Reinforcement learning used for an adaptive attention mechanism, and integrated in an architecture with both top-down and bottom-up vision processing

April 24, 2015 11:26 , Juan-Antonio Fernández-Madrigal

Ognibene, D.; Baldassare, G., Ecological Active Vision: Four Bioinspired Principles to Integrate Bottom–Up and Adaptive Top–Down Attention Tested With a Simple Camera-Arm Robot, Autonomous Mental Development, IEEE Transactions on , vol.7, no.1, pp.3,25, March 2015. DOI: 10.1109/TAMD.2014.2341351.

Vision gives primates a wealth of information useful to manipulate the environment, but at the same time it can easily overwhelm their computational resources. Active vision is a key solution found by nature to solve this problem: a limited fovea actively displaced in space to collect only relevant information. Here we highlight that in ecological conditions this solution encounters four problems: 1) the agent needs to learn where to look based on its goals; 2) manipulation causes learning feedback in areas of space possibly outside the attention focus; 3) good visual actions are needed to guide manipulation actions, but only these can generate learning feedback; and 4) a limited fovea causes aliasing problems. We then propose a computational architecture (“BITPIC”) to overcome the four problems, integrating four bioinspired key ingredients: 1) reinforcement-learning fovea-based top-down attention; 2) a strong vision-manipulation coupling; 3) bottom-up periphery-based attention; and 4) a novel action-oriented memory. The system is tested with a simple simulated camera-arm robot solving a class of search-and-reach tasks involving color-blob “objects.” The results show that the architecture solves the problems, and hence the tasks, very efficiently, and highlight how the architecture principles can contribute to a full exploitation of the advantages of active vision in ecological conditions.

Posted in: Applications of reinforcement learning to robots, Computer vision, Psycho-physiological bases of engineering , Tagged: Decision making, Directly bioinspired, Manipulation, Q-learning, Reinforcement learning

On the role of emotions in cognition, in particular in cognitive control

March 11, 2015 15:14 , Juan-Antonio Fernández-Madrigal

Michael Inzlicht, Bruce D. Bartholow, Jacob B. Hirsh, 2015, Emotional foundations of cognitive control, Trends in Cognitive Sciences, Volume 19, Issue 3, March 2015, Pages 126-132, DOI: 10.1016/j.tics.2015.01.004.

Often seen as the paragon of higher cognition, here we suggest that cognitive control is dependent on emotion. Rather than asking whether control is influenced by emotion, we ask whether control itself can be understood as an emotional process. Reviewing converging evidence from cybernetics, animal research, cognitive neuroscience, and social and personality psychology, we suggest that cognitive control is initiated when goal conflicts evoke phasic changes to emotional primitives that both focus attention on the presence of goal conflicts and energize conflict resolution to support goal-directed behavior. Critically, we propose that emotion is not an inert byproduct of conflict but is instrumental in recruiting control. Appreciating the emotional foundations of control leads to testable predictions that can spur future research.

Posted in: Cognitive sciences , Tagged: Decision making, Useful for teaching

Solving the problem of the slow learning rate of reinfocerment learning through the acquisition of the transition model from the data

January 23, 2015 18:03 , Juan-Antonio Fernández-Madrigal

Deisenroth, M.P.; Fox, D.; Rasmussen, C.E., Gaussian Processes for Data-Efficient Learning in Robotics and Control, Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol.37, no.2, pp.408,423, Feb. 2015, DOI: 10.1109/TPAMI.2013.218

Autonomous learning has been a promising direction in control and robotics for more than a decade since data-driven learning allows to reduce the amount of engineering knowledge, which is otherwise required. However, autonomous reinforcement learning (RL) approaches typically require many interactions with the system to learn controllers, which is a practical limitation in real systems, such as robots, where many interactions can be impractical and time consuming. To address this problem, current learning approaches typically require task-specific knowledge in form of expert demonstrations, realistic simulators, pre-shaped policies, or specific knowledge about the underlying dynamics. In this paper, we follow a different approach and speed up learning by extracting more information from data. In particular, we learn a probabilistic, non-parametric Gaussian process transition model of the system. By explicitly incorporating model uncertainty into long-term planning and controller learning our approach reduces the effects of model errors, a key problem in model-based learning. Compared to state-of-the art RL our model-based policy search method achieves an unprecedented speed of learning. We demonstrate its applicability to autonomous learning in real robot and control tasks.

Posted in: Applications of reinforcement learning to robots , Tagged: Decision making, Reinforcement learning

« Previous 1 2 3 4 Next »

Tag Archives: Decision Making

A good intro about actor-critic and decision making without model on MDPs

J. Wang and I. C. Paschalidis, “An Actor-Critic Algorithm With Second-Order Actor and Critic,” in IEEE Transactions on Automatic Control, vol. 62, no. 6, pp. 2689-2703, June 2017.DOI: 10.1109/TAC.2016.2616384.

On how the calculus of utility of actions drives many human behaviours

Theoretical models for explaining the human (quick) decicion-making process

Roger Ratcliff, Philip L. Smith, Scott D. Brown, Gail McKoon, Diffusion Decision Model: Current Issues and History, Trends in Cognitive Sciences, Volume 20, Issue 4, April 2016, Pages 260-281, ISSN 1364-6613, DOI: 10.1016/j.tics.2016.01.007.

The quick-intuition vs. slow-deliberation dilemma from a decision-making perspective

Y-Lan Boureau, Peter Sokol-Hessner, Nathaniel D. Daw, Deciding How To Decide: Self-Control and Meta-Decision Making, Trends in Cognitive Sciences, Volume 19, Issue 11, November 2015, Pages 700-710, ISSN 1364-6613, DOI: 10.1016/j.tics.2015.08.013.

Modelling emotions in adaptive agents through the action selection part of reinforcement learning, plus some references on the neurophysiological bases of RL and a good review of literature on emotions

Joost Broekens , Elmer Jacobs , Catholijn M. Jonker, A reinforcement learning model of joy, distress, hope and fear, Connection Science, Vol. 27, Iss. 3, 2015, DOI: 10.1080/09540091.2015.1031081.

Quantum probability theory as an alternative to classical (Kolgomorov) probability theory for modelling human decision making processes, and a curious description of the effect of a particular ordering of decisions in the complete result

Peter D. Bruza, Zheng Wang, Jerome R. Busemeyer, Quantum cognition: a new theoretical approach to psychology, Trends in Cognitive Sciences, Volume 19, Issue 7, July 2015, Pages 383-393, ISSN 1364-6613, DOI: 10.1016/j.tics.2015.05.001.

A new approach to solve POMDP-like problems through gradient descent and optimal control

Vadim Indelman, Luca Carlone, Frank Dellaert, Planning in the continuous domain: A generalized belief space approach for autonomous navigation in unknown environments, The International Journal of Robotics Research, vol. 34 no. 7, pp. 849-882, DOI: 10.1177/0278364914561102.

Reinforcement learning used for an adaptive attention mechanism, and integrated in an architecture with both top-down and bottom-up vision processing

On the role of emotions in cognition, in particular in cognitive control

Michael Inzlicht, Bruce D. Bartholow, Jacob B. Hirsh, 2015, Emotional foundations of cognitive control, Trends in Cognitive Sciences, Volume 19, Issue 3, March 2015, Pages 126-132, DOI: 10.1016/j.tics.2015.01.004.

Solving the problem of the slow learning rate of reinfocerment learning through the acquisition of the transition model from the data

Deisenroth, M.P.; Fox, D.; Rasmussen, C.E., Gaussian Processes for Data-Efficient Learning in Robotics and Control, Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol.37, no.2, pp.408,423, Feb. 2015, DOI: 10.1109/TPAMI.2013.218

Post Navigation

Fields, areas and lines of research

Archives

Tag Archives: Decision Making

J. Wang and I. C. Paschalidis, “An Actor-Critic Algorithm With Second-Order Actor and Critic,” in IEEE Transactions on Automatic Control, vol. 62, no. 6, pp. 2689-2703, June 2017.DOI: 10.1109/TAC.2016.2616384.

Roger Ratcliff, Philip L. Smith, Scott D. Brown, Gail McKoon, Diffusion Decision Model: Current Issues and History, Trends in Cognitive Sciences, Volume 20, Issue 4, April 2016, Pages 260-281, ISSN 1364-6613, DOI: 10.1016/j.tics.2016.01.007.

Y-Lan Boureau, Peter Sokol-Hessner, Nathaniel D. Daw, Deciding How To Decide: Self-Control and Meta-Decision Making, Trends in Cognitive Sciences, Volume 19, Issue 11, November 2015, Pages 700-710, ISSN 1364-6613, DOI: 10.1016/j.tics.2015.08.013.

Joost Broekens , Elmer Jacobs , Catholijn M. Jonker, A reinforcement learning model of joy, distress, hope and fear, Connection Science, Vol. 27, Iss. 3, 2015, DOI: 10.1080/09540091.2015.1031081.

Peter D. Bruza, Zheng Wang, Jerome R. Busemeyer, Quantum cognition: a new theoretical approach to psychology, Trends in Cognitive Sciences, Volume 19, Issue 7, July 2015, Pages 383-393, ISSN 1364-6613, DOI: 10.1016/j.tics.2015.05.001.

Vadim Indelman, Luca Carlone, Frank Dellaert, Planning in the continuous domain: A generalized belief space approach for autonomous navigation in unknown environments, The International Journal of Robotics Research, vol. 34 no. 7, pp. 849-882, DOI: 10.1177/0278364914561102.

Michael Inzlicht, Bruce D. Bartholow, Jacob B. Hirsh, 2015, Emotional foundations of cognitive control, Trends in Cognitive Sciences, Volume 19, Issue 3, March 2015, Pages 126-132, DOI: 10.1016/j.tics.2015.01.004.

Deisenroth, M.P.; Fox, D.; Rasmussen, C.E., Gaussian Processes for Data-Efficient Learning in Robotics and Control, Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol.37, no.2, pp.408,423, Feb. 2015, DOI: 10.1109/TPAMI.2013.218

Post Navigation

Fields, areas and lines of research

Transversal topics, methods and tools

Archives