Tag Archives: Decision Making

Modelling emotions in adaptive agents through the action selection part of reinforcement learning, plus some references on the neurophysiological bases of RL and a good review of literature on emotions

Joost Broekens , Elmer Jacobs , Catholijn M. Jonker, A reinforcement learning model of joy, distress, hope and fear, Connection Science, Vol. 27, Iss. 3, 2015, DOI: 10.1080/09540091.2015.1031081.

In this paper we computationally study the relation between adaptive behaviour and emotion. Using the reinforcement learning framework, we propose that learned state utility, V(s), models fear (negative) and hope (positive) based on the fact that both signals are about anticipation of loss or gain. Further, we propose that joy/distress is a signal similar to the error signal. We present agent-based simulation experiments that show that this model replicates psychological and behavioural dynamics of emotion. This work distinguishes itself by assessing the dynamics of emotion in an adaptive agent framework – coupling it to the literature on habituation, development, extinction and hope theory. Our results support the idea that the function of emotion is to provide a complex feedback signal for an organism to adapt its behaviour. Our work is relevant for understanding the relation between emotion and adaptation in animals, as well as for human–robot interaction, in particular how emotional signals can be used to communicate between adaptive agents and humans.

Quantum probability theory as an alternative to classical (Kolgomorov) probability theory for modelling human decision making processes, and a curious description of the effect of a particular ordering of decisions in the complete result

Peter D. Bruza, Zheng Wang, Jerome R. Busemeyer, Quantum cognition: a new theoretical approach to psychology, Trends in Cognitive Sciences, Volume 19, Issue 7, July 2015, Pages 383-393, ISSN 1364-6613, DOI: 10.1016/j.tics.2015.05.001.

What type of probability theory best describes the way humans make judgments under uncertainty and decisions under conflict? Although rational models of cognition have become prominent and have achieved much success, they adhere to the laws of classical probability theory despite the fact that human reasoning does not always conform to these laws. For this reason we have seen the recent emergence of models based on an alternative probabilistic framework drawn from quantum theory. These quantum models show promise in addressing cognitive phenomena that have proven recalcitrant to modeling by means of classical probability theory. This review compares and contrasts probabilistic models based on Bayesian or classical versus quantum principles, and highlights the advantages and disadvantages of each approach.

A new approach to solve POMDP-like problems through gradient descent and optimal control

Vadim Indelman, Luca Carlone, Frank Dellaert, Planning in the continuous domain: A generalized belief space approach for autonomous navigation in unknown environments, The International Journal of Robotics Research, vol. 34 no. 7, pp. 849-882, DOI: 10.1177/0278364914561102.

We investigate the problem of planning under uncertainty, with application to mobile robotics. We propose a probabilistic framework in which the robot bases its decisions on the generalized belief, which is a probabilistic description of its own state and of external variables of interest. The approach naturally leads to a dual-layer architecture: an inner estimation layer, which performs inference to predict the outcome of possible decisions; and an outer decisional layer which is in charge of deciding the best action to undertake. Decision making is entrusted to a model predictive control (MPC) scheme. The formulation is valid for general cost functions and does not discretize the state or control space, enabling planning in continuous domain. Moreover, it allows to relax the assumption of maximum likelihood observations: predicted measurements are treated as random variables, and binary random variables are used to model the event that a measurement is actually taken by the robot. We successfully apply our approach to the problem of uncertainty-constrained exploration, in which the robot has to perform tasks in an unknown environment, while maintaining localization uncertainty within given bounds. We present an extensive numerical analysis of the proposed approach and compare it against related work. In practice, our planning approach produces smooth and natural trajectories and is able to impose soft upper bounds on the uncertainty. Finally, we exploit the results of this analysis to identify current limitations and show that the proposed framework can accommodate several desirable extensions.

Reinforcement learning used for an adaptive attention mechanism, and integrated in an architecture with both top-down and bottom-up vision processing

Ognibene, D.; Baldassare, G., Ecological Active Vision: Four Bioinspired Principles to Integrate Bottom–Up and Adaptive Top–Down Attention Tested With a Simple Camera-Arm Robot, Autonomous Mental Development, IEEE Transactions on , vol.7, no.1, pp.3,25, March 2015. DOI: 10.1109/TAMD.2014.2341351.

Vision gives primates a wealth of information useful to manipulate the environment, but at the same time it can easily overwhelm their computational resources. Active vision is a key solution found by nature to solve this problem: a limited fovea actively displaced in space to collect only relevant information. Here we highlight that in ecological conditions this solution encounters four problems: 1) the agent needs to learn where to look based on its goals; 2) manipulation causes learning feedback in areas of space possibly outside the attention focus; 3) good visual actions are needed to guide manipulation actions, but only these can generate learning feedback; and 4) a limited fovea causes aliasing problems. We then propose a computational architecture (“BITPIC”) to overcome the four problems, integrating four bioinspired key ingredients: 1) reinforcement-learning fovea-based top-down attention; 2) a strong vision-manipulation coupling; 3) bottom-up periphery-based attention; and 4) a novel action-oriented memory. The system is tested with a simple simulated camera-arm robot solving a class of search-and-reach tasks involving color-blob “objects.” The results show that the architecture solves the problems, and hence the tasks, very efficiently, and highlight how the architecture principles can contribute to a full exploitation of the advantages of active vision in ecological conditions.

On the role of emotions in cognition, in particular in cognitive control

Michael Inzlicht, Bruce D. Bartholow, Jacob B. Hirsh, 2015, Emotional foundations of cognitive control, Trends in Cognitive Sciences, Volume 19, Issue 3, March 2015, Pages 126-132, DOI: 10.1016/j.tics.2015.01.004.

Often seen as the paragon of higher cognition, here we suggest that cognitive control is dependent on emotion. Rather than asking whether control is influenced by emotion, we ask whether control itself can be understood as an emotional process. Reviewing converging evidence from cybernetics, animal research, cognitive neuroscience, and social and personality psychology, we suggest that cognitive control is initiated when goal conflicts evoke phasic changes to emotional primitives that both focus attention on the presence of goal conflicts and energize conflict resolution to support goal-directed behavior. Critically, we propose that emotion is not an inert byproduct of conflict but is instrumental in recruiting control. Appreciating the emotional foundations of control leads to testable predictions that can spur future research.

Solving the problem of the slow learning rate of reinfocerment learning through the acquisition of the transition model from the data

Deisenroth, M.P.; Fox, D.; Rasmussen, C.E., Gaussian Processes for Data-Efficient Learning in Robotics and Control, Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol.37, no.2, pp.408,423, Feb. 2015, DOI: 10.1109/TPAMI.2013.218

Autonomous learning has been a promising direction in control and robotics for more than a decade since data-driven learning allows to reduce the amount of engineering knowledge, which is otherwise required. However, autonomous reinforcement learning (RL) approaches typically require many interactions with the system to learn controllers, which is a practical limitation in real systems, such as robots, where many interactions can be impractical and time consuming. To address this problem, current learning approaches typically require task-specific knowledge in form of expert demonstrations, realistic simulators, pre-shaped policies, or specific knowledge about the underlying dynamics. In this paper, we follow a different approach and speed up learning by extracting more information from data. In particular, we learn a probabilistic, non-parametric Gaussian process transition model of the system. By explicitly incorporating model uncertainty into long-term planning and controller learning our approach reduces the effects of model errors, a key problem in model-based learning. Compared to state-of-the art RL our model-based policy search method achieves an unprecedented speed of learning. We demonstrate its applicability to autonomous learning in real robot and control tasks.

Partially observable reinforcement learning and the problem of representing the history of the learning process efficiently

Doshi-Velez, F.; Pfau, D.; Wood, F.; Roy, N., Bayesian Nonparametric Methods for Partially-Observable Reinforcement Learning, Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol.37, no.2, pp.394,407, Feb. 2015, DOI: 10.1109/TPAMI.2013.191

Making intelligent decisions from incomplete information is critical in many applications: for example, robots must choose actions based on imperfect sensors, and speech-based interfaces must infer a user\u2019s needs from noisy microphone inputs. What makes these tasks hard is that often we do not have a natural representation with which to model the domain and use for choosing actions; we must learn about the domain\u2019s properties while simultaneously performing the task. Learning a representation also involves trade-offs between modeling the data that we have seen previously and being able to make predictions about new data. This article explores learning representations of stochastic systems using Bayesian nonparametric statistics. Bayesian nonparametric methods allow the sophistication of a representation to scale gracefully with the complexity in the data. Our main contribution is a careful empirical evaluation of how representations learned using Bayesian nonparametric methods compare to other standard learning approaches, especially in support of planning and control. We show that the Bayesian aspects of the methods result in achieving state-of-the-art performance in decision making with relatively few samples, while the nonparametric aspects often result in fewer computations. These results hold across a variety of different techniques for choosing actions given a representation.

On the way humans reduce perceptual information during decision making, falling apart from statistically optimal behavior, in order to deal with the overwhelming sensory flow

Christopher Summerfield, Konstantinos Tsetsos, Do humans make good decisions?, Trends in Cognitive Sciences, Volume 19, Issue 1, January 2015, Pages 27-34, ISSN 1364-6613, DOI: 10.1016/j.tics.2014.11.005

Human performance on perceptual classification tasks approaches that of an ideal observer, but economic decisions are often inconsistent and intransitive, with preferences reversing according to the local context. We discuss the view that suboptimal choices may result from the efficient coding of decision-relevant information, a strategy that allows expected inputs to be processed with higher gain than unexpected inputs. Efficient coding leads to \u2018robust\u2019 decisions that depart from optimality but maximise the information transmitted by a limited-capacity system in a rapidly-changing world. We review recent work showing that when perceptual environments are variable or volatile, perceptual decisions exhibit the same suboptimal context-dependence as economic choices, and we propose a general computational framework that accounts for findings across the two domains.