Tag Archives: Survey

Survey of model-based reinforcement learning (and of reinforcement learning in general), for its application to improve learning time in robotics; a lot of references but not so many -or clear- explanations

Athanasios S. Polydoros, Lazaros Nalpantidis, Survey of Model-Based Reinforcement Learning: Applications on Robotics, Journal of Intelligent & Robotic Systems, May 2017, Volume 86, Issue 2, pp 153–173, DOI: 10.1007/s10846-017-0468-y.

Reinforcement learning is an appealing approach for allowing robots to learn new tasks. Relevant literature reveals a plethora of methods, but at the same time makes clear the lack of implementations for dealing with real life challenges. Current expectations raise the demand for adaptable robots. We argue that, by employing model-based reinforcement learning, the—now limited—adaptability characteristics of robotic systems can be expanded. Also, model-based reinforcement learning exhibits advantages that makes it more applicable to real life use-cases compared to model-free methods. Thus, in this survey, model-based methods that have been applied in robotics are covered. We categorize them based on the derivation of an optimal policy, the definition of the returns function, the type of the transition model and the learned task. Finally, we discuss the applicability of model-based reinforcement learning approaches in new applications, taking into consideration the state of the art in both algorithms and hardware.

Emergence of symbols in robotics as a “new” area of research in developmental robotics: a survey

Tadahiro Taniguchi, Takayuki Nagai, Tomoaki Nakamura, Naoto Iwahashi, Tetsuya Ogata, Hideki Asoh, Symbol Emergence in Robotics: A Survey, arXiv:1509.08973.

Humans can learn the use of language through physical interaction with their environment and semiotic communication with other people. It is very important to obtain a computational understanding of how humans can form a symbol system and obtain semiotic skills through their autonomous mental development. Recently, many studies have been conducted on the construction of robotic systems and machine-learning methods that can learn the use of language through embodied multimodal interaction with their environment and other systems. Understanding human social interactions and developing a robot that can smoothly communicate with human users in the long term, requires an understanding of the dynamics of symbol systems and is crucially important. The embodied cognition and social interaction of participants gradually change a symbol system in a constructive manner. In this paper, we introduce a field of research called symbol emergence in robotics (SER). SER is a constructive approach towards an emergent symbol system. The emergent symbol system is socially self-organized through both semiotic communications and physical interactions with autonomous cognitive developmental agents, i.e., humans and developmental robots. Specifically, we describe some state-of-art research topics concerning SER, e.g., multimodal categorization, word discovery, and a double articulation analysis, that enable a robot to obtain words and their embodied meanings from raw sensory–motor information, including visual information, haptic information, auditory information, and acoustic speech signals, in a totally unsupervised manner. Finally, we suggest future directions of research in SER.

Modelling emotions in adaptive agents through the action selection part of reinforcement learning, plus some references on the neurophysiological bases of RL and a good review of literature on emotions

Joost Broekens , Elmer Jacobs , Catholijn M. Jonker, A reinforcement learning model of joy, distress, hope and fear, Connection Science, Vol. 27, Iss. 3, 2015, DOI: 10.1080/09540091.2015.1031081.

In this paper we computationally study the relation between adaptive behaviour and emotion. Using the reinforcement learning framework, we propose that learned state utility, V(s), models fear (negative) and hope (positive) based on the fact that both signals are about anticipation of loss or gain. Further, we propose that joy/distress is a signal similar to the error signal. We present agent-based simulation experiments that show that this model replicates psychological and behavioural dynamics of emotion. This work distinguishes itself by assessing the dynamics of emotion in an adaptive agent framework – coupling it to the literature on habituation, development, extinction and hope theory. Our results support the idea that the function of emotion is to provide a complex feedback signal for an organism to adapt its behaviour. Our work is relevant for understanding the relation between emotion and adaptation in animals, as well as for human–robot interaction, in particular how emotional signals can be used to communicate between adaptive agents and humans.

A clarification and systematization of UKF

Menegaz, H.M.T.; Ishihara, J.Y.; Borges, G.A.; Vargas, A.N., A Systematization of the Unscented Kalman Filter Theory, in Automatic Control, IEEE Transactions on , vol.60, no.10, pp.2583-2598, Oct. 2015 DOI: 10.1109/TAC.2015.2404511.

In this paper, we propose a systematization of the (discrete-time) Unscented Kalman Filter (UKF) theory. We gather all available UKF variants in the literature, present corrections to theoretical inconsistencies, and provide a tool for the construction of new UKF’s in a consistent way. This systematization is done, mainly, by revisiting the concepts of Sigma-Representation, Unscented Transformation (UT), Scaled Unscented Transformation (SUT), UKF, and Square-Root Unscented Kalman Filter (SRUKF). Inconsistencies are related to 1) matching the order of the transformed covariance and cross-covariance matrices of both the UT and the SUT; 2) multiple UKF definitions; 3) issue with some reduced sets of sigma points described in the literature; 4) the conservativeness of the SUT; 5) the scaling effect of the SUT on both its transformed covariance and cross-covariance matrices; and 6) possibly ill-conditioned results in SRUKF’s. With the proposed systematization, the symmetric sets of sigma points in the literature are formally justified, and we are able to provide new consistent variations for UKF’s, such as the Scaled SRUKF’s and the UKF’s composed by the minimum number of sigma points. Furthermore, our proposed SRUKF has improved computational properties when compared to state-of-the-art methods.

Survey on Model-Driven Software Engineering for real-time embedded systems and robotics

Brugali, D., Model-Driven Software Engineering in Robotics: Models Are Designed to Use the Relevant Things, Thereby Reducing the Complexity and Cost in the Field of Robotics, in Robotics & Automation Magazine, IEEE , vol.22, no.3, pp.155-166, Sept. 2015, DOI: 10.1109/MRA.2015.2452201.

A model is an abstract representation of a real system or phenomenon [1]. The idea of a model is to capture important properties of reality and to eglect irrelevant details. The properties that are relevant and that can be neglected depend on the purpose of creating a model. A model can make a particular system or phenomenon easier to understand, quantify, visualize, simulate, or predict.

On how the human cognition detects regularities in noisy sensory data (“Statistical learning” in psychology terms)

Annabelle Goujon, André Didierjean, Simon Thorpe, Investigating implicit statistical learning mechanisms through contextual cueing, Trends in Cognitive Sciences, Volume 19, Issue 9, September 2015, Pages 524-533, ISSN 1364-6613, DOI: 10.1016/j.tics.2015.07.009.

Since its inception, the contextual cueing (CC) paradigm has generated considerable interest in various fields of cognitive sciences because it constitutes an elegant approach to understanding how statistical learning (SL) mechanisms can detect contextual regularities during a visual search. In this article we review and discuss five aspects of CC: (i) the implicit nature of learning, (ii) the mechanisms involved in CC, (iii) the mediating factors affecting CC, (iv) the generalization of CC phenomena, and (v) the dissociation between implicit and explicit CC phenomena. The findings suggest that implicit SL is an inherent component of ongoing processing which operates through clustering, associative, and reinforcement processes at various levels of sensory-motor processing, and might result from simple spike-timing-dependent plasticity.

A new algorithm for clock synchronization in wireless sensor networks with bounded delays, that includes interesting references to surveys

Emanuele Garone, Andrea Gasparri, Francesco Lamonaca, Clock synchronization protocol for wireless sensor networks with bounded communication delays, Automatica, Volume 59, September 2015, Pages 60-72, ISSN 0005-1098, DOI: 10.1016/j.automatica.2015.06.014.

In this paper, we address the clock synchronization problem for wireless sensor networks. In particular, we consider a wireless sensor network where nodes are equipped with a local clock and communicate in order to achieve a common sense of time. The proposed approach consists of two asynchronous consensus algorithms, the first of which synchronizes the clocks frequency and the second of which synchronizes the clocks offset. This work advances the state of the art by providing robustness against bounded communication delays. A theoretical characterization of the algorithm properties is provided. Simulations and experimental results are presented to corroborate the theoretical findings and show the effectiveness of the proposed algorithm.

A very interesting review of current approaches to SLAM based on smoothing (i.e., graph optimization) and in clustering the map into submaps

Jiantong Cheng, Jonghyuk Kim, Jinliang Shao, Weihua Zhang, Robust linear pose graph-based SLAM, Robotics and Autonomous Systems, Volume 72, October 2015, Pages 71-82, ISSN 0921-8890, DOI: 10.1016/j.robot.2015.04.010.

This paper addresses a robust and efficient solution to eliminate false loop-closures in a pose-graph linear SLAM problem. Linear SLAM was recently demonstrated based on submap joining techniques in which a nonlinear coordinate transformation was performed separately out of the optimization loop, resulting in a convex optimization problem. This however introduces added complexities in dealing with false loop-closures, which mostly stems from two factors: (a) the limited local observations in map-joining stages and (b) the non block-diagonal nature of the information matrix of each submap. To address these problems, we propose a Robust Linear SLAM by (a) developing a delayed optimization for outlier candidates and (b) utilizing a Schur complement to efficiently eliminate corrupted information block. Based on this new strategy, we prove that the spread of outlier information does not compromise the optimization performance of inliers and can be fully filtered out from the corrupted information matrix. Experimental results based on public synthetic and real-world datasets in 2D and 3D environments show that this robust approach can cope with the incorrect loop-closures robustly and effectively.

Quantum probability theory as an alternative to classical (Kolgomorov) probability theory for modelling human decision making processes, and a curious description of the effect of a particular ordering of decisions in the complete result

Peter D. Bruza, Zheng Wang, Jerome R. Busemeyer, Quantum cognition: a new theoretical approach to psychology, Trends in Cognitive Sciences, Volume 19, Issue 7, July 2015, Pages 383-393, ISSN 1364-6613, DOI: 10.1016/j.tics.2015.05.001.

What type of probability theory best describes the way humans make judgments under uncertainty and decisions under conflict? Although rational models of cognition have become prominent and have achieved much success, they adhere to the laws of classical probability theory despite the fact that human reasoning does not always conform to these laws. For this reason we have seen the recent emergence of models based on an alternative probabilistic framework drawn from quantum theory. These quantum models show promise in addressing cognitive phenomena that have proven recalcitrant to modeling by means of classical probability theory. This review compares and contrasts probabilistic models based on Bayesian or classical versus quantum principles, and highlights the advantages and disadvantages of each approach.

Developmental approach for a robot manipulator that learns in several bootstrapped stages, strongly inspired in infant development

Ugur, E.; Nagai, Y.; Sahin, E.; Oztop, E., Staged Development of Robot Skills: Behavior Formation, Affordance Learning and Imitation with Motionese, Autonomous Mental Development, IEEE Transactions on , vol.7, no.2, pp.119,139, June 2015, DOI: 10.1109/TAMD.2015.2426192.

Inspired by infant development, we propose a three staged developmental framework for an anthropomorphic robot manipulator. In the first stage, the robot is initialized with a basic reach-and- enclose-on-contact movement capability, and discovers a set of behavior primitives by exploring its movement parameter space. In the next stage, the robot exercises the discovered behaviors on different objects, and learns the caused effects; effectively building a library of affordances and associated predictors. Finally, in the third stage, the learned structures and predictors are used to bootstrap complex imitation and action learning with the help of a cooperative tutor. The main contribution of this paper is the realization of an integrated developmental system where the structures emerging from the sensorimotor experience of an interacting real robot are used as the sole building blocks of the subsequent stages that generate increasingly more complex cognitive capabilities. The proposed framework includes a number of common features with infant sensorimotor development. Furthermore, the findings obtained from the self-exploration and motionese guided human-robot interaction experiments allow us to reason about the underlying mechanisms of simple-to-complex sensorimotor skill progression in human infants.