Category Archives: Robotics

Learning concepts from graphs in robotics, through first-order logic and discovery of subgraphs, forming arbitrary hierarchies

Ana C. Tenorio-González, Eduardo F. Morales, Automatic discovery of relational concepts by an incremental graph-based representation, Robotics and Autonomous Systems, Volume 83, 2016, Pages 1-14, ISSN 0921-8890, DOI: 10.1016/j.robot.2016.06.012.

Automatic discovery of concepts has been an elusive area in machine learning. In this paper, we describe a system, called ADC, that automatically discovers concepts in a robotics domain, performing predicate invention. Unlike traditional approaches of concept discovery, our approach automatically finds and collects instances of potential relational concepts. An agent, using ADC, creates an incremental graph-based representation with the information it gathers while exploring its environment, from which common sub-graphs are identified. The subgraphs discovered are instances of potential relational concepts which are induced with Inductive Logic Programming and predicate invention. Several concepts can be induced concurrently and the learned concepts can form arbitrarily hierarchies. The approach was tested for learning concepts of polygons, furniture, and floors of buildings with a simulated robot and compared with concepts suggested by users.

Survey of model-based reinforcement learning (and of reinforcement learning in general), for its application to improve learning time in robotics; a lot of references but not so many -or clear- explanations

Athanasios S. Polydoros, Lazaros Nalpantidis, Survey of Model-Based Reinforcement Learning: Applications on Robotics, Journal of Intelligent & Robotic Systems, May 2017, Volume 86, Issue 2, pp 153–173, DOI: 10.1007/s10846-017-0468-y.

Reinforcement learning is an appealing approach for allowing robots to learn new tasks. Relevant literature reveals a plethora of methods, but at the same time makes clear the lack of implementations for dealing with real life challenges. Current expectations raise the demand for adaptable robots. We argue that, by employing model-based reinforcement learning, the—now limited—adaptability characteristics of robotic systems can be expanded. Also, model-based reinforcement learning exhibits advantages that makes it more applicable to real life use-cases compared to model-free methods. Thus, in this survey, model-based methods that have been applied in robotics are covered. We categorize them based on the derivation of an optimal policy, the definition of the returns function, the type of the transition model and the learned task. Finally, we discuss the applicability of model-based reinforcement learning approaches in new applications, taking into consideration the state of the art in both algorithms and hardware.

Emergence of symbols in robotics as a “new” area of research in developmental robotics: a survey

Tadahiro Taniguchi, Takayuki Nagai, Tomoaki Nakamura, Naoto Iwahashi, Tetsuya Ogata, Hideki Asoh, Symbol Emergence in Robotics: A Survey, arXiv:1509.08973.

Humans can learn the use of language through physical interaction with their environment and semiotic communication with other people. It is very important to obtain a computational understanding of how humans can form a symbol system and obtain semiotic skills through their autonomous mental development. Recently, many studies have been conducted on the construction of robotic systems and machine-learning methods that can learn the use of language through embodied multimodal interaction with their environment and other systems. Understanding human social interactions and developing a robot that can smoothly communicate with human users in the long term, requires an understanding of the dynamics of symbol systems and is crucially important. The embodied cognition and social interaction of participants gradually change a symbol system in a constructive manner. In this paper, we introduce a field of research called symbol emergence in robotics (SER). SER is a constructive approach towards an emergent symbol system. The emergent symbol system is socially self-organized through both semiotic communications and physical interactions with autonomous cognitive developmental agents, i.e., humans and developmental robots. Specifically, we describe some state-of-art research topics concerning SER, e.g., multimodal categorization, word discovery, and a double articulation analysis, that enable a robot to obtain words and their embodied meanings from raw sensory–motor information, including visual information, haptic information, auditory information, and acoustic speech signals, in a totally unsupervised manner. Finally, we suggest future directions of research in SER.

Combination of several mobile robot localization methods in order to achieve high accuracy in industrial environments, with interesting figures for current localization accuracy achievable by standard solutions

Goran Vasiljevi, Damjan Mikli, Ivica Draganjac, Zdenko Kovai, Paolo Lista, High-accuracy vehicle localization for autonomous warehousing, Robotics and Computer-Integrated Manufacturing, Volume 42, December 2016, Pages 1-16, ISSN 0736-5845, DOI: 10.1016/j.rcim.2016.05.001.

The research presented in this paper aims to bridge the gap between the latest scientific advances in autonomous vehicle localization and the industrial state of the art in autonomous warehousing. Notwithstanding great scientific progress in the past decades, industrial autonomous warehousing systems still rely on external infrastructure for obtaining their precise location. This approach increases warehouse installation costs and decreases system reliability, as it is sensitive to measurement outliers and the external localization infrastructure can get dirty or damaged. Several approaches, well studied in scientific literature, are capable of determining vehicle position based only on information provided by on board sensors, most commonly wheel encoders and laser scanners. However, scientific results published to date either do not provide sufficient accuracy for industrial applications, or have not been extensively tested in realistic, industrial-like operating conditions. In this paper, we combine several well established algorithms into a high-precision localization pipeline, capable of computing the pose of an autonomous forklift to sub-centimeter precision. The algorithms use only odometry information from wheel encoders and range readings from an on board laser scanner. The effectiveness of the proposed solution is evaluated by an extensive experiment that lasted for several days, and was performed in a realistic industrial-like environment.

Integration of the ICP algorithm with a Kalman filter to improve relative localization, with a good state-of-the-art of ICP algorithms

F. Aghili and C. Y. Su, “Robust Relative Navigation by Integration of ICP and Adaptive Kalman Filter Using Laser Scanner and IMU,” in IEEE/ASME Transactions on Mechatronics, vol. 21, no. 4, pp. 2015-2026, Aug. 2016.DOI: 10.1109/TMECH.2016.2547905.

This paper presents a robust six-degree-of-freedom relative navigation by combining the iterative closet point (ICP) registration algorithm and a noise-adaptive Kalman filter in a closed-loop configuration together with measurements from a laser scanner and an inertial measurement unit (IMU). In this approach, the fine-alignment phase of the registration is integrated with the filter innovation step for estimation correction, while the filter estimate propagation provides the coarse alignment needed to find the corresponding points at the beginning of ICP iteration cycle. The convergence of the ICP point matching is monitored by a fault-detection logic, and the covariance associated with the ICP alignment error is estimated by a recursive algorithm. This ICP enhancement has proven to improve robustness and accuracy of the pose-tracking performance and to automatically recover correct alignment whenever the tracking is lost. The Kalman filter estimator is designed so as to identify the required parameters such as IMU biases and location of the spacecraft center of mass. The robustness and accuracy of the relative navigation algorithm is demonstrated through a hardware-in-the loop simulation setting, in which actual vision data for the relative navigation are generated by a laser range finder scanning a spacecraft mockup attached to a robotic motion simulator.

Massive parallelization of POMDPs with a very good state-of-the-art review

Taekhee Lee, Young J. Kim (2015), Massively parallel motion planning algorithms under uncertainty using POMDP , The International Journal of Robotics Research, Vol 35, Issue 8, pp. 928 – 942, DOI: 10.1177/0278364915594856.

We present new parallel algorithms that solve continuous-state partially observable Markov decision process (POMDP) problems using the GPU (gPOMDP) and a hybrid of the GPU and CPU (hPOMDP). We choose the Monte Carlo value iteration (MCVI) method as our base algorithm and parallelize this algorithm using the multi-level parallel formulation of MCVI. For each parallel level, we propose efficient algorithms to utilize the massive data parallelism available on modern GPUs. Our GPU-based method uses the two workload distribution techniques, compute/data interleaving and workload balancing, in order to obtain the maximum parallel performance at the highest level. Here we also present a CPU–GPU hybrid method that takes advantage of both CPU and GPU parallelism in order to solve highly complex POMDP planning problems. The CPU is responsible for data preparation, while the GPU performs Monte Cacrlo simulations; these operations are performed concurrently using the compute/data overlap technique between the CPU and GPU. To the best of the authors’ knowledge, our algorithms are the first parallel algorithms that efficiently execute POMDP in a massively parallel fashion utilizing the GPU or a hybrid of the GPU and CPU. Our algorithms outperform the existing CPU-based algorithm by a factor of 75–99 based on the chosen benchmark.

Combining efficiently symbolic planning with geometric planning

Fabien Lagriffoul, Benjamin Andres (2016), Combining task and motion planning: A culprit detection problem , The International Journal of Robotics Research, Vol 35, Issue 8, pp. 890 – 927, DOI: 10.1177/0278364915619022.

Solving problems combining task and motion planning requires searching across a symbolic search space and a geometric search space. Because of the semantic gap between symbolic and geometric representations, symbolic sequences of actions are not guaranteed to be geometrically feasible. This compels us to search in the combined search space, in which frequent backtracks between symbolic and geometric levels make the search inefficient. We address this problem by guiding symbolic search with rich information extracted from the geometric level through culprit detection mechanisms.

Interesting approach for communicating robots: through the passive recognition of others patterns of motion

Barnali Das, Micael S. Couceiro, Patricia A. Vargas, MRoCS: A new multi-robot communication system based on passive action recognition, Robotics and Autonomous Systems, Volume 82, August 2016, Pages 46-60, ISSN 0921-8890, DOI: 10.1016/j.robot.2016.04.002.

Multi-robot search-and-rescue missions often face major challenges in adverse environments due to the limitations of traditional implicit and explicit communication. This paper proposes a novel multi-robot communication system (MRoCS), which uses a passive action recognition technique that overcomes the shortcomings of traditional models. The proposed MRoCS relies on individual motion, by mimicking the waggle dance of honey bees and thus forming and recognising different patterns accordingly. The system was successfully designed and implemented in simulation and with real robots. Experimental results show that, the pattern recognition process successfully reported high sensitivity with good precision in all cases for three different patterns thus corroborating our hypothesis.

A robot architecture composed of reinforcement learners for predicting and developing behaviors

Richard S. Sutton, Joseph Modayil, Michael Delp, Thomas Degris, Patrick M. Pilarski, Adam White, and Doina PrecupHorde (2011), A scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction, Proc. of 10th Int. Conf. on Autonomous Agents and Multiagent Systems (AAMAS 2011), Tumer, Yolum, Sonenberg and Stone (eds.), May, 2–6, 2011, Taipei, Taiwan, pp. 761-768.

Maintaining accurate world knowledge in a complex and changing environment is a perennial problem for robots and other artificial intelligence systems. Our architecture for addressing this problem, called Horde, consists of a large number of independent reinforcement learning sub-agents, or demons. Each demon is responsible for answering a single predictive or goal-oriented question about the world, thereby contributing in a factored, modular way to the system’s overall knowledge. The questions are in the form of a value function, but each demon has its own policy, reward function, termination function, and terminal-reward function unrelated to those of the base problem. Learning proceeds in parallel by all demons simultaneously so as to extract the maximal training information from whatever actions are taken by the system as a whole. Gradient-based temporal-difference learning methods are used to learn efficiently and reliably with function approximation in this off-policy setting. Horde runs in constant time and memory per time step, and is thus suitable for learning online in realtime applications such as robotics. We present results using Horde on a multi-sensored mobile robot to successfully learn goal-oriented behaviors and long-term predictions from off-policy experience. Horde is a significant incremental step towards a real-time architecture for efficient learning of general knowledge from unsupervised sensorimotor interaction.

“Nexting” (predicting events that occur next, possibly at different time scales) implemented in a robot through temporal difference learning and with a large number of learners

Joseph Modayil, Adam White, Richard S. Sutton (2011), Multi-timescale Nexting in a Reinforcement Learning Robot, arXiv:1112.1133 [cs.LG]. ARXIV, (this version to appear in the Proceedings of the Conference on the Simulation of Adaptive Behavior, 2012).

The term “nexting” has been used by psychologists to refer to the propensity of people and many other animals to continually predict what will happen next in an immediate, local, and personal sense. The ability to “next” constitutes a basic kind of awareness and knowledge of one’s environment. In this paper we present results with a robot that learns to next in real time, predicting thousands of features of the world’s state, including all sensory inputs, at timescales from 0.1 to 8 seconds. This was achieved by treating each state feature as a reward-like target and applying temporal-difference methods to learn a corresponding value function with a discount rate corresponding to the timescale. We show that two thousand predictions, each dependent on six thousand state features, can be learned and updated online at better than 10Hz on a laptop computer, using the standard TD(lambda) algorithm with linear function approximation. We show that this approach is efficient enough to be practical, with most of the learning complete within 30 minutes. We also show that a single tile-coded feature representation suffices to accurately predict many different signals at a significant range of timescales. Finally, we show that the accuracy of our learned predictions compares favorably with the optimal off-line solution.