Monthly Archives: December 2018

You are browsing the site archives by month.

A nice review of visual SLAM with deep learning, and its evolution from non-learning visual SLAM

Ruihao Li, Sen Wang, DongBing Gu, Ongoing Evolution of Visual SLAM from Geometry to Deep Learning: Challenges and Opportunities, Cognitive Computation, December 2018, Volume 10, Issue 6, pp 875–889, DOI: 10.1007/s12559-018-9591-8.

Visual simultaneous localization and mapping (SLAM) has been investigated in the robotics community for decades. Significant progress and achievements on visual SLAM have been made, with geometric model-based techniques becoming increasingly mature and accurate. However, they tend to be fragile under challenging environments. Recently, there is a trend to develop data-driven approaches, e.g., deep learning, for visual SLAM problems with more robust performance. This paper aims to witness the ongoing evolution of visual SLAM techniques from geometric model-based to data-driven approaches by providing a comprehensive technical review. Our contribution is not only just a compilation of state-of-the-art end-to-end deep learning SLAM work, but also an insight into the underlying mechanism of deep learning SLAM. For such a purpose, we provide a concise overview of geometric model-based approaches first. Next, we identify visual depth estimation using deep learning is a starting point of the evolution. It is from depth estimation that ego-motion or pose estimation techniques using deep learning flourish rapidly. In addition, we strive to link semantic segmentation using deep learning with emergent semantic SLAM techniques to shed light on simultaneous estimation of ego-motion and high-level understanding. Finally, we visualize some further opportunities in this research direction.

A developmental architecture for sensory-motor skills based on predictors, and a nice state-of-the-art in cognitive architectures for sensory-motor skill learning

E. Wieser and G. Cheng, A Self-Verifying Cognitive Architecture for Robust Bootstrapping of Sensory-Motor Skills via Multipurpose Predictors, IEEE Transactions on Cognitive and Developmental Systems, vol. 10, no. 4, pp. 1081-1095, DOI: 10.1109/TCDS.2018.2871857.

The autonomous acquisition of sensory-motor skills along multiple developmental stages is one of the current challenges in robotics. To this end, we propose a new developmental cognitive architecture that combines multipurpose predictors and principles of self-verification for the robust bootstrapping of sensory-motor skills. Our architecture operates with loops formed by both mental simulation of sensory-motor sequences and their subsequent physical trial on a robot. During these loops, verification algorithms monitor the predicted and the physically observed sensory-motor data. Multiple types of predictors are acquired through several developmental stages. As a result, the architecture can select and plan actions, adapt to various robot platforms by adjusting proprioceptive feedback, predict the risk of self-collision, learn from a previous interaction stage by validating and extracting sensory-motor data for training the predictor of a subsequent stage, and finally acquire an internal representation for evaluating the performance of its predictors. These cognitive capabilities in turn realize the bootstrapping of early hand-eye coordination and its improvement. We validate the cognitive capabilities experimentally and, in particular, show an improvement of reaching as an example skill.

Weighting relations between concepts to form (hierarchically) further concepts

T. Nakamura and T. Nagai, Ensemble-of-Concept Models for Unsupervised Formation of Multiple Categories, IEEE Transactions on Cognitive and Developmental Systems, vol. 10, no. 4, pp. 1043-1057, DOI: 10.1109/TCDS.2017.2745502.

Recent studies have shown that robots can form concepts and understand the meanings of words through inference. The key idea underlying these studies is the “multimodal categorization” of a robot’s experiences. Despite the success in the formation of concepts by robots, a major drawback of previous studies stems from the fact that they have been mainly focused on object concepts. Obviously, human concepts are limited not only to object concepts but also to other kinds such as those connected to the tactile sense and color. In this paper, we propose a novel model called the ensemble-of-concept models (EoCMs) to form various kinds of concepts. In EoCMs, we introduce weights that represent the strength connecting modalities and concepts. By changing these weights, many concepts that are connected to particular modalities can be formed; however, meaningless concepts for humans are included in these concepts. To communicate with humans, robots are required to form meaningful concepts for us. Therefore, we utilize utterances taught by human users as the robot observes objects. The robot connects words included in the teaching utterances with formed concepts and selects meaningful concepts to communicate with users. The experimental results show that the robot can form not only object concepts but also others such as color-related concepts and haptic concepts. Furthermore, using word2vec, we compare the meanings of the words acquired by the robot in connecting them to the concepts formed.

A definition of emergence and its application to emergence in robots

R. L. Sturdivant and E. K. P. Chong, The Necessary and Sufficient Conditions for Emergence in Systems Applied to Symbol Emergence in Robots, IEEE Transactions on Cognitive and Developmental Systems, vol. 10, no. 4, pp. 1035-1042, DOI: 10.1109/TCDS.2017.2731361.

A conceptual model for emergence with downward causation is developed. In addition, the necessary and sufficient conditions are identified for a phenomenon to be considered emergent in a complex system. It is then applied to symbol emergence in robots. This paper is motivated by the usefulness of emergence to explain a wide variety of phenomena in systems, and cognition in natural and artificial creatures. Downward causation is shown to be a critical requirement for potentially emergent phenomena to be considered actually emergent. Models of emergence with and without downward causation are described and how weak emergence can include downward causation. A process flow is developed for distinguishing emergence from nonemergence based upon the application of reductionism and detection of downward causation. Examples are shown for applying the necessary and sufficient conditions to filter out actually emergent phenomena from nonemergent ones. Finally, this approach for detecting emergence is applied to complex projects and symbol emergence in robots.

A cognitive architecture for self-development in robots that interact with humans, with a nice state-of-the-art of robot cognitive architectures

C. Moulin-Frier et al., DAC-h3: A Proactive Robot Cognitive Architecture to Acquire and Express Knowledge About the World and the Self, IEEE Transactions on Cognitive and Developmental Systems, vol. 10, no. 4, pp. 1005-1022, DOI: 10.1109/TCDS.2017.2754143.

This paper introduces a cognitive architecture for a humanoid robot to engage in a proactive, mixed-initiative exploration and manipulation of its environment, where the initiative can originate from both human and robot. The framework, based on a biologically grounded theory of the brain and mind, integrates a reactive interaction engine, a number of state-of-the-art perceptual and motor learning algorithms, as well as planning abilities and an autobiographical memory. The architecture as a whole drives the robot behavior to solve the symbol grounding problem, acquire language capabilities, execute goal-oriented behavior, and express a verbal narrative of its own experience in the world. We validate our approach in human-robot interaction experiments with the iCub humanoid robot, showing that the proposed cognitive architecture can be applied in real time within a realistic scenario and that it can be used with naive users.

A ROS module that improves real-time aspects of network communication among distributed ROS machines, and a nice analysis of wireless network characteristics and limitations

Danilo Tardioli, Ramviyas Parasuraman, Petter Ögren, Pound: A multi-master ROS node for reducing delay and jitter in wireless multi-robot networks, Robotics and Autonomous Systems, Volume 111, 2019, Pages 73-87, DOI: 10.1016/j.robot.2018.10.009.

The Robot Operating System (ROS) is a popular and widely used software framework for building robotics systems. With the growth of its popularity, it has started to be used in multi-robot systems as well. However, the TCP connections that the platform relies on for connecting the so-called ROS nodes presents several issues regarding limited-bandwidth, delays, and jitter, when used in wireless multi-hop networks. In this paper, we present a thorough analysis of the problem and propose a new ROS node called Pound to improve the wireless communication performance by reducing delay and jitter in data exchanges, especially in multi-hop networks. Pound allows the use of multiple ROS masters (roscores), features data compression, and importantly, introduces a priority scheme that allows favoring more important flows over less important ones. We compare Pound to the state-of-the-art solutions through extensive experiments and show that it performs equally well, or better in all the test cases, including a control-over-network example.

Interesting use of RL (deep-RL) for detection – reformulation of detection as a sequential decision process

F. Ghesu et al., Multi-Scale Deep Reinforcement Learning for Real-Time 3D-Landmark Detection in CT Scans, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 1, pp. 176-189, DOI: 10.1109/TPAMI.2017.2782687.

Robust and fast detection of anatomical structures is a prerequisite for both diagnostic and interventional medical image analysis. Current solutions for anatomy detection are typically based on machine learning techniques that exploit large annotated image databases in order to learn the appearance of the captured anatomy. These solutions are subject to several limitations, including the use of suboptimal feature engineering techniques and most importantly the use of computationally suboptimal search-schemes for anatomy detection. To address these issues, we propose a method that follows a new paradigm by reformulating the detection problem as a behavior learning task for an artificial agent. We couple the modeling of the anatomy appearance and the object search in a unified behavioral framework, using the capabilities of deep reinforcement learning and multi-scale image analysis. In other words, an artificial agent is trained not only to distinguish the target anatomical object from the rest of the body but also how to find the object by learning and following an optimal navigation path to the target object in the imaged volumetric space. We evaluated our approach on 1487 3D-CT volumes from 532 patients, totaling over 500,000 image slices and show that it significantly outperforms state-of-the-art solutions on detecting several anatomical structures with no failed cases from a clinical acceptance perspective, while also achieving a 20-30 percent higher detection accuracy. Most importantly, we improve the detection-speed of the reference methods by 2-3 orders of magnitude, achieving unmatched real-time performance on large 3D-CT scans.