Interesting mathematical study of the properties of graphs for graph-based SLAM and other graph-based estimation problems

Khosoussi, K., Giamou, M., Sukhatme, G. S., Huang, S., Dissanayake, G., & How, J. P., Reliable Graphs for SLAM, The International Journal of Robotics Research, 2019, DOI: 10.1177/0278364918823086.

Estimation-over-graphs (EoG) is a class of estimation problems that admit a natural graphical representation. Several key problems in robotics and sensor networks, including sensor network localization, synchronization over a group, and simultaneous localization and mapping (SLAM) fall into this category. We pursue two main goals in this work. First, we aim to characterize the impact of the graphical structure of SLAM and related problems on estimation reliability. We draw connections between several notions of graph connectivity and various properties of the underlying estimation problem. In particular, we establish results on the impact of the weighted number of spanning trees on the D-optimality criterion in 2D SLAM. These results enable agents to evaluate estimation reliability based only on the graphical representation of the EoG problem. We then use our findings and study the problem of designing sparse SLAM problems that lead to reliable maximum likelihood estimates through the synthesis of sparse graphs with the maximum weighted tree connectivity. Characterizing graphs with the maximum number of spanning trees is an open problem in general. To tackle this problem, we establish several new theoretical results, including the monotone log-submodularity of the weighted number of spanning trees. We exploit these structures and design a complementary greedy–convex pair of efficient approximation algorithms with provable guarantees. The proposed synthesis framework is applied to various forms of the measurement selection problem in resource-constrained SLAM. Our algorithms and theoretical findings are validated using random graphs, existing and new synthetic SLAM benchmarks, and publicly available real pose-graph SLAM datasets.

SLAM based on submap joining that achieves linear cost through a novel choice of the reference frame of each submap, and an interesting related works on map joining, i.e., considering submaps as observations

Liang Zhao, Shoudong Huang, Gamini Dissanayake, Linear SLAM: Linearising the SLAM problems using submap joining, Automatica, Volume 100, 2019, Pages 231-246, DOI: 10.1016/j.automatica.2018.10.037.

The main contribution of this paper is a new submap joining based approach for solving large-scale Simultaneous Localization and Mapping (SLAM) problems. Each local submap is independently built using the local information through solving a small-scale SLAM; the joining of submaps mainly involves solving linear least squares and performing nonlinear coordinate transformations. Through approximating the local submap information as the state estimate and its corresponding information matrix, judiciously selecting the submap coordinate frames, and approximating the joining of a large number of submaps by joining only two maps at a time, either sequentially or in a more efficient Divide and Conquer manner, the nonlinear optimization process involved in most of the existing submap joining approaches is avoided. Thus the proposed submap joining algorithm does not require initial guess or iterations since linear least squares problems have closed-form solutions. The proposed Linear SLAM technique is applicable to feature-based SLAM, pose graph SLAM and D-SLAM, in both two and three dimensions, and does not require any assumption on the character of the covariance matrices. Simulations and experiments are performed to evaluate the proposed Linear SLAM algorithm. Results using publicly available datasets in 2D and 3D show that Linear SLAM produces results that are very close to the best solutions that can be obtained using full nonlinear optimization algorithm started from an accurate initial guess. The C/C++ and MATLAB source codes of Linear SLAM are available on OpenSLAM.

Detection of qualitative behaviours in signals

Ying Tang, Alessio Franci, Romain Postoyan, On-line detection of qualitative dynamical changes in nonlinear systems: The resting-oscillation case, Automatica, Volume 100, 2019, Pages 17-28, DOI: 10.1016/j.automatica.2018.10.058.

Motivated by neuroscience applications, we introduce the concept of qualitative detection, that is, the problem of determining on-line the current qualitative dynamical behavior (e.g., resting, oscillating, bursting, spiking etc.) of a nonlinear system. The approach is thought for systems characterized by i) large parameter variability and redundancy, ii) a small number of possible robust, qualitatively different dynamical behaviors and, iii) the presence of sharply different characteristic timescales. These properties are omnipresent in neurosciences and hamper quantitative modeling and fitting of experimental data. As a result, novel control theoretical strategies are needed to face neuroscience challenges like on-line epileptic seizure detection. The proposed approach aims at detecting the current dynamical behavior of the system and whether a qualitative change is likely to occur without quantitatively fitting any model nor asymptotically estimating any parameter. We talk of qualitative detection. We rely on the qualitative properties of the system dynamics, extracted via singularity and singular perturbation theories, to design low dimensional qualitative detectors. We introduce this concept on a general class of singularly perturbed systems and then solve the problem for an analytically tractable class of two-dimensional systems with a single unknown sigmoidal nonlinearity and two sharply separated timescales. Numerical results are provided to show the performance of the designed qualitative detector.

A nice review of visual SLAM with deep learning, and its evolution from non-learning visual SLAM

Ruihao Li, Sen Wang, DongBing Gu, Ongoing Evolution of Visual SLAM from Geometry to Deep Learning: Challenges and Opportunities, Cognitive Computation, December 2018, Volume 10, Issue 6, pp 875–889, DOI: 10.1007/s12559-018-9591-8.

Visual simultaneous localization and mapping (SLAM) has been investigated in the robotics community for decades. Significant progress and achievements on visual SLAM have been made, with geometric model-based techniques becoming increasingly mature and accurate. However, they tend to be fragile under challenging environments. Recently, there is a trend to develop data-driven approaches, e.g., deep learning, for visual SLAM problems with more robust performance. This paper aims to witness the ongoing evolution of visual SLAM techniques from geometric model-based to data-driven approaches by providing a comprehensive technical review. Our contribution is not only just a compilation of state-of-the-art end-to-end deep learning SLAM work, but also an insight into the underlying mechanism of deep learning SLAM. For such a purpose, we provide a concise overview of geometric model-based approaches first. Next, we identify visual depth estimation using deep learning is a starting point of the evolution. It is from depth estimation that ego-motion or pose estimation techniques using deep learning flourish rapidly. In addition, we strive to link semantic segmentation using deep learning with emergent semantic SLAM techniques to shed light on simultaneous estimation of ego-motion and high-level understanding. Finally, we visualize some further opportunities in this research direction.

A developmental architecture for sensory-motor skills based on predictors, and a nice state-of-the-art in cognitive architectures for sensory-motor skill learning

E. Wieser and G. Cheng, A Self-Verifying Cognitive Architecture for Robust Bootstrapping of Sensory-Motor Skills via Multipurpose Predictors, IEEE Transactions on Cognitive and Developmental Systems, vol. 10, no. 4, pp. 1081-1095, DOI: 10.1109/TCDS.2018.2871857.

The autonomous acquisition of sensory-motor skills along multiple developmental stages is one of the current challenges in robotics. To this end, we propose a new developmental cognitive architecture that combines multipurpose predictors and principles of self-verification for the robust bootstrapping of sensory-motor skills. Our architecture operates with loops formed by both mental simulation of sensory-motor sequences and their subsequent physical trial on a robot. During these loops, verification algorithms monitor the predicted and the physically observed sensory-motor data. Multiple types of predictors are acquired through several developmental stages. As a result, the architecture can select and plan actions, adapt to various robot platforms by adjusting proprioceptive feedback, predict the risk of self-collision, learn from a previous interaction stage by validating and extracting sensory-motor data for training the predictor of a subsequent stage, and finally acquire an internal representation for evaluating the performance of its predictors. These cognitive capabilities in turn realize the bootstrapping of early hand-eye coordination and its improvement. We validate the cognitive capabilities experimentally and, in particular, show an improvement of reaching as an example skill.

Weighting relations between concepts to form (hierarchically) further concepts

T. Nakamura and T. Nagai, Ensemble-of-Concept Models for Unsupervised Formation of Multiple Categories, IEEE Transactions on Cognitive and Developmental Systems, vol. 10, no. 4, pp. 1043-1057, DOI: 10.1109/TCDS.2017.2745502.

Recent studies have shown that robots can form concepts and understand the meanings of words through inference. The key idea underlying these studies is the “multimodal categorization” of a robot’s experiences. Despite the success in the formation of concepts by robots, a major drawback of previous studies stems from the fact that they have been mainly focused on object concepts. Obviously, human concepts are limited not only to object concepts but also to other kinds such as those connected to the tactile sense and color. In this paper, we propose a novel model called the ensemble-of-concept models (EoCMs) to form various kinds of concepts. In EoCMs, we introduce weights that represent the strength connecting modalities and concepts. By changing these weights, many concepts that are connected to particular modalities can be formed; however, meaningless concepts for humans are included in these concepts. To communicate with humans, robots are required to form meaningful concepts for us. Therefore, we utilize utterances taught by human users as the robot observes objects. The robot connects words included in the teaching utterances with formed concepts and selects meaningful concepts to communicate with users. The experimental results show that the robot can form not only object concepts but also others such as color-related concepts and haptic concepts. Furthermore, using word2vec, we compare the meanings of the words acquired by the robot in connecting them to the concepts formed.

A definition of emergence and its application to emergence in robots

R. L. Sturdivant and E. K. P. Chong, The Necessary and Sufficient Conditions for Emergence in Systems Applied to Symbol Emergence in Robots, IEEE Transactions on Cognitive and Developmental Systems, vol. 10, no. 4, pp. 1035-1042, DOI: 10.1109/TCDS.2017.2731361.

A conceptual model for emergence with downward causation is developed. In addition, the necessary and sufficient conditions are identified for a phenomenon to be considered emergent in a complex system. It is then applied to symbol emergence in robots. This paper is motivated by the usefulness of emergence to explain a wide variety of phenomena in systems, and cognition in natural and artificial creatures. Downward causation is shown to be a critical requirement for potentially emergent phenomena to be considered actually emergent. Models of emergence with and without downward causation are described and how weak emergence can include downward causation. A process flow is developed for distinguishing emergence from nonemergence based upon the application of reductionism and detection of downward causation. Examples are shown for applying the necessary and sufficient conditions to filter out actually emergent phenomena from nonemergent ones. Finally, this approach for detecting emergence is applied to complex projects and symbol emergence in robots.

A cognitive architecture for self-development in robots that interact with humans, with a nice state-of-the-art of robot cognitive architectures

C. Moulin-Frier et al., DAC-h3: A Proactive Robot Cognitive Architecture to Acquire and Express Knowledge About the World and the Self, IEEE Transactions on Cognitive and Developmental Systems, vol. 10, no. 4, pp. 1005-1022, DOI: 10.1109/TCDS.2017.2754143.

This paper introduces a cognitive architecture for a humanoid robot to engage in a proactive, mixed-initiative exploration and manipulation of its environment, where the initiative can originate from both human and robot. The framework, based on a biologically grounded theory of the brain and mind, integrates a reactive interaction engine, a number of state-of-the-art perceptual and motor learning algorithms, as well as planning abilities and an autobiographical memory. The architecture as a whole drives the robot behavior to solve the symbol grounding problem, acquire language capabilities, execute goal-oriented behavior, and express a verbal narrative of its own experience in the world. We validate our approach in human-robot interaction experiments with the iCub humanoid robot, showing that the proposed cognitive architecture can be applied in real time within a realistic scenario and that it can be used with naive users.

A ROS module that improves real-time aspects of network communication among distributed ROS machines, and a nice analysis of wireless network characteristics and limitations

Danilo Tardioli, Ramviyas Parasuraman, Petter Ögren, Pound: A multi-master ROS node for reducing delay and jitter in wireless multi-robot networks, Robotics and Autonomous Systems, Volume 111, 2019, Pages 73-87, DOI: 10.1016/j.robot.2018.10.009.

The Robot Operating System (ROS) is a popular and widely used software framework for building robotics systems. With the growth of its popularity, it has started to be used in multi-robot systems as well. However, the TCP connections that the platform relies on for connecting the so-called ROS nodes presents several issues regarding limited-bandwidth, delays, and jitter, when used in wireless multi-hop networks. In this paper, we present a thorough analysis of the problem and propose a new ROS node called Pound to improve the wireless communication performance by reducing delay and jitter in data exchanges, especially in multi-hop networks. Pound allows the use of multiple ROS masters (roscores), features data compression, and importantly, introduces a priority scheme that allows favoring more important flows over less important ones. We compare Pound to the state-of-the-art solutions through extensive experiments and show that it performs equally well, or better in all the test cases, including a control-over-network example.

Interesting use of RL (deep-RL) for detection – reformulation of detection as a sequential decision process

F. Ghesu et al., Multi-Scale Deep Reinforcement Learning for Real-Time 3D-Landmark Detection in CT Scans, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 1, pp. 176-189, DOI: 10.1109/TPAMI.2017.2782687.

Robust and fast detection of anatomical structures is a prerequisite for both diagnostic and interventional medical image analysis. Current solutions for anatomy detection are typically based on machine learning techniques that exploit large annotated image databases in order to learn the appearance of the captured anatomy. These solutions are subject to several limitations, including the use of suboptimal feature engineering techniques and most importantly the use of computationally suboptimal search-schemes for anatomy detection. To address these issues, we propose a method that follows a new paradigm by reformulating the detection problem as a behavior learning task for an artificial agent. We couple the modeling of the anatomy appearance and the object search in a unified behavioral framework, using the capabilities of deep reinforcement learning and multi-scale image analysis. In other words, an artificial agent is trained not only to distinguish the target anatomical object from the rest of the body but also how to find the object by learning and following an optimal navigation path to the target object in the imaged volumetric space. We evaluated our approach on 1487 3D-CT volumes from 532 patients, totaling over 500,000 image slices and show that it significantly outperforms state-of-the-art solutions on detecting several anatomical structures with no failed cases from a clinical acceptance perspective, while also achieving a 20-30 percent higher detection accuracy. Most importantly, we improve the detection-speed of the reference methods by 2-3 orders of magnitude, achieving unmatched real-time performance on large 3D-CT scans.