Category Archives: Robotics

Selecting the best visual cues in the next future for reducing the computational cost of localization under limited computational resources

L. Carlone and S. Karaman, Attention and Anticipation in Fast Visual-Inertial Navigation, IEEE Transactions on Robotics, vol. 35, no. 1, pp. 1-20, Feb. 2019 DOI: 10.1109/TRO.2018.2872402.

We study a visual-inertial navigation (VIN) problem in which a robot needs to estimate its state using an on-board camera and an inertial sensor, without any prior knowledge of the external environment. We consider the case in which the robot can allocate limited resources to VIN, due to tight computational constraints. Therefore, we answer the following question: under limited resources, what are the most relevant visual cues to maximize the performance of VIN? Our approach has four key ingredients. First, it is task-driven, in that the selection of the visual cues is guided by a metric quantifying the VIN performance. Second, it exploits the notion of anticipation, since it uses a simplified model for forward-simulation of robot dynamics, predicting the utility of a set of visual cues over a future time horizon. Third, it is efficient and easy to implement, since it leads to a greedy algorithm for the selection of the most relevant visual cues. Fourth, it provides formal performance guarantees: we leverage submodularity to prove that the greedy selection cannot be far from the optimal (combinatorial) selection. Simulations and real experiments on agile drones show that our approach ensures state-of-the-art VIN performance while maintaining a lean processing time. In the easy scenarios, our approach outperforms appearance-based feature selection in terms of localization errors. In the most challenging scenarios, it enables accurate VIN while appearance-based feature selection fails to track robot’s motion during aggressive maneuvers.

An application of MDPs to UAV collision-free navigation with an interesting taxonomy of the state-of-the-art

Xiang Yu1, Xiaobin Zhou2, Youmin Zhang, Collision-Free Trajectory Generation and Tracking for UAVs Using Markov Decision Process in a Cluttered Environment, Journal of Intelligent & Robotic Systems, 2019, 93:17–32 DOI: 10.1007/s10846-018-0802-z.

A collision-free trajectory generation and tracking method capable of re-planning unmanned aerial vehicle (UAV) trajectories can increase flight safety and decrease the possibility of mission failures. In this paper, a Markov decision process (MDP) based algorithm combined with backtracking method is presented to create a safe trajectory in the case of hostile environments. Subsequently, a differential flatness method is adopted to smooth the profile of the rerouted trajectory for satisfying the UAV physical constraints. Lastly, a flight controller based on passivity-based control (PBC) is designed to maintain UAV’s stability and trajectory tracking performance. simulation results demonstrate that the UAV with the proposed strategy is capable of avoiding obstacles in a hostile environment.

Interesting mathematical study of the properties of graphs for graph-based SLAM and other graph-based estimation problems

Khosoussi, K., Giamou, M., Sukhatme, G. S., Huang, S., Dissanayake, G., & How, J. P., Reliable Graphs for SLAM, The International Journal of Robotics Research, 2019, DOI: 10.1177/0278364918823086.

Estimation-over-graphs (EoG) is a class of estimation problems that admit a natural graphical representation. Several key problems in robotics and sensor networks, including sensor network localization, synchronization over a group, and simultaneous localization and mapping (SLAM) fall into this category. We pursue two main goals in this work. First, we aim to characterize the impact of the graphical structure of SLAM and related problems on estimation reliability. We draw connections between several notions of graph connectivity and various properties of the underlying estimation problem. In particular, we establish results on the impact of the weighted number of spanning trees on the D-optimality criterion in 2D SLAM. These results enable agents to evaluate estimation reliability based only on the graphical representation of the EoG problem. We then use our findings and study the problem of designing sparse SLAM problems that lead to reliable maximum likelihood estimates through the synthesis of sparse graphs with the maximum weighted tree connectivity. Characterizing graphs with the maximum number of spanning trees is an open problem in general. To tackle this problem, we establish several new theoretical results, including the monotone log-submodularity of the weighted number of spanning trees. We exploit these structures and design a complementary greedy–convex pair of efficient approximation algorithms with provable guarantees. The proposed synthesis framework is applied to various forms of the measurement selection problem in resource-constrained SLAM. Our algorithms and theoretical findings are validated using random graphs, existing and new synthetic SLAM benchmarks, and publicly available real pose-graph SLAM datasets.

SLAM based on submap joining that achieves linear cost through a novel choice of the reference frame of each submap, and an interesting related works on map joining, i.e., considering submaps as observations

Liang Zhao, Shoudong Huang, Gamini Dissanayake, Linear SLAM: Linearising the SLAM problems using submap joining, Automatica, Volume 100, 2019, Pages 231-246, DOI: 10.1016/j.automatica.2018.10.037.

The main contribution of this paper is a new submap joining based approach for solving large-scale Simultaneous Localization and Mapping (SLAM) problems. Each local submap is independently built using the local information through solving a small-scale SLAM; the joining of submaps mainly involves solving linear least squares and performing nonlinear coordinate transformations. Through approximating the local submap information as the state estimate and its corresponding information matrix, judiciously selecting the submap coordinate frames, and approximating the joining of a large number of submaps by joining only two maps at a time, either sequentially or in a more efficient Divide and Conquer manner, the nonlinear optimization process involved in most of the existing submap joining approaches is avoided. Thus the proposed submap joining algorithm does not require initial guess or iterations since linear least squares problems have closed-form solutions. The proposed Linear SLAM technique is applicable to feature-based SLAM, pose graph SLAM and D-SLAM, in both two and three dimensions, and does not require any assumption on the character of the covariance matrices. Simulations and experiments are performed to evaluate the proposed Linear SLAM algorithm. Results using publicly available datasets in 2D and 3D show that Linear SLAM produces results that are very close to the best solutions that can be obtained using full nonlinear optimization algorithm started from an accurate initial guess. The C/C++ and MATLAB source codes of Linear SLAM are available on OpenSLAM.

A nice review of visual SLAM with deep learning, and its evolution from non-learning visual SLAM

Ruihao Li, Sen Wang, DongBing Gu, Ongoing Evolution of Visual SLAM from Geometry to Deep Learning: Challenges and Opportunities, Cognitive Computation, December 2018, Volume 10, Issue 6, pp 875–889, DOI: 10.1007/s12559-018-9591-8.

Visual simultaneous localization and mapping (SLAM) has been investigated in the robotics community for decades. Significant progress and achievements on visual SLAM have been made, with geometric model-based techniques becoming increasingly mature and accurate. However, they tend to be fragile under challenging environments. Recently, there is a trend to develop data-driven approaches, e.g., deep learning, for visual SLAM problems with more robust performance. This paper aims to witness the ongoing evolution of visual SLAM techniques from geometric model-based to data-driven approaches by providing a comprehensive technical review. Our contribution is not only just a compilation of state-of-the-art end-to-end deep learning SLAM work, but also an insight into the underlying mechanism of deep learning SLAM. For such a purpose, we provide a concise overview of geometric model-based approaches first. Next, we identify visual depth estimation using deep learning is a starting point of the evolution. It is from depth estimation that ego-motion or pose estimation techniques using deep learning flourish rapidly. In addition, we strive to link semantic segmentation using deep learning with emergent semantic SLAM techniques to shed light on simultaneous estimation of ego-motion and high-level understanding. Finally, we visualize some further opportunities in this research direction.

A developmental architecture for sensory-motor skills based on predictors, and a nice state-of-the-art in cognitive architectures for sensory-motor skill learning

E. Wieser and G. Cheng, A Self-Verifying Cognitive Architecture for Robust Bootstrapping of Sensory-Motor Skills via Multipurpose Predictors, IEEE Transactions on Cognitive and Developmental Systems, vol. 10, no. 4, pp. 1081-1095, DOI: 10.1109/TCDS.2018.2871857.

The autonomous acquisition of sensory-motor skills along multiple developmental stages is one of the current challenges in robotics. To this end, we propose a new developmental cognitive architecture that combines multipurpose predictors and principles of self-verification for the robust bootstrapping of sensory-motor skills. Our architecture operates with loops formed by both mental simulation of sensory-motor sequences and their subsequent physical trial on a robot. During these loops, verification algorithms monitor the predicted and the physically observed sensory-motor data. Multiple types of predictors are acquired through several developmental stages. As a result, the architecture can select and plan actions, adapt to various robot platforms by adjusting proprioceptive feedback, predict the risk of self-collision, learn from a previous interaction stage by validating and extracting sensory-motor data for training the predictor of a subsequent stage, and finally acquire an internal representation for evaluating the performance of its predictors. These cognitive capabilities in turn realize the bootstrapping of early hand-eye coordination and its improvement. We validate the cognitive capabilities experimentally and, in particular, show an improvement of reaching as an example skill.

A definition of emergence and its application to emergence in robots

R. L. Sturdivant and E. K. P. Chong, The Necessary and Sufficient Conditions for Emergence in Systems Applied to Symbol Emergence in Robots, IEEE Transactions on Cognitive and Developmental Systems, vol. 10, no. 4, pp. 1035-1042, DOI: 10.1109/TCDS.2017.2731361.

A conceptual model for emergence with downward causation is developed. In addition, the necessary and sufficient conditions are identified for a phenomenon to be considered emergent in a complex system. It is then applied to symbol emergence in robots. This paper is motivated by the usefulness of emergence to explain a wide variety of phenomena in systems, and cognition in natural and artificial creatures. Downward causation is shown to be a critical requirement for potentially emergent phenomena to be considered actually emergent. Models of emergence with and without downward causation are described and how weak emergence can include downward causation. A process flow is developed for distinguishing emergence from nonemergence based upon the application of reductionism and detection of downward causation. Examples are shown for applying the necessary and sufficient conditions to filter out actually emergent phenomena from nonemergent ones. Finally, this approach for detecting emergence is applied to complex projects and symbol emergence in robots.

A cognitive architecture for self-development in robots that interact with humans, with a nice state-of-the-art of robot cognitive architectures

C. Moulin-Frier et al., DAC-h3: A Proactive Robot Cognitive Architecture to Acquire and Express Knowledge About the World and the Self, IEEE Transactions on Cognitive and Developmental Systems, vol. 10, no. 4, pp. 1005-1022, DOI: 10.1109/TCDS.2017.2754143.

This paper introduces a cognitive architecture for a humanoid robot to engage in a proactive, mixed-initiative exploration and manipulation of its environment, where the initiative can originate from both human and robot. The framework, based on a biologically grounded theory of the brain and mind, integrates a reactive interaction engine, a number of state-of-the-art perceptual and motor learning algorithms, as well as planning abilities and an autobiographical memory. The architecture as a whole drives the robot behavior to solve the symbol grounding problem, acquire language capabilities, execute goal-oriented behavior, and express a verbal narrative of its own experience in the world. We validate our approach in human-robot interaction experiments with the iCub humanoid robot, showing that the proposed cognitive architecture can be applied in real time within a realistic scenario and that it can be used with naive users.

A ROS module that improves real-time aspects of network communication among distributed ROS machines, and a nice analysis of wireless network characteristics and limitations

Danilo Tardioli, Ramviyas Parasuraman, Petter Ögren, Pound: A multi-master ROS node for reducing delay and jitter in wireless multi-robot networks, Robotics and Autonomous Systems, Volume 111, 2019, Pages 73-87, DOI: 10.1016/j.robot.2018.10.009.

The Robot Operating System (ROS) is a popular and widely used software framework for building robotics systems. With the growth of its popularity, it has started to be used in multi-robot systems as well. However, the TCP connections that the platform relies on for connecting the so-called ROS nodes presents several issues regarding limited-bandwidth, delays, and jitter, when used in wireless multi-hop networks. In this paper, we present a thorough analysis of the problem and propose a new ROS node called Pound to improve the wireless communication performance by reducing delay and jitter in data exchanges, especially in multi-hop networks. Pound allows the use of multiple ROS masters (roscores), features data compression, and importantly, introduces a priority scheme that allows favoring more important flows over less important ones. We compare Pound to the state-of-the-art solutions through extensive experiments and show that it performs equally well, or better in all the test cases, including a control-over-network example.

A novel paradigm for motion planning based on probabilistic inference

Mukadam, M., Dong, J., Yan, X., Dellaert, F., & Boots, B. , Continuous-time Gaussian process motion planning via probabilistic inference, The International Journal of Robotics Research, 37(11), 1319–1340, DOI: 10.1177/0278364918790369.

We introduce a novel formulation of motion planning, for continuous-time trajectories, as probabilistic inference. We first show how smooth continuous-time trajectories can be represented by a small number of states using sparse Gaussian process (GP) models. We next develop an efficient gradient-based optimization algorithm that exploits this sparsity and GP interpolation. We call this algorithm the Gaussian Process Motion Planner (GPMP). We then detail how motion planning problems can be formulated as probabilistic inference on a factor graph. This forms the basis for GPMP2, a very efficient algorithm that combines GP representations of trajectories with fast, structure-exploiting inference via numerical optimization. Finally, we extend GPMP2 to an incremental algorithm, iGPMP2, that can efficiently replan when conditions change. We benchmark our algorithms against several sampling-based and trajectory optimization-based motion planning algorithms on planning problems in multiple environments. Our evaluation reveals that GPMP2 is several times faster than previous algorithms while retaining robustness. We also benchmark iGPMP2 on replanning problems, and show that it can find successful solutions in a fraction of the time required by GPMP2 to replan from scratch.