Category Archives: Mobile Robot Slam

Interesting survey on Visual SLAM without filtering and of its future lines of research

Georges Younes, Daniel Asmar, Elie Shammas, John Zelek, Keyframe-based monocular SLAM: design, survey, and future directions, Robotics and Autonomous Systems, Volume 98, 2017, Pages 67-88, DOI: 10.1016/j.robot.2017.09.010.

Extensive research in the field of monocular SLAM for the past fifteen years has yielded workable systems that found their way into various applications in robotics and augmented reality. Although filter-based monocular SLAM systems were common at some time, the more efficient keyframe-based solutions are becoming the de facto methodology for building a monocular SLAM system. The objective of this paper is threefold: first, the paper serves as a guideline for people seeking to design their own monocular SLAM according to specific environmental constraints. Second, it presents a survey that covers the various keyframe-based monocular SLAM systems in the literature, detailing the components of their implementation, and critically assessing the specific strategies made in each proposed solution. Third, the paper provides insight into the direction of future research in this field, to address the major limitations still facing monocular SLAM; namely, in the issues of illumination changes, initialization, highly dynamic motion, poorly textured scenes, repetitive textures, map maintenance, and failure recovery.

Interesting implementation of visual graph SLAM in C++ for educational purposes

Dominik Schlegel, Mirco Colosi, Giorgio Grisetti, ProSLAM: Graph SLAM from a Programmer’s Perspective/strong>, arXiv:1709.04377.

In this paper we present ProSLAM, a lightweight stereo visual SLAM system designed with simplicity in mind. Our work stems from the experience gathered by the authors while teaching SLAM to students and aims at providing a highly modular system that can be easily implemented and understood. Rather than focusing on the well known mathematical aspects of Stereo Visual SLAM, in this work we highlight the data structures and the algorithmic aspects that one needs to tackle during the design of such a system. We implemented ProSLAM using the C++ programming language in combination with a minimal set of well known used external libraries. In addition to an open source implementation, we provide several code snippets that address the core aspects of our approach directly in this paper. The results of a thorough validation performed on standard benchmark datasets show that our approach achieves accuracy comparable to state of the art methods, while requiring substantially less computational resources.

Interesting review of approaches to visually detect loop closings in robotics, and a novel, very efficient method that is independent on the image representation and based on not using the typical l2 norm (least squares), which leads to dense optimization problems

Yasir Latif, Guoquan Huang, John Leonard, José Neira, Sparse optimization for robust and efficient loop closing, Robotics and Autonomous Systems, Volume 93, July 2017, Pages 13-26, ISSN 0921-8890,DOI: 10.1016/j.robot.2017.03.016.

It is essential for a robot to be able to detect revisits or loop closures for long-term visual navigation. A key insight explored in this work is that the loop-closing event inherently occurs sparsely, i.e., the image currently being taken matches with only a small subset (if any) of previous images. Based on this observation, we formulate the problem of loop-closure detection as a sparse, convex
ℓ 1 -minimization problem. By leveraging fast convex optimization techniques, we are able to efficiently find loop closures, thus enabling real-time robot navigation. This novel formulation requires no offline dictionary learning, as required by most existing approaches, and thus allows online incremental operation. Our approach ensures a unique hypothesis by choosing only a single globally optimal match when making a loop-closure decision. Furthermore, the proposed formulation enjoys a flexible representation with no restriction imposed on how images should be represented, while requiring only that the representations are “close” to each other when the corresponding images are visually similar. The proposed algorithm is validated extensively using real-world datasets.

Insights into the sparsity of graph-SLAM (i.e., in the smoothing / optimization approach to SLAM) and a good formalization of the problem

K. Khosoussi, S. Huang and G. Dissanayake, “A Sparse Separable SLAM Back-End,” in IEEE Transactions on Robotics, vol. 32, no. 6, pp. 1536-1549, Dec. 2016. DOI: 10.1109/TRO.2016.2609394.

We propose a scalable algorithm to take advantage of the separable structure of simultaneous localization and mapping (SLAM). Separability is an overlooked structure of SLAM that distinguishes it from a generic nonlinear least-squares problem. The standard relative-pose and relative-position measurement models in SLAM are affine with respect to robot and features’ positions. Therefore, given an estimate for robot orientation, the conditionally optimal estimate for the rest of the state variables can be easily computed by solving a sparse linear least-squares problem. We propose an algorithm to exploit this intrinsic property of SLAM by stripping the problem down to its nonlinear core, while maintaining its natural sparsity. Our algorithm can be used in conjunction with any Newton-based solver and is applicable to 2-D/3-D pose-graph and feature-based SLAM. Our results suggest that iteratively solving the nonlinear core of SLAM leads to a fast and reliable convergence as compared to the state-of-the-art sparse back-ends.

An excellent survey of metrical SLAM (and of map representations and other issues related to SLAM) as of 2016

C. Cadena et al., “Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age,” in IEEE Transactions on Robotics, vol. 32, no. 6, pp. 1309-1332, Dec. 2016. DOI: 10.1109/TRO.2016.2624754.

Simultaneous localization and mapping (SLAM) consists in the concurrent construction of a model of the environment (the map), and the estimation of the state of the robot moving within it. The SLAM community has made astonishing progress over the last 30 years, enabling large-scale real-world applications and witnessing a steady transition of this technology to industry. We survey the current state of SLAM and consider future directions. We start by presenting what is now the de-facto standard formulation for SLAM. We then review related work, covering a broad set of topics including robustness and scalability in long-term mapping, metric and semantic representations for mapping, theoretical performance guarantees, active SLAM and exploration, and other new frontiers. This paper simultaneously serves as a position paper and tutorial to those who are users of SLAM. By looking at the published research with a critical eye, we delineate open challenges and new research issues, that still deserve careful scientific investigation. The paper also contains the authors’ take on two questions that often animate discussions during robotics conferences: Do robots need SLAM? and Is SLAM solved?

Implementation of PF SLAM in FPGAs and a good state of the art of the issue

B.G. Sileshi, J. Oliver, R. Toledo, J. Gonçalves, P. Costa, On the behaviour of low cost laser scanners in HW/SW particle filter SLAM applications, Robotics and Autonomous Systems, Volume 80, June 2016, Pages 11-23, ISSN 0921-8890, DOI: 10.1016/j.robot.2016.03.002.

Particle filters (PFs) are computationally intensive sequential Monte Carlo estimation methods with applications in the field of mobile robotics for performing tasks such as tracking, simultaneous localization and mapping (SLAM) and navigation, by dealing with the uncertainties and/or noise generated by the sensors as well as with the intrinsic uncertainties of the environment. However, the application of PFs with an important number of particles has traditionally been difficult to implement in real-time applications due to the huge number of operations they require. This work presents a hardware implementation on FPGA (field programmable gate arrays) of a PF applied to SLAM which aims to accelerate the execution time of the PF algorithm with moderate resource. The presented system is evaluated for different sensors including a low cost Neato XV-11 laser scanner sensor. First the system is validated by post processing data provided by a realistic simulation of a differential robot, equipped with a hacked Neato XV-11 laser scanner, that navigates in the Robot@Factory competition maze. The robot was simulated using SimTwo, which is a realistic simulation software that can support several types of robots. The simulator provides the robot ground truth, odometry and the laser scanner data. Then the proposed solution is further validated on standard laser scanner sensors in complex environments. The results achieved from this study confirmed the possible use of low cost laser scanner for different robotics applications which benefits in several aspects due to its cost and the increased speed provided by the SLAM algorithm running on FPGA.

Dealing with multiple hypothesis in Graph-SLAM through multigraphs (as in multi-hierarchical graphs)

Max Pfingsthorn and Andreas Birk, Generalized graph SLAM: Solving local and global ambiguities through multimodal and hyperedge constraints, The International Journal of Robotics Research May 2016 35: 601-630, DOI: 10.1177/0278364915585395.

Research in Graph-based Simultaneous Localization and Mapping has experienced a recent trend towards robust methods. These methods take the combinatorial aspect of data association into account by allowing decisions of the graph topology to be made during optimization. The Generalized Graph Simultaneous Localization and Mapping framework presented in this work can represent ambiguous data on both local and global scales, i.e. it can handle multiple mutually exclusive choices in registration results and potentially erroneous loop closures. This is achieved by augmenting previous work on multimodal distributions with an extended graph structure using hyperedges to encode ambiguous loop closures. The novel representation combines both hyperedges and multimodal Mixture of Gaussian constraints to represent all sources of ambiguity in Simultaneous Localization and Mapping. Furthermore, a discrete optimization stage is introduced between the Simultaneous Localization and Mapping frontend and backend to handle these ambiguities in a unified way utilizing the novel representation of Generalized Graph Simultaneous Localization and Mapping, providing a general approach to handle all forms of outliers. The novel Generalized Prefilter method optimizes among all local and global choices and generates a traditional unimodal unambiguous pose graph for subsequent continuous optimization in the backend. Systematic experiments on synthetic datasets show that the novel representation of the Generalized Graph Simultaneous Localization and Mapping framework with the Generalized Prefilter method, is significantly more robust and faster than other robust state-of-the-art methods. In addition, two experiments with real data are presented to corroborate the results observed with synthetic data. Different general strategies to construct problems from real data, utilizing the full representational power of the Generalized Graph Simultaneous Localization and Mapping framework are also illustrated in these experiments.

Very interesting survey on visual place recognition, including historical background, physio-psychological bases and a definition of “place” in robotics

S. Lowry et al., Visual Place Recognition: A Survey, in IEEE Transactions on Robotics, vol. 32, no. 1, pp. 1-19, Feb. 2016. DOI: 10.1109/TRO.2015.2496823.

Visual place recognition is a challenging problem due to the vast range of ways in which the appearance of real-world places can vary. In recent years, improvements in visual sensing capabilities, an ever-increasing focus on long-term mobile robot autonomy, and the ability to draw on state-of-the-art research in other disciplines-particularly recognition in computer vision and animal navigation in neuroscience-have all contributed to significant advances in visual place recognition systems. This paper presents a survey of the visual place recognition research landscape. We start by introducing the concepts behind place recognition-the role of place recognition in the animal kingdom, how a “place” is defined in a robotics context, and the major components of a place recognition system. Long-term robot operations have revealed that changing appearance can be a significant factor in visual place recognition failure; therefore, we discuss how place recognition solutions can implicitly or explicitly account for appearance change within the environment. Finally, we close with a discussion on the future of visual place recognition, in particular with respect to the rapid advances being made in the related fields of deep learning, semantic scene understanding, and video description.

Incorporating spatial info into the symbolic (bag-of-words) info used for loop closure detection

Nishant Kejriwal, Swagat Kumar, Tomohiro Shibata, High performance loop closure detection using bag of word pairs, Robotics and Autonomous Systems, Volume 77, March 2016, Pages 55-65, ISSN 0921-8890, DOI: 10.1016/j.robot.2015.12.003.

In this paper, we look into the problem of loop closure detection in topological mapping. The bag of words (BoW) is a popular approach which is fast and easy to implement, but suffers from perceptual aliasing, primarily due to vector quantization. We propose to overcome this limitation by incorporating the spatial co-occurrence information directly into the dictionary itself. This is done by creating an additional dictionary comprising of word pairs, which are formed by using a spatial neighborhood defined based on the scale size of each point feature. Since the word pairs are defined relative to the spatial location of each point feature, they exhibit a directional attribute which is a new finding made in this paper. The proposed approach, called bag of word pairs (BoWP), uses relative spatial co-occurrence of words to overcome the limitations of the conventional BoW methods. Unlike previous methods that use spatial arrangement only as a verification step, the proposed method incorporates spatial information directly into the detection level and thus, influences all stages of decision making. The proposed BoWP method is implemented in an on-line fashion by incorporating some of the popular concepts such as, K-D tree for storing and searching features, Bayesian probabilistic framework for making decisions on loop closures, incremental creation of dictionary and using RANSAC for confirming loop closure for the top candidate. Unlike previous methods, an incremental version of K-D tree implementation is used which prevents rebuilding of tree for every incoming image, thereby reducing the per image computation time considerably. Through experiments on standard datasets it is shown that the proposed methods provide better recall performance than most of the existing methods. This improvement is achieved without making use any geometric information obtained from range sensors or robot odometry. The computational requirements for the algorithm is comparable to that of BoW methods and is shown to be less than the latest state-of-the-art method in this category.

Implementation of spatial relations in graph-SLAM through quaternions instead of homogeneous matrices

Jiantong Cheng, Jonghyuk Kim, Zhenyu Jiang, Wanfang Che, Dual quaternion-based graphical SLAM, Robotics and Autonomous Systems, Volume 77, March 2016, Pages 15-24, ISSN 0921-8890, DOI: 10.1016/j.robot.2015.12.001.

This paper presents a new parameterization approach for the graph-based SLAM problem and reveals the differences of two popular over-parameterized ways in the optimization procedure. In the SALM problem, constraints or relative transformations between any two poses are generally separated into translations plus 3D rotations, which are then described in a homogeneous transformation matrix (HTM) to simplify computational operations. This however introduces added complexities in frequent conversions between the HTM and state variables, due to their different representations. This new approach, unit dual quaternion (UDQ), describes a spatial transformation as a screw with only 8 elements. We show that state variables can be directly represented by UDQs, and how their relative transformations can be written with the UDQ product, without the trivial computations of HTM. Then, we explore the performances of the unit quaternion and the axis–angle representations in the graph-based SLAM problem, which have been successfully applied to over parameterize perturbations under the assumption of small errors. Based on public synthetic and real-world datasets in 2D and 3D environments, experimental results show that the proposed approach reduces greatly the computational complexity while obtaining the same optimization accuracies as the HTM-based algorithm, and the axis–angle representation is superior to be the quaternion in the case of poor initial estimations.