Ruihao Li, Sen Wang, DongBing Gu, Ongoing Evolution of Visual SLAM from Geometry to Deep Learning: Challenges and Opportunities, Cognitive Computation, December 2018, Volume 10, Issue 6, pp 875–889, DOI: 10.1007/s12559-018-9591-8.
Visual simultaneous localization and mapping (SLAM) has been investigated in the robotics community for decades. Significant progress and achievements on visual SLAM have been made, with geometric model-based techniques becoming increasingly mature and accurate. However, they tend to be fragile under challenging environments. Recently, there is a trend to develop data-driven approaches, e.g., deep learning, for visual SLAM problems with more robust performance. This paper aims to witness the ongoing evolution of visual SLAM techniques from geometric model-based to data-driven approaches by providing a comprehensive technical review. Our contribution is not only just a compilation of state-of-the-art end-to-end deep learning SLAM work, but also an insight into the underlying mechanism of deep learning SLAM. For such a purpose, we provide a concise overview of geometric model-based approaches first. Next, we identify visual depth estimation using deep learning is a starting point of the evolution. It is from depth estimation that ego-motion or pose estimation techniques using deep learning flourish rapidly. In addition, we strive to link semantic segmentation using deep learning with emergent semantic SLAM techniques to shed light on simultaneous estimation of ego-motion and high-level understanding. Finally, we visualize some further opportunities in this research direction.