Tag Archives: Visual Odometry

A survey on visual SLAM in robotics

Iman Abaspur Kazerouni, Luke Fitzgerald, Gerard Dooly, Daniel Toal, A survey of state-of-the-art on visual SLAM, Expert Systems with Applications, Volume 205, 2022 DOI: 10.1016/j.eswa.2022.117734.

This paper is an overview to Visual Simultaneous Localization and Mapping (V-SLAM). We discuss the basic definitions in the SLAM and vision system fields and provide a review of the state-of-the-art methods utilized for mobile robot\u2019s vision and SLAM. This paper covers topics from the basic SLAM methods, vision sensors, machine vision algorithms for feature extraction and matching, Deep Learning (DL) methods and datasets for Visual Odometry (VO) and Loop Closure (LC) in V-SLAM applications. Several feature extraction and matching algorithms are simulated to show a better vision of feature-based techniques.

See also:

Jun Cheng, Liyan Zhang, Qihong Chen, Xinrong Hu, Jingcao Cai, “A review of visual SLAM methods for autonomous driving vehicles,” Engineering Applications of Artificial Intelligence, Volume 114, 2022, 104992, ISSN 0952-1976, https://doi.org/10.1016/j.engappai.2022.104992.

Tianyao Zhang, Xiaoguang Hu, Jin Xiao, Guofeng Zhang, “A survey of visual navigation: From geometry to embodied AI,” Engineering Applications of Artificial Intelligence, Volume 114, 2022, 105036, ISSN 0952-1976, https://doi.org/10.1016/j.engappai.2022.105036.

First end-to-end implementation of (monocular) Visual Odometry with deep neural networks, including output with the uncertainty of the result

Sen Wang, Ronald Clark, Hongkai Wen, and Niki Trigoni, End-to-end, sequence-to-sequence probabilistic visual odometry through deep neural networks, The International Journal of Robotics Research Vol 37, Issue 4-5, pp. 513 – 542, DOI: 0.1177/0278364917734298.

This paper studies visual odometry (VO) from the perspective of deep learning. After tremendous efforts in the robotics and computer vision communities over the past few decades, state-of-the-art VO algorithms have demonstrated incredible performance. However, since the VO problem is typically formulated as a pure geometric problem, one of the key features still missing from current VO systems is the capability to automatically gain knowledge and improve performance through learning. In this paper, we investigate whether deep neural networks can be effective and beneficial to the VO problem. An end-to-end, sequence-to-sequence probabilistic visual odometry (ESP-VO) framework is proposed for the monocular VO based on deep recurrent convolutional neural networks. It is trained and deployed in an end-to-end manner, that is, directly inferring poses and uncertainties from a sequence of raw images (video) without adopting any modules from the conventional VO pipeline. It can not only automatically learn effective feature representation encapsulating geometric information through convolutional neural networks, but also implicitly model sequential dynamics and relation for VO using deep recurrent neural networks. Uncertainty is also derived along with the VO estimation without introducing much extra computation. Extensive experiments on several datasets representing driving, flying and walking scenarios show competitive performance of the proposed ESP-VO to the state-of-the-art methods, demonstrating a promising potential of the deep learning technique for VO and verifying that it can be a viable complement to current VO systems.