Interesting study about how to quantify the uncertainty in SLAM and the preservation of its monotonic growth, which is needed to good decision making in active SLAM

M. L. Rodríguez-Arévalo, J. Neira and J. A. Castellanos, On the Importance of Uncertainty Representation in Active SLAM, IEEE Transactions on Robotics, vol. 34, no. 3, pp. 829-834 DOI: 10.1109/TRO.2018.2808902.

The purpose of this work is to highlight the paramount importance of representing and quantifying uncertainty to correctly report the associated confidence of the robot’s location estimate at each time step along its trajectory and therefore decide the correct course of action in an active SLAM mission. We analyze the monotonicity property of different decision-making criteria, both in 2-D and 3-D, with respect to the representation of uncertainty and of the orientation of the robot’s pose. Monotonicity, the property that uncertainty increases as the robot moves, is essential for adequate decision making. We analytically show that, by using differential representations to propagate spatial uncertainties, monotonicity is preserved for all optimality criteria, A-opt, D-opt, and E-opt, and for Shannon’s entropy. We also show that monotonicity does not hold for any criteria in absolute representations using Roll-Pitch-Yaw and Euler angles. Finally, using unit quaternions in absolute representations, the only criteria that preserve monotonicity are D-opt and Shannon’s entropy.

A new algorithm that provably converges to a global clock consensus in a network

Miloš S. Stanković, Srdjan S. Stanković, Karl Henrik Johansson, Distributed time synchronization for networks with random delays and measurement noise, Automatica, Volume 93, 2018, Pages 126-137 DOI: 10.1016/j.automatica.2018.03.054.

In this paper a new distributed asynchronous algorithm is proposed for time synchronization in networks with random communication delays, measurement noise and communication dropouts. Three different types of the drift correction algorithm are introduced, based on different kinds of local time increments. Under nonrestrictive conditions concerning network properties, it is proved that all the algorithm types provide convergence in the mean square sense and with probability one (w.p.1) of the corrected drifts of all the nodes to the same value (consensus). An estimate of the convergence rate of these algorithms is derived. For offset correction, a new algorithm is proposed containing a compensation parameter coping with the influence of random delays and special terms taking care of the influence of both linearly increasing time and drift correction. It is proved that the corrected offsets of all the nodes converge in the mean square sense and w.p.1. An efficient offset correction algorithm based on consensus on local compensation parameters is also proposed. It is shown that the overall time synchronization algorithm can also be implemented as a flooding algorithm with one reference node. It is proved that it is possible to achieve bounded error between local corrected clocks in the mean square sense and w.p.1. Simulation results provide an additional practical insight into the algorithm properties and show its advantage over the existing methods.

Evaluating the safeness of a motion plan for mobile robot navigation

Brian Axelrod, Leslie Pack Kaelbling, and Tomás Lozano-Pérez Provably safe robot navigation with obstacle uncertainty, The International Journal of Robotics Research Vol 37, Issue 7 DOI: 10.1177/0278364918778338.

As drones and autonomous cars become more widespread, it is becoming increasingly important that robots can operate safely under realistic conditions. The noisy information fed into real systems means that robots must use estimates of the environment to plan navigation. Efficiently guaranteeing that the resulting motion plans are safe under these circumstances has proved difficult. We examine how to guarantee that a trajectory or policy has at most ϵ collision probability (ϵ-safe) with only imperfect observations of the environment. We examine the implications of various mathematical formalisms of safety and arrive at a mathematical notion of safety of a long-term execution, even when conditioned on observational information. We explore the idea of shadows that generalize the notion of a confidence set to estimated shapes and present a theorem that allows us to understand the relationship between shadows and their classical statistical equivalents such as confidence and credible sets. We present efficient algorithms that use shadows to prove that trajectories or policies are safe with much tighter bounds than in previous work. Notably, the complexity of the environment does not affect our method’s ability to evaluate whether a trajectory or policy is safe. We then use these safety-checking methods to design a safe variant of the rapidly exploring random tree (RRT) planning algorithm.

Shared autonomy where the target is predicted with POMDPs to cope with uncertain predictions

Shervin Javdani, Henny Admoni, Stefania Pellegrinelli, Siddhartha S. Srinivasa, and J. Andrew Bagnell Shared autonomy via hindsight optimization for teleoperation and teaming, The International Journal of Robotics Research Vol 37, Issue 7, pp. 717 – 742 DOI: 10.1177/0278364918776060.

In shared autonomy, a user and autonomous system work together to achieve shared goals. To collaborate effectively, the autonomous system must know the user’s goal. As such, most prior works follow a predict-then-act model, first predicting the user’s goal with high confidence, then assisting given that goal. Unfortunately, confidently predicting the user’s goal may not be possible until they have nearly achieved it, causing predict-then-act methods to provide little assistance. However, the system can often provide useful assistance even when confidence for any single goal is low (e.g. move towards multiple goals). In this work, we formalize this insight by modeling shared autonomy as a partially observable Markov decision process (POMDP), providing assistance that minimizes the expected cost-to-go with an unknown goal. As solving this POMDP optimally is intractable, we use hindsight optimization to approximate. We apply our framework to both shared-control teleoperation and human–robot teaming. Compared with predict-then-act methods, our method achieves goals faster, requires less user input, decreases user idling time, and results in fewer user–robot collisions.

Relation between optimization and reinforcement learning

Megumi Miyashita, Shiro Yano, Toshiyuki Kondo Mirror descent search and its acceleration, Robotics and Autonomous Systems, Volume 106, 2018, Pages 107-116 DOI: 10.1016/j.robot.2018.04.009.

In recent years, attention has been focused on the relationship between black-box optimization problem and reinforcement learning problem. In this research, we propose the Mirror Descent Search (MDS) algorithm which is applicable both for black box optimization problems and reinforcement learning problems. Our method is based on the mirror descent method, which is a general optimization algorithm. The contribution of this research is roughly twofold. We propose two essential algorithms, called MDS and Accelerated Mirror Descent Search (AMDS), and two more approximate algorithms: Gaussian Mirror Descent Search (G-MDS) and Gaussian Accelerated Mirror Descent Search (G-AMDS). This research shows that the advanced methods developed in the context of the mirror descent research can be applied to reinforcement learning problem. We also clarify the relationship between an existing reinforcement learning algorithm and our method. With two evaluation experiments, we show our proposed algorithms converge faster than some state-of-the-art methods.

A new model of cognition

Howard, N. & Hussain, A. The Fundamental Code Unit of the Brain: Towards a New Model for Cognitive Geometry, Cogn Comput (2018) 10: 426 DOI: 10.1007/s12559-017-9538-5.

This paper discusses the problems arising from the multidisciplinary nature of cognitive research and the need to conceptually unify insights from multiple fields into the phenomena that drive cognition. Specifically, the Fundamental Code Unit (FCU) is proposed as a means to better quantify the intelligent thought process at multiple levels of analysis. From the linguistic and behavioral output, FCU produces to the chemical and physical processes within the brain that drive it. The proposed method efficiently model the most complex decision-making process performed by the brain.

Adapting inverse reinforcement learning for including the risk-aversion of the agent

Sumeet Singh, Jonathan Lacotte, Anirudha Majumdar, and Marco Pavone, Risk-sensitive inverse reinforcement learning via semi- and non-parametric methods , The International Journal of Robotics Research First Published May 22, 2018 DOI: 10.1177/0278364918772017.

The literature on inverse reinforcement learning (IRL) typically assumes that humans take actions to minimize the expected value of a cost function, i.e., that humans are risk neutral. Yet, in practice, humans are often far from being risk neutral. To fill this gap, the objective of this paper is to devise a framework for risk-sensitive (RS) IRL to explicitly account for a human’s risk sensitivity. To this end, we propose a flexible class of models based on coherent risk measures, which allow us to capture an entire spectrum of risk preferences from risk neutral to worst case. We propose efficient non-parametric algorithms based on linear programming and semi-parametric algorithms based on maximum likelihood for inferring a human’s underlying risk measure and cost function for a rich class of static and dynamic decision-making settings. The resulting approach is demonstrated on a simulated driving game with 10 human participants. Our method is able to infer and mimic a wide range of qualitatively different driving styles from highly risk averse to risk neutral in a data-efficient manner. Moreover, comparisons of the RS-IRL approach with a risk-neutral model show that the RS-IRL framework more accurately captures observed participant behavior both qualitatively and quantitatively, especially in scenarios where catastrophic outcomes such as collisions can occur.

A new mathematical formulation of manipulator motion that simplifies dynamics and kinematics

Labbé, M. & Michaud, F., Comprehensive theory of differential kinematics and dynamics towards extensive motion optimization framework, The International Journal of Robotics Research First Published May 20, 2018 DOI: 10.1177/0278364918772893.

This paper presents a novel unified theoretical framework for differential kinematics and dynamics for the optimization of complex robot motion. By introducing an 18×18 comprehensive motion transformation matrix, the forward differential kinematics and dynamics, including velocity and acceleration, can be written in a simple chain product similar to an ordinary rotational matrix. This formulation enables the analytical computation of derivatives of various physical quantities (e.g. link velocities, link accelerations, or joint torques) with respect to joint coordinates, velocities and accelerations for a robot trajectory in an efficient manner (O(NJ), where NJ is the number of the robot’s degree of freedom), which is useful for motion optimization. Practical implementation of gradient computation is demonstrated together with simulation results of robot motion optimization to validate the effectiveness of the proposed framework.

Using short- and long-term memories in SLAM

Labbé, M. & Michaud, F., Long-term online multi-session graph-based SPLAM with memory management, Auton Robot (2018) 42: 1133. DOI: 10.1007/s10514-017-9682-5.

For long-term simultaneous planning, localization and mapping (SPLAM), a robot should be able to continuously update its map according to the dynamic changes of the environment and the new areas explored. With limited onboard computation capabilities, a robot should also be able to limit the size of the map used for online localization and mapping. This paper addresses these challenges using a memory management mechanism, which identifies locations that should remain in a Working Memory (WM) for online processing from locations that should be transferred to a Long-Term Memory (LTM). When revisiting previously mapped areas that are in LTM, the mechanism can retrieve these locations and place them back in WM for online SPLAM. The approach is tested on a robot equipped with a short-range laser rangefinder and a RGB-D camera, patrolling autonomously 10.5 km in an indoor environment over 11 sessions while having encountered 139 people.

A novel approach to avoid the minima problem in potential fields navigation

Fedele, G., D’Alfonso, L., Chiaravalloti, F. et al., Obstacles Avoidance Based on Switching Potential Functions, J Intell Robot Syst (2018) 90: 387. DOI: 10.1007/s10846-017-0687-2.

In this paper, a novel path planning and obstacles avoidance method for a mobile robot is proposed. This method makes use of a switching strategy between the attractive potential of the target and a new helicoidal potential field which allows to bypass an obstacle by driving the robot around it. The new technique aims at overcoming the local minima problems of the well-known artificial potentials method, caused by the summation of two (or more) potential fields. In fact, in the proposed approach, only a single potential is used at a time. The resulting proposed technique uses only local information and ensures high robustness, in terms of achieved performance and computational complexity, w.r.t. the number of obstacles. Numerical simulations, together with comparisons with existing methods, confirm a very robust behavior of the method, also in the case of a framework with multiple obstacles.