Category Archives: Robotics

Improving the simulation-to-real transfer of learning robotic skills by learning smaller skills and how to connect them in reality

Julian RC, Heiden E, He Z, et al., Scaling simulation-to-real transfer by learning a latent space of robot skills, . The International Journal of Robotics Research. 2020;39(10-11):1259-1278 DOI: 10.1177/0278364920944474.

We present a strategy for simulation-to-real transfer, which builds on recent advances in robot skill decomposition. Rather than focusing on minimizing the simulation–reality gap, we propose a method for increasing the sample efficiency and robustness of existing simulation-to-real approaches which exploits hierarchy and online adaptation. Instead of learning a unique policy for each desired robotic task, we learn a diverse set of skills and their variations, and embed those skill variations in a continuously parameterized space. We then interpolate, search, and plan in this space to find a transferable policy which solves more complex, high-level tasks by combining low-level skills and their variations. In this work, we first characterize the behavior of this learned skill space, by experimenting with several techniques for composing pre-learned latent skills. We then discuss an algorithm which allows our method to perform long-horizon tasks never seen in simulation, by intelligently sequencing short-horizon latent skills. Our algorithm adapts to unseen tasks online by repeatedly choosing new skills from the latent space, using live sensor data and simulation to predict which latent skill will perform best next in the real world. Importantly, our method learns to control a real robot in joint-space to achieve these high-level tasks with little or no on-robot time, despite the fact that the low-level policies may not be perfectly transferable from simulation to real, and that the low-level skills were not trained on any examples of high-level tasks. In addition to our results indicating a lower sample complexity for families of tasks, we believe that our method provides a promising template for combining learning-based methods with proven classical robotics algorithms such as model-predictive control.

Combination of RL with human provided models for navigation

Amarildo Likmeta, Alberto Maria Metelli, Andrea Tirinzoni, Riccardo Giol, Marcello Restelli, Danilo Romano, Combining reinforcement learning with rule-based controllers for transparent and general decision-making in autonomous driving, . Robotics and Autonomous Systems, Volume 131, 2020 DOI: 10.1016/j.robot.2020.103568.

The design of high-level decision-making systems is a topical problem in the field of autonomous driving. In this paper, we combine traditional rule-based strategies and reinforcement learning (RL) with the goal of achieving transparency and robustness. On the one hand, the use of handcrafted rule-based controllers allows for transparency, i.e., it is always possible to determine why a given decision was made, but they struggle to scale to complex driving scenarios, in which several objectives need to be considered. On the other hand, black-box RL approaches enable us to deal with more complex scenarios, but they are usually hardly interpretable. In this paper, we combine the best properties of these two worlds by designing parametric rule-based controllers, in which interpretable rules can be provided by domain experts and their parameters are learned via RL. After illustrating how to apply parameter-based RL methods (PGPE) to this setting, we present extensive numerical simulations in the highway and in two urban scenarios: intersection and roundabout. For each scenario, we show the formalization as an RL problem and we discuss the results of our approach in comparison with handcrafted rule-based controllers and black-box RL techniques.

Fast and more exact triangulation method for robot localization using range measurements

Pınar Oğuz-Ekim, Lambiotte R., Lefebvre E., TDOA based localization and its application to the initialization of LiDAR based autonomous robots, . Robotics and Autonomous Systems, Volume 131, 2020, DOI: 10.1016/j.robot.2020.103590.

This work considers the problem of locating a single robot given a set of squared noisy range difference measurements to a set of points (anchors) whose positions are known. In the sequel, localization problem is solved in the Least-Squares (LS) sense by writing the robot position in polar/spherical coordinates. This representation transforms the original nonconvex/multimodal cost function into the quotient of two quadratic forms, whose constrained maximization is more tractable than the original problem. Simulation results indicate that the proposed method has similar accuracy to state-of-the-art optimization-based localization algorithms in its class, and the simple algorithmic structure and computational efficiency makes it appealing for applications with strong computational constraints. Additionally, location information is used to find the initial orientation of the robot with respect to the previously obtained map in scan matching. Thus, the crucial problem of the autonomous initialization and localization in robotics is solved.

Towards the emergence of obstacle avoidance through collisions

Qian F, Koditschek DE., An obstacle disturbance selection framework: emergent robot steady states under repeated collisions, The International Journal of Robotics Research. 2020;39(13):1549-1566, DOI: 10.1177/0278364920935514.

Natural environments are often filled with obstacles and disturbances. Traditional navigation and planning approaches normally depend on finding a traversable “free space” for robots to avoid unexpected contact or collision. We hypothesize that with a better understanding of the robot–obstacle interactions, these collisions and disturbances can be exploited as opportunities to improve robot locomotion in complex environments. In this article, we propose a novel obstacle disturbance selection (ODS) framework with the aim of allowing robots to actively select disturbances to achieve environment-aided locomotion. Using an empirically characterized relationship between leg–obstacle contact position and robot trajectory deviation, we simplify the representation of the obstacle-filled physical environment to a horizontal-plane disturbance force field. We then treat each robot leg as a “disturbance force selector” for prediction of obstacle-modulated robot dynamics. Combining the two representations provides analytical insights into the effects of gaits on legged traversal in cluttered environments. We illustrate the predictive power of the ODS framework by studying the horizontal-plane dynamics of a quadrupedal robot traversing an array of evenly-spaced cylindrical obstacles with both bounding and trotting gaits. Experiments corroborate numerical simulations that reveal the emergence of a stable equilibrium orientation in the face of repeated obstacle disturbances. The ODS reduction yields closed-form analytical predictions of the equilibrium position for different robot body aspect ratios, gait patterns, and obstacle spacings. We conclude with speculative remarks bearing on the prospects for novel ODS-based gait control schemes for shaping robot navigation in perturbation-rich environments.

A new contribution along the DESPOT line focused on hybrid CPU+GPU platforms

Cai P, Luo Y, Hsu D, Lee WS., HyP-DESPOT: A hybrid parallel algorithm for online planning under uncertainty, The International Journal of Robotics Research. 2021;40(2-3):558-573, DOI: 10.1177/0278364920937074.

Robust planning under uncertainty is critical for robots in uncertain, dynamic environments, but incurs high computational cost. State-of-the-art online search algorithms, such as DESPOT, have vastly improved the computational efficiency of planning under uncertainty and made it a valuable tool for robotics in practice. This work takes one step further by leveraging both CPU and GPU parallelization in order to achieve real-time online planning performance for complex tasks with large state, action, and observation spaces. Specifically, Hybrid Parallel DESPOT (HyP-DESPOT) is a massively parallel online planning algorithm that integrates CPU and GPU parallelism in a multi-level scheme. It performs parallel DESPOT tree search by simultaneously traversing multiple independent paths using multi-core CPUs; it performs parallel Monte Carlo simulations at the leaf nodes of the search tree using GPUs. HyP-DESPOT provably converges in finite time under moderate conditions and guarantees near-optimality of the solution. Experimental results show that HyP-DESPOT speeds up online planning by up to a factor of several hundred in several challenging robotic tasks in simulation, compared with the original DESPOT algorithm. It also exhibits real-time performance on a robot vehicle navigating among many pedestrians.

Hybrid Monte Carlo + Interval-based localization

Weiss, R., Glösekötter, P., Prestes, E. et al., Hybridisation of Sequential Monte Carlo Simulation with Non-linear Bounded-error State Estimation Applied to Global Localisation of Mobile Robots, J Intell Robot Syst 99, 335–357 (2020) DOI: 10.1007/s10846-019-01118-7.

Accurate self-localisation is a fundamental ability of any mobile robot. In Monte Carlo localisation, a probability distribution over a space of possible hypotheses accommodates the inherent uncertainty in the position estimate, whereas bounded-error localisation provides a region that is guaranteed to contain the robot. However, this guarantee is accompanied by a constant probability over the confined region and therefore the information yield may not be sufficient for certain practical applications. Four hybrid localisation algorithms are proposed, combining probabilistic filtering with non-linear bounded-error state estimation based on interval analysis. A forward-backward contractor and the Set Inverter via Interval Analysis are hybridised with a bootstrap filter and an unscented particle filter, respectively. The four algorithms are applied to global localisation of an underwater robot, using simulated distance measurements to distinguishable landmarks. As opposed to previous hybrid methods found in the literature, the bounded-error state estimate is not maintained throughout the whole estimation process. Instead, it is only computed once in the beginning, when solving the wake-up robot problem, and after kidnapping of the robot, which drastically reduces the computational cost when compared to the existing algorithms. It is shown that the novel algorithms can solve the wake-up robot problem as well as the kidnapped robot problem more accurately than the two conventional probabilistic filters.

Simultaneous localization, mapping and semantic labelling in mobile robots

Taniguchi, Akira, Hagiwara, Yoshinobu, Taniguchi, Tadahiro, Inamura, Tetsunari, Improved and scalable online learning of spatial concepts and language models with mapping, Autonomous Robots 44(6), DOI: 10.1007/s10514-020-09905-0.

We propose a novel online learning algorithm, called SpCoSLAM 2.0, for spatial concepts and lexical acquisition with high accuracy and scalability. Previously, we proposed SpCoSLAM as an online learning algorithm based on unsupervised Bayesian probabilistic model that integrates multimodal place categorization, lexical acquisition, and SLAM. However, our original algorithm had limited estimation accuracy owing to the influence of the early stages of learning, and increased computational complexity with added training data. Therefore, we introduce techniques such as fixed-lag rejuvenation to reduce the calculation time while maintaining an accuracy higher than that of the original algorithm. The results show that, in terms of estimation accuracy, the proposed algorithm exceeds the original algorithm and is comparable to batch learning. In addition, the calculation time of the proposed algorithm does not depend on the amount of training data and becomes constant for each step of the scalable algorithm. Our approach will contribute to the realization of long-term spatial language interactions between humans and robots.

Bayesian estimation of the model in model-based RL for robots

Senda, Kei, Hishinuma, Toru, Tani, Yurika, Approximate Bayesian reinforcement learning based on estimation of plant, Autonomous Robots 44(5), DOI: 10.1007/s10514-020-09901-4.

This study proposes an approximate parametric model-based Bayesian reinforcement learning approach for robots, based on online Bayesian estimation and online planning for an estimated model. The proposed approach is designed to learn a robotic task with a few real-world samples and to be robust against model uncertainty, within feasible computational resources. The proposed approach employs two-stage modeling, which is composed of (1) a parametric differential equation model with a few parameters based on prior knowledge such as equations of motion, and (2) a parametric model that interpolates a finite number of transition probability models for online estimation and planning. The proposed approach modifies the online Bayesian estimation to be robust against approximation errors of the parametric model to a real plant. The policy planned for the interpolating model is proven to have a form of theoretical robustness. Numerical simulation and hardware experiments of a planar peg-in-hole task demonstrate the effectiveness of the proposed approach.

Adapting the resolution of depth sensors and the location of the high-resolution area (fovea) as a possible attention mechanism in robots

Tasneem Z, Adhivarahan C, Wang D, Xie H, Dantu K, Koppal SJ., Adaptive fovea for scanning depth sensors, The International Journal of Robotics Research. 2020;39(7):837-855, DOI: 10.1177/0278364920920931.

Depth sensors have been used extensively for perception in robotics. Typically these sensors have a fixed angular resolution and field of view (FOV). This is in contrast to human perception, which involves foveating: scanning with the eyes’ highest angular resolution over regions of interest (ROIs). We build a scanning depth sensor that can control its angular resolution over the FOV. This opens up new directions for robotics research, because many algorithms in localization, mapping, exploration, and manipulation make implicit assumptions about the fixed resolution of a depth sensor, impacting latency, energy efficiency, and accuracy. Our algorithms increase resolution in ROIs either through deconvolutions or intelligent sample distribution across the FOV. The areas of high resolution in the sensor FOV act as artificial fovea and we adaptively vary the fovea locations to maximize a well-known information theoretic measure. We demonstrate novel applications such as adaptive time-of-flight (TOF) sensing, LiDAR zoom, gradient-based LiDAR sensing, and energy-efficient LiDAR scanning. As a proof of concept, we mount the sensor on a ground robot platform, showing how to reduce robot motion to obtain a desired scanning resolution. We also present a ROS wrapper for active simulation for our novel sensor in Gazebo. Finally, we provide extensive empirical analysis of all our algorithms, demonstrating trade-offs between time, resolution and stand-off distance.

Path planning by merging random sampling (RRT) with informed heuristics (A*)

Jonathan D Gammell, Timothy D Barfoot, Siddhartha S Srinivasa, Batch Informed Trees (BIT*): Informed asymptotically optimal anytime search, The International Journal of Robotics Research. 2020;39(5):543-567, DOI: 10.1177/0278364919890396.

Path planning in robotics often requires finding high-quality solutions to continuously valued and/or high-dimensional problems. These problems are challenging and most planning algorithms instead solve simplified approximations. Popular approximations include graphs and random samples, as used by informed graph-based searches and anytime sampling-based planners, respectively.

Informed graph-based searches, such as A*, traditionally use heuristics to search a priori graphs in order of potential solution quality. This makes their search efficient, but leaves their performance dependent on the chosen approximation. If the resolution of the chosen approximation is too low, then they may not find a (suitable) solution, but if it is too high, then they may take a prohibitively long time to do so.

Anytime sampling-based planners, such as RRT*, traditionally use random sampling to approximate the problem domain incrementally. This allows them to increase resolution until a suitable solution is found, but makes their search dependent on the order of approximation. Arbitrary sequences of random samples approximate the problem domain in every direction simultaneously, but may be prohibitively inefficient at containing a solution.

This article unifies and extends these two approaches to develop Batch Informed Trees (BIT*), an informed, anytime sampling-based planner. BIT* solves continuous path planning problems efficiently by using sampling and heuristics to alternately approximate and search the problem domain. Its search is ordered by potential solution quality, as in A*, and its approximation improves indefinitely with additional computational time, as in RRT*. It is shown analytically to be almost-surely asymptotically optimal and experimentally to outperform existing sampling-based planners, especially on high-dimensional planning problems.