Category Archives: Robotics

Survey on POMDPs for robotics

M. Lauri, D. Hsu and J. Pajarinen, Partially Observable Markov Decision Processes in Robotics: A Survey, IEEE Transactions on Robotics, vol. 39, no. 1, pp. 21-40, Feb. 2023 DOI: 10.1109/TRO.2022.3200138.

Noisy sensing, imperfect control, and environment changes are defining characteristics of many real-world robot tasks. The partially observable Markov decision process (POMDP) provides a principled mathematical framework for modeling and solving robot decision and control tasks under uncertainty. Over the last decade, it has seen many successful applications, spanning localization and navigation, search and tracking, autonomous driving, multirobot systems, manipulation, and human\u2013robot interaction. This survey aims to bridge the gap between the development of POMDP models and algorithms at one end and application to diverse robot decision tasks at the other. It analyzes the characteristics of these tasks and connects them with the mathematical and algorithmic properties of the POMDP framework for effective modeling and solution. For practitioners, the survey provides some of the key task characteristics in deciding when and how to apply POMDPs to robot tasks successfully. For POMDP algorithm designers, the survey provides new insights into the unique challenges of applying POMDPs to robot systems and points to promising new directions for further research.

Review of RL applied to robotic manipulation

��igo Elguea-Aguinaco, Antonio Serrano-Mu�oz, Dimitrios Chrysostomou, Ibai Inziarte-Hidalgo, Simon B�gh, Nestor Arana-Arexolaleiba, A review on reinforcement learning for contact-rich robotic manipulation tasks, Robotics and Computer-Integrated Manufacturing, Volume 81, 2023 DOI: 10.1016/j.rcim.2022.102517.

Research and application of reinforcement learning in robotics for contact-rich manipulation tasks have exploded in recent years. Its ability to cope with unstructured environments and accomplish hard-to-engineer behaviors has led reinforcement learning agents to be increasingly applied in real-life scenarios. However, there is still a long way ahead for reinforcement learning to become a core element in industrial applications. This paper examines the landscape of reinforcement learning and reviews advances in its application in contact-rich tasks from 2017 to the present. The analysis investigates the main research for the most commonly selected tasks for testing reinforcement learning algorithms in both rigid and deformable object manipulation. Additionally, the trends around reinforcement learning associated with serial manipulators are explored as well as the various technological challenges that this machine learning control technique currently presents. Lastly, based on the state-of-the-art and the commonalities among the studies, a framework relating the main concepts of reinforcement learning in contact-rich manipulation tasks is proposed. The final goal of this review is to support the robotics community in future development of systems commanded by reinforcement learning, discuss the main challenges of this technology and suggest future research directions in the domain.

Mapping unseen rooms by deducing them from known environment structure

Matteo Luperto, Federico Amadelli, Moreno Di Berardino, Francesco Amigoni, Mapping beyond what you can see: Predicting the layout of rooms behind closed doors, Robotics and Autonomous Systems, Volume 159, 2023 DOI: 10.1016/j.robot.2022.104282.

The availability of maps of indoor environments is often fundamental for autonomous mobile robots to efficiently operate in industrial, office, and domestic applications. When robots build such maps, some areas of interest could be inaccessible, for instance, due to closed doors. As a consequence, these areas are not represented in the maps, possibly causing limitations in robot localization and navigation. In this paper, we provide a method that completes 2D grid maps by adding the predicted layout of the rooms behind closed doors. The main idea of our approach is to exploit the underlying geometrical structure of indoor environments to estimate the shape of unobserved rooms. Results show that our method is accurate in completing maps also when large portions of environments cannot be accessed by the robot during map building. We experimentally validate the quality of the completed maps by using them to perform path planning tasks.

Including safety learning in RL for improving the sim-to-lab gap

Kai-Chieh Hsu, Allen Z. Ren, Duy P. Nguyen, Anirudha Majumdar, Jaime F. Fisac, Sim-to-Lab-to-Real: Safe reinforcement learning with shielding and generalization guarantees, Artificial Intelligence, Volume 314, 2023 DOI: 10.1016/j.artint.2022.103811.

Safety is a critical component of autonomous systems and remains a challenge for learning-based policies to be utilized in the real world. In particular, policies learned using reinforcement learning often fail to generalize to novel environments due to unsafe behavior. In this paper, we propose Sim-to-Lab-to-Real to bridge the reality gap with a probabilistically guaranteed safety-aware policy distribution. To improve safety, we apply a dual policy setup where a performance policy is trained using the cumulative task reward and a backup (safety) policy is trained by solving the Safety Bellman Equation based on Hamilton-Jacobi (HJ) reachability analysis. In Sim-to-Lab transfer, we apply a supervisory control scheme to shield unsafe actions during exploration; in Lab-to-Real transfer, we leverage the Probably Approximately Correct (PAC)-Bayes framework to provide lower bounds on the expected performance and safety of policies in unseen environments. Additionally, inheriting from the HJ reachability analysis, the bound accounts for the expectation over the worst-case safety in each environment. We empirically study the proposed framework for ego-vision navigation in two types of indoor environments with varying degrees of photorealism. We also demonstrate strong generalization performance through hardware experiments in real indoor spaces with a quadrupedal robot. See https://sites.google.com/princeton.edu/sim-to-lab-to-real for supplementary material.

A RRT-based method that addresses combined task and motion planning

Riccardo Caccavale, Alberto Finzi, A rapidly-exploring random trees approach to combined task and motion planning, Robotics and Autonomous Systems, Volume 157, 2022 DOI: 10.1016/j.robot.2022.104238.

Task and motion planning in robotics are typically addressed by separated intertwined methods. Task planners generate abstract high-level actions to be executed, while motion planners provide the associated discrete movements in the configuration space satisfying kinodynamic constraints. However, these two planning processes are strictly dependent, therefore the problem of combining task and motion planning with a uniform approach is very relevant. In this work, we tackle this issue by proposing a RRT-based method that addresses combined task and motion planning. Our approach relies on a combined metric space where both symbolic (task) and sub-symbolic (motion) spaces are represented. The associated notion of distance is then exploited by a RRT-based planner to generate a plan that includes both symbolic actions and feasible movements in the configuration space. The proposed method is assessed in several case studies provided by a real-world hospital logistic scenario, where an omni-directional mobile robot is involved in navigation and transportation tasks.

Using results from belief-based planning for Bayesian inference in robotics

Farhi, E.I., Indelman, V., Bayesian incremental inference update by re-using calculations from belief space planning: a new paradigm, Auton Robot 46, 783\u2013816 (2022). DOI: 10.1007/s10514-022-10045-w.

Inference and decision making under uncertainty are key processes in every autonomous system and numerous robotic problems. In recent years, the similarities between inference and decision making triggered much work, from developing unified computational frameworks to pondering about the duality between the two. In spite of these efforts, inference and control, as well as inference and belief space planning (BSP) are still treated as two separate processes. In this paper we propose a paradigm shift, a novel approach which deviates from conventional Bayesian inference and utilizes the similarities between inference and BSP. We make the key observation that inference can be efficiently updated using predictions made during the decision making stage, even in light of inconsistent data association between the two. We developed a two staged process that implements our novel approach and updates inference using calculations from the precursory planning phase. Using autonomous navigation in an unknown environment along with iSAM2 efficient methodologies as a test case, we benchmarked our novel approach against standard Bayesian inference, both with synthetic and real-world data (KITTI dataset). Results indicate that not only our approach improves running time by at least a factor of two while providing the same estimation accuracy, but it also alleviates the computational burden of state dimensionality and loop closures.

Real-time and Bayesian-enabled ICP for mobile robot localization and mapping in a Bayesian framework

Maken FA, Ramos F, Ott L. , Bayesian iterative closest point for mobile robot localization, The International Journal of Robotics Research. 2022;41(9-10):851-874 DOI: 10.1177/02783649221101417.

Accurate localization of a robot in a known environment is a fundamental capability for successfully performing path planning, manipulation, and grasping tasks. Particle filters, also known as Monte Carlo localization (MCL), are a commonly used method to determine the robot\u2019s pose within its environment. For ground robots, noisy wheel odometry readings are typically used as a motion model to predict the vehicle\u2019s location. Such a motion model requires tuning of various parameters based on terrain and robot type. However, such an ego-motion estimation is not always available for all platforms. Scan matching using the iterative closest point (ICP) algorithm is a popular alternative approach, providing ego-motion estimates for localization. Iterative closest point computes a point estimate of the transformation between two poses given point clouds captured at these locations. Being a point estimate method, ICP does not deal with the uncertainties in the scan alignment process, which may arise due to sensor noise, partial overlap, or the existence of multiple solutions. Another challenge for ICP is the high computational cost required to align two large point clouds, limiting its applicability to less dynamic problems. In this paper, we address these challenges by leveraging recent advances in probabilistic inference. Specifically, we first address the run-time issue and propose SGD-ICP, which employs stochastic gradient descent (SGD) to solve the optimization problem of ICP. Next, we leverage SGD-ICP to obtain a distribution over transformations and propose a Markov Chain Monte Carlo method using stochastic gradient Langevin dynamics (SGLD) updates. Our ICP variant, termed Bayesian-ICP, is a full Bayesian solution to the problem. To demonstrate the benefits of Bayesian-ICP for mobile robotic applications, we propose an adaptive motion model employing Bayesian-ICP to produce proposal distributions for Monte Carlo Localization. Experiments using both Kinect and 3D LiDAR data show that our proposed SGD-ICP method achieves the same solution quality as standard ICP while being significantly more efficient. We then demonstrate empirically that Bayesian-ICP can produce accurate distributions over pose transformations and is fast enough for online applications. Finally, using Bayesian-ICP as a motion model alleviates the need to tune the motion model parameters from odometry, resulting in better-calibrated localization uncertainty.

Adaptation of industrial robots to variations in tasks through RL

Tian Yu, Qing Chang, User-guided motion planning with reinforcement learning for human-robot collaboration in smart manufacturing, Expert Systems with Applications, Volume 209, 2022 DOI: 10.1016/j.eswa.2022.118291.

In today\u2019s manufacturing system, robots are expected to perform increasingly complex manipulation tasks in collaboration with humans. However, current industrial robots are still largely preprogrammed with very little autonomy and still required to be reprogramed by robotics experts for even slightly changed tasks. Therefore, it is highly desirable that robots can adapt to certain task changes with motion planning strategies to easily work with non-robotic experts in manufacturing environments. In this paper, we propose a user-guided motion planning algorithm in combination with reinforcement learning (RL) method to enable robots automatically generate their motion plans for new tasks by learning from a few kinesthetic human demonstrations. Features of common human demonstrated tasks in a specific application environment, e.g., desk assembly or warehouse loading/unloading are abstracted and saved in a library. The definition of semantical similarity between features in the library and features of a new task is proposed and further used to construct the reward function in RL. To achieve an adaptive motion plan facing task changes or new task requirements, features embedded in the library are mapped to appropriate task segments based on the trained motion planning policy using Q-learning. A new task can be either learned as a combination of a few features in the library or a requirement for further human demonstration if the current library is insufficient for the new task. We evaluate our approach on a 6 DOF UR5e robot on multiple tasks and scenarios and show the effectiveness of our method with respect to different scenarios.

On the extended use of RL for navigation in UAVs

Fadi AlMahamid, Katarina Grolinger, Autonomous Unmanned Aerial Vehicle navigation using Reinforcement Learning: A systematic review, Engineering Applications of Artificial Intelligence, Volume 115, 2022 DOI: 10.1016/j.engappai.2022.105321.

There is an increasing demand for using Unmanned Aerial Vehicle (UAV), known as drones, in different applications such as packages delivery, traffic monitoring, search and rescue operations, and military combat engagements. In all of these applications, the UAV is used to navigate the environment autonomously \u2014 without human interaction, perform specific tasks and avoid obstacles. Autonomous UAV navigation is commonly accomplished using Reinforcement Learning (RL), where agents act as experts in a domain to navigate the environment while avoiding obstacles. Understanding the navigation environment and algorithmic limitations plays an essential role in choosing the appropriate RL algorithm to solve the navigation problem effectively. Consequently, this study first identifies the main UAV navigation tasks and discusses navigation frameworks and simulation software. Next, RL algorithms are classified and discussed based on the environment, algorithm characteristics, abilities, and applications in different UAV navigation problems, which will help the practitioners and researchers select the appropriate RL algorithms for their UAV navigation use cases. Moreover, identified gaps and opportunities will drive UAV navigation research.

Hierarchical RL with diverse methods integrated in the framework

Ye Zhou, Hann Woei Ho, Online robot guidance and navigation in non-stationary environment with hybrid Hierarchical Reinforcement Learning, Engineering Applications of Artificial Intelligence, Volume 114, 2022 DOI: 10.1016/j.engappai.2022.105152.

Hierarchical Reinforcement Learning (HRL) provides an option to solve complex guidance and navigation problems with high-dimensional spaces, multiple objectives, and a large number of states and actions. The current HRL methods often use the same or similar reinforcement learning methods within one application so that multiple objectives can be easily combined. Since there is not a single learning method that can benefit all targets, hybrid Hierarchical Reinforcement Learning (hHRL) was proposed to use various methods to optimize the learning with different types of information and objectives in one application. The previous hHRL method, however, requires manual task-specific designs, which involves engineers\u2019 preferences and may impede its transfer learning ability. This paper, therefore, proposes a systematic online guidance and navigation method under the framework of hHRL, which generalizes training samples with a function approximator, decomposes the state space automatically, and thus does not require task-specific designs. The simulation results indicate that the proposed method is superior to the previous hHRL method, which requires manual decomposition, in terms of the convergence rate and the learnt policy. It is also shown that this method is generally applicable to non-stationary environments changing over episodes and over time without the loss of efficiency even with noisy state information.