Real-time approach to POMDPs for robot navigation

P. Cai and D. Hsu, Closing the Planning\u2013Learning Loop With Application to Autonomous Driving, IEEE Transactions on Robotics, vol. 39, no. 2, pp. 998-1011, April 2023 DOI: 10.1109/TRO.2022.3210767.

Real-time planning under uncertainty is critical for robots operating in complex dynamic environments. Consider, for example, an autonomous robot vehicle driving in dense, unregulated urban traffic of cars, motorcycles, buses, etc. The robot vehicle has to plan in both short and long terms, in order to interact with many traffic participants of uncertain intentions and drive effectively. Planning explicitly over a long time horizon, however, incurs prohibitive computational cost and is impractical under real-time constraints. To achieve real-time performance for large-scale planning, this work introduces a new algorithm Learning from Tree Search for Driving (LeTS-Drive), which integrates planning and learning in a closed loop, and applies it to autonomous driving in crowded urban traffic in simulation. Specifically, LeTS-Drive learns a policy and its value function from data provided by an online planner, which searches a sparsely sampled belief tree; the online planner in turn uses the learned policy and value functions as heuristics to scale up its run-time performance for real-time robot control. These two steps are repeated to form a closed loop so that the planner and the learner inform each other and improve in synchrony. The algorithm learns on its own in a self-supervised manner, without human effort on explicit data labeling. Experimental results demonstrate that LeTS-Drive outperforms either planning or learning alone, as well as open-loop integration of planning and learning.

Q-learning with a variation of e-greedy to learn the optimal management of energy in autonomous vehicles navigation

Mojgan Fayyazi, Monireh Abdoos, Duong Phan, Mohsen Golafrouz, Mahdi Jalili, Reza N. Jazar, Reza Langari, Hamid Khayyam, Real-time self-adaptive Q-learning controller for energy management of conventional autonomous vehicles, Expert Systems with Applications, Volume 222, 2023 DOI: 10.1016/j.eswa.2023.119770.

Reducing emissions and energy consumption of autonomous vehicles is critical in the modern era. This paper presents an intelligent energy management system based on Reinforcement Learning (RL) for conventional autonomous vehicles. Furthermore, in order to improve the efficiency, a new exploration strategy is proposed to replace the traditional decayed \u03b5-greedy strategy in the Q-learning algorithm associated with RL. Unlike traditional Q-learning algorithms, the proposed self-adaptive Q-learning (SAQ-learning) can be applied in real-time. The learning capability of the controllers can help the vehicle deal with unknown situations in real-time. Numerical simulations show that compared to other controllers, Q-learning and SAQ-learning controllers can generate the desired engine torque based on the vehicle road power demand and control the air/fuel ratio by changing the throttle angle efficiently in real-time. Also, the proposed real-time SAQ-learning is shown to improve the operational time by 23% compared to standard Q-learning. Our simulations reveal the effectiveness of the proposed control system compared to other methods, namely dynamic programming and fuzzy logic methods.

There are people working on robotic software engineering these days :-O ! (real-time included)

Arturo Laurenzi, Davide Antonucci, Nikos G. Tsagarakis, Luca Muratore, The XBot2 real-time middleware for robotics, Robotics and Autonomous Systems, Volume 163, 2023 DOI: 10.1016/j.robot.2023.104379.

This paper introduces XBot2, a novel real-time middleware for robotic applications with a strong focus on modularity and reusability of components, and seamless support for multi-threaded, mixed real-time (RT) and non-RT architectures. Compared to previous works, XBot2 focuses on providing a dynamic, ready-to-use hardware abstraction layer that allows users to make run-time queries about the robot topology, and act consequently, by leveraging an easy-to-use API that is fully RT-compatible. We provide an extensive description about implementation challenges and design decisions, and finally validate our architecture with multiple use-cases. These range from the integration of three popular simulation tools (i.e. Gazebo, PyBullet, and MuJoCo), to real-world tests involving complex, hybrid robotic platforms such as IIT\u2019s CENTAURO and MoCA robots.

Emergence of number meaning from sensorimotor experiences

Elena Sixtus, Florian Krause, Oliver Lindemann, Martin H. Fischer, A sensorimotor perspective on numerical cognition, Trends in Cognitive Sciences, Volume 27, Issue 4, 2023, Pages 367-378 DOI: 10.1016/j.tics.2023.01.002.

Numbers are present in every part of modern society and the human capacity to use numbers is unparalleled in other species. Understanding the mental and neural representations supporting this capacity is of central interest to cognitive psychology, neuroscience, and education. Embodied numerical cognition theory suggests that beyond the seemingly abstract symbols used to refer to numbers, their underlying meaning is deeply grounded in sensorimotor experiences, and that our specific understanding of numerical information is shaped by actions related to our fingers, egocentric space, and experiences with magnitudes in everyday life. We propose a sensorimotor perspective on numerical cognition in which number comprehension and numerical proficiency emerge from grounding three distinct numerical core concepts: magnitude, ordinality, and cardinality.

Survey on POMDPs for robotics

M. Lauri, D. Hsu and J. Pajarinen, Partially Observable Markov Decision Processes in Robotics: A Survey, IEEE Transactions on Robotics, vol. 39, no. 1, pp. 21-40, Feb. 2023 DOI: 10.1109/TRO.2022.3200138.

Noisy sensing, imperfect control, and environment changes are defining characteristics of many real-world robot tasks. The partially observable Markov decision process (POMDP) provides a principled mathematical framework for modeling and solving robot decision and control tasks under uncertainty. Over the last decade, it has seen many successful applications, spanning localization and navigation, search and tracking, autonomous driving, multirobot systems, manipulation, and human\u2013robot interaction. This survey aims to bridge the gap between the development of POMDP models and algorithms at one end and application to diverse robot decision tasks at the other. It analyzes the characteristics of these tasks and connects them with the mathematical and algorithmic properties of the POMDP framework for effective modeling and solution. For practitioners, the survey provides some of the key task characteristics in deciding when and how to apply POMDPs to robot tasks successfully. For POMDP algorithm designers, the survey provides new insights into the unique challenges of applying POMDPs to robot systems and points to promising new directions for further research.

Review of RL applied to robotic manipulation

��igo Elguea-Aguinaco, Antonio Serrano-Mu�oz, Dimitrios Chrysostomou, Ibai Inziarte-Hidalgo, Simon B�gh, Nestor Arana-Arexolaleiba, A review on reinforcement learning for contact-rich robotic manipulation tasks, Robotics and Computer-Integrated Manufacturing, Volume 81, 2023 DOI: 10.1016/j.rcim.2022.102517.

Research and application of reinforcement learning in robotics for contact-rich manipulation tasks have exploded in recent years. Its ability to cope with unstructured environments and accomplish hard-to-engineer behaviors has led reinforcement learning agents to be increasingly applied in real-life scenarios. However, there is still a long way ahead for reinforcement learning to become a core element in industrial applications. This paper examines the landscape of reinforcement learning and reviews advances in its application in contact-rich tasks from 2017 to the present. The analysis investigates the main research for the most commonly selected tasks for testing reinforcement learning algorithms in both rigid and deformable object manipulation. Additionally, the trends around reinforcement learning associated with serial manipulators are explored as well as the various technological challenges that this machine learning control technique currently presents. Lastly, based on the state-of-the-art and the commonalities among the studies, a framework relating the main concepts of reinforcement learning in contact-rich manipulation tasks is proposed. The final goal of this review is to support the robotics community in future development of systems commanded by reinforcement learning, discuss the main challenges of this technology and suggest future research directions in the domain.

Mapping unseen rooms by deducing them from known environment structure

Matteo Luperto, Federico Amadelli, Moreno Di Berardino, Francesco Amigoni, Mapping beyond what you can see: Predicting the layout of rooms behind closed doors, Robotics and Autonomous Systems, Volume 159, 2023 DOI: 10.1016/j.robot.2022.104282.

The availability of maps of indoor environments is often fundamental for autonomous mobile robots to efficiently operate in industrial, office, and domestic applications. When robots build such maps, some areas of interest could be inaccessible, for instance, due to closed doors. As a consequence, these areas are not represented in the maps, possibly causing limitations in robot localization and navigation. In this paper, we provide a method that completes 2D grid maps by adding the predicted layout of the rooms behind closed doors. The main idea of our approach is to exploit the underlying geometrical structure of indoor environments to estimate the shape of unobserved rooms. Results show that our method is accurate in completing maps also when large portions of environments cannot be accessed by the robot during map building. We experimentally validate the quality of the completed maps by using them to perform path planning tasks.

Using stochastic bits instead of binary logic

H. Li and Y. Chen, Hybrid Logic Computing of Binary and Stochastic, IEEE Embedded Systems Letters, vol. 14, no. 4, pp. 171-174, Dec. 2022 DOI: 10.1109/LES.2022.3170457.

Binary logic is applied internally to almost all digital signal processing and computer systems, because binary logic is direct implemented in CMOS circuits. Stochastic logic is achieved through its particular representation of data, which uses the probability of the logic level being ON to represent data. Stochastic logic computing is a type of logic computation based on stochastic bit stream instead of the binary numbers. This letter proposes a hybrid computing system of binary logic and stochastic logic, called hybrid logic. The study discusses how to generate hybrid logic circuits, and demonstrates the properties of hybrid logic circuits.

Including safety learning in RL for improving the sim-to-lab gap

Kai-Chieh Hsu, Allen Z. Ren, Duy P. Nguyen, Anirudha Majumdar, Jaime F. Fisac, Sim-to-Lab-to-Real: Safe reinforcement learning with shielding and generalization guarantees, Artificial Intelligence, Volume 314, 2023 DOI: 10.1016/j.artint.2022.103811.

Safety is a critical component of autonomous systems and remains a challenge for learning-based policies to be utilized in the real world. In particular, policies learned using reinforcement learning often fail to generalize to novel environments due to unsafe behavior. In this paper, we propose Sim-to-Lab-to-Real to bridge the reality gap with a probabilistically guaranteed safety-aware policy distribution. To improve safety, we apply a dual policy setup where a performance policy is trained using the cumulative task reward and a backup (safety) policy is trained by solving the Safety Bellman Equation based on Hamilton-Jacobi (HJ) reachability analysis. In Sim-to-Lab transfer, we apply a supervisory control scheme to shield unsafe actions during exploration; in Lab-to-Real transfer, we leverage the Probably Approximately Correct (PAC)-Bayes framework to provide lower bounds on the expected performance and safety of policies in unseen environments. Additionally, inheriting from the HJ reachability analysis, the bound accounts for the expectation over the worst-case safety in each environment. We empirically study the proposed framework for ego-vision navigation in two types of indoor environments with varying degrees of photorealism. We also demonstrate strong generalization performance through hardware experiments in real indoor spaces with a quadrupedal robot. See https://sites.google.com/princeton.edu/sim-to-lab-to-real for supplementary material.

Image resizing for achieving real-time in embedded AI

Hu, Y., Liu, S., Abdelzaher, T. et al. Real-time task scheduling with image resizing for criticality-based machine perception, Real-Time Syst 58, 430\u2013455 (2022) DOI: 10.1007/s11241-022-09387-6.

This paper extends a previous conference publication that proposed a real-time task scheduling framework for criticality-based machine perception, leveraging image resizing as the tool to control the accuracy and execution time trade-off. Criticality-based machine perception reduces the computing demand of on-board AI-based machine inference pipelines (that run on embedded hardware) in applications such as autonomous drones and cars. By segmenting inputs, such as individual video frames, into smaller parts and allowing the downstream AI-based perception module to process some segments ahead of (or at a higher quality than) others, limited machine resources are spent more judiciously on more important parts of the input (e.g., on foreground objects in lieu of backgrounds). In recent work, we explored the use of image resizing as a way to offer a middle ground between full-resolution processing and dropping, thus allowing more flexibility in handling less important parts of the input. In this journal extension, we make the following contributions: (i) We relax a limiting assumption of our prior work; namely, the need for a \u201cperfect sensor” to identify which parts of the image are more critical. Instead, we investigate the use of real LiDAR measurements for quick-and-dirty image segmentation ahead of AI-based processing. (ii) We explore another dimension of freedom in the scheduler: namely, merging several nearby objects into a consolidated segment for downstream processing. We formulate the scheduling problem as an optimal resize-merge problem and design a solution for it. Experiments on an AI-powered embedded platform with a real-world driving dataset demonstrate the practicality and effectiveness of our proposed framework.