Category Archives: Applications Of Reinforcement Learning To Robots

Hierarchical RL with continuous options

Zhigang Huang, Quan Liu, Fei Zhu, Hierarchical reinforcement learning with adaptive scheduling for robot control, Engineering Applications of Artificial Intelligence, Volume 126, Part D, 2023 DOI: 10.1016/j.engappai.2023.107130.

Conventional hierarchical reinforcement learning (HRL) relies on discrete options to represent explicitly distinguishable knowledge, which may lead to severe performance bottlenecks. It is possible to represent richer knowledge through continuous options, but reliable scheduling methods are lacking. To design an available scheduling method for continuous options, in this paper, the hierarchical reinforcement learning with adaptive scheduling (HAS) algorithm is proposed. Its low-level controller learns diverse options, while the high-level controller schedules options to learn solutions. It achieves an adaptive balance between exploration and exploitation during the frequent scheduling of continuous options, maximizing the representation potential of continuous options. It builds on multi-step static scheduling and makes switching decisions according to the relative advantages of the previous and the estimated continuous options, enabling the agent to focus on different behaviors at different phases of the task. The expected t-step distance is applied to demonstrate the superiority of adaptive scheduling in terms of exploration. Furthermore, an interruption incentive based on annealing is proposed to alleviate excessive exploration during the early training phase, accelerating the convergence rate. Finally, we apply HAS to robot control with sparse rewards in continuous spaces, and develop a comprehensive experimental analysis scheme. The experimental results not only demonstrate the high performance and robustness of HAS, but also provide evidence that the adaptive scheduling method has a positive effect both on the representation and option policies.

RL to learn not only manipulator skills but also safety skills

A. C. Ak, E. E. Aksoy and S. Sariel, Learning Failure Prevention Skills for Safe Robot Manipulation, IEEE Robotics and Automation Letters, vol. 8, no. 12, pp. 7994-8001, Dec. 2023 DOI: 10.1109/LRA.2023.3324587.

Robots are more capable of achieving manipulation tasks for everyday activities than before. However, the safety of manipulation skills that robots employ is still an open problem. Considering all possible failures during skill learning increases the complexity of the process and restrains learning an optimal policy. Nonetheless, safety-focused modularity in the acquisition of skills has not been adequately addressed in previous works. For that purpose, we reformulate skills as base and failure prevention skills, where base skills aim at completing tasks and failure prevention skills aim at reducing the risk of failures to occur. Then, we propose a modular and hierarchical method for safe robot manipulation by augmenting base skills by learning failure prevention skills with reinforcement learning and forming a skill library to address different safety risks. Furthermore, a skill selection policy that considers estimated risks is used for the robot to select the best control policy for safe manipulation. Our experiments show that the proposed method achieves the given goal while ensuring safety by preventing failures. We also show that with the proposed method, skill learning is feasible and our safe manipulation tools can be transferred to the real environment.

Dealing with affordances in robotics through RL

X. Yang, Z. Ji, J. Wu and Y. -K. Lai, Recent Advances of Deep Robotic Affordance Learning: A Reinforcement Learning Perspective, EEE Transactions on Cognitive and Developmental Systems, vol. 15, no. 3, pp. 1139-1149, Sept. 2023 DOI: 10.1109/TCDS.2023.3277288.

As a popular concept proposed in the field of psychology, affordance has been regarded as one of the important abilities that enable humans to understand and interact with the environment. Briefly, it captures the possibilities and effects of the actions of an agent applied to a specific object or, more generally, a part of the environment. This article provides a short review of the recent developments of deep robotic affordance learning (DRAL), which aims to develop data-driven methods that use the concept of affordance to aid in robotic tasks. We first classify these papers from a reinforcement learning (RL) perspective and draw connections between RL and affordances. The technical details of each category are discussed and their limitations are identified. We further summarize them and identify future challenges from the aspects of observations, actions, affordance representation, data-collection, and real-world deployment. A final remark is given at the end to propose a promising future direction of the RL-based affordance definition to include the predictions of arbitrary action consequences.

Using “empowerment” to better select actions in RL when there are only sparse rewards

Dai, S., Xu, W., Hofmann, A. et al. An empowerment-based solution to robotic manipulation tasks with sparse rewards, Auton Robot 47, 617\u2013633 (2023) DOI: 10.1007/s10514-023-10087-8.

In order to provide adaptive and user-friendly solutions to robotic manipulation, it is important that the agent can learn to accomplish tasks even if they are only provided with very sparse instruction signals. To address the issues reinforcement learning algorithms face when task rewards are sparse, this paper proposes an intrinsic motivation approach that can be easily integrated into any standard reinforcement learning algorithm and can allow robotic manipulators to learn useful manipulation skills with only sparse extrinsic rewards. Through integrating and balancing empowerment and curiosity, this approach shows superior performance compared to other state-of-the-art intrinsic exploration approaches during extensive empirical testing. When combined with other strategies for tackling the exploration challenge, e.g. curriculum learning, our approach is able to further improve the exploration efficiency and task success rate. Qualitative analysis also shows that when combined with diversity-driven intrinsic motivations, this approach can help manipulators learn a set of diverse skills which could potentially be applied to other more complicated manipulation tasks and accelerate their learning process.

A survey of guided RL for improving its application on robotics

J. E�er, N. Bach, C. Jestel, O. Urbann and S. Kerner, Guided Reinforcement Learning: A Review and Evaluation for Efficient and Effective Real-World Robotics [Survey], IEEE Robotics & Automation Magazine, vol. 30, no. 2, pp. 67-85, June 2023 DOI: 10.1109/MRA.2022.3207664.

Recent successes aside, reinforcement learning (RL) still faces significant challenges in its application to the real-world robotics domain. Guiding the learning process with additional knowledge offers a potential solution, thus leveraging the strengths of data- and knowledge-driven approaches. However, this field of research encompasses several disciplines and hence would benefit from a structured overview.

In this article, we propose a concept of guided RL that provides a systematic approach toward accelerating the training process and improving performance for real-world robotics settings. We introduce a taxonomy that structures guided RL approaches and shows how different sources of knowledge can be integrated into the learning pipeline in a practical way. Based on this, we describe available approaches in this field and quantitatively evaluate their specific impact in terms of efficiency, effectiveness, and sim-to-real transfer within the robotics domain.

Improving safety in deep RL in the case of autonomous driving

Eduardo Candela, Olivier Doustaly, Leandro Parada, Felix Feng, Yiannis Demiris, Panagiotis Angeloudis, Risk-aware controller for autonomous vehicles using model-based collision prediction and reinforcement learning, Artificial Intelligence, Volume 320, 2023 DOI: 10.1016/j.artint.2023.103923.

Autonomous Vehicles (AVs) have the potential to save millions of lives and increase the efficiency of transportation services. However, the successful deployment of AVs requires tackling multiple challenges related to modeling and certifying safety. State-of-the-art decision-making methods usually rely on end-to-end learning or imitation learning approaches, which still pose significant safety risks. Hence the necessity of risk-aware AVs that can better predict and handle dangerous situations. Furthermore, current approaches tend to lack explainability due to their reliance on end-to-end Deep Learning, where significant causal relationships are not guaranteed to be learned from data. This paper introduces a novel risk-aware framework for training AV agents using a bespoke collision prediction model and Reinforcement Learning (RL). The collision prediction model is based on Gaussian Processes and vehicle dynamics, and is used to generate the RL state vector. Using an explicit risk model increases the post-hoc explainability of the AV agent, which is vital for reaching and certifying the high safety levels required for AVs and other safety-sensitive applications. Experimental results obtained with a simulator and state-of-the-art RL algorithms show that the risk-aware RL framework decreases average collision rates by 15%, makes AVs more robust to sudden harsh braking situations, and achieves better performance in both safety and speed when compared to a standard rule-based method (the Intelligent Driver Model). Moreover, the proposed collision prediction model outperforms other models in the literature.

See also: https://doi.org/10.1016/j.artint.2023.103922
And also: https://doi.org/10.1177/02783649231169492

Using proprioceptive, internal perceptions, in robots, with RL

Agnese Augello, Salvatore Gaglio, Ignazio Infantino, Umberto Maniscalco, Giovanni Pilato, Filippo Vella, Roboception and adaptation in a cognitive robot, Robotics and Autonomous Systems, Volume 164, 2023 DOI: 10.1016/j.robot.2023.104400.

In robotics, perception is usually oriented at understanding what is happening in the external world, while few works pay attention to what is occurring in the robot\u2019s body. In this work, we propose an artificial somatosensory system, embedded in a cognitive architecture, that enables a robot to perceive the sensations from its embodiment while executing a task. We called these perceptions roboceptions, and they let the robot act according to its own physical needs in addition to the task demands. Physical information is processed by the robot to behave in a balanced way, determining the most appropriate trade-off between the achievement of the task and its well being. The experiments show the integration of information from the somatosensory system and the choices that lead to the accomplishment of the task.

Q-learning with a variation of e-greedy to learn the optimal management of energy in autonomous vehicles navigation

Mojgan Fayyazi, Monireh Abdoos, Duong Phan, Mohsen Golafrouz, Mahdi Jalili, Reza N. Jazar, Reza Langari, Hamid Khayyam, Real-time self-adaptive Q-learning controller for energy management of conventional autonomous vehicles, Expert Systems with Applications, Volume 222, 2023 DOI: 10.1016/j.eswa.2023.119770.

Reducing emissions and energy consumption of autonomous vehicles is critical in the modern era. This paper presents an intelligent energy management system based on Reinforcement Learning (RL) for conventional autonomous vehicles. Furthermore, in order to improve the efficiency, a new exploration strategy is proposed to replace the traditional decayed \u03b5-greedy strategy in the Q-learning algorithm associated with RL. Unlike traditional Q-learning algorithms, the proposed self-adaptive Q-learning (SAQ-learning) can be applied in real-time. The learning capability of the controllers can help the vehicle deal with unknown situations in real-time. Numerical simulations show that compared to other controllers, Q-learning and SAQ-learning controllers can generate the desired engine torque based on the vehicle road power demand and control the air/fuel ratio by changing the throttle angle efficiently in real-time. Also, the proposed real-time SAQ-learning is shown to improve the operational time by 23% compared to standard Q-learning. Our simulations reveal the effectiveness of the proposed control system compared to other methods, namely dynamic programming and fuzzy logic methods.

Review of RL applied to robotic manipulation

��igo Elguea-Aguinaco, Antonio Serrano-Mu�oz, Dimitrios Chrysostomou, Ibai Inziarte-Hidalgo, Simon B�gh, Nestor Arana-Arexolaleiba, A review on reinforcement learning for contact-rich robotic manipulation tasks, Robotics and Computer-Integrated Manufacturing, Volume 81, 2023 DOI: 10.1016/j.rcim.2022.102517.

Research and application of reinforcement learning in robotics for contact-rich manipulation tasks have exploded in recent years. Its ability to cope with unstructured environments and accomplish hard-to-engineer behaviors has led reinforcement learning agents to be increasingly applied in real-life scenarios. However, there is still a long way ahead for reinforcement learning to become a core element in industrial applications. This paper examines the landscape of reinforcement learning and reviews advances in its application in contact-rich tasks from 2017 to the present. The analysis investigates the main research for the most commonly selected tasks for testing reinforcement learning algorithms in both rigid and deformable object manipulation. Additionally, the trends around reinforcement learning associated with serial manipulators are explored as well as the various technological challenges that this machine learning control technique currently presents. Lastly, based on the state-of-the-art and the commonalities among the studies, a framework relating the main concepts of reinforcement learning in contact-rich manipulation tasks is proposed. The final goal of this review is to support the robotics community in future development of systems commanded by reinforcement learning, discuss the main challenges of this technology and suggest future research directions in the domain.

Including safety learning in RL for improving the sim-to-lab gap

Kai-Chieh Hsu, Allen Z. Ren, Duy P. Nguyen, Anirudha Majumdar, Jaime F. Fisac, Sim-to-Lab-to-Real: Safe reinforcement learning with shielding and generalization guarantees, Artificial Intelligence, Volume 314, 2023 DOI: 10.1016/j.artint.2022.103811.

Safety is a critical component of autonomous systems and remains a challenge for learning-based policies to be utilized in the real world. In particular, policies learned using reinforcement learning often fail to generalize to novel environments due to unsafe behavior. In this paper, we propose Sim-to-Lab-to-Real to bridge the reality gap with a probabilistically guaranteed safety-aware policy distribution. To improve safety, we apply a dual policy setup where a performance policy is trained using the cumulative task reward and a backup (safety) policy is trained by solving the Safety Bellman Equation based on Hamilton-Jacobi (HJ) reachability analysis. In Sim-to-Lab transfer, we apply a supervisory control scheme to shield unsafe actions during exploration; in Lab-to-Real transfer, we leverage the Probably Approximately Correct (PAC)-Bayes framework to provide lower bounds on the expected performance and safety of policies in unseen environments. Additionally, inheriting from the HJ reachability analysis, the bound accounts for the expectation over the worst-case safety in each environment. We empirically study the proposed framework for ego-vision navigation in two types of indoor environments with varying degrees of photorealism. We also demonstrate strong generalization performance through hardware experiments in real indoor spaces with a quadrupedal robot. See https://sites.google.com/princeton.edu/sim-to-lab-to-real for supplementary material.