Applications of reinforcement learning to robots | kipr

Reducing the need of samples in RL through evolutionary techniques

October 10, 2024 09:22 , Juan-Antonio Fernández-Madrigal

Onori, G., Shahid, A.A., Braghin, F. et al. , Adaptive Optimization of Hyper-Parameters for Robotic Manipulation through Evolutionary Reinforcement Learning, J Intell Robot Syst 110, 108 (2024) DOI: 10.1007/s10846-024-02138-8.

Deep Reinforcement Learning applications are growing due to their capability of teaching the agent any task autonomously and generalizing the learning. However, this comes at the cost of a large number of samples and interactions with the environment. Moreover, the robustness of learned policies is usually achieved by a tedious tuning of hyper-parameters and reward functions. In order to address this issue, this paper proposes an evolutionary RL algorithm for the adaptive optimization of hyper-parameters. The policy is trained using an on-policy algorithm, Proximal Policy Optimization (PPO), coupled with an evolutionary algorithm. The achieved results demonstrate an improvement in the sample efficiency of the RL training on a robotic grasping task. In particular, the learning is improved with respect to the baseline case of a non-evolutionary agent. The evolutionary agent needs % fewer samples to completely learn the grasping task, enabled by the adaptive transfer of knowledge between the agents through the evolutionary algorithm. The proposed approach also demonstrates the possibility of updating reward parameters during training, potentially providing a general approach to creating reward functions.

Dealing with combinatorial large action spaces in RL through action masking

September 26, 2024 04:51 , Juan-Antonio Fernández-Madrigal

Z. Wu, Y. Li, W. Zhan, C. Liu, Y. -H. Liu and M. Tomizuka,Efficient Reinforcement Learning of Task Planners for Robotic Palletization Through Iterative Action Masking Learning, IEEE Robotics and Automation Letters, vol. 9, no. 11, pp. 9303-9310, Nov. 2024 DOI: 10.1109/LRA.2024.3440731.

The development of robotic systems for palletization in logistics scenarios is of paramount importance, addressing critical efficiency and precision demands in supply chain management. This paper investigates the application of Reinforcement Learning (RL) in enhancing task planning for such robotic systems. Confronted with the substantial challenge of a vast action space, which is a significant impediment to efficiently apply out-of-the-shelf RL methods, our study introduces a novel method of utilizing supervised learning to iteratively prune and manage the action space effectively. By reducing the complexity of the action space, our approach not only accelerates the learning phase but also ensures the effectiveness and reliability of the task planning in robotic palletization. The experiemental results underscore the efficacy of this method, highlighting its potential in improving the performance of RL applications in complex and high-dimensional environments like logistics palletization.

Posted in: Applications of reinforcement learning to robots

Improving explainability of deep RL in Robotics

September 19, 2024 08:32 , Juan-Antonio Fernández-Madrigal

Mehran Taghian, Shotaro Miwa, Yoshihiro Mitsuka, Johannes Günther, Shadan Golestan, Osmar Zaiane, Explainability of deep reinforcement learning algorithms in robotic domains by using Layer-wise Relevance Propagation, Engineering Applications of Artificial Intelligence, Volume 137, Part A, 2024 DOI: 10.1016/j.engappai.2024.109131.

A key component to the recent success of reinforcement learning is the introduction of neural networks for representation learning. Doing so allows for solving challenging problems in several domains, one of which is robotics. However, a major criticism of deep reinforcement learning (DRL) algorithms is their lack of explainability and interpretability. This problem is even exacerbated in robotics as they oftentimes cohabitate space with humans, making it imperative to be able to reason about their behavior. In this paper, we propose to analyze the learned representation in a robotic setting by utilizing Graph Networks (GNs). Using the GN and Layer-wise Relevance Propagation (LRP), we represent the observations as an entity-relationship to allow us to interpret the learned policy. We evaluate our approach in two environments in MuJoCo. These two environments were delicately designed to effectively measure the value of knowledge gained by our approach to analyzing learned representations. This approach allows us to analyze not only how different parts of the observation space contribute to the decision-making process but also differentiate between policies and their differences in performance. This difference in performance also allows for reasoning about the agent’s recovery from faults. These insights are key contributions to explainable deep reinforcement learning in robotic settings.

Posted in: Applications of reinforcement learning to robots , Tagged: Deep reinforcement learning, Explainability

A good survey and taxonomy for DRL in robotics

September 12, 2024 15:28 , Juan-Antonio Fernández-Madrigal

Chen Tang 1, Ben Abbatematteo 1, Jiaheng Hu 1, Rohan Chandra , Roberto Martı́n-Martı́n , Peter Stone, Deep Reinforcement Learning for Robotics: A Survey of Real-World
Successes, arXiv:2408.03539 [cs.RO] https://www.arxiv.org/abs/2408.03539.

Reinforcement learning (RL), particularly its combination with deep neural networks referred to as deep RL (DRL), has shown tremendous promise across a wide range of applications, suggesting its potential for enabling the development of sophisticated robotic behaviors. Robotics problems, however, pose fundamental difficulties for the application of RL, stemming from the complexity and cost of interacting with the physical world. This article provides a modern survey of DRL for robotics, with a particular focus on evaluating the real-world successes achieved with DRL in realizing several key robotic competencies. Our analysis aims to identify the key factors underlying those exciting successes, reveal underexplored areas, and provide an overall characterization of the status of DRL in robotics. We highlight several important avenues for future work, emphasizing the need for stable and sample-efficient real-world RL paradigms, holistic approaches for discovering and integrating various competencies to tackle complex long-horizon, open-world tasks, and principled development and evaluation procedures. This survey is designed to offer insights for both RL practitioners and roboticists toward harnessing RL’s power to create generally capable real-world robotic systems.

Posted in: Applications of reinforcement learning to robots , Tagged: Deep reinforcement learning, Survey

Safety in RL through “predictive safety filters”

September 12, 2024 08:31 , Juan-Antonio Fernández-Madrigal

Aksel Vaaler, Svein Jostein Husa, Daniel Menges, Thomas Nakken Larsen, Adil Rasheed, Modular control architecture for safe marine navigation: Reinforcement learning with predictive safety filters, Artificial Intelligence, Volume 336, 2024, DOI: 10.1016/j.artint.2024.104201.

Many autonomous systems are safety-critical, making it essential to have a closed-loop control system that satisfies constraints arising from underlying physical limitations and safety aspects in a robust manner. However, this is often challenging to achieve for real-world systems. For example, autonomous ships at sea have nonlinear and uncertain dynamics and are subject to numerous time-varying environmental disturbances such as waves, currents, and wind. There is increasing interest in using machine learning-based approaches to adapt these systems to more complex scenarios, but there are few standard frameworks that guarantee the safety and stability of such systems. Recently, predictive safety filters (PSF) have emerged as a promising method to ensure constraint satisfaction in learning-based control, bypassing the need for explicit constraint handling in the learning algorithms themselves. The safety filter approach leads to a modular separation of the problem, allowing the use of arbitrary control policies in a task-agnostic way. The filter takes in a potentially unsafe control action from the main controller and solves an optimization problem to compute a minimal perturbation of the proposed action that adheres to both physical and safety constraints. In this work, we combine reinforcement learning (RL) with predictive safety filtering in the context of marine navigation and control. The RL agent is trained on path-following and safety adherence across a wide range of randomly generated environments, while the predictive safety filter continuously monitors the agents’ proposed control actions and modifies them if necessary. The combined PSF/RL scheme is implemented on a simulated model of Cybership II, a miniature replica of a typical supply ship. Safety performance and learning rate are evaluated and compared with those of a standard, non-PSF, RL agent. It is demonstrated that the predictive safety filter is able to keep the vessel safe, while not prohibiting the learning rate and performance of the RL agent.

Posted in: Applications of reinforcement learning to robots , Tagged: Safe RL

Using physical models to guide Deep RL in robotics

September 6, 2024 09:05 , Juan-Antonio Fernández-Madrigal

X. Li, W. Shang and S. Cong, Offline Reinforcement Learning of Robotic Control Using Deep Kinematics and Dynamics, IEEE/ASME Transactions on Mechatronics, vol. 29, no. 4, pp. 2428-2439, Aug. 2024 DOI: 10.1109/TMECH.2023.3336316.

With the rapid development of deep learning, model-free reinforcement learning algorithms have achieved remarkable results in many fields. However, their high sample complexity and the potential for causing damage to environments and robots pose severe challenges for their application in real-world environments. Model-based reinforcement learning algorithms are often used to reduce the sample complexity. One limitation of these algorithms is the inevitable modeling errors. While the black-box model can fit complex state transition models, it ignores the existing knowledge of physics and robotics, especially studies of kinematic and dynamic models of the robotic manipulator. Compared with the black-box model, the physics-inspired deep models do not require specific knowledge of each system to obtain interpretable kinematic and dynamic models. In model-based reinforcement learning, these models can simulate the motion and be combined with classical controllers. This is due to their sharing the same form as traditional models, leading to higher precision tracking results. In this work, we utilize physics-inspired deep models to learn the kinematics and dynamics of a robotic manipulator. We propose a model-based offline reinforcement learning algorithm for controller parameter learning, combined with the traditional computed-torque controller. Experiments on trajectory tracking control of the Baxter manipulator, both in joint and operational space, are conducted in simulation and real environments. Experimental results demonstrate that our algorithm can significantly improve tracking accuracy and exhibits strong generalization and robustness.

Posted in: Applications of reinforcement learning to robots , Tagged: Deep reinforcement learning

Improving reward-sparse situations in RL by adding backward learning

September 6, 2024 08:02 , Juan-Antonio Fernández-Madrigal

X. Qi, D. Chen, Z. Li and X. Tan, Back-Stepping Experience Replay With Application to Model-Free Reinforcement Learning for a Soft Snake Robot, IEEE Robotics and Automation Letters, vol. 9, no. 9, pp. 7517-7524, Sept. 2024 DOI: 10.1109/LRA.2024.3427550.

In this letter, we propose a novel technique, Back-stepping Experience Replay (BER), that is compatible with arbitrary off-policy reinforcement learning (RL) algorithms. BER aims to enhance learning efficiency in systems with approximate reversibility, reducing the need for complex reward shaping. The method constructs reversed trajectories using back-stepping transitions to reach random or fixed targets. Interpretable as a bi-directional approach, BER addresses inaccuracies in back-stepping transitions through a purification of the replay experience during learning. Given the intricate nature of soft robots and their complex interactions with environments, we present an application of BER in a model-free RL approach for the locomotion and navigation of a soft snake robot, which is capable of serpentine motion enabled by anisotropic friction between the body and ground. In addition, a dynamic simulator is developed to assess the effectiveness and efficiency of the BER algorithm, in which the robot demonstrates successful learning (reaching a 100% success rate) and adeptly reaches random targets, achieving an average speed 48% faster than that of the best baseline approach.

Posted in: Applications of reinforcement learning to robots , Tagged: Deep reinforcement learning, Sparse rewards

Avoiding the sim-to-real RL transfer problem through learning the parameters of a physical system

September 6, 2024 07:49 , Juan-Antonio Fernández-Madrigal

Viktor Wiberg, Erik Wallin, Arvid Fälldin, Tobias Semberg, Morgan Rossander, Eddie Wadbro, Martin Servin, Sim-to-real transfer of active suspension control using deep reinforcement learning, Robotics and Autonomous Systems, Volume 179, 2024 DOI: 10.1016/j.robot.2024.104731.

We explore sim-to-real transfer of deep reinforcement learning controllers for a heavy vehicle with active suspensions designed for traversing rough terrain. While related research primarily focuses on lightweight robots with electric motors and fast actuation, this study uses a forestry vehicle with a complex hydraulic driveline and slow actuation. We simulate the vehicle using multibody dynamics and apply system identification to find an appropriate set of simulation parameters. We then train policies in simulation using various techniques to mitigate the sim-to-real gap, including domain randomization, action delays, and a reward penalty to encourage smooth control. In reality, the policies trained with action delays and a penalty for erratic actions perform nearly at the same level as in simulation. In experiments on level ground, the motion trajectories closely overlap when turning to either side, as well as in a route tracking scenario. When faced with a ramp that requires active use of the suspensions, the simulated and real motions are in close alignment. This shows that the actuator model together with system identification yields a sufficiently accurate model of the actuators. We observe that policies trained without the additional action penalty exhibit fast switching or bang–bang control. These present smooth motions and high performance in simulation but transfer poorly to reality. We find that policies make marginal use of the local height map for perception, showing no indications of predictive planning. However, the strong transfer capabilities entail that further development concerning perception and performance can be largely confined to simulation.

Posted in: Applications of reinforcement learning to robots , Tagged: Deep reinforcement learning, Simulation-to-real problem

RL to learn the coordination of different goals in autonomous driving

July 29, 2024 08:47 , Juan-Antonio Fernández-Madrigal

J. Liu, J. Yin, Z. Jiang, Q. Liang and H. Li, Attention-Based Distributional Reinforcement Learning for Safe and Efficient Autonomous Driving, IEEE Robotics and Automation Letters, vol. 9, no. 9, pp. 7477-7484, Sept. 2024 DOI: 10.1109/LRA.2024.3427551.

Autonomous driving vehicles play a critical role in intelligent transportation systems and have garnered considerable attention. Currently, the popular approach in autonomous driving systems is to design separate optimal objectives for each independent module. Therefore, a major concern arises from the fact that these diverse optimal objectives may have an impact on the final driving policy. However, reinforcement learning provides a promising solution to tackle the challenge through joint training and its exploration ability. This letter aims to develop a safe and efficient reinforcement learning approach with advanced features for autonomous navigation in urban traffic scenarios. Firstly, we develop a novel distributional reinforcement learning method that integrates an implicit distribution model into an actor-critic framework. Subsequently, we introduce a spatial attention module to capture interaction features between the ego vehicle and other traffic vehicles, and design a temporal attention module to extract the long-term sequential feature. Finally, we utilize bird’s-eye-view as a context-aware representation of traffic scenarios, fused by the above spatio-temporal features. To validate our approach, we conduct experiments on the NoCrash and CoRL benchmarks, especially on our closed-loop openDD scenarios. The experimental results demonstrate the impressive performance of our approach in terms of convergence and stability compared to the baselines.

Posted in: Applications of reinforcement learning to robots , Tagged: Autonomous vehicles, Behaviour-based architectures

Making RL safer by first learning what is a safe situation

July 18, 2024 12:38 , Juan-Antonio Fernández-Madrigal

K. Fan, Z. Chen, G. Ferrigno and E. D. Momi, Learn From Safe Experience: Safe Reinforcement Learning for Task Automation of Surgical Robot, IEEE Transactions on Artificial Intelligence, vol. 5, no. 7, pp. 3374-3383, July 2024 DOI: 10.1109/TAI.2024.3351797.

Surgical task automation in robotics can improve the outcomes, reduce quality-of-care variance among surgeons and relieve surgeons’ fatigue. Reinforcement learning (RL) methods have shown considerable performance in robot autonomous control in complex environments. However, the existing RL algorithms for surgical robots do not consider any safety requirements, which is unacceptable in automating surgical tasks. In this work, we propose an approach called safe experience reshaping (SER) that can be integrated into any offline RL algorithm. First, the method identifies and learns the geometry of constraints. Second, a safe experience is obtained by projecting an unsafe action to the tangent space of the learned geometry, which means that the action is in the safe space. Then, the collected safe experiences are used for safe policy training. We designed three tasks that closely resemble real surgical tasks including 2-D cutting tasks and a contact-rich debridement task in 3-D space to evaluate the safe RL framework. We compare our framework to five state-of-the-art (SOTA) RL methods including reward penalty and primal-dual methods. Results show that our framework gets a lower rate of constraint violations and better performance in task success, especially with a higher convergence speed.

Posted in: Applications of reinforcement learning to robots , Tagged: Safe RL

« Previous 1 2 3 4 … 8 Next »

Category Archives: Applications Of Reinforcement Learning To Robots

Reducing the need of samples in RL through evolutionary techniques

Onori, G., Shahid, A.A., Braghin, F. et al. , Adaptive Optimization of Hyper-Parameters for Robotic Manipulation through Evolutionary Reinforcement Learning, J Intell Robot Syst 110, 108 (2024) DOI: 10.1007/s10846-024-02138-8.

Dealing with combinatorial large action spaces in RL through action masking

Z. Wu, Y. Li, W. Zhan, C. Liu, Y. -H. Liu and M. Tomizuka,Efficient Reinforcement Learning of Task Planners for Robotic Palletization Through Iterative Action Masking Learning, IEEE Robotics and Automation Letters, vol. 9, no. 11, pp. 9303-9310, Nov. 2024 DOI: 10.1109/LRA.2024.3440731.

Improving explainability of deep RL in Robotics

A good survey and taxonomy for DRL in robotics

Chen Tang 1, Ben Abbatematteo 1, Jiaheng Hu 1, Rohan Chandra , Roberto Martı́n-Martı́n , Peter Stone, Deep Reinforcement Learning for Robotics: A Survey of Real-World
Successes, arXiv:2408.03539 [cs.RO] https://www.arxiv.org/abs/2408.03539.

Safety in RL through “predictive safety filters”

Aksel Vaaler, Svein Jostein Husa, Daniel Menges, Thomas Nakken Larsen, Adil Rasheed, Modular control architecture for safe marine navigation: Reinforcement learning with predictive safety filters, Artificial Intelligence, Volume 336, 2024, DOI: 10.1016/j.artint.2024.104201.

Using physical models to guide Deep RL in robotics

X. Li, W. Shang and S. Cong, Offline Reinforcement Learning of Robotic Control Using Deep Kinematics and Dynamics, IEEE/ASME Transactions on Mechatronics, vol. 29, no. 4, pp. 2428-2439, Aug. 2024 DOI: 10.1109/TMECH.2023.3336316.

Improving reward-sparse situations in RL by adding backward learning

X. Qi, D. Chen, Z. Li and X. Tan, Back-Stepping Experience Replay With Application to Model-Free Reinforcement Learning for a Soft Snake Robot, IEEE Robotics and Automation Letters, vol. 9, no. 9, pp. 7517-7524, Sept. 2024 DOI: 10.1109/LRA.2024.3427550.

Avoiding the sim-to-real RL transfer problem through learning the parameters of a physical system

Viktor Wiberg, Erik Wallin, Arvid Fälldin, Tobias Semberg, Morgan Rossander, Eddie Wadbro, Martin Servin, Sim-to-real transfer of active suspension control using deep reinforcement learning, Robotics and Autonomous Systems, Volume 179, 2024 DOI: 10.1016/j.robot.2024.104731.

RL to learn the coordination of different goals in autonomous driving

J. Liu, J. Yin, Z. Jiang, Q. Liang and H. Li, Attention-Based Distributional Reinforcement Learning for Safe and Efficient Autonomous Driving, IEEE Robotics and Automation Letters, vol. 9, no. 9, pp. 7477-7484, Sept. 2024 DOI: 10.1109/LRA.2024.3427551.

Making RL safer by first learning what is a safe situation

K. Fan, Z. Chen, G. Ferrigno and E. D. Momi, Learn From Safe Experience: Safe Reinforcement Learning for Task Automation of Surgical Robot, IEEE Transactions on Artificial Intelligence, vol. 5, no. 7, pp. 3374-3383, July 2024 DOI: 10.1109/TAI.2024.3351797.

Post Navigation

Fields, areas and lines of research

Archives

Category Archives: Applications Of Reinforcement Learning To Robots

Onori, G., Shahid, A.A., Braghin, F. et al. , Adaptive Optimization of Hyper-Parameters for Robotic Manipulation through Evolutionary Reinforcement Learning, J Intell Robot Syst 110, 108 (2024) DOI: 10.1007/s10846-024-02138-8.

Z. Wu, Y. Li, W. Zhan, C. Liu, Y. -H. Liu and M. Tomizuka,Efficient Reinforcement Learning of Task Planners for Robotic Palletization Through Iterative Action Masking Learning, IEEE Robotics and Automation Letters, vol. 9, no. 11, pp. 9303-9310, Nov. 2024 DOI: 10.1109/LRA.2024.3440731.

Chen Tang 1, Ben Abbatematteo 1, Jiaheng Hu 1, Rohan Chandra , Roberto Martı́n-Martı́n , Peter Stone, Deep Reinforcement Learning for Robotics: A Survey of Real-World Successes, arXiv:2408.03539 [cs.RO] https://www.arxiv.org/abs/2408.03539.

Aksel Vaaler, Svein Jostein Husa, Daniel Menges, Thomas Nakken Larsen, Adil Rasheed, Modular control architecture for safe marine navigation: Reinforcement learning with predictive safety filters, Artificial Intelligence, Volume 336, 2024, DOI: 10.1016/j.artint.2024.104201.

X. Li, W. Shang and S. Cong, Offline Reinforcement Learning of Robotic Control Using Deep Kinematics and Dynamics, IEEE/ASME Transactions on Mechatronics, vol. 29, no. 4, pp. 2428-2439, Aug. 2024 DOI: 10.1109/TMECH.2023.3336316.

X. Qi, D. Chen, Z. Li and X. Tan, Back-Stepping Experience Replay With Application to Model-Free Reinforcement Learning for a Soft Snake Robot, IEEE Robotics and Automation Letters, vol. 9, no. 9, pp. 7517-7524, Sept. 2024 DOI: 10.1109/LRA.2024.3427550.

Viktor Wiberg, Erik Wallin, Arvid Fälldin, Tobias Semberg, Morgan Rossander, Eddie Wadbro, Martin Servin, Sim-to-real transfer of active suspension control using deep reinforcement learning, Robotics and Autonomous Systems, Volume 179, 2024 DOI: 10.1016/j.robot.2024.104731.

J. Liu, J. Yin, Z. Jiang, Q. Liang and H. Li, Attention-Based Distributional Reinforcement Learning for Safe and Efficient Autonomous Driving, IEEE Robotics and Automation Letters, vol. 9, no. 9, pp. 7477-7484, Sept. 2024 DOI: 10.1109/LRA.2024.3427551.

K. Fan, Z. Chen, G. Ferrigno and E. D. Momi, Learn From Safe Experience: Safe Reinforcement Learning for Task Automation of Surgical Robot, IEEE Transactions on Artificial Intelligence, vol. 5, no. 7, pp. 3374-3383, July 2024 DOI: 10.1109/TAI.2024.3351797.

Post Navigation

Fields, areas and lines of research

Transversal topics, methods and tools

Archives

Chen Tang 1, Ben Abbatematteo 1, Jiaheng Hu 1, Rohan Chandra , Roberto Martı́n-Martı́n , Peter Stone, Deep Reinforcement Learning for Robotics: A Survey of Real-World
Successes, arXiv:2408.03539 [cs.RO] https://www.arxiv.org/abs/2408.03539.