Reinforcement learning

On the use of GPUs for parallelization of MPCs through the parallelization of symbolic mathematical expressions

January 9, 2025 16:29 , Juan-Antonio Fernández-Madrigal

S. H. Jeon, S. Hong, H. J. Lee, C. Khazoom and S. Kim, CusADi: A GPU Parallelization Framework for Symbolic Expressions and Optimal Control, IEEE Robotics and Automation Letters, vol. 10, no. 2, pp. 899-906, Feb. 2025, DOI: 10.1109/LRA.2024.3512254.

The parallelism afforded by GPUs presents significant advantages in training controllers through reinforcement learning (RL). However, integrating model-based optimization into this process remains challenging due to the complexity of formulating and solving optimization problems across thousands of instances. In this work, we present CusADi , an extension of the casadi symbolic framework to support the parallelization of arbitrary closed-form expressions on GPUs with CUDA . We also formulate a closed-form approximation for solving general optimal control problems, enabling large-scale parallelization and evaluation of MPC controllers. Our results show a ten-fold speedup relative to similar MPC implementation on the CPU, and we demonstrate the use of CusADi for various applications, including parallel simulation, parameter sweeps, and policy training.

Posted in: Computer science , Tagged: Model Predictive Control, Parallelization, Reinforcement learning

An inspiring formalization of the latest models of human emotions into RL

September 19, 2024 17:09 , Juan-Antonio Fernández-Madrigal

Aviv Emanuel, Eran Eldar, Emotions as Computations, Neuroscience & Biobehavioral Reviews, Volume 144, January 2023 DOI: 10.1016/j.neubiorev.2022.104977.

Emotions ubiquitously impact action, learning, and perception, yet their essence and role remain widely debated. Computational accounts of emotion aspire to answer these questions with greater conceptual precision informed by normative principles and neurobiological data. We examine recent progress in this regard and find that emotions may implement three classes of computations, which serve to evaluate states, actions, and uncertain prospects. For each of these, we use the formalism of reinforcement learning to offer a new formulation that better accounts for existing evidence. We then consider how these distinct computations may map onto distinct emotions and moods. Integrating extensive research on the causes and consequences of different emotions suggests a parsimonious one-to-one mapping, according to which emotions are integral to how we evaluate outcomes (pleasure & pain), learn to predict them (happiness & sadness), use them to inform our (frustration & content) and others’ (anger & gratitude) actions, and plan in order to realize (desire & hope) or avoid (fear & anxiety) uncertain outcomes.

Posted in: Psycho-physiological bases of engineering , Tagged: Emotions, Reinforcement learning

Using RL as a framework to study political issues

March 7, 2024 09:33 , Juan-Antonio Fernández-Madrigal

Lion Schulz, Rahul Bhui, Political reinforcement learners, Trends in Cognitive Sciences, Volume 28, Issue 3, 2024, Pages 210-222 DOI: 10.1016/j.tics.2023.12.001.

Politics can seem home to the most calculating and yet least rational elements of humanity. How might we systematically characterize this spectrum of political cognition? Here, we propose reinforcement learning (RL) as a unified framework to dissect the political mind. RL describes how agents algorithmically navigate complex and uncertain domains like politics. Through this computational lens, we outline three routes to political differences, stemming from variability in agents\u2019 conceptions of a problem, the cognitive operations applied to solve the problem, or the backdrop of information available from the environment. A computational vantage on maladies of the political mind offers enhanced precision in assessing their causes, consequences, and cures.

Posted in: Cognitive sciences , Tagged: Politics, Reinforcement learning

On the complexities of RL when it confronts the real (natural) world

February 9, 2024 09:34 , Juan-Antonio Fernández-Madrigal

Toby Wise, Kara Emery, Angela Radulescu, Naturalistic reinforcement learning, Trends in Cognitive Sciences, Volume 28, Issue 2, 2024, Pages 144-158 DOI: 10.1016/j.tics.2023.08.016.

Humans possess a remarkable ability to make decisions within real-world environments that are expansive, complex, and multidimensional. Human cognitive computational neuroscience has sought to exploit reinforcement learning (RL) as a framework within which to explain human decision-making, often focusing on constrained, artificial experimental tasks. In this article, we review recent efforts that use naturalistic approaches to determine how humans make decisions in complex environments that better approximate the real world, providing a clearer picture of how humans navigate the challenges posed by real-world decisions. These studies purposely embed elements of naturalistic complexity within experimental paradigms, rather than focusing on simplification, generating insights into the processes that likely underpin humans\u2019 ability to navigate complex, multidimensional real-world environments so successfully.

Posted in: Psycho-physiological bases of engineering , Tagged: Real world RL, Reinforcement learning

Including safety learning in RL for improving the sim-to-lab gap

November 17, 2023 09:59 , Juan-Antonio Fernández-Madrigal

Kai-Chieh Hsu, Allen Z. Ren, Duy P. Nguyen, Anirudha Majumdar, Jaime F. Fisac, Sim-to-Lab-to-Real: Safe reinforcement learning with shielding and generalization guarantees, Artificial Intelligence, Volume 314, 2023 DOI: 10.1016/j.artint.2022.103811.

Safety is a critical component of autonomous systems and remains a challenge for learning-based policies to be utilized in the real world. In particular, policies learned using reinforcement learning often fail to generalize to novel environments due to unsafe behavior. In this paper, we propose Sim-to-Lab-to-Real to bridge the reality gap with a probabilistically guaranteed safety-aware policy distribution. To improve safety, we apply a dual policy setup where a performance policy is trained using the cumulative task reward and a backup (safety) policy is trained by solving the Safety Bellman Equation based on Hamilton-Jacobi (HJ) reachability analysis. In Sim-to-Lab transfer, we apply a supervisory control scheme to shield unsafe actions during exploration; in Lab-to-Real transfer, we leverage the Probably Approximately Correct (PAC)-Bayes framework to provide lower bounds on the expected performance and safety of policies in unseen environments. Additionally, inheriting from the HJ reachability analysis, the bound accounts for the expectation over the worst-case safety in each environment. We empirically study the proposed framework for ego-vision navigation in two types of indoor environments with varying degrees of photorealism. We also demonstrate strong generalization performance through hardware experiments in real indoor spaces with a quadrupedal robot. See https://sites.google.com/princeton.edu/sim-to-lab-to-real for supplementary material.

Posted in: Applications of reinforcement learning to robots , Tagged: Reinforcement learning, Simulation-to-real problem

Adaptation of industrial robots to variations in tasks through RL

November 10, 2023 09:27 , Juan-Antonio Fernández-Madrigal

Tian Yu, Qing Chang, User-guided motion planning with reinforcement learning for human-robot collaboration in smart manufacturing, Expert Systems with Applications, Volume 209, 2022 DOI: 10.1016/j.eswa.2022.118291.

In today\u2019s manufacturing system, robots are expected to perform increasingly complex manipulation tasks in collaboration with humans. However, current industrial robots are still largely preprogrammed with very little autonomy and still required to be reprogramed by robotics experts for even slightly changed tasks. Therefore, it is highly desirable that robots can adapt to certain task changes with motion planning strategies to easily work with non-robotic experts in manufacturing environments. In this paper, we propose a user-guided motion planning algorithm in combination with reinforcement learning (RL) method to enable robots automatically generate their motion plans for new tasks by learning from a few kinesthetic human demonstrations. Features of common human demonstrated tasks in a specific application environment, e.g., desk assembly or warehouse loading/unloading are abstracted and saved in a library. The definition of semantical similarity between features in the library and features of a new task is proposed and further used to construct the reward function in RL. To achieve an adaptive motion plan facing task changes or new task requirements, features embedded in the library are mapped to appropriate task segments based on the trained motion planning policy using Q-learning. A new task can be either learned as a combination of a few features in the library or a requirement for further human demonstration if the current library is insufficient for the new task. We evaluate our approach on a 6 DOF UR5e robot on multiple tasks and scenarios and show the effectiveness of our method with respect to different scenarios.

Posted in: Applications of reinforcement learning to robots , Tagged: Adaptation to task variation, Learning by demonstration, Reinforcement learning

On the extended use of RL for navigation in UAVs

November 3, 2023 10:07 , Juan-Antonio Fernández-Madrigal

Fadi AlMahamid, Katarina Grolinger, Autonomous Unmanned Aerial Vehicle navigation using Reinforcement Learning: A systematic review, Engineering Applications of Artificial Intelligence, Volume 115, 2022 DOI: 10.1016/j.engappai.2022.105321.

There is an increasing demand for using Unmanned Aerial Vehicle (UAV), known as drones, in different applications such as packages delivery, traffic monitoring, search and rescue operations, and military combat engagements. In all of these applications, the UAV is used to navigate the environment autonomously \u2014 without human interaction, perform specific tasks and avoid obstacles. Autonomous UAV navigation is commonly accomplished using Reinforcement Learning (RL), where agents act as experts in a domain to navigate the environment while avoiding obstacles. Understanding the navigation environment and algorithmic limitations plays an essential role in choosing the appropriate RL algorithm to solve the navigation problem effectively. Consequently, this study first identifies the main UAV navigation tasks and discusses navigation frameworks and simulation software. Next, RL algorithms are classified and discussed based on the environment, algorithm characteristics, abilities, and applications in different UAV navigation problems, which will help the practitioners and researchers select the appropriate RL algorithms for their UAV navigation use cases. Moreover, identified gaps and opportunities will drive UAV navigation research.

Posted in: Applications of reinforcement learning to robots , Tagged: Reinforcement learning, Robot navigation, Survey, UAVs

Survey on RL applied to cyber-security

October 27, 2023 06:19 , Juan-Antonio Fernández-Madrigal

Amrin Maria Khan Adawadkar, Nilima Kulkarni, Cyber-security and reinforcement learning \u2014 A brief survey, Engineering Applications of Artificial Intelligence, Volume 114, 2022, DOI: 10.1016/j.engappai.2022.105116.

This paper presents a comprehensive literature review on Reinforcement Learning (RL) techniques used in Intrusion Detection Systems (IDS), Intrusion Prevention Systems (IPS), Internet of Things (IoT) and Identity and Access Management (IAM). This study reviews scientific documents such as journals and articles, from 2010 to 2021, extracted from the Science Direct, ACM, IEEEXplore, and Springer database. Most of the research articles published in 2020 and 2021, for cybersecurity and RL are for IDS classifiers and resource optimization in IoTs. Some datasets used for training RL-based IDS algorithms are NSL-KDD, CICIDS, and AWID. There are few datasets and publications for IAM. The few that exist focus on the physical layer authentication. The current state of the art lacks standard evaluation criteria, however, we have identified parameters like detection rate, precision, and accuracy which can be used to compare the algorithms employing RL. This paper is suitable for new researchers, students, and beginners in the field of RL who want to learn about the field and identify problem areas.

Posted in: Computer vision , Tagged: Cybersecurity, Reinforcement learning

Unexpected consequences of training smarthome systems with reinforcement learning: effects on human behaviours

October 27, 2023 06:02 , Juan-Antonio Fernández-Madrigal

S. Suman, A. Etemad and F. Rivest, TPotential Impacts of Smart Homes on Human Behavior: A Reinforcement Learning Approach, IEEE Transactions on Artificial Intelligence, vol. 3, no. 4, pp. 567-580, Aug. 2022 DOI: 10.1109/TAI.2021.3127483.

Smart homes are becoming increasingly popular as a result of advances in machine learning and cloud computing. Devices, such as smart thermostats and speakers, are now capable of learning from user feedback and adaptively adjust their settings to human preferences. Nonetheless, these devices might in turn impact human behavior. To investigate the potential impacts of smart homes on human behavior, we simulate a series of hierarchical-reinforcement learning-based human models capable of performing various activities\u2014namely, setting temperature and humidity for thermal comfort inside a Q-Learning-based smart home model. We then investigate the possibility of the human models\u2019 behaviors being altered as a result of the smart home and the human model adapting to one another. For our human model, the activities are based on hierarchical-reinforcement learning. This allows the human to learn how long it must continue a given activity and decide when to leave it. We then integrate our human model in the environment along with the smart home model and perform rigorous experiments considering various scenarios involving a model of a single human and models of two different humans with the smart home. Our experiments show that with the smart home, the human model can exhibit unexpected behaviors such as frequent changing of activities and an increase in the time required to modify the thermal preferences. With two human models, we interestingly observe that certain combinations of models result in normal behaviors, while other combinations exhibit the same unexpected behaviors as those observed from the single human experiment.

Posted in: Psycho-physiological bases of engineering , Tagged: Reinf, Reinforcement learning, Smart homes

Human+machine sequential decision making

October 27, 2023 05:59 , Juan-Antonio Fernández-Madrigal

Q. Zhang, Y. Kang, Y. -B. Zhao, P. Li and S. You, Traded Control of Human\u2013Machine Systems for Sequential Decision-Making Based on Reinforcement Learning, IEEE Transactions on Artificial Intelligence, vol. 3, no. 4, pp. 553-566, Aug. 2022 DOI: 10.1109/TAI.2021.3127857.

Sequential decision-making (SDM) is a common type of decision-making problem with sequential and multistage characteristics. Among them, the learning and updating of policy are the main challenges in solving SDM problems. Unlike previous machine autonomy driven by artificial intelligence alone, we improve the control performance of SDM tasks by combining human intelligence and machine intelligence. Specifically, this article presents a paradigm of a human\u2013machine traded control systems based on reinforcement learning methods to optimize the solution process of sequential decision problems. By designing the idea of autonomous boundary and credibility assessment, we enable humans and machines at the decision-making level of the systems to collaborate more effectively. And the arbitration in the human\u2013machine traded control systems introduces the Bayesian neural network and the dropout mechanism to consider the uncertainty and security constraints. Finally, experiments involving machine traded control, human traded control were implemented. The preliminary experimental results of this article show that our traded control method improves decision-making performance and verifies the effectiveness for SDM problems.

Posted in: Human-robot interaction , Tagged: Reinforcement learning, Sequiential Decision Making

1 2 3 … 6 Next »

Tag Archives: Reinforcement Learning

On the use of GPUs for parallelization of MPCs through the parallelization of symbolic mathematical expressions

S. H. Jeon, S. Hong, H. J. Lee, C. Khazoom and S. Kim, CusADi: A GPU Parallelization Framework for Symbolic Expressions and Optimal Control, IEEE Robotics and Automation Letters, vol. 10, no. 2, pp. 899-906, Feb. 2025, DOI: 10.1109/LRA.2024.3512254.

An inspiring formalization of the latest models of human emotions into RL

Aviv Emanuel, Eran Eldar, Emotions as Computations, Neuroscience & Biobehavioral Reviews, Volume 144, January 2023 DOI: 10.1016/j.neubiorev.2022.104977.

Using RL as a framework to study political issues

Lion Schulz, Rahul Bhui, Political reinforcement learners, Trends in Cognitive Sciences, Volume 28, Issue 3, 2024, Pages 210-222 DOI: 10.1016/j.tics.2023.12.001.

On the complexities of RL when it confronts the real (natural) world

Toby Wise, Kara Emery, Angela Radulescu, Naturalistic reinforcement learning, Trends in Cognitive Sciences, Volume 28, Issue 2, 2024, Pages 144-158 DOI: 10.1016/j.tics.2023.08.016.

Including safety learning in RL for improving the sim-to-lab gap

Kai-Chieh Hsu, Allen Z. Ren, Duy P. Nguyen, Anirudha Majumdar, Jaime F. Fisac, Sim-to-Lab-to-Real: Safe reinforcement learning with shielding and generalization guarantees, Artificial Intelligence, Volume 314, 2023 DOI: 10.1016/j.artint.2022.103811.

Adaptation of industrial robots to variations in tasks through RL

Tian Yu, Qing Chang, User-guided motion planning with reinforcement learning for human-robot collaboration in smart manufacturing, Expert Systems with Applications, Volume 209, 2022 DOI: 10.1016/j.eswa.2022.118291.

On the extended use of RL for navigation in UAVs

Fadi AlMahamid, Katarina Grolinger, Autonomous Unmanned Aerial Vehicle navigation using Reinforcement Learning: A systematic review, Engineering Applications of Artificial Intelligence, Volume 115, 2022 DOI: 10.1016/j.engappai.2022.105321.

Survey on RL applied to cyber-security

Amrin Maria Khan Adawadkar, Nilima Kulkarni, Cyber-security and reinforcement learning \u2014 A brief survey, Engineering Applications of Artificial Intelligence, Volume 114, 2022, DOI: 10.1016/j.engappai.2022.105116.

Unexpected consequences of training smarthome systems with reinforcement learning: effects on human behaviours

S. Suman, A. Etemad and F. Rivest, TPotential Impacts of Smart Homes on Human Behavior: A Reinforcement Learning Approach, IEEE Transactions on Artificial Intelligence, vol. 3, no. 4, pp. 567-580, Aug. 2022 DOI: 10.1109/TAI.2021.3127483.

Human+machine sequential decision making

Q. Zhang, Y. Kang, Y. -B. Zhao, P. Li and S. You, Traded Control of Human\u2013Machine Systems for Sequential Decision-Making Based on Reinforcement Learning, IEEE Transactions on Artificial Intelligence, vol. 3, no. 4, pp. 553-566, Aug. 2022 DOI: 10.1109/TAI.2021.3127857.

Post Navigation

Fields, areas and lines of research

Archives

Tag Archives: Reinforcement Learning

S. H. Jeon, S. Hong, H. J. Lee, C. Khazoom and S. Kim, CusADi: A GPU Parallelization Framework for Symbolic Expressions and Optimal Control, IEEE Robotics and Automation Letters, vol. 10, no. 2, pp. 899-906, Feb. 2025, DOI: 10.1109/LRA.2024.3512254.

Aviv Emanuel, Eran Eldar, Emotions as Computations, Neuroscience & Biobehavioral Reviews, Volume 144, January 2023 DOI: 10.1016/j.neubiorev.2022.104977.

Lion Schulz, Rahul Bhui, Political reinforcement learners, Trends in Cognitive Sciences, Volume 28, Issue 3, 2024, Pages 210-222 DOI: 10.1016/j.tics.2023.12.001.

Toby Wise, Kara Emery, Angela Radulescu, Naturalistic reinforcement learning, Trends in Cognitive Sciences, Volume 28, Issue 2, 2024, Pages 144-158 DOI: 10.1016/j.tics.2023.08.016.

Kai-Chieh Hsu, Allen Z. Ren, Duy P. Nguyen, Anirudha Majumdar, Jaime F. Fisac, Sim-to-Lab-to-Real: Safe reinforcement learning with shielding and generalization guarantees, Artificial Intelligence, Volume 314, 2023 DOI: 10.1016/j.artint.2022.103811.

Tian Yu, Qing Chang, User-guided motion planning with reinforcement learning for human-robot collaboration in smart manufacturing, Expert Systems with Applications, Volume 209, 2022 DOI: 10.1016/j.eswa.2022.118291.

Fadi AlMahamid, Katarina Grolinger, Autonomous Unmanned Aerial Vehicle navigation using Reinforcement Learning: A systematic review, Engineering Applications of Artificial Intelligence, Volume 115, 2022 DOI: 10.1016/j.engappai.2022.105321.

Amrin Maria Khan Adawadkar, Nilima Kulkarni, Cyber-security and reinforcement learning \u2014 A brief survey, Engineering Applications of Artificial Intelligence, Volume 114, 2022, DOI: 10.1016/j.engappai.2022.105116.

S. Suman, A. Etemad and F. Rivest, TPotential Impacts of Smart Homes on Human Behavior: A Reinforcement Learning Approach, IEEE Transactions on Artificial Intelligence, vol. 3, no. 4, pp. 567-580, Aug. 2022 DOI: 10.1109/TAI.2021.3127483.

Q. Zhang, Y. Kang, Y. -B. Zhao, P. Li and S. You, Traded Control of Human\u2013Machine Systems for Sequential Decision-Making Based on Reinforcement Learning, IEEE Transactions on Artificial Intelligence, vol. 3, no. 4, pp. 553-566, Aug. 2022 DOI: 10.1109/TAI.2021.3127857.

Post Navigation

Fields, areas and lines of research

Transversal topics, methods and tools

Archives