Juan-Antonio Fernández-Madrigal | kipr

On the role of the hippocampus in managing the environmental context

July 6, 2023 10:40 , Juan-Antonio Fernández-Madrigal

Andrew P. Maurer, Lynn Nadel, The Continuity of Context: A Role for the Hippocampus, . Trends in Cognitive Sciences, Volume 25, Issue 3, 2021, Pages 187-199 DOI: 10.1016/j.tics.2020.12.007.

Tracking moment-to-moment change in input and detecting change sufficient to require altering behavior is crucial to survival. Here, we discuss how the brain evaluates change over time, focusing on the hippocampus and its role in tracking context. We leverage the anatomy and physiology of the hippocampal longitudinal axis, re-entrant loops, and amorphous networks to account for stimulus equivalence and the updating of an organism’s sense of its context. Place cells have a central role in tracking contextual continuities and discontinuities across multiple scales, a capacity beyond current models of pattern separation and completion. This perspective highlights the critical role of the hippocampus in both spatial cognition and episodic memory: tracking change and detecting boundaries separating one context, or episode, from another.

Posted in: Psycho-physiological bases of engineering , Tagged: Hippocampus

Summary of the state of the art and current challenges of Deep RL in Robotics

July 3, 2023 08:43 , Juan-Antonio Fernández-Madrigal

Ibarz J, Tan J, Finn C, Kalakrishnan M, Pastor P, Levine S., How to train your robot with deep reinforcement learning: lessons we have learned, . The International Journal of Robotics Research. 2021;40(4-5):698-721 DOI: 10.1177/0278364920987859.

Deep reinforcement learning (RL) has emerged as a promising approach for autonomously acquiring complex behaviors from low-level sensor observations. Although a large portion of deep RL research has focused on applications in video games and simulated control, which does not connect with the constraints of learning in real environments, deep RL has also demonstrated promise in enabling physical robots to learn complex skills in the real world. At the same time, real-world robotics provides an appealing domain for evaluating such algorithms, as it connects directly to how humans learn: as an embodied agent in the real world. Learning to perceive and move in the real world presents numerous challenges, some of which are easier to address than others, and some of which are often not considered in RL research that focuses only on simulated domains. In this review article, we present a number of case studies involving robotic deep RL. Building off of these case studies, we discuss commonly perceived challenges in deep RL and how they have been addressed in these works. We also provide an overview of other outstanding challenges, many of which are unique to the real-world robotics setting and are not often the focus of mainstream RL research. Our goal is to provide a resource both for roboticists and machine learning researchers who are interested in furthering the progress of deep RL in the real world.

NOTES:

Interesting summary of the state of the arts and algorithms used.
Defining reward beforehand partly defeats the primary goal of learning by itself.
Re-using experiences gathered for learning a task for other tasks, since experiences are mostly task-independent.
The problem of leaving the robot unattended while learning, and of mechanism damages and wear-tear. “Learning physically requires human presence for resetting experiments, monitoring hardware status and ensuring safety”. “The majority of robot learning experiments to date were conducted on a single robot closely monitored by a single human operator. This one-to-one relation between robot and operator has been a tedious but effective way to ensure continuous and safe operation. The human can reset the scene, stop the robot in unsafe situations, and simply restart and reset the robot on failures. However, to scale up data collection efforts and increase the throughput of evaluation runs, robots need to run without human supervision. It is impractical to allocate more operators to a set-up with multiple robots, or whenever a single robot is meant to run 24/7, and especially both.” “Repeated falling, self-collisions, jerky actuation, and collisions with obstacles may damage the robot and its surroundings, which will require costly repairs and manual interventions ” “We use the term robot persistence to refer to the capability of the robot to persist in collecting data and training with minimal human intervention.”
The Reality Gap can be very important, and so the life-long adaptation. “The reality gap is a major obstacle that prevents the application of learning to robotics”. “we found that the actuator dynamics and the lack of latency modeling are the main causes of the model error” in the reality gap. “Hardware degradation, such as change of battery level, wear and tear, and hardware failure, are the major causes of dynamic changes”
Recognizing dangerous situations: section 4.11.3, even learn them.
Importance of learning bad situations together with good situations: “to add demonstration data to the data buffer for the off-policy algorithm” -> “tends to be problematic in practice, because commonly used approximate dynamic programming methods (i.e., value function estimation) need to see both good and bad experience to learn which actions are desirable. Therefore, when the demonstrations are much better than the agent’s own experience, the value function will typically learn that the demonstrated states are better, but might fail to learn which actions must be taken to reach those states.” -> can be intertwined together, mixing their results into one (“joint training”) -> better to learn the models in model-based.
Simulation is needed to reduce the effort of real learning.”In the last few years, the OpenAI Gym benchmark (Brockman et al., 2016) is the key driving force behind the development of deep RL and its application to robotics”
“Generally speaking, among model-free techniques, off-policy methods are about an order of magnitude more data efficient than on-policy methods. Model-based methods could be another order of magnitude more data efficient than their model-free counterparts.”
The presence of delays in the learning loop compromises Markovianity and thus RL performance (sect. 4.8). These delays are not covered by simulators. Compensating delay techniques are addressed in sect. 4.3.1. “Latency measures the delay from when the observation is measured at the sensor, to when the action is actually executed at the actuator. This delay is usually on the order of milliseconds to seconds, depending on the hardware and the complexity of the policy. The existence of latency means that the next state of the system does not directly depend on the measured state, but instead on the state after a delay of latency after the measurement, which is not observable. Latency violates the most fundamental assumption of MDP (Xiao et al., 2020), and thus can cause failure to some RL algorithms.” ” For model-based methods, the planning component is often computationally expensive, and incurs additional latency.”
“pretrain a policy network with demonstrations via learning (also called behavioral cloning)”
Overfitting may be a cause of worsening learning quality with more experiences.
“effective exploration is particularly challenging in tasks with sparse reward. In the most extreme version of this problem, the agent must essentially find a (high-reward) needle in a (zero-reward) haystack. Unfortunately, the most natural formulation of many practical robotics tasks has this property. For this reason, a number of prior works have focused on studying exploration for sparse-reward robotic tasks”
A main drawback of Deep RL is the need of massive data.
High sensitivity of algorithms, particularly Deep ones, to the initial state and to the way their hyperparameters are set, specially for Off-policy algorithms.
“There is a tradeoff here as more environment diversity may cause the policies to have lower performance. Often this can be alleviated with larger and better neural network architectures”

Posted in: Applications of reinforcement learning to robots , Tagged: Deep reinforcement learning, State and challenges of RL in Robotics

Learning the parameters of a robot navigator through Q-learning

June 30, 2023 12:59 , Juan-Antonio Fernández-Madrigal

Chang, L., Shan, L., Jiang, C. et al, Reinforcement based mobile robot path planning with improved dynamic window approach in unknown environment, . Auton Robot 45, 51–76 (2021) DOI: 10.1007/s10514-020-09947-4.

Mobile robot path planning in an unknown environment is a fundamental and challenging problem in the field of robotics. Dynamic window approach (DWA) is an effective method of local path planning, however some of its evaluation functions are inadequate and the algorithm for choosing the weights of these functions is lacking, which makes it highly dependent on the global reference and prone to fail in an unknown environment. In this paper, an improved DWA based on Q-learning is proposed. First, the original evaluation functions are modified and extended by adding two new evaluation functions to enhance the performance of global navigation. Then, considering the balance of effectiveness and speed, we define the state space, action space and reward function of the adopted Q-learning algorithm for the robot motion planning. After that, the parameters of the proposed DWA are adaptively learned by Q-learning and a trained agent is obtained to adapt to the unknown environment. At last, by a series of comparative simulations, the proposed method shows higher navigation efficiency and successful rate in the complex unknown environment. The proposed method is also validated in experiments based on XQ-4 Pro robot to verify its navigation capability in both static and dynamic environment.

Posted in: Robot motion planning , Tagged: Q-learning, Robot navigation

Classification with decision trees based on POMDPs

June 30, 2023 12:25 , Juan-Antonio Fernández-Madrigal

Shlomi Maliah, Guy Shani, Using POMDPs for learning cost sensitive decision trees, . Artificial Intelligence, Volume 292, 2021 DOI: 10.1016/j.artint.2020.103400.

In classification, an algorithm learns to classify a given instance based on a set of observed attribute values. In many real world cases testing the value of an attribute incurs a cost. Furthermore, there can also be a cost associated with the misclassification of an instance. Cost sensitive classification attempts to minimize the expected cost of classification, by deciding after each observed attribute value, which attribute to measure next. In this paper we suggest Partially Observable Markov Decision Processes (POMDPs) as a modeling tool for cost sensitive classification. POMDPs are typically solved through a policy over belief states. We show how a relatively small set of potentially important belief states can be identified, and define an MDP over these belief states. To identify these potentially important belief states, we construct standard decision trees over all attribute subsets, and the leaves of these trees become the state space of our tree-based MDP. At each phase we decide on the next attribute to measure, balancing the cost of the measurement and the classification accuracy. We compare our approach to a set of previous approaches, showing our approach to work better for a range of misclassification costs.

Posted in: Artificial Intelligence , Tagged: Classification as decision making, Decision trees, POMDPs

Localizing robots within pipes through RF signals

June 30, 2023 10:01 , Juan-Antonio Fernández-Madrigal

Carlos Rizzo, Teresa Seco, Jesús Espelosín, Francisco Lera, José Luis Villarroel, An alternative approach for robot localization inside pipes using RF spatial fadings, . Robotics and Autonomous Systems, Volume 136, 2021 DOI: 10.1016/j.robot.2020.103702.

Accurate robot localization represents a challenge inside pipes due to the particular conditions that characterize this type of environment. Outdoor techniques (GPS in particular) do not work at all inside metal pipes, while traditional indoor localization methods based on camera or laser sensors do not perform well mainly due to a lack of external illumination and distinctive features along pipes. Moreover, humidity and slippery surfaces make wheel odometry unreliable. In this paper, we estimate the localization of a robot along a pipe with an alternative Radio Frequency (RF) approach. We first analyze wireless propagation in metallic pipes and propose a series of setups that allow us to obtain periodic RF spatial fadings (a sort of standing wave periodic pattern), together with the influence of the antenna position and orientation over these fadings. Subsequently, we propose a discrete RF odometry-like method, by means of counting the fadings while traversing them. The transversal fading analysis (number of antennas and cross-section position) makes it possible to increase the resolution of this method. Lastly, the model of the signal is used in a continuous approach serving as an RF map. The proposed localization methods outperform our previous contributions in terms of resolution, accuracy, reliability and robustness. Experimental results demonstrate the effectiveness of the RF-based strategy without the need for a previously known map of the scenario or any substantial modification of the existing infrastructure.

Posted in: Mobile Robot Localization , Tagged: Pipe robots

Learning robot simulators

June 30, 2023 09:58 , Juan-Antonio Fernández-Madrigal

Grant W. Woodford, Mathys C. du Plessis, Bootstrapped Neuro-Simulation for complex robots, . Robotics and Autonomous Systems, Volume 136, 2021 DOI: 10.1016/j.robot.2020.103708.

Robotic simulators are often used to speed up the Evolutionary Robotics (ER) process. Most simulation approaches are based on physics modelling. However, physics-based simulators can become complex to develop and require prior knowledge of the robotic system. Robotics simulators can be constructed using Machine Learning techniques, such as Artificial Neural Networks (ANNs). ANN-based simulator development usually requires a lengthy behavioural data collection period before the simulator can be trained and used to evaluate controllers during the ER process. The Bootstrapped Neuro-Simulation (BNS) approach can be used to simultaneously collect behavioural data, train an ANN-based simulator and evolve controllers for a particular robotic problem. This paper investigates proposed improvements to the BNS approach and demonstrates the viability of the approach by optimising gait controllers for a Hexapod and Snake robot platform.

Posted in: Robot models , Tagged: Evolutionary robotics, Neural networks, Simulation

Mixing Monte-Carlo Tree Search with Q-learning for robot learning

June 30, 2023 09:48 , Juan-Antonio Fernández-Madrigal

Francesco Riccio, Roberto Capobianco, Daniele Nardi, LoOP: Iterative learning for optimistic planning on robots, . Robotics and Autonomous Systems, Volume 36, 2021 DOI: 10.1016/j.robot.2020.103693.

Efficient robotic behaviors require robustness and adaptation to dynamic changes of the environment, whose characteristics rapidly vary during robot operation. To generate effective robot action policies, planning and learning techniques have shown the most promising results. However, if considered individually, they present different limitations. Planning techniques lack generalization among similar states and require experts to define behavioral routines at different levels of abstraction. Conversely, learning methods usually require a considerable number of training samples and iterations of the algorithm. To overcome these issues, and to efficiently generate robot behaviors, we introduce LoOP, an iterative learning algorithm for optimistic planning that combines state-of-the-art planning and learning techniques to generate action policies. The main contribution of LoOP is the combination of Monte-Carlo Search Planning and Q-learning, which enables focused exploration during policy refinement in different robotic applications. We demonstrate the robustness and flexibility of LoOP in various domains and multiple robotic platforms, by validating the proposed approach with an extensive experimental evaluation.

Posted in: Applications of reinforcement learning to robots , Tagged: Monte Carlo POMDPs, Q-learning, Skill learning

Deep learning RL methods for robot navigation

June 29, 2023 10:47 , Juan-Antonio Fernández-Madrigal

Luong, M., Pham, C., Incremental Learning for Autonomous Navigation of Mobile Robots based on Deep Reinforcement Learning, . J Intell Robot Syst 101, 1 (2021) DOI: 10.1007/s10846-020-01262-5.

This paper presents an incremental learning method and system for autonomous robot navigation. The range finder laser sensor and online deep reinforcement learning are utilized for generating the navigation policy, which is effective for avoiding obstacles along the robot’s trajectories as well as for robot’s reaching the destination. An empirical experiment is conducted under simulation and real-world settings. Under the simulation environment, the results show that the proposed method can generate a highly effective navigation policy (more than 90% accuracy) after only 150k training iterations. Moreover, our system has slightly outperformed deep-Q, while having considerably surpassed Proximal Policy Optimization, two recent state-of-the art robot navigation systems. Finally, two experiments are performed to demonstrate the feasibility and effectiveness of our robot’s proposed navigation system in real-time under real-world settings.

Posted in: Applications of reinforcement learning to robots , Tagged: Deep reinforcement learning, Robot navigation

Qualitative modelling of quadcopters that is claimed to be better than reinforcement learning

June 29, 2023 10:31 , Juan-Antonio Fernández-Madrigal

Šoberl, D., Bratko, I. & Žabkar, Learning to Control a Quadcopter Qualitatively., . J Intell Robot Syst 100, 1097–1110 (2020) DOI: 10.1007/s10846-020-01228-7.

Qualitative modeling allows autonomous agents to learn comprehensible control models, formulated in a way that is close to human intuition. By abstracting away certain numerical information, qualitative models can provide better insights into operating principles of a dynamic system in comparison to traditional numerical models. We show that qualitative models, learned from numerical traces, contain enough information to allow motion planning and path following. We demonstrate our methods on the task of flying a quadcopter. A qualitative control model is learned through motor babbling. Training is significantly faster than training times reported in papers using reinforcement learning with similar quadcopter experiments. A qualitative collision-free trajectory is computed by means of qualitative simulation, and executed reactively while dynamically adapting to numerical characteristics of the system. Experiments have been conducted and assessed in the V-REP robotic simulator.

Posted in: Robot motion planning , Tagged: Quadcopters, Qualitative modelling, Reinforcement learning

Using abstraction of dimensions in RRT motion planning

June 29, 2023 10:22 , Juan-Antonio Fernández-Madrigal

Xanthidis, M., Esposito, J.M., Rekleitis, I. et al., Motion Planning by Sampling in Subspaces of Progressively Increasing Dimension, . J Intell Robot Syst 100, 777–789 (2020) DOI: 10.1007/s10846-020-01217-w.

This paper introduces an enhancement to traditional sampling-based planners, resulting in efficiency increases for high-dimensional holonomic systems such as hyper-redundant manipulators, snake-like robots, and humanoids. Despite the performance advantages of modern sampling-based motion planners, solving high dimensional planning problems in near real-time remains a considerable challenge. The proposed enhancement to popular sampling-based planning algorithms is aimed at circumventing the exponential dependence on dimensionality, by progressively exploring lower dimensional volumes of the configuration space. Extensive experiments comparing the enhanced and traditional version of RRT, RRT-Connect, and Bidirectional T-RRT on both a planar hyper-redundant manipulator and the Baxter humanoid robot show significant acceleration, up to two orders of magnitude, on computing a solution. We also explore important implementation issues in the sampling process and discuss the limitations of this method.

Posted in: Robot motion planning , Tagged: Abstraction, Dimensionality reduction, RRT

« Previous 1 … 27 28 29 30 31 … 80 Next »

Author Archives: Juan-antonio Fernández-madrigal

On the role of the hippocampus in managing the environmental context

Andrew P. Maurer, Lynn Nadel, The Continuity of Context: A Role for the Hippocampus, . Trends in Cognitive Sciences, Volume 25, Issue 3, 2021, Pages 187-199 DOI: 10.1016/j.tics.2020.12.007.

Summary of the state of the art and current challenges of Deep RL in Robotics

Ibarz J, Tan J, Finn C, Kalakrishnan M, Pastor P, Levine S., How to train your robot with deep reinforcement learning: lessons we have learned, . The International Journal of Robotics Research. 2021;40(4-5):698-721 DOI: 10.1177/0278364920987859.

Learning the parameters of a robot navigator through Q-learning

Chang, L., Shan, L., Jiang, C. et al, Reinforcement based mobile robot path planning with improved dynamic window approach in unknown environment, . Auton Robot 45, 51–76 (2021) DOI: 10.1007/s10514-020-09947-4.

Classification with decision trees based on POMDPs

Shlomi Maliah, Guy Shani, Using POMDPs for learning cost sensitive decision trees, . Artificial Intelligence, Volume 292, 2021 DOI: 10.1016/j.artint.2020.103400.

Localizing robots within pipes through RF signals

Carlos Rizzo, Teresa Seco, Jesús Espelosín, Francisco Lera, José Luis Villarroel, An alternative approach for robot localization inside pipes using RF spatial fadings, . Robotics and Autonomous Systems, Volume 136, 2021 DOI: 10.1016/j.robot.2020.103702.

Learning robot simulators

Grant W. Woodford, Mathys C. du Plessis, Bootstrapped Neuro-Simulation for complex robots, . Robotics and Autonomous Systems, Volume 136, 2021 DOI: 10.1016/j.robot.2020.103708.

Mixing Monte-Carlo Tree Search with Q-learning for robot learning

Francesco Riccio, Roberto Capobianco, Daniele Nardi, LoOP: Iterative learning for optimistic planning on robots, . Robotics and Autonomous Systems, Volume 36, 2021 DOI: 10.1016/j.robot.2020.103693.

Deep learning RL methods for robot navigation

Luong, M., Pham, C., Incremental Learning for Autonomous Navigation of Mobile Robots based on Deep Reinforcement Learning, . J Intell Robot Syst 101, 1 (2021) DOI: 10.1007/s10846-020-01262-5.

Qualitative modelling of quadcopters that is claimed to be better than reinforcement learning

Šoberl, D., Bratko, I. & Žabkar, Learning to Control a Quadcopter Qualitatively., . J Intell Robot Syst 100, 1097–1110 (2020) DOI: 10.1007/s10846-020-01228-7.

Using abstraction of dimensions in RRT motion planning

Xanthidis, M., Esposito, J.M., Rekleitis, I. et al., Motion Planning by Sampling in Subspaces of Progressively Increasing Dimension, . J Intell Robot Syst 100, 777–789 (2020) DOI: 10.1007/s10846-020-01217-w.

Post Navigation

Fields, areas and lines of research

Archives

Author Archives: Juan-antonio Fernández-madrigal

Andrew P. Maurer, Lynn Nadel, The Continuity of Context: A Role for the Hippocampus, . Trends in Cognitive Sciences, Volume 25, Issue 3, 2021, Pages 187-199 DOI: 10.1016/j.tics.2020.12.007.

Ibarz J, Tan J, Finn C, Kalakrishnan M, Pastor P, Levine S., How to train your robot with deep reinforcement learning: lessons we have learned, . The International Journal of Robotics Research. 2021;40(4-5):698-721 DOI: 10.1177/0278364920987859.

Chang, L., Shan, L., Jiang, C. et al, Reinforcement based mobile robot path planning with improved dynamic window approach in unknown environment, . Auton Robot 45, 51–76 (2021) DOI: 10.1007/s10514-020-09947-4.

Shlomi Maliah, Guy Shani, Using POMDPs for learning cost sensitive decision trees, . Artificial Intelligence, Volume 292, 2021 DOI: 10.1016/j.artint.2020.103400.

Carlos Rizzo, Teresa Seco, Jesús Espelosín, Francisco Lera, José Luis Villarroel, An alternative approach for robot localization inside pipes using RF spatial fadings, . Robotics and Autonomous Systems, Volume 136, 2021 DOI: 10.1016/j.robot.2020.103702.

Grant W. Woodford, Mathys C. du Plessis, Bootstrapped Neuro-Simulation for complex robots, . Robotics and Autonomous Systems, Volume 136, 2021 DOI: 10.1016/j.robot.2020.103708.

Francesco Riccio, Roberto Capobianco, Daniele Nardi, LoOP: Iterative learning for optimistic planning on robots, . Robotics and Autonomous Systems, Volume 36, 2021 DOI: 10.1016/j.robot.2020.103693.

Luong, M., Pham, C., Incremental Learning for Autonomous Navigation of Mobile Robots based on Deep Reinforcement Learning, . J Intell Robot Syst 101, 1 (2021) DOI: 10.1007/s10846-020-01262-5.

Šoberl, D., Bratko, I. & Žabkar, Learning to Control a Quadcopter Qualitatively., . J Intell Robot Syst 100, 1097–1110 (2020) DOI: 10.1007/s10846-020-01228-7.

Xanthidis, M., Esposito, J.M., Rekleitis, I. et al., Motion Planning by Sampling in Subspaces of Progressively Increasing Dimension, . J Intell Robot Syst 100, 777–789 (2020) DOI: 10.1007/s10846-020-01217-w.

Post Navigation

Fields, areas and lines of research

Transversal topics, methods and tools

Archives