Tag Archives: Behaviour-based Architectures

RL to learn the coordination of different goals in autonomous driving

J. Liu, J. Yin, Z. Jiang, Q. Liang and H. Li, Attention-Based Distributional Reinforcement Learning for Safe and Efficient Autonomous Driving, IEEE Robotics and Automation Letters, vol. 9, no. 9, pp. 7477-7484, Sept. 2024 DOI: 10.1109/LRA.2024.3427551.

Autonomous driving vehicles play a critical role in intelligent transportation systems and have garnered considerable attention. Currently, the popular approach in autonomous driving systems is to design separate optimal objectives for each independent module. Therefore, a major concern arises from the fact that these diverse optimal objectives may have an impact on the final driving policy. However, reinforcement learning provides a promising solution to tackle the challenge through joint training and its exploration ability. This letter aims to develop a safe and efficient reinforcement learning approach with advanced features for autonomous navigation in urban traffic scenarios. Firstly, we develop a novel distributional reinforcement learning method that integrates an implicit distribution model into an actor-critic framework. Subsequently, we introduce a spatial attention module to capture interaction features between the ego vehicle and other traffic vehicles, and design a temporal attention module to extract the long-term sequential feature. Finally, we utilize bird’s-eye-view as a context-aware representation of traffic scenarios, fused by the above spatio-temporal features. To validate our approach, we conduct experiments on the NoCrash and CoRL benchmarks, especially on our closed-loop openDD scenarios. The experimental results demonstrate the impressive performance of our approach in terms of convergence and stability compared to the baselines.

A hierarchical robot control architecture that supports learning of skills at different levels through “curriculum learning” and an interesting approach to mix behaviours

Suro, F., Ferber, J., Stratulat, T. et al., A hierarchical representation of behaviour supporting open ended development and progressive learning for artificial agents, . Auton Robot 45, 245–264 (2021) DOI: 10.1007/s10514-020-09960-7.

One of the challenging aspects of open ended or lifelong agent development is that the final behaviour for which an agent is trained at a given moment can be an element for the future creation of one, or even several, behaviours of greater complexity, whose purpose cannot be anticipated. In this paper, we present modular influence network design (MIND), an artificial agent control architecture suited to open ended and cumulative learning. The MIND architecture encapsulates sub behaviours into modules and combines them into a hierarchy reflecting the modular and hierarchical nature of complex tasks. Compared to similar research, the main original aspect of MIND is the multi layered hierarchy using a generic control signal, the influence, to obtain an efficient global behaviour. This article shows the ability of MIND to learn a curriculum of independent didactic tasks of increasing complexity covering different aspects of a desired behaviour. In so doing we demonstrate the contributions of MIND to open-ended development: encapsulation into modules allows for the preservation and re-usability of all the skills acquired during the curriculum and their focused retraining, the modular structure serves the evolving topology by easing the coordination of new sensors, actuators and heterogeneous learning structures.

How “behaviour trees” generalize the subsumption architecture and some other control architecture frameworks

M. Colledanchise and P. Ögren, “How Behavior Trees Modularize Hybrid Control Systems and Generalize Sequential Behavior Compositions, the Subsumption Architecture, and Decision Trees,” in IEEE Transactions on Robotics, vol. 33, no. 2, pp. 372-389, April 2017.DOI: 10.1109/TRO.2016.2633567.

Behavior trees (BTs) are a way of organizing the switching structure of a hybrid dynamical system (HDS), which was originally introduced in the computer game programming community. In this paper, we analyze how the BT representation increases the modularity of an HDS and how key system properties are preserved over compositions of such systems, in terms of combining two BTs into a larger one. We also show how BTs can be seen as a generalization of sequential behavior compositions, the subsumption architecture, and decisions trees. These three tools are powerful but quite different, and the fact that they are unified in a natural way in BTs might be a reason for their popularity in the gaming community. We conclude the paper by giving a set of examples illustrating how the proposed analysis tools can be applied to robot control BTs.

Robots that pre-compute a number of possible behaviours (in simulation) and then learn their performance with them (propragating that performance measures to similar behaviors through Gaussian Processes Regression) and select the best at each situation (through Bayesian Optimization), thus confronting varying environments and damages to the robot

A. Cully, et al. Robots that can adapt like animals, Nature, 521 (2015), pp. 503–507, DOI: 10.1038/nature14422.

Robots have transformed many industries, most notably manufacturing, and have the power to deliver tremendous benefits to society, such as in search and rescue, disaster response, health care and transportation. They are also invaluable tools for scientific exploration in environments inaccessible to humans, from distant planets to deep oceans. A major obstacle to their widespread adoption in more complex environments outside factories is their fragility. Whereas animals can quickly adapt to injuries, current robots cannot think outside the box to find a compensatory behaviour when they are damaged: they are limited to their pre-specified self-sensing abilities, can diagnose only anticipated failure modes, and require a pre-programmed contingency plan for every type of potential damage, an impracticality for complex robots. A promising approach to reducing robot fragility involves having robots learn appropriate behaviours in response to damage, but current techniques are slow even with small, constrained search spaces. Here we introduce an intelligent trial-and-error algorithm that allows robots to adapt to damage in less than two minutes in large search spaces without requiring self-diagnosis or pre-specified contingency plans. Before the robot is deployed, it uses a novel technique to create a detailed map of the space of high-performing behaviours. This map represents the robotâ €™ s prior knowledge about what behaviours it can perform and their value. When the robot is damaged, it uses this prior knowledge to guide a trial-and-error learning algorithm that conducts intelligent experiments to rapidly discover a behaviour that compensates for the damage. Experiments reveal successful adaptations for a legged robot injured in five different ways, including damaged, broken, and missing legs, and for a robotic arm with joints broken in 14 different ways. This new algorithm will enable more robust, effective, autonomous robots, and may shed light on the principles that animals use to adapt to injury.

Model checking for the verification of the correct functionality in the presence of sensor failures of a network of behaviours included in a robotic architecture

Lisa Kiekbusch, Christopher Armbrust, Karsten Berns, Formal verification of behaviour networks including sensor failures, Robotics and Autonomous Systems, Volume 74, Part B, December 2015, Pages 331-339, ISSN 0921-8890, DOI: 10.1016/j.robot.2015.08.002.

The paper deals with the problem of verifying behaviour-based control systems. Although failures in sensor hardware and software can have strong influences on the robot’s operation, they are often neglected in the verification process. Instead, perfect sensing is assumed. Therefore, this paper provides an approach for modelling the sensor chain in a formal way and connecting it to the formal model of the control system. The resulting model can be verified using model checking techniques, which is shown on the examples of the control systems of an autonomous indoor robot and an autonomous off-road robot.