Category Archives: Robotics

Making RL safer by first learning what is a safe situation

K. Fan, Z. Chen, G. Ferrigno and E. D. Momi, Learn From Safe Experience: Safe Reinforcement Learning for Task Automation of Surgical Robot, IEEE Transactions on Artificial Intelligence, vol. 5, no. 7, pp. 3374-3383, July 2024 DOI: 10.1109/TAI.2024.3351797.

Surgical task automation in robotics can improve the outcomes, reduce quality-of-care variance among surgeons and relieve surgeons’ fatigue. Reinforcement learning (RL) methods have shown considerable performance in robot autonomous control in complex environments. However, the existing RL algorithms for surgical robots do not consider any safety requirements, which is unacceptable in automating surgical tasks. In this work, we propose an approach called safe experience reshaping (SER) that can be integrated into any offline RL algorithm. First, the method identifies and learns the geometry of constraints. Second, a safe experience is obtained by projecting an unsafe action to the tangent space of the learned geometry, which means that the action is in the safe space. Then, the collected safe experiences are used for safe policy training. We designed three tasks that closely resemble real surgical tasks including 2-D cutting tasks and a contact-rich debridement task in 3-D space to evaluate the safe RL framework. We compare our framework to five state-of-the-art (SOTA) RL methods including reward penalty and primal-dual methods. Results show that our framework gets a lower rate of constraint violations and better performance in task success, especially with a higher convergence speed.

A new software design, verification and implementation method for robotics

Li, W., Ribeiro, P., Miyazawa, A. et al., Formal design, verification and implementation of robotic controller software via RoboChart and RoboTool, Auton Robot 48, 14 (2024) DOI: 10.1007/s10514-024-10163-7.

Current practice in simulation and implementation of robot controllers is usually undertaken with guidance from high-level design diagrams and pseudocode. Thus, no rigorous connection between the design and the development of a robot controller is established. This paper presents a framework for designing robotic controllers with support for automatic generation of executable code and automatic property checking. A state-machine based notation, RoboChart, and a tool (RoboTool) that implements the automatic generation of code and mathematical models from the designed controllers are presented. We demonstrate the application of RoboChart and its related tool through a case study of a robot performing an exploration task. The automatically generated code is platform independent and is used in both simulation and two different physical robotic platforms. Properties are formally checked against the mathematical models generated by RoboTool, and further validated in the actual simulations and physical experiments. The tool not only provides engineers with a way of designing robotic controllers formally but also paves the way for correct implementation of robotic systems.

Profiling the energy consumption of AGVs

J. Leng, J. Peng, J. Liu, Y. Zhang, J. Ji and Y. Zhang, rofiling Power Consumption in Low-Speed Autonomous Guided Vehicles, IEEE Robotics and Automation Letters, vol. 9, no. 7, pp. 6027-6034, July 2024 DOI: 10.1109/LRA.2024.3396051.

The increasing demand for automation has led to a rise in the use of low-speed Autonomous guided vehicles (AGVs). However, AGVs rely on batteries for their power source, which limits their operational time and affects their overall performance. To optimize their energy usage and enhance their battery life, it is crucial to understand the power consumption behavior of AGVs. This letter presents a comprehensive study on profiling power consumption in low-speed AGVs. The previous power consumption estimation models for AGVs were mostly based on physical formulas. We introduce a data-driven power consumption estimation model for each of the main components of the AGV, including the chassis, computing platform, sensors and communication devices. By conducting three actual driving tests, we show that the MAPE in estimating instantaneous power is 4.8%, a significant 8.1% improvement compared to using a physical model. Moreover, the MAPE for energy consumption is only 1.5%, which is 6.6% better than the physical model. To demonstrate the utility of our power consumption estimation models, we conduct two case studies – one is energy-efficient path planning and the other is energy-efficient perception task interval adjustment. This study demonstrates that integrating the power consumption estimation model into path planning reduces energy consumption by over 12%. Additionally, adjusting detection interval lowers computational energy consumption by 10.1%.

Interesting testing (simulated) bed for quadrotors

Júnio Santos Bulhões, Cristiane Lopes Martins, Cristian Hansen, Márcio Rodrigues da Cunha Reis, Alana da Silva Magalhães, Antonio Paulo Coimbra, Wesley Pacheco Calixto, Platform and simulator with three degrees of freedom for testing quadcopters, Robotics and Autonomous Systems, Volume 176, 2024 DOI: 10.1016/j.robot.2024.104682.

This study aims to design a test platform for quadcopters, which allows the execution of all rotational movements and prevents translational movements without affecting the dynamics of the system. The methodological approach involves both simulation and the construction of the test platform. Two simulators are developed: (i) a linear simulator, used to assist in determining control parameters, and (ii) a nonlinear simulator, used to model the nonlinearity inherent to the rotational behavior of aircraft. In addition, the control system for the quadcopter is implemented, utilizing proportional, integral, and derivative control principles. By conducting seven experiments on the test platform and in the nonlinear simulator, the obtained results are compared in order to validate the proposed methodology. The mean discrepancy observed between the mean absolute difference obtained by the test platform and by the nonlinear simulator for the angle ϕ was 0.85°, for the angle θ was 2.77°, and for the angle ψ was 4.66°. When analyzed separately, the mean absolute errors for the angles, using the nonlinear simulator and the test platform, showed differences below 2% in almost all evaluated experiments. The developed test platform preserves the rotational dynamics of the quadcopter as desired, closely approaching the results obtained by the nonlinear simulator. Consequently, this platform can be used to carry out practical tests in a controlled environment.

Interesting improvements in MC localization

Alireza Mohseni, Vincent Duchaine, Tony Wong, Improvement in Monte Carlo localization using information theory and statistical approaches, Engineering Applications of Artificial Intelligence, Volume 131, 2024 DOI: 10.1016/j.engappai.2024.107897.

Monte Carlo localization methods deploy a particle filter to resolve a hidden Markov process based on recursive Bayesian estimation, which approximates the internal states of a dynamic system given observation data. When the observed data are corrupted by outliers, the particle filter’s performance may deteriorate, preventing the algorithm from accurately computing dynamic system states such as a robot’s position, which in turn reduces the accuracy of the localization and navigation. In this paper, the notion of information entropy is used to identify outliers. Then, a probability-based approach is used to remove the discovered outliers. In addition, a new mutation process is added to the localization algorithm to exploit the posterior probability density function in order to actively detect the high-likelihood region. The goal of incorporating the mutation operator into this method is to solve the problem of algorithm impoverishment which is due to insufficient representation of the complete probability density function. Simulation experiments are used to confirm the effectiveness of the proposed techniques. They also are employed to predict the remaining viability of a lithium-ion battery. Furthermore, in an experimental study, the modified Monte Carlo localization algorithm was applied to a mobile robot to demonstrate the local planner’s improved accuracy. The test results indicate that developed techniques are capable of effectively capturing the dynamic behavior of a system and accurately tracking its characteristics.

Learning how to reset the episode in RL

S. -H. Lee and S. -W. Seo, Self-Supervised Curriculum Generation for Autonomous Reinforcement Learning Without Task-Specific Knowledge, IEEE Robotics and Automation Letters, vol. 9, no. 5, pp. 4043-4050, May 2024 DOI: 10.1109/LRA.2024.3375714.

A significant bottleneck in applying current reinforcement learning algorithms to real-world scenarios is the need to reset the environment between every episode. This reset process demands substantial human intervention, making it difficult for the agent to learn continuously and autonomously. Several recent works have introduced autonomous reinforcement learning (ARL) algorithms that generate curricula for jointly training reset and forward policies. While their curricula can reduce the number of required manual resets by taking into account the agent’s learning progress, they rely on task-specific knowledge, such as predefined initial states or reset reward functions. In this paper, we propose a novel ARL algorithm that can generate a curriculum adaptive to the agent’s learning progress without task-specific knowledge. Our curriculum empowers the agent to autonomously reset to diverse and informative initial states. To achieve this, we introduce a success discriminator that estimates the success probability from each initial state when the agent follows the forward policy. The success discriminator is trained with relabeled transitions in a self-supervised manner. Our experimental results demonstrate that our ARL algorithm can generate an adaptive curriculum and enable the agent to efficiently bootstrap to solve sparse-reward maze navigation and manipulation tasks, outperforming baselines with significantly fewer manual resets.

Networked differential telerrobot remotely controlled in spite of disturbances and delays

Luca Nanu, Luigi Colangelo, Carlo Novara, Carlos Perez Montenegro, Embedded model control of networked control systems: An experimental robotic application, Mechatronics, Volume 99, 2024 DOI: 10.1016/j.mechatronics.2024.103160.

In Networked Control System (NCS), the absence of physical communication links in the loop leads to relevant issues, such as measurement delays and asynchronous execution of the control commands. In general, these issues may significantly compromise the performance of the NCS, possibly causing unstable behaviours. This paper presents an original approach to the design of a complete digital control unit for a system characterized by a varying sampling time and asynchronous command execution. The approach is based on the Embedded Model Control (EMC) methodology, whose key feature is the estimation of the disturbances, errors and nonlinearities affecting the plant to control and their online cancellation. In this way, measurement delays and execution asynchronicity are treated as errors and rejected up to a given frequency by the EMC unit. The effectiveness of the proposed approach is demonstrated in a real-world case-study, where the NCS consists of a differential-drive mobile robot (the plant) and a control unit, and the two subsystems communicate through the web without physical connection links. After a preliminary verification using a high-fidelity numerical simulator, the designed controller is validated in several experimental tests, carried out on a real-time embedded system incorporated in the robotic platform.

Improving EKF and UKF when diverse precision sensors are used for localization through adaptive covariances

Giseo Park, Optimal vehicle position estimation using adaptive unscented Kalman filter based on sensor fusion, Mechatronics, Volume 99, 2024 DOI: 10.1016/j.mechatronics.2024.103144.

Precise position recognition systems are actively used in various automotive technology fields such as autonomous vehicles, intelligent transportation systems, and vehicle driving safety systems. In line with this demand, this paper proposes a new vehicle position estimation algorithm based on sensor fusion between low-cost standalone global positioning system (GPS) and inertial measurement unit (IMU) sensors. In order to estimate accurate vehicle position information using two complementary sensor types, adaptive unscented Kalman filter (AUKF), an optimal state estimation algorithm, is applied to the vehicle kinematic model. Since this AUKF includes an adaptive covariance matrix whose value changes under GPS outage conditions, it has high estimation robustness even if the accuracy of the GPS measurement signal is low. Through comparison of estimation errors with both extended Kalman filter (EKF) and UKF, which are widely used state estimation algorithms, it can be confirmed how improved the estimation performance of the proposed AUKF algorithm in real-vehicle experiments is. The given test course includes roads of various shapes as well as GPS outage sections, so it is suitable for evaluating vehicle position estimation performance.

Graph NNs in RL for improving sample efficiency

Feng Zhang, Chengbin Xuan, Hak-Keung Lam, An obstacle avoidance-specific reinforcement learning method based on fuzzy attention mechanism and heterogeneous graph neural networks, Engineering Applications of Artificial Intelligence, Volume 130, 2024 DOI: 10.1016/j.engappai.2023.107764.

Deep reinforcement learning (RL) is an advancing learning tool to handle robotics control problems. However, it typically suffers from sample efficiency and effectiveness. The emergence of Graph Neural Networks (GNNs) enables the integration of the RL and graph representation learning techniques. It realises outstanding training performance and transfer capability by forming controlling scenarios into the corresponding graph domain. Nevertheless, the existing approaches strongly depend on the artificial graph formation processes with intensive bias and cannot propagate messages discriminatively on explicit physical dependence, which leads to restricted flexibility, size transfer capability and suboptimal performance. This paper proposes a fuzzy attention mechanism-based heterogeneous graph neural network (FAM-HGNN) framework for resolving the control problem under the RL context. FAM emphasises the significant connections and weakening of the trivial connections in a fully connected graph, which mitigates the potential negative influence caused by the artificial graph formation process. HGNN obtains a higher level of relational inductive bias by conducting graph propagations on a masked graph. Experimental results show that our FAM-HGNN outperforms the multi-layer perceptron-based and the existing GNN-based RL approaches regarding training performance and size transfer capability. We also conducted an ablation study and sensitivity analysis to validate the efficacy of the proposed method further.

A review of state-of-the-art path planning methods applied to autonomous driving

Mohamed Reda, Ahmed Onsy, Amira Y. Haikal, Ali Ghanbari, Path planning algorithms in the autonomous driving system: A comprehensive review, Robotics and Autonomous Systems, Volume 174, 2024 DOI: 10.1016/j.robot.2024.104630.

This comprehensive review focuses on the Autonomous Driving System (ADS), which aims to reduce human errors that are the reason for about 95% of car accidents. The ADS consists of six stages: sensors, perception, localization, assessment, path planning, and control. We explain the main state-of-the-art techniques used in each stage, analyzing 275 papers, with 162 specifically on path planning due to its complexity, NP-hard optimization nature, and pivotal role in ADS. This paper categorizes path planning techniques into three primary groups: traditional (graph-based, sampling-based, gradient-based, optimization-based, interpolation curve algorithms), machine and deep learning, and meta-heuristic optimization, detailing their advantages and drawbacks. Findings show that meta-heuristic optimization methods, representing 23% of our study, are preferred for being general problem solvers capable of handling complex problems. In addition, they have faster convergence and reduced risk of local minima. Machine and deep learning techniques, accounting for 25%, are favored for their learning capabilities and fast responses to known scenarios. The trend towards hybrid algorithms (27%) combines various methods, merging each algorithm’s benefits and overcoming the other’s drawbacks. Moreover, adaptive parameter tuning is crucial to enhance efficiency, applicability, and balancing the search capability. This review sheds light on the future of path planning in autonomous driving systems, helping to tackle current challenges and unlock the full capabilities of autonomous vehicles.