Category Archives: Robotics

Graph NNs in RL for improving sample efficiency

Feng Zhang, Chengbin Xuan, Hak-Keung Lam, An obstacle avoidance-specific reinforcement learning method based on fuzzy attention mechanism and heterogeneous graph neural networks, Engineering Applications of Artificial Intelligence, Volume 130, 2024 DOI: 10.1016/j.engappai.2023.107764.

Deep reinforcement learning (RL) is an advancing learning tool to handle robotics control problems. However, it typically suffers from sample efficiency and effectiveness. The emergence of Graph Neural Networks (GNNs) enables the integration of the RL and graph representation learning techniques. It realises outstanding training performance and transfer capability by forming controlling scenarios into the corresponding graph domain. Nevertheless, the existing approaches strongly depend on the artificial graph formation processes with intensive bias and cannot propagate messages discriminatively on explicit physical dependence, which leads to restricted flexibility, size transfer capability and suboptimal performance. This paper proposes a fuzzy attention mechanism-based heterogeneous graph neural network (FAM-HGNN) framework for resolving the control problem under the RL context. FAM emphasises the significant connections and weakening of the trivial connections in a fully connected graph, which mitigates the potential negative influence caused by the artificial graph formation process. HGNN obtains a higher level of relational inductive bias by conducting graph propagations on a masked graph. Experimental results show that our FAM-HGNN outperforms the multi-layer perceptron-based and the existing GNN-based RL approaches regarding training performance and size transfer capability. We also conducted an ablation study and sensitivity analysis to validate the efficacy of the proposed method further.

A review of state-of-the-art path planning methods applied to autonomous driving

Mohamed Reda, Ahmed Onsy, Amira Y. Haikal, Ali Ghanbari, Path planning algorithms in the autonomous driving system: A comprehensive review, Robotics and Autonomous Systems, Volume 174, 2024 DOI: 10.1016/j.robot.2024.104630.

This comprehensive review focuses on the Autonomous Driving System (ADS), which aims to reduce human errors that are the reason for about 95% of car accidents. The ADS consists of six stages: sensors, perception, localization, assessment, path planning, and control. We explain the main state-of-the-art techniques used in each stage, analyzing 275 papers, with 162 specifically on path planning due to its complexity, NP-hard optimization nature, and pivotal role in ADS. This paper categorizes path planning techniques into three primary groups: traditional (graph-based, sampling-based, gradient-based, optimization-based, interpolation curve algorithms), machine and deep learning, and meta-heuristic optimization, detailing their advantages and drawbacks. Findings show that meta-heuristic optimization methods, representing 23% of our study, are preferred for being general problem solvers capable of handling complex problems. In addition, they have faster convergence and reduced risk of local minima. Machine and deep learning techniques, accounting for 25%, are favored for their learning capabilities and fast responses to known scenarios. The trend towards hybrid algorithms (27%) combines various methods, merging each algorithm’s benefits and overcoming the other’s drawbacks. Moreover, adaptive parameter tuning is crucial to enhance efficiency, applicability, and balancing the search capability. This review sheds light on the future of path planning in autonomous driving systems, helping to tackle current challenges and unlock the full capabilities of autonomous vehicles.

Integrating symbolic (common sense) reasoning and probabilistic planning (POMDPs) in robots

Shiqi Zhang, Piyush Khandelwal, Peter Stone, iCORPP: Interleaved commonsense reasoning and probabilistic planning on robots, Robotics and Autonomous Systems, Volume 174, 2024 DOI: 10.1016/j.robot.2023.104613.

Robot sequential decision-making in the real world is a challenge because it requires the robots to simultaneously reason about the current world state and dynamics, while planning actions to accomplish complex tasks. On the one hand, declarative languages and reasoning algorithms support representing and reasoning with commonsense knowledge. But these algorithms are not good at planning actions toward maximizing cumulative reward over a long, unspecified horizon. On the other hand, probabilistic planning frameworks, such as Markov decision processes (MDPs) and partially observable MDPs (POMDPs), support planning to achieve long-term goals under uncertainty. But they are ill-equipped to represent or reason about knowledge that is not directly related to actions. In this article, we present an algorithm, called iCORPP, to simultaneously estimate the current world state, reason about world dynamics, and construct task-oriented controllers. In this process, robot decision-making problems are decomposed into two interdependent (smaller) subproblems that focus on reasoning to “understand the world” and planning to “achieve the goal” respectively. The developed algorithm has been implemented and evaluated both in simulation and on real robots using everyday service tasks, such as indoor navigation, and dialog management. Results show significant improvements in scalability, efficiency, and adaptiveness, compared to competitive baselines including handcrafted action policies.

Correcting systematic and non-systematic errors in odometry

Bibiana Fari�a, Jonay Toledo, Leopoldo Acosta, Improving odometric sensor performance by real-time error processing and variable covariance, Mechatronics, Volume 98, 2024 DOI: 10.1016/j.mechatronics.2023.103123.

This paper presents a new method to increase odometric sensor accuracy by systematic and non-systematic errors processing. Mobile robot localization is improved combining this technique with a filter that fuses the information from several sensors characterized by their covariance. The process focuses on calculating the odometric speed difference with respect to the filter to implement an error type detection module in real time. The correction of systematic errors consists in an online parameter adjustment using the previous information and conditioned by the filter accuracy. This data is also applied to design a variable odometric covariance which describes the sensor reliability and determines the influence of both errors on the robot localization. The method is implemented in a low-cost autonomous wheelchair with a LIDAR, IMU and encoders fused by an UKF algorithm. The experimental results prove that the estimated poses are closer to the real ones than using other well-known previous methods.

Survey on methods for learning from demonstration in robotics

M. Tavassoli, S. Katyara, M. Pozzi, N. Deshpande, D. G. Caldwell and D. Prattichizzo, Learning Skills From Demonstrations: A Trend From Motion Primitives to Experience Abstraction, IEEE Transactions on Cognitive and Developmental Systems, vol. 16, no. 1, pp. 57-74, Feb. 20248 DOI: 10.1109/TCDS.2023.3296166.

The uses of robots are changing from static environments in factories to encompass novel concepts such as human\u2013robot collaboration in unstructured settings. Preprogramming all the functionalities for robots becomes impractical, and hence, robots need to learn how to react to new events autonomously, just like humans. However, humans, unlike machines, are naturally skilled in responding to unexpected circumstances based on either experiences or observations. Hence, embedding such anthropoid behaviors into robots entails the development of neuro-cognitive models that emulate motor skills under a robot learning paradigm. Effective encoding of these skills is bound to the proper choice of tools and techniques. This survey paper studies different motion and behavior learning methods ranging from movement primitives (MPs) to experience abstraction (EA), applied to different robotic tasks. These methods are scrutinized and then experimentally benchmarked by reconstructing a standard pick-n-place task. Apart from providing a standard guideline for the selection of strategies and algorithms, this article aims to draw a perspective on their possible extensions and improvements.

Particle grid maps

G. Chen, W. Dong, P. Peng, J. Alonso-Mora and X. Zhu, Continuous Occupancy Mapping in Dynamic Environments Using Particles, IEEE Transactions on Robotics, vol. 40, pp. 64-84, 2024 DOI: 10.1109/TRO.2023.3323841.

Particle-based dynamic occupancy maps were proposed in recent years to model the obstacles in dynamic environments. Current particle-based maps describe the occupancy status in discrete grid form and suffer from the grid size problem, wherein a large grid size is unfavorable for motion planning while a small grid size lowers efficiency and causes gaps and inconsistencies. To tackle this problem, this article generalizes the particle-based map into continuous space and builds an efficient 3-D egocentric local map. A dual-structure subspace division paradigm, composed of a voxel subspace division and a novel pyramid-like subspace division, is proposed to propagate particles and update the map efficiently with the consideration of occlusions. The occupancy status at an arbitrary point in the map space can then be estimated with the weights of the particles. To reduce the noise in modeling static and dynamic obstacles simultaneously, an initial velocity estimation approach and a mixture model are utilized. Experimental results show that our map can effectively and efficiently model both dynamic obstacles and static obstacles. Compared to the state-of-the-art grid-form particle-based map, our map enables continuous occupancy estimation and substantially improves the mapping performance at different resolutions.

Offline RL in robotics

L. Yao, B. Zhao, X. Xu, Z. Wang, P. K. Wong and Y. Hu, Efficient Incremental Offline Reinforcement Learning With Sparse Broad Critic Approximation, IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 54, no. 1, pp. 156-169, Jan. 2024 DOI: 10.1109/TSMC.2023.3305498.

Offline reinforcement learning (ORL) has been getting increasing attention in robot learning, benefiting from its ability to avoid hazardous exploration and learn policies directly from precollected samples. Approximate policy iteration (API) is one of the most commonly investigated ORL approaches in robotics, due to its linear representation of policies, which makes it fairly transparent in both theoretical and engineering analysis. One open problem of API is how to design efficient and effective basis functions. The broad learning system (BLS) has been extensively studied in supervised and unsupervised learning in various applications. However, few investigations have been conducted on ORL. In this article, a novel incremental ORL approach with sparse broad critic approximation (BORL) is proposed with the advantages of BLS, which approximates the critic function in a linear manner with randomly projected sparse and compact features and dynamically expands its broad structure. The BORL is the first extension of API with BLS in the field of robotics and ORL. The approximation ability and convergence performance of BORL are also analyzed. Comprehensive simulation studies are then conducted on two benchmarks, and the results demonstrate that the proposed BORL can obtain comparable or better performance than conventional API methods without laborious hyperparameter fine-tuning work. To further demonstrate the effectiveness of BORL in practical robotic applications, a variable force tracking problem in robotic ultrasound scanning (RUSS) is investigated, and a learning-based adaptive impedance control (LAIC) algorithm is proposed based on BORL. The experimental results demonstrate the advantages of LAIC compared with conventional force tracking methods.

See also: X. Wang, D. Hou, L. Huang and Y. Cheng, “Offline\u2013Online Actor\u2013Critic,” in IEEE Transactions on Artificial Intelligence, vol. 5, no. 1, pp. 61-69, Jan. 2024, doi: 10.1109/TAI.2022.3225251

Hierarchical Deep-RL for continuous and large state spaces

A. P. Pope et al. Hierarchical Reinforcement Learning for Air Combat at DARPA’s AlphaDogfight Trials, EEE Transactions on Artificial Intelligence, vol. 4, no. 6, pp. 1371-1385, Dec. 2023 DOI: 10.1109/TAI.2022.3222143.

Autonomous control in high-dimensional, continuous state spaces is a persistent and important challenge in the fields of robotics and artificial intelligence. Because of high risk and complexity, the adoption of AI for autonomous combat systems has been a long-standing difficulty. In order to address these issues, DARPA’s AlphaDogfight Trials (ADT) program sought to vet the feasibility of and increase trust in AI for autonomously piloting an F-16 in simulated air-to-air combat. Our submission to ADT solves the high-dimensional, continuous control problem using a novel hierarchical deep reinforcement learning approach consisting of a high-level policy selector and a set of separately trained low-level policies specialized for excelling in specific regions of the state space. Both levels of the hierarchy are trained using off-policy, maximum entropy methods with expert knowledge integrated through reward shaping. Our approach outperformed human expert pilots and achieved a second-place rank in the ADT championship event.

Visibility graphs for robot path planning is still in use!

Junlin Ou, Seong Hyeon Hong, Ge Song, Yi Wang, Hybrid path planning based on adaptive visibility graph initialization and edge computing for mobile robots, Engineering Applications of Artificial Intelligence, Volume 126, Part D, 2023 DOI: 10.1016/j.engappai.2023.107110.

This paper presents a new initialization method that combines adaptive visibility graphs and the A* algorithm to improve the exploration, accuracy, and computing efficiency of hybrid path planning for mobile robots. First, segments/links in the full visibility graphs are removed randomly in an iterative and adaptive manner, yielding adaptive visibility graphs. Then the A* algorithm is applied to find the shortest paths in these adaptive visibility graphs. Next, high-quality paths featuring low fitness values are chosen to initialize the subsequent heuristic optimization in hybrid path planning. Specifically, in the present study, the genetic algorithm (GA) is implemented on a CPU/GPU edge computing device (Jetson AGX Xavier) to exploit its massively parallel processing threads, and the strategy for judicious CPU/GPU resource utilization is also developed. Numerical experiments are conducted to determine proper hyperparameters and configure GA with balanced performance. Various optimal paths with differential consideration of practical factors for robot path planning are obtained by the proposed method. Compared to the other benchmark methods, ours significantly improves the diversity of initial path and exploration, optimization accuracy, and computing speed (within 5�s with most less than 2�s). Furthermore, real-time experiments are carried out to demonstrate the effectiveness and application of the proposed algorithm on mobile robots.

Review of NNs for solving manipulator inverse kinematics

Daniel Cagigas-Mu�iz, Artificial Neural Networks for inverse kinematics problem in articulated robots, Engineering Applications of Artificial Intelligence,
Volume 126, Part D, 2023 DOI: 10.1016/j.engappai.2023.107175.

The inverse kinematics problem in articulated robots implies to obtain joint rotation angles using the robot end effector position and orientation tool. Unlike the problem of direct kinematics, in inverse kinematics there are no systematic methods for solving the problem. Moreover, solving the inverse kinematics problem is particularly complicated for certain morphologies of articulated robots. Machine learning techniques and, more specifically, artificial neural networks (ANNs) have been proposed in the scientific literature to solve this problem. However, there are some limitations in the performance of ANNs. In this study, different techniques that involve ANNs are proposed and analyzed. The results show that the proposed original bootstrap sampling and hybrid methods can substantially improve the performance of approaches that use only one ANN. Although all of these improvements do not solve completely the inverse kinematics problem in articulated robots, they do lay the foundations for the design and development of future more effective and efficient controllers. Therefore, the source code and documentation of this research are also publicly available to practitioners interested in adapting and improving these methods to any industrial robot or articulated robot.