Predicting changes in the environment through time series for better robot navigation

Yanbo Wang, Yaxian Fan, Jingchuan Wang, Weidong Chen, Long-term navigation for autonomous robots based on spatio-temporal map prediction, Robotics and Autonomous Systems, Volume 179, 2024 DOI: 10.1016/j.robot.2024.104724.

The robotics community has witnessed a growing demand for long-term navigation of autonomous robots in diverse environments, including factories, homes, offices, and public places. The core challenge in long-term navigation for autonomous robots lies in effectively adapting to varying degrees of dynamism in the environment. In this paper, we propose a long-term navigation method for autonomous robots based on spatio-temporal map prediction. The time series model is introduced to learn the changing patterns of different environmental structures or objects on multiple time scales based on the historical maps and forecast the future maps for long-term navigation. Then, an improved global path planning algorithm is performed based on the time-variant predicted cost maps. During navigation, the current observations are fused with the predicted map through a modified Bayesian filter to reduce the impact of prediction errors, and the updated map is stored for future predictions. We run simulation and conduct several weeks of experiments in multiple scenarios. The results show that our algorithm is effective and robust for long-term navigation in dynamic environments.

RL to learn the coordination of different goals in autonomous driving

J. Liu, J. Yin, Z. Jiang, Q. Liang and H. Li, Attention-Based Distributional Reinforcement Learning for Safe and Efficient Autonomous Driving, IEEE Robotics and Automation Letters, vol. 9, no. 9, pp. 7477-7484, Sept. 2024 DOI: 10.1109/LRA.2024.3427551.

Autonomous driving vehicles play a critical role in intelligent transportation systems and have garnered considerable attention. Currently, the popular approach in autonomous driving systems is to design separate optimal objectives for each independent module. Therefore, a major concern arises from the fact that these diverse optimal objectives may have an impact on the final driving policy. However, reinforcement learning provides a promising solution to tackle the challenge through joint training and its exploration ability. This letter aims to develop a safe and efficient reinforcement learning approach with advanced features for autonomous navigation in urban traffic scenarios. Firstly, we develop a novel distributional reinforcement learning method that integrates an implicit distribution model into an actor-critic framework. Subsequently, we introduce a spatial attention module to capture interaction features between the ego vehicle and other traffic vehicles, and design a temporal attention module to extract the long-term sequential feature. Finally, we utilize bird’s-eye-view as a context-aware representation of traffic scenarios, fused by the above spatio-temporal features. To validate our approach, we conduct experiments on the NoCrash and CoRL benchmarks, especially on our closed-loop openDD scenarios. The experimental results demonstrate the impressive performance of our approach in terms of convergence and stability compared to the baselines.

RL in periodic scenarios

A. Aniket and A. Chattopadhyay, Online Reinforcement Learning in Periodic MDP, IEEE Transactions on Artificial Intelligence, vol. 5, no. 7, pp. 3624-3637, July 2024 DOI: 10.1109/TAI.2024.3375258.

We study learning in periodic Markov decision process (MDP), a special type of nonstationary MDP where both the state transition probabilities and reward functions vary periodically, under the average reward maximization setting. We formulate the problem as a stationary MDP by augmenting the state space with the period index and propose a periodic upper confidence bound reinforcement learning-2 (PUCRL2) algorithm. We show that the regret of PUCRL2 varies linearly with the period N and as O(TlogT−−−−−√) with the horizon length T . Utilizing the information about the sparsity of transition matrix of augmented MDP, we propose another algorithm [periodic upper confidence reinforcement learning with Bernstein bounds (PUCRLB) which enhances upon PUCRL2, both in terms of regret ( O(N−−√) dependency on period] and empirical performance. Finally, we propose two other algorithms U-PUCRL2 and U-PUCRLB for extended uncertainty in the environment in which the period is unknown but a set of candidate periods are known. Numerical results demonstrate the efficacy of all the algorithms.

Making RL safer by first learning what is a safe situation

K. Fan, Z. Chen, G. Ferrigno and E. D. Momi, Learn From Safe Experience: Safe Reinforcement Learning for Task Automation of Surgical Robot, IEEE Transactions on Artificial Intelligence, vol. 5, no. 7, pp. 3374-3383, July 2024 DOI: 10.1109/TAI.2024.3351797.

Surgical task automation in robotics can improve the outcomes, reduce quality-of-care variance among surgeons and relieve surgeons’ fatigue. Reinforcement learning (RL) methods have shown considerable performance in robot autonomous control in complex environments. However, the existing RL algorithms for surgical robots do not consider any safety requirements, which is unacceptable in automating surgical tasks. In this work, we propose an approach called safe experience reshaping (SER) that can be integrated into any offline RL algorithm. First, the method identifies and learns the geometry of constraints. Second, a safe experience is obtained by projecting an unsafe action to the tangent space of the learned geometry, which means that the action is in the safe space. Then, the collected safe experiences are used for safe policy training. We designed three tasks that closely resemble real surgical tasks including 2-D cutting tasks and a contact-rich debridement task in 3-D space to evaluate the safe RL framework. We compare our framework to five state-of-the-art (SOTA) RL methods including reward penalty and primal-dual methods. Results show that our framework gets a lower rate of constraint violations and better performance in task success, especially with a higher convergence speed.

A new software design, verification and implementation method for robotics

Li, W., Ribeiro, P., Miyazawa, A. et al., Formal design, verification and implementation of robotic controller software via RoboChart and RoboTool, Auton Robot 48, 14 (2024) DOI: 10.1007/s10514-024-10163-7.

Current practice in simulation and implementation of robot controllers is usually undertaken with guidance from high-level design diagrams and pseudocode. Thus, no rigorous connection between the design and the development of a robot controller is established. This paper presents a framework for designing robotic controllers with support for automatic generation of executable code and automatic property checking. A state-machine based notation, RoboChart, and a tool (RoboTool) that implements the automatic generation of code and mathematical models from the designed controllers are presented. We demonstrate the application of RoboChart and its related tool through a case study of a robot performing an exploration task. The automatically generated code is platform independent and is used in both simulation and two different physical robotic platforms. Properties are formally checked against the mathematical models generated by RoboTool, and further validated in the actual simulations and physical experiments. The tool not only provides engineers with a way of designing robotic controllers formally but also paves the way for correct implementation of robotic systems.

Setting up goals, even unproductive or unuseful ones, can help in building cognition

Junyi Chu, Joshua B. Tenenbaum, Laura E. Schulz, In praise of folly: flexible goals and human cognition, Trends in Cognitive Sciences, Volume 28, Issue 7, 2024, Pages 628-642 DOI: 10.1016/j.tics.2024.03.006.

Humans often pursue idiosyncratic goals that appear remote from functional ends, including information gain. We suggest that this is valuable because goals (even prima facie foolish or unachievable ones) contain structured information that scaffolds thinking and planning. By evaluating hypotheses and plans with respect to their goals, humans can discover new ideas that go beyond prior knowledge and observable evidence. These hypotheses and plans can be transmitted independently of their original motivations, adapted across generations, and serve as an engine of cultural evolution. Here, we review recent empirical and computational research underlying goal generation and planning and discuss the ways that the flexibility of our motivational system supports cognitive gains for both individuals and societies.

Review of the current methologies for achieving continuous learning, and its biological bases

Buddhi Wickramasinghe, Gobinda Saha , and Kaushik Roy, Continual Learning: A Review of Techniques, Challenges, and Future Directions, IEEE TRANSACTIONS ON ARTIFICIAL INTELLIGENCE, VOL. 5, NO. 6, JUNE 2024 DOI: 10.1109/TAI.2023.3339091.

Continual learning (CL), or the ability to acquire, process, and learn from new information without forgetting acquired knowledge, is a fundamental quality of an intelligent agent. The human brain has evolved into gracefully dealing with ever-changing circumstances and learning from experience with the help of complex neurophysiological mechanisms. Even though artificial intelligence takes after human intelligence, traditional neural networks do not possess the ability to adapt to dynamic environments. When presented with new information, an artificial neural network (ANN) often completely forgets its prior knowledge, a phenomenon called catastrophic forgetting or catastrophic interference. Incorporating CL capabilities into ANNs is an active field of research and is integral to achieving artificial general intelligence. In this review, we revisit CL approaches and critically examine their strengths and limitations. We conclude that CL approaches should look beyond mitigating catastrophic forgetting and strive for systems that can learn, store, recall, and transfer knowledge, much like the human brain. To this end, we highlight the importance of adopting alternative brain-inspired data representations and learning algorithms and provide our perspective on promising new directions where CL could play an instrumental role.

See also: doi: 10.1109/TAI.2024.3355879

A clustering algorithm that claims to be simpler and faster than others

Yewang Chen, Yuanyuan Yang, Songwen Pei, Yi Chen, Jixiang Du, A simple rapid sample-based clustering for large-scale data, Engineering Applications of Artificial Intelligence, Volume 133, Part F, 2024 DOI: 10.1016/j.engappai.2024.108551.

Large-scale data clustering is a crucial task in addressing big data challenges. However, existing approaches often struggle to efficiently and effectively identify different types of big data, making it a significant challenge. In this paper, we propose a novel sample-based clustering algorithm, which is very simple but extremely efficient, and runs in about O(n×r) expected time, where n is the size of the dataset and r is the category number. The method is based on two key assumptions: (1) The data of each sufficient sample should have similar data distribution, as well as category distribution, to the entire data set; (2) the representative of each category in all sufficient samples conform to Gaussian distribution. It processes data in two stages, one is to classify data in each local sample independently, and the other is to globally classify data by assigning each point to the category of its nearest representative category center. The experimental results show that the proposed algorithm is effective, which outperforms other current variants of clustering algorithm.

Profiling the energy consumption of AGVs

J. Leng, J. Peng, J. Liu, Y. Zhang, J. Ji and Y. Zhang, rofiling Power Consumption in Low-Speed Autonomous Guided Vehicles, IEEE Robotics and Automation Letters, vol. 9, no. 7, pp. 6027-6034, July 2024 DOI: 10.1109/LRA.2024.3396051.

The increasing demand for automation has led to a rise in the use of low-speed Autonomous guided vehicles (AGVs). However, AGVs rely on batteries for their power source, which limits their operational time and affects their overall performance. To optimize their energy usage and enhance their battery life, it is crucial to understand the power consumption behavior of AGVs. This letter presents a comprehensive study on profiling power consumption in low-speed AGVs. The previous power consumption estimation models for AGVs were mostly based on physical formulas. We introduce a data-driven power consumption estimation model for each of the main components of the AGV, including the chassis, computing platform, sensors and communication devices. By conducting three actual driving tests, we show that the MAPE in estimating instantaneous power is 4.8%, a significant 8.1% improvement compared to using a physical model. Moreover, the MAPE for energy consumption is only 1.5%, which is 6.6% better than the physical model. To demonstrate the utility of our power consumption estimation models, we conduct two case studies – one is energy-efficient path planning and the other is energy-efficient perception task interval adjustment. This study demonstrates that integrating the power consumption estimation model into path planning reduces energy consumption by over 12%. Additionally, adjusting detection interval lowers computational energy consumption by 10.1%.

Thermodynamics as a way of identifying hierarchies

Morten L. Kringelbach, Yonatan Sanz Perl, Gustavo Deco, The Thermodynamics of Mind, Trends in Cognitive Sciences, Volume 28, Issue 6, 2024, Pages 568-581 DOI: 10.1016/j.tics.2024.03.009.

To not only survive, but also thrive, the brain must efficiently orchestrate distributed computation across space and time. This requires hierarchical organisation facilitating fast information transfer and processing at the lowest possible metabolic cost. Quantifying brain hierarchy is difficult but can be estimated from the asymmetry of information flow. Thermodynamics has successfully characterised hierarchy in many other complex systems. Here, we propose the ‘Thermodynamics of Mind’ framework as a natural way to quantify hierarchical brain orchestration and its underlying mechanisms. This has already provided novel insights into the orchestration of hierarchy in brain states including movie watching, where the hierarchy of the brain is flatter than during rest. Overall, this framework holds great promise for revealing the orchestration of cognition.