Monthly Archives: October 2023

You are browsing the site archives by month.

Communication delays modelled as Gamma distributions for space-earth applications

H. Chen, Z. Liu, P. Huang and Z. Kuang, Time-Delay Modeling and Simulation for Relay Communication-Based Space Telerobot System, IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 52, no. 7, pp. 4211-4222, July 2022 DOI: 10.1109/TSMC.2021.3090806.

In a space telerobot system (STS), effectiveness of the control method in eliminating the time delay\u2019s influences is advisable to be verified under the real circumstance. However, it is difficult and costly for many scholars to obtain confidential information that would allow them to establish an STS. It may be feasible, using some existing results, to model the time delay as close to reality as possible, and to then program a simulation system to generate the simulated time delay, thus verifying validity. In this article, time-delay modeling and simulation problems for relay communication-based STS are first studied. The time delay in relay communication-based STS consists of both processing and communication time delays; the latter is divided into ground and ground-space parts. By extending the available results, processing and ground communication time delays are modeled with the probability distribution function modeling approach. An optimal communication link identification and minimum time-delay realization (OCLIMTDR) method is proposed to model the ground-space communication time delay. In this method, the novel point\u2013vector\u2013sphere (PVS) algorithm serves to judge link connectivity. The PVS algorithm is based on geometric theory, which gives the OCLIMTDR method good extensibility and renders it suitable for any relay communication network in theory. All three parts of the time-delay models are integrated to form the loop time-delay model of the STS. Subsequently, a time-delay simulation system is created by programming the loop time-delay model. Finally, the correctness of the simulation system is further authenticated based on simulations and some prior knowledge.

Kalman filter with time delays

H. Zhu, J. Mi, Y. Li, K. -V. Yuen and H. Leung, VB-Kalman Based Localization for Connected Vehicles With Delayed and Lost Measurements: Theory and Experiments, EEE/ASME Transactions on Mechatronics, vol. 27, no. 3, pp. 1370-1378, June 2022 DOI: 10.1109/TMECH.2021.3095096.

Traditionally, connected vehicles (CVs) share their own sensor data that relies on the satellite with their surrounding vehicles by vehicle-to-vehicle (V2V) communication. However, the satellite-based signal sometimes may be lost due to environmental factors. Time-delays and packet dropouts may occur randomly by V2V communication. To ensure the reliability and accuracy of localization for CVs, a novel variational Bayesian (VB)-Kalman method is developed for unknown and time varying probabilities of delayed and lost measurements. In this VB-Kalman localization method, two random variables are introduced to indicate whether a measurement is delayed and available, respectively. A hierarchical model is then formulated and its parameters and state are simultaneously estimated by the VB technique. Experimental results validate the proposed method for the localization of CVs in practice.

NOTE: See also S. Guo, Y. Liu, Y. Zheng and T. Ersal, “A Delay Compensation Framework for Connected Testbeds,” in IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 52, no. 7, pp. 4163-4176, July 2022, doi: 10.1109/TSMC.2021.3091974.

A tutorial on the integration of ROS with some interesting software, such as Jupyter

T. Fischer, W. Vollprecht, S. Traversaro, S. Yen, C. Herrero and M. Milford, A RoboStack Tutorial: Using the Robot Operating System Alongside the Conda and Jupyter Data Science Ecosystems, IEEE Robotics & Automation Magazine, vol. 29, no. 2, pp. 65-74, June 2022 DOI: 10.1109/MRA.2021.3128367.

The Robot Operating System (ROS) has become the de facto standard middleware in the robotics community [1] . ROS bundles everything, from low-level drivers to tools that transform among coordinate systems, to state-of-the-art perception and control algorithms. One of ROS\u2019s key merits is the rich ecosystem of standardized tools to build and distribute ROS-based software.

Improving the quality of memory replay in RL through an evolutionary algorithm biologically inspired

M. Ramicic and A. Bonarini, Augmented Memory Replay in Reinforcement Learning With Continuous Control, IEEE Transactions on Cognitive and Developmental Systems, vol. 14, no. 2, pp. 485-496, June 2022 DOI: 10.1109/TCDS.2021.3050723.

Online reinforcement learning agents are currently able to process an increasing amount of data by converting it into a higher order value functions. This expansion of the information collected from the environment increases the agent\u2019s state space enabling it to scale up to more complex problems but also increases the risk of forgetting by learning on redundant or conflicting data. To improve the approximation of a large amount of data, a random mini-batch of the past experiences that are stored in the replay memory buffer is often replayed at each learning step. The proposed work takes inspiration from a biological mechanism which acts as a protective layer of higher cognitive functions found in mammalian brain: active memory consolidation mitigates the effect of forgetting previous memories by dynamically processing the new ones. Similar dynamics are implemented by the proposed augmented memory replay or AMR algorithm. The architecture of AMR , based on a simple artificial neural network is able to provide an augmentation policy which modifies each of the agents experiences by augmenting their relevance prior to storing them in the replay memory. The function approximator of AMR is evolved using genetic algorithm in order to obtain the specific augmentation policy function that yields the best performance of a learning agent in a specific environment given by its received cumulative reward. Experimental results show that an evolved AMR augmentation function capable of increasing the significance of the specific memories is able to further increase the stability and convergence speed of the learning algorithms dealing with the complexity of continuous action domains.

Modelling the perception of time in the human brain through RL with eligibility traces

I. Louren�o, R. Mattila, R. Ventura and B. Wahlberg, A Biologically Inspired Computational Model of Time Perception, IEEE Transactions on Cognitive and Developmental Systems, vol. 14, no. 2, pp. 258-268, June 2022 DOI: 10.1109/TCDS.2021.3120301.

Time perception\u2014how humans and animals perceive the passage of time\u2014forms the basis for important cognitive skills, such as decision making, planning, and communication. In this work, we propose a framework for examining the mechanisms responsible for time perception. We first model neural time perception as a combination of two known timing sources: internal neuronal mechanisms and external (environmental) stimuli, and design a decision-making framework to replicate them. We then implement this framework in a simulated robot. We measure the robot\u2019s success on a temporal discrimination task originally performed by mice to evaluate their capacity to exploit temporal knowledge. We conclude that the robot is able to perceive time similarly to animals when it comes to their intrinsic mechanisms of interpreting time and performing time-aware actions. Next, by analyzing the behavior of agents equipped with the framework, we propose an estimator to infer characteristics of the timing mechanisms intrinsic to the agents. In particular, we show that from their empirical action probability distribution, we are able to estimate parameters used for perceiving time. Overall, our work shows promising results when it comes to drawing conclusions regarding some of the characteristics present in biological timing mechanisms.

NOTE: See also H. Basgol, I. Ayhan and E. Ugur, “Time Perception: A Review on Psychological, Computational, and Robotic Models,” in IEEE Transactions on Cognitive and Developmental Systems, vol. 14, no. 2, pp. 301-315, June 2022, doi: 10.1109/TCDS.2021.3059045.

Dealing with the exploration with a nice introduction to the problem

Jiayi Lu, Shuai Han, Shuai L�, Meng Kang, Junwei Zhang, Sampling diversity driven exploration with state difference guidance, Expert Systems with Applications, Volume 203, 2022 DOI: 10.1016/j.eswa.2022.117418.

Exploration is one of the key issues of deep reinforcement learning, especially in the environments with sparse or deceptive rewards. Exploration based on intrinsic rewards can handle these environments. However, these methods cannot take both global interaction dynamics and local environment changes into account simultaneously. In this paper, we propose a novel intrinsic reward for off-policy learning, which not only encourages the agent to take actions not fully learned from a global perspective, but also instructs the agent to trigger remarkable changes in the environment from a local perspective. Meanwhile, we propose the double-actors\u2013double-critics framework to combine intrinsic rewards with extrinsic rewards to avoid the inappropriate combination of intrinsic and extrinsic rewards in previous methods. This framework can be applied to off-policy learning algorithms based on the actor\u2013critic method. We provide a comprehensive evaluation of our approach on the MuJoCo benchmark environments. The results demonstrate that our method can perform effective exploration in the environments with dense, deceptive and sparse rewards. Besides, we conduct sufficient ablation and quantitative analyses to intrinsic rewards. Furthermore, we also verify the superiority and rationality of our double-actors\u2013double-critics framework through comparative experiments.

Reconstructing indoor map layouts from geometrical data

Matteo Luperto, Francesco Amigoni, Reconstruction and prediction of the layout of indoor environments from two-dimensional metric maps, Engineering Applications of Artificial Intelligence, Volume 113, 2022 DOI: 10.1016/j.engappai.2022.104910.

Metric maps, like occupancy grids, are one of the most common ways to represent indoor environments in autonomous mobile robotics. Although they are effective for navigation and localization, metric maps contain little knowledge about the structure of the buildings they represent. In this paper, we propose a method that identifies the structure of indoor environments from 2D metric maps by retrieving their layout, namely an abstract geometrical representation that models walls as line segments and rooms as polygons. The method works by finding regularities within a building, abstracting from the possibly noisy information of the metric map, and uses such knowledge to reconstruct the layout of the observed part and to predict a possible layout of the partially observed portion of the building. Thus, differently of other methods from the state of the art, our method can be applied both to fully observed environments and, most significantly, to partially observed ones. Experimental results show that our approach performs effectively and robustly on different types of input metric maps and that the predicted layout is increasingly more accurate when the input metric map is increasingly more complete. The layout returned by our method can be exploited in several tasks, such as semantic mapping, place categorization, path planning, human\u2013robot communication, and task allocation.

Increasing exploration when the agent performs worse, decreasing when performing better, in the context of DQN for distributing computation among cloud and edge servers, also dealing with hybridization of RL with Fuzzy

Do Bao Son, Ta Huu Binh, Hiep Khac Vo, Binh Minh Nguyen, Huynh Thi Thanh Binh, Shui Yu, Value-based reinforcement learning approaches for task offloading in Delay Constrained Vehicular Edge Computing, Engineering Applications of Artificial Intelligence, Volume 113, 2022 DOI: 10.1016/j.engappai.2022.104898.

In the age of booming information technology, human-being has witnessed the need for new paradigms with both high computational capability and low latency. A potential solution is Vehicular Edge Computing (VEC). Previous work proposed a Fuzzy Deep Q-Network in Offloading scheme (FDQO) that combines Fuzzy rules and Deep Q-Network (DQN) to improve DQN\u2019s early performance by using Fuzzy Controller (FC). However, we notice that frequent usage of FC can hinder the future growth performance of model. One way to overcome this issue is to remove Fuzzy Controller entirely. We introduced an algorithm called baseline DQN (b-DQN), represented by its two variants Static baseline DQN (Sb-DQN) and Dynamic baseline DQN (Db-DQN), to modify the exploration rate base on the average rewards of closest observations. Our findings confirm that these baseline DQN algorithms surpass traditional DQN models in terms of average Quality of Experience (QoE) in 100 time slots by about 6%, but still suffer from poor early performance (such as in the first 5 time slots). Here, we introduce baseline FDQO (b-FDQO). This algorithm has a strategy to modify the Fuzzy Logic usage instead of removing it entirely while still observing the rewards to modify the exploration rate. It brings a higher average QoE in the first 5 time slots compared to other non-fuzzy-logic algorithms by at least 55.12%, prevent the model from getting too bad result over all time slots, while having the late performance as good as that of b-DQN.

Abstraction of continuous control problems considered as MDPs

H. G. Tanner and A. Stager, Data-Driven Abstractions for Robots With Stochastic Dynamics, IEEE Transactions on Robotics, vol. 38, no. 3, pp. 1686-1702, June 2022 DOI: 10.1109/TRO.2021.3119209.

This article describes the construction of stochastic, data-based discrete abstractions for uncertain random processes continuous in time and space. Motivated by the fact that modeling processes often introduce errors which interfere with the implementation of control strategies, here the abstraction process proceeds in reverse: the methodology does not abstract models; rather it models abstractions. Specifically, it first formalizes a template for a family of stochastic abstractions, and then fits the parameters of that template to match the dynamics of the underlying process and ground the abstraction. The article also shows how the parameter-fitting approach can be implemented based on a probabilistic model validation approach which draws from randomized algorithms, and results in a discrete abstract model which is approximately simulated by the actual process physics, at a desired confidence level. In this way, the models afford the implementation of symbolic control plans with probabilistic guarantees at a desired level of fidelity.

Continuous POMDPs through belief state sparsification, applied to active SLAM

Elimelech K, Indelman V. Simplified decision making in the belief space using belief sparsification. The International Journal of Robotics Research. 2022;41(5):470-496 DOI: 10.1177/02783649221076381.

In this work, we introduce a new and efficient solution approach for the problem of decision making under uncertainty, which can be formulated as decision making in a belief space, over a possibly high-dimensional state space. Typically, to solve a decision problem, one should identify the optimal action from a set of candidates, according to some objective. We claim that one can often generate and solve an analogous yet simplified decision problem, which can be solved more efficiently. A wise simplification method can lead to the same action selection, or one for which the maximal loss in optimality can be guaranteed. Furthermore, such simplification is separated from the state inference and does not compromise its accuracy, as the selected action would finally be applied on the original state. First, we present the concept for general decision problems and provide a theoretical framework for a coherent formulation of the approach. We then practically apply these ideas to decision problems in the belief space, which can be simplified by considering a sparse approximation of their initial belief. The scalable belief sparsification algorithm we provide is able to yield solutions which are guaranteed to be consistent with the original problem. We demonstrate the benefits of the approach in the solution of a realistic active-SLAM problem and manage to significantly reduce computation time, with no loss in the quality of solution. This work is both fundamental and practical and holds numerous possible extensions.