kipr | Scientific papers that were of interest for Prof. Juan-Antonio Fernández-Madrigal

Hierarchical RL with diverse methods integrated in the framework

October 27, 2023 06:12 , Juan-Antonio Fernández-Madrigal

Ye Zhou, Hann Woei Ho, Online robot guidance and navigation in non-stationary environment with hybrid Hierarchical Reinforcement Learning, Engineering Applications of Artificial Intelligence, Volume 114, 2022 DOI: 10.1016/j.engappai.2022.105152.

Hierarchical Reinforcement Learning (HRL) provides an option to solve complex guidance and navigation problems with high-dimensional spaces, multiple objectives, and a large number of states and actions. The current HRL methods often use the same or similar reinforcement learning methods within one application so that multiple objectives can be easily combined. Since there is not a single learning method that can benefit all targets, hybrid Hierarchical Reinforcement Learning (hHRL) was proposed to use various methods to optimize the learning with different types of information and objectives in one application. The previous hHRL method, however, requires manual task-specific designs, which involves engineers\u2019 preferences and may impede its transfer learning ability. This paper, therefore, proposes a systematic online guidance and navigation method under the framework of hHRL, which generalizes training samples with a function approximator, decomposes the state space automatically, and thus does not require task-specific designs. The simulation results indicate that the proposed method is superior to the previous hHRL method, which requires manual decomposition, in terms of the convergence rate and the learnt policy. It is also shown that this method is generally applicable to non-stationary environments changing over episodes and over time without the loss of efficiency even with noisy state information.

Posted in: Applications of reinforcement learning to robots , Tagged: Hierarchical reinforcement learning, Robot navigation

Unexpected consequences of training smarthome systems with reinforcement learning: effects on human behaviours

October 27, 2023 06:02 , Juan-Antonio Fernández-Madrigal

S. Suman, A. Etemad and F. Rivest, TPotential Impacts of Smart Homes on Human Behavior: A Reinforcement Learning Approach, IEEE Transactions on Artificial Intelligence, vol. 3, no. 4, pp. 567-580, Aug. 2022 DOI: 10.1109/TAI.2021.3127483.

Smart homes are becoming increasingly popular as a result of advances in machine learning and cloud computing. Devices, such as smart thermostats and speakers, are now capable of learning from user feedback and adaptively adjust their settings to human preferences. Nonetheless, these devices might in turn impact human behavior. To investigate the potential impacts of smart homes on human behavior, we simulate a series of hierarchical-reinforcement learning-based human models capable of performing various activities\u2014namely, setting temperature and humidity for thermal comfort inside a Q-Learning-based smart home model. We then investigate the possibility of the human models\u2019 behaviors being altered as a result of the smart home and the human model adapting to one another. For our human model, the activities are based on hierarchical-reinforcement learning. This allows the human to learn how long it must continue a given activity and decide when to leave it. We then integrate our human model in the environment along with the smart home model and perform rigorous experiments considering various scenarios involving a model of a single human and models of two different humans with the smart home. Our experiments show that with the smart home, the human model can exhibit unexpected behaviors such as frequent changing of activities and an increase in the time required to modify the thermal preferences. With two human models, we interestingly observe that certain combinations of models result in normal behaviors, while other combinations exhibit the same unexpected behaviors as those observed from the single human experiment.

Posted in: Psycho-physiological bases of engineering , Tagged: Reinf, Reinforcement learning, Smart homes

Human+machine sequential decision making

October 27, 2023 05:59 , Juan-Antonio Fernández-Madrigal

Q. Zhang, Y. Kang, Y. -B. Zhao, P. Li and S. You, Traded Control of Human\u2013Machine Systems for Sequential Decision-Making Based on Reinforcement Learning, IEEE Transactions on Artificial Intelligence, vol. 3, no. 4, pp. 553-566, Aug. 2022 DOI: 10.1109/TAI.2021.3127857.

Sequential decision-making (SDM) is a common type of decision-making problem with sequential and multistage characteristics. Among them, the learning and updating of policy are the main challenges in solving SDM problems. Unlike previous machine autonomy driven by artificial intelligence alone, we improve the control performance of SDM tasks by combining human intelligence and machine intelligence. Specifically, this article presents a paradigm of a human\u2013machine traded control systems based on reinforcement learning methods to optimize the solution process of sequential decision problems. By designing the idea of autonomous boundary and credibility assessment, we enable humans and machines at the decision-making level of the systems to collaborate more effectively. And the arbitration in the human\u2013machine traded control systems introduces the Bayesian neural network and the dropout mechanism to consider the uncertainty and security constraints. Finally, experiments involving machine traded control, human traded control were implemented. The preliminary experimental results of this article show that our traded control method improves decision-making performance and verifies the effectiveness for SDM problems.

Posted in: Human-robot interaction , Tagged: Reinforcement learning, Sequiential Decision Making

New algorithms for optimal path planning with certain optimality guarantees

October 27, 2023 05:48 , Juan-Antonio Fernández-Madrigal

Strub MP, Gammell JD. **Adaptively Informed Trees (AIT) and Effort Informed Trees (EIT): Asymmetric bidirectional sampling-based path planning,** The International Journal of Robotics Research. 2022;41(4):390-417 DOI: 10.1177/02783649211069572.

Optimal path planning is the problem of finding a valid sequence of states between a start and goal that optimizes an objective. Informed path planning algorithms order their search with problem-specific knowledge expressed as heuristics and can be orders of magnitude more efficient than uninformed algorithms. Heuristics are most effective when they are both accurate and computationally inexpensive to evaluate, but these are often conflicting characteristics. This makes the selection of appropriate heuristics difficult for many problems. This paper presents two almost-surely asymptotically optimal sampling-based path planning algorithms to address this challenge, Adaptively Informed Trees (AIT*) and Effort Informed Trees (EIT*). These algorithms use an asymmetric bidirectional search in which both searches continuously inform each other. This allows AIT* and EIT* to improve planning performance by simultaneously calculating and exploiting increasingly accurate, problem-specific heuristics. The benefits of AIT* and EIT* relative to other sampling-based algorithms are demonstrated on 12 problems in abstract, robotic, and biomedical domains optimizing path length and obstacle clearance. The experiments show that AIT* and EIT* outperform other algorithms on problems optimizing obstacle clearance, where a priori cost heuristics are often ineffective, and still perform well on problems minimizing path length, where such heuristics are often effective.

Posted in: Robot motion planning , Tagged: Path planning

Probabilistic ICP (Iterative Closest Point) with an intro on classical ICP

October 27, 2023 05:45 , Juan-Antonio Fernández-Madrigal

Breux Y, Mas A, Lapierre L. On-manifold probabilistic Iterative Closest Point: Application to underwater karst exploration, The International Journal of Robotics Research. 2022;41(9-10):875-902 DOI: 10.1177/02783649221101418.

This paper proposes MpIC, an on-manifold derivation of the probabilistic Iterative Correspondence (pIC) algorithm, which is a stochastic version of the original Iterative Closest Point. It is developed in the context of autonomous underwater karst exploration based on acoustic sonars. First, a derivation of pIC based on the Lie group structure of SE(3) is developed. The closed-form expression of the covariance modeling the estimated rigid transformation is also provided. In a second part, its application to 3D scan matching between acoustic sonar measurements is proposed. It is a prolongation of previous work on elevation angle estimation from wide-beam acoustic sonar. While the pIC approach proposed is intended to be a key component in a Simultaneous Localization and Mapping framework, this paper focuses on assessing its viability on a unitary basis. As ground truth data in karst aquifer are difficult to obtain, quantitative experiments are carried out on a simulated karst environment and show improvement compared to previous state-of-the-art approach. The algorithm is also evaluated on a real underwater cave dataset demonstrating its practical applicability.

See also: Maken FA, Ramos F, Ott L. Bayesian iterative closest point for mobile robot localization. The International Journal of Robotics Research. 2022;41(9-10):851-874. doi:10.1177/02783649221101417

Posted in: Mobile Robot Localization , Tagged: ICP, Useful for teaching

Survey of machine learning applied to robot navigation, including a brief survey of classic navigation

October 27, 2023 05:42 , Juan-Antonio Fernández-Madrigal

Xiao, X., Liu, B., Warnell, G. et al. Motion planning and control for mobile robot navigation using machine learning: a survey, Auton Robot 46, 569\u2013597 (2022) DOI: 10.1007/s10514-022-10039-8.

Moving in complex environments is an essential capability of intelligent mobile robots. Decades of research and engineering have been dedicated to developing sophisticated navigation systems to move mobile robots from one point to another. Despite their overall success, a recently emerging research thrust is devoted to developing machine learning techniques to address the same problem, based in large part on the success of deep learning. However, to date, there has not been much direct comparison between the classical and emerging paradigms to this problem. In this article, we survey recent works that apply machine learning for motion planning and control in mobile robot navigation, within the context of classical navigation systems. The surveyed works are classified into different categories, which delineate the relationship of the learning approaches to classical methods. Based on this classification, we identify common challenges and promising future directions.

Posted in: Robot motion planning , Tagged: Machine learning, Robot navigation, Survey

POMDP Planner that uses multiple levels of approximation to the system dynamics to reduce the number and complexity of forward simulations

October 27, 2023 05:39 , Juan-Antonio Fernández-Madrigal

Hoerger M, Kurniawati H, Elfes A. , Multilevel Monte Carlo for solving POMDPs on-line, The International Journal of Robotics Research. 2023;42(4-5):196-213 DOI: 10.1177/02783649221093658.

Planning under partial observability is essential for autonomous robots. A principled way to address such planning problems is the Partially Observable Markov Decision Process (POMDP). Although solving POMDPs is computationally intractable, substantial advancements have been achieved in developing approximate POMDP solvers in the past two decades. However, computing robust solutions for systems with complex dynamics remains challenging. Most on-line solvers rely on a large number of forward simulations and standard Monte Carlo methods to compute the expected outcomes of actions the robot can perform. For systems with complex dynamics, for example, those with non-linear dynamics that admit no closed-form solution, even a single forward simulation can be prohibitively expensive. Of course, this issue exacerbates for problems with long planning horizons. This paper aims to alleviate the above difficulty. To this end, we propose a new on-line POMDP solver, called Multilevel POMDP Planner\u2009(MLPP), that combines the commonly known Monte-Carlo-Tree-Search with the concept of Multilevel Monte Carlo to speed up our capability in generating approximately optimal solutions for POMDPs with complex dynamics. Experiments on four different problems involving torque control, navigation and grasping indicate that MLPP\u2009substantially outperforms state-of-the-art POMDP solvers.

Posted in: Robot task planning , Tagged: Hierarchical POMDPs, Monte Carlo POMDPs, POMDPs

Semi-Markov HMMs for modelling time series in milling machines

October 27, 2023 05:32 , Juan-Antonio Fernández-Madrigal

Kai Li, Chaochao Qiu, Xinzhao Zhou, Mingsong Chen, Yongcheng Lin, Xianshi Jia, Bin Li, Modeling and tagging of time sequence signals in the milling process based on an improved hidden semi-Markov model, Expert Systems with Applications, Volume 205, 2022 DOI: 10.1016/j.eswa.2022.117758.

Vibration signals are widely used in the field of tool wear, tool residual life prediction and health monitoring of mechanical equipment. However, the current data-driven research methods mostly rely on high-value and high-density labeled data to establish relevant models and algorithms. Therefore, it is of great significance to solve the problem of automatic tagging of data, realize automatic signal interception, and enhance the value density of manufacturing process data. The Hidden semi-Markov model (HSMM) can describe the real spatial statistical characteristics of random models through observable data. As HSMM does not need the real labels of the signal, it can reduce tagging work to improve the marking efficiency. In this paper, an improved HSMM was proposed to model and tag the spindle vibration signals in the milling process. First, the Mel frequency cepstral coefficients (MFCCs) were extracted as observation sequences from the collected spindle vibration signals, and the dimension of the original features was reduced by linear discriminant analysis (LDA). Subsequently, a signal automatic tagging model based on HSMM was developed, in which the state duration can be explicitly modeled. Finally, the evaluation of the proposed methodology was carried out in the laboratory and real industry machining. The experimental results confirmed the effectiveness and robustness of the proposed model.

Posted in: Probability and statistics , Tagged: Clustering of time series, HMMs, Semi-markov processes

A survey on visual SLAM in robotics

October 27, 2023 05:26 , Juan-Antonio Fernández-Madrigal

Iman Abaspur Kazerouni, Luke Fitzgerald, Gerard Dooly, Daniel Toal, A survey of state-of-the-art on visual SLAM, Expert Systems with Applications, Volume 205, 2022 DOI: 10.1016/j.eswa.2022.117734.

This paper is an overview to Visual Simultaneous Localization and Mapping (V-SLAM). We discuss the basic definitions in the SLAM and vision system fields and provide a review of the state-of-the-art methods utilized for mobile robot\u2019s vision and SLAM. This paper covers topics from the basic SLAM methods, vision sensors, machine vision algorithms for feature extraction and matching, Deep Learning (DL) methods and datasets for Visual Odometry (VO) and Loop Closure (LC) in V-SLAM applications. Several feature extraction and matching algorithms are simulated to show a better vision of feature-based techniques.

Communication delays modelled as Gamma distributions for space-earth applications

October 20, 2023 09:55 , Juan-Antonio Fernández-Madrigal

H. Chen, Z. Liu, P. Huang and Z. Kuang, Time-Delay Modeling and Simulation for Relay Communication-Based Space Telerobot System, IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 52, no. 7, pp. 4211-4222, July 2022 DOI: 10.1109/TSMC.2021.3090806.

In a space telerobot system (STS), effectiveness of the control method in eliminating the time delay\u2019s influences is advisable to be verified under the real circumstance. However, it is difficult and costly for many scholars to obtain confidential information that would allow them to establish an STS. It may be feasible, using some existing results, to model the time delay as close to reality as possible, and to then program a simulation system to generate the simulated time delay, thus verifying validity. In this article, time-delay modeling and simulation problems for relay communication-based STS are first studied. The time delay in relay communication-based STS consists of both processing and communication time delays; the latter is divided into ground and ground-space parts. By extending the available results, processing and ground communication time delays are modeled with the probability distribution function modeling approach. An optimal communication link identification and minimum time-delay realization (OCLIMTDR) method is proposed to model the ground-space communication time delay. In this method, the novel point\u2013vector\u2013sphere (PVS) algorithm serves to judge link connectivity. The PVS algorithm is based on geometric theory, which gives the OCLIMTDR method good extensibility and renders it suitable for any relay communication network in theory. All three parts of the time-delay models are integrated to form the loop time-delay model of the STS. Subsequently, a time-delay simulation system is created by programming the loop time-delay model. Finally, the correctness of the simulation system is further authenticated based on simulations and some prior knowledge.

Posted in: Communication networks , Tagged: Communication delays, Probability distribution estimation

« Previous 1 … 16 17 18 19 20 … 77 Next »

Hierarchical RL with diverse methods integrated in the framework

Ye Zhou, Hann Woei Ho, Online robot guidance and navigation in non-stationary environment with hybrid Hierarchical Reinforcement Learning, Engineering Applications of Artificial Intelligence, Volume 114, 2022 DOI: 10.1016/j.engappai.2022.105152.

Unexpected consequences of training smarthome systems with reinforcement learning: effects on human behaviours

S. Suman, A. Etemad and F. Rivest, TPotential Impacts of Smart Homes on Human Behavior: A Reinforcement Learning Approach, IEEE Transactions on Artificial Intelligence, vol. 3, no. 4, pp. 567-580, Aug. 2022 DOI: 10.1109/TAI.2021.3127483.

Human+machine sequential decision making

Q. Zhang, Y. Kang, Y. -B. Zhao, P. Li and S. You, Traded Control of Human\u2013Machine Systems for Sequential Decision-Making Based on Reinforcement Learning, IEEE Transactions on Artificial Intelligence, vol. 3, no. 4, pp. 553-566, Aug. 2022 DOI: 10.1109/TAI.2021.3127857.

New algorithms for optimal path planning with certain optimality guarantees

Strub MP, Gammell JD. **Adaptively Informed Trees (AIT) and Effort Informed Trees (EIT): Asymmetric bidirectional sampling-based path planning,** The International Journal of Robotics Research. 2022;41(4):390-417 DOI: 10.1177/02783649211069572.

Probabilistic ICP (Iterative Closest Point) with an intro on classical ICP

Breux Y, Mas A, Lapierre L. On-manifold probabilistic Iterative Closest Point: Application to underwater karst exploration, The International Journal of Robotics Research. 2022;41(9-10):875-902 DOI: 10.1177/02783649221101418.

Survey of machine learning applied to robot navigation, including a brief survey of classic navigation

Xiao, X., Liu, B., Warnell, G. et al. Motion planning and control for mobile robot navigation using machine learning: a survey, Auton Robot 46, 569\u2013597 (2022) DOI: 10.1007/s10514-022-10039-8.

POMDP Planner that uses multiple levels of approximation to the system dynamics to reduce the number and complexity of forward simulations

Hoerger M, Kurniawati H, Elfes A. , Multilevel Monte Carlo for solving POMDPs on-line, The International Journal of Robotics Research. 2023;42(4-5):196-213 DOI: 10.1177/02783649221093658.

Semi-Markov HMMs for modelling time series in milling machines

Kai Li, Chaochao Qiu, Xinzhao Zhou, Mingsong Chen, Yongcheng Lin, Xianshi Jia, Bin Li, Modeling and tagging of time sequence signals in the milling process based on an improved hidden semi-Markov model, Expert Systems with Applications, Volume 205, 2022 DOI: 10.1016/j.eswa.2022.117758.

A survey on visual SLAM in robotics

Iman Abaspur Kazerouni, Luke Fitzgerald, Gerard Dooly, Daniel Toal, A survey of state-of-the-art on visual SLAM, Expert Systems with Applications, Volume 205, 2022 DOI: 10.1016/j.eswa.2022.117734.

Communication delays modelled as Gamma distributions for space-earth applications

H. Chen, Z. Liu, P. Huang and Z. Kuang, Time-Delay Modeling and Simulation for Relay Communication-Based Space Telerobot System, IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 52, no. 7, pp. 4211-4222, July 2022 DOI: 10.1109/TSMC.2021.3090806.

Post Navigation

Fields, areas and lines of research

Archives

Ye Zhou, Hann Woei Ho, Online robot guidance and navigation in non-stationary environment with hybrid Hierarchical Reinforcement Learning, Engineering Applications of Artificial Intelligence, Volume 114, 2022 DOI: 10.1016/j.engappai.2022.105152.

S. Suman, A. Etemad and F. Rivest, TPotential Impacts of Smart Homes on Human Behavior: A Reinforcement Learning Approach, IEEE Transactions on Artificial Intelligence, vol. 3, no. 4, pp. 567-580, Aug. 2022 DOI: 10.1109/TAI.2021.3127483.

Q. Zhang, Y. Kang, Y. -B. Zhao, P. Li and S. You, Traded Control of Human\u2013Machine Systems for Sequential Decision-Making Based on Reinforcement Learning, IEEE Transactions on Artificial Intelligence, vol. 3, no. 4, pp. 553-566, Aug. 2022 DOI: 10.1109/TAI.2021.3127857.

Strub MP, Gammell JD. Adaptively Informed Trees (AIT*) and Effort Informed Trees (EIT*): Asymmetric bidirectional sampling-based path planning, The International Journal of Robotics Research. 2022;41(4):390-417 DOI: 10.1177/02783649211069572.

Breux Y, Mas A, Lapierre L. On-manifold probabilistic Iterative Closest Point: Application to underwater karst exploration, The International Journal of Robotics Research. 2022;41(9-10):875-902 DOI: 10.1177/02783649221101418.

Xiao, X., Liu, B., Warnell, G. et al. Motion planning and control for mobile robot navigation using machine learning: a survey, Auton Robot 46, 569\u2013597 (2022) DOI: 10.1007/s10514-022-10039-8.

Hoerger M, Kurniawati H, Elfes A. , Multilevel Monte Carlo for solving POMDPs on-line, The International Journal of Robotics Research. 2023;42(4-5):196-213 DOI: 10.1177/02783649221093658.

Kai Li, Chaochao Qiu, Xinzhao Zhou, Mingsong Chen, Yongcheng Lin, Xianshi Jia, Bin Li, Modeling and tagging of time sequence signals in the milling process based on an improved hidden semi-Markov model, Expert Systems with Applications, Volume 205, 2022 DOI: 10.1016/j.eswa.2022.117758.

Iman Abaspur Kazerouni, Luke Fitzgerald, Gerard Dooly, Daniel Toal, A survey of state-of-the-art on visual SLAM, Expert Systems with Applications, Volume 205, 2022 DOI: 10.1016/j.eswa.2022.117734.

H. Chen, Z. Liu, P. Huang and Z. Kuang, Time-Delay Modeling and Simulation for Relay Communication-Based Space Telerobot System, IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 52, no. 7, pp. 4211-4222, July 2022 DOI: 10.1109/TSMC.2021.3090806.

Post Navigation

Fields, areas and lines of research

Transversal topics, methods and tools

Archives

Strub MP, Gammell JD. **Adaptively Informed Trees (AIT) and Effort Informed Trees (EIT): Asymmetric bidirectional sampling-based path planning,** The International Journal of Robotics Research. 2022;41(4):390-417 DOI: 10.1177/02783649211069572.