Juan-Antonio Fernández-Madrigal | kipr

Substituting the update step of a bayesian filter by a maximum likelihood optimisation in order to use non-linear observation models in a (linear-transition) Kalman framework

July 30, 2015 16:32 , Juan-Antonio Fernández-Madrigal

Damián Marelli, Minyue Fu, and Brett Ninness, Asymptotic Optimality of the Maximum-Likelihood Kalman Filter for Bayesian Tracking With Multiple Nonlinear Sensors, IEEE Transactions on signal processing, vol. 63, no. 17, DOI: 10.1109/TSP.2015.2440220.

Bayesian tracking is a general technique for state estimation of nonlinear dynamic systems, but it suffers from the drawback of computational complexity. This paper is concerned with a class of Wiener systems with multiple nonlinear sensors. Such a system consists of a linear dynamic system followed by a set of static nonlinear measurements. We study a maximum-likelihood Kalman filtering (MLKF) technique which involves maximum-like-lihood estimation of the nonlinear measurements followed by classical Kalman filtering. This technique permits a distributed implementation of the Bayesian tracker and guarantees the boundedness of the estimation error. The focus of this paper is to study the extent to which the MLKF technique approximates the theoretically optimal Bayesian tracker. We provide conditions to guarantee that this approximation becomes asymptotically exact as the number of sensors becomes large. Two case studies are analyzed in detail.

A new algorithm for clock synchronization in wireless sensor networks with bounded delays, that includes interesting references to surveys

July 27, 2015 10:26 , Juan-Antonio Fernández-Madrigal

Emanuele Garone, Andrea Gasparri, Francesco Lamonaca, Clock synchronization protocol for wireless sensor networks with bounded communication delays, Automatica, Volume 59, September 2015, Pages 60-72, ISSN 0005-1098, DOI: 10.1016/j.automatica.2015.06.014.

In this paper, we address the clock synchronization problem for wireless sensor networks. In particular, we consider a wireless sensor network where nodes are equipped with a local clock and communicate in order to achieve a common sense of time. The proposed approach consists of two asynchronous consensus algorithms, the first of which synchronizes the clocks frequency and the second of which synchronizes the clocks offset. This work advances the state of the art by providing robustness against bounded communication delays. A theoretical characterization of the algorithm properties is provided. Simulations and experimental results are presented to corroborate the theoretical findings and show the effectiveness of the proposed algorithm.

Posted in: Communication networks, Real-Time Systems , Tagged: Clock synchronization, Survey, Wireless sensor networks

A very interesting review of current approaches to SLAM based on smoothing (i.e., graph optimization) and in clustering the map into submaps

July 22, 2015 09:37 , Juan-Antonio Fernández-Madrigal

Jiantong Cheng, Jonghyuk Kim, Jinliang Shao, Weihua Zhang, Robust linear pose graph-based SLAM, Robotics and Autonomous Systems, Volume 72, October 2015, Pages 71-82, ISSN 0921-8890, DOI: 10.1016/j.robot.2015.04.010.

This paper addresses a robust and efficient solution to eliminate false loop-closures in a pose-graph linear SLAM problem. Linear SLAM was recently demonstrated based on submap joining techniques in which a nonlinear coordinate transformation was performed separately out of the optimization loop, resulting in a convex optimization problem. This however introduces added complexities in dealing with false loop-closures, which mostly stems from two factors: (a) the limited local observations in map-joining stages and (b) the non block-diagonal nature of the information matrix of each submap. To address these problems, we propose a Robust Linear SLAM by (a) developing a delayed optimization for outlier candidates and (b) utilizing a Schur complement to efficiently eliminate corrupted information block. Based on this new strategy, we prove that the spread of outlier information does not compromise the optimization performance of inliers and can be fully filtered out from the corrupted information matrix. Experimental results based on public synthetic and real-world datasets in 2D and 3D environments show that this robust approach can cope with the incorrect loop-closures robustly and effectively.

Posted in: Mobile robot SLAM , Tagged: Graph-based SLAM, Long-term SLAM, Map clustering, Survey

Quantum probability theory as an alternative to classical (Kolgomorov) probability theory for modelling human decision making processes, and a curious description of the effect of a particular ordering of decisions in the complete result

July 21, 2015 11:45 , Juan-Antonio Fernández-Madrigal

Peter D. Bruza, Zheng Wang, Jerome R. Busemeyer, Quantum cognition: a new theoretical approach to psychology, Trends in Cognitive Sciences, Volume 19, Issue 7, July 2015, Pages 383-393, ISSN 1364-6613, DOI: 10.1016/j.tics.2015.05.001.

What type of probability theory best describes the way humans make judgments under uncertainty and decisions under conflict? Although rational models of cognition have become prominent and have achieved much success, they adhere to the laws of classical probability theory despite the fact that human reasoning does not always conform to these laws. For this reason we have seen the recent emergence of models based on an alternative probabilistic framework drawn from quantum theory. These quantum models show promise in addressing cognitive phenomena that have proven recalcitrant to modeling by means of classical probability theory. This review compares and contrasts probabilistic models based on Bayesian or classical versus quantum principles, and highlights the advantages and disadvantages of each approach.

Posted in: Probability theories and interpretations, Psycho-physiological bases of engineering , Tagged: Decision making, Directly bioinspired, Quantum probability, Survey

Transfer learning in reinforcement learning through case-based and the use of heuristics for selecting actions

July 21, 2015 11:24 , Juan-Antonio Fernández-Madrigal

Reinaldo A.C. Bianchi, Luiz A. Celiberto Jr., Paulo E. Santos, Jackson P. Matsuura, Ramon Lopez de Mantaras, Transferring knowledge as heuristics in reinforcement learning: A case-based approach, Artificial Intelligence, Volume 226, September 2015, Pages 102-121, ISSN 0004-3702, DOI: 10.1016/j.artint.2015.05.008.

The goal of this paper is to propose and analyse a transfer learning meta-algorithm that allows the implementation of distinct methods using heuristics to accelerate a Reinforcement Learning procedure in one domain (the target) that are obtained from another (simpler) domain (the source domain). This meta-algorithm works in three stages: first, it uses a Reinforcement Learning step to learn a task on the source domain, storing the knowledge thus obtained in a case base; second, it does an unsupervised mapping of the source-domain actions to the target-domain actions; and, third, the case base obtained in the first stage is used as heuristics to speed up the learning process in the target domain.
A set of empirical evaluations were conducted in two target domains: the 3D mountain car (using a learned case base from a 2D simulation) and stability learning for a humanoid robot in the Robocup 3D Soccer Simulator (that uses knowledge learned from the Acrobot domain). The results attest that our transfer learning algorithm outperforms recent heuristically-accelerated reinforcement learning and transfer learning algorithms.

Posted in: Reinforcement learning in AI, Reinforcement learning theory , Tagged: Bootstrapped learning, Case-based learning, Neural networks, Reinforcement learning, Transfer learning

Semantic and syntactic bootstrapped learning for robots, inspired in similar processes in humans, that use language as a scaffolding mechanism to improve learning in unknown situations

July 21, 2015 11:01 , Juan-Antonio Fernández-Madrigal

Worgotter, F.; Geib, C.; Tamosiunaite, M.; Aksoy, E.E.; Piater, J.; Hanchen Xiong; Ude, A.; Nemec, B.; Kraft, D.; Kruger, N.; Wachter, M.; Asfour, T., Structural Bootstrapping—A Novel, Generative Mechanism for Faster and More Efficient Acquisition of Action-Knowledge, Autonomous Mental Development, IEEE Transactions on , vol.7, no.2, pp.140,154, June 2015, DOI: 10.1109/TAMD.2015.2427233.

Humans, but also robots, learn to improve their behavior. Without existing knowledge, learning either needs to be explorative and, thus, slow or-to be more efficient-it needs to rely on supervision, which may not always be available. However, once some knowledge base exists an agent can make use of it to improve learning efficiency and speed. This happens for our children at the age of around three when they very quickly begin to assimilate new information by making guided guesses how this fits to their prior knowledge. This is a very efficient generative learning mechanism in the sense that the existing knowledge is generalized into as-yet unexplored, novel domains. So far generative learning has not been employed for robots and robot learning remains to be a slow and tedious process. The goal of the current study is to devise for the first time a general framework for a generative process that will improve learning and which can be applied at all different levels of the robot’s cognitive architecture. To this end, we introduce the concept of structural bootstrapping-borrowed and modified from child language acquisition-to define a probabilistic process that uses existing knowledge together with new observations to supplement our robot’s data-base with missing information about planning-, object-, as well as, action-relevant entities. In a kitchen scenario, we use the example of making batter by pouring and mixing two components and show that the agent can efficiently acquire new knowledge about planning operators, objects as well as required motor pattern for stirring by structural bootstrapping. Some benchmarks are shown, too, that demonstrate how structural bootstrapping improves performance.

Posted in: Developmental robotics, Psycho-physiological bases of engineering , Tagged: Bootstrapped learning, Directly bioinspired, Task planning

Developmental approach for a robot manipulator that learns in several bootstrapped stages, strongly inspired in infant development

July 21, 2015 10:32 , Juan-Antonio Fernández-Madrigal

Ugur, E.; Nagai, Y.; Sahin, E.; Oztop, E., Staged Development of Robot Skills: Behavior Formation, Affordance Learning and Imitation with Motionese, Autonomous Mental Development, IEEE Transactions on , vol.7, no.2, pp.119,139, June 2015, DOI: 10.1109/TAMD.2015.2426192.

Inspired by infant development, we propose a three staged developmental framework for an anthropomorphic robot manipulator. In the first stage, the robot is initialized with a basic reach-and- enclose-on-contact movement capability, and discovers a set of behavior primitives by exploring its movement parameter space. In the next stage, the robot exercises the discovered behaviors on different objects, and learns the caused effects; effectively building a library of affordances and associated predictors. Finally, in the third stage, the learned structures and predictors are used to bootstrap complex imitation and action learning with the help of a cooperative tutor. The main contribution of this paper is the realization of an integrated developmental system where the structures emerging from the sensorimotor experience of an interacting real robot are used as the sole building blocks of the subsequent stages that generate increasingly more complex cognitive capabilities. The proposed framework includes a number of common features with infant sensorimotor development. Furthermore, the findings obtained from the self-exploration and motionese guided human-robot interaction experiments allow us to reason about the underlying mechanisms of simple-to-complex sensorimotor skill progression in human infants.

Posted in: Developmental robotics, Psycho-physiological bases of engineering , Tagged: Bootstrapped learning, Directly bioinspired, Manipulation, Skill learning, Survey, Task planning

Finding the common utility of actions in several tasks learnt in the same domain in order to reduce the learning cost of reinforcement learning

July 21, 2015 08:42 , Juan-Antonio Fernández-Madrigal

Rosman, B.; Ramamoorthy, S., Action Priors for Learning Domain Invariances, Autonomous Mental Development, IEEE Transactions on , vol.7, no.2, pp.107,118, June 2015, DOI: 10.1109/TAMD.2015.2419715.

An agent tasked with solving a number of different decision making problems in similar environments has an opportunity to learn over a longer timescale than each individual task. Through examining solutions to different tasks, it can uncover behavioral invariances in the domain, by identifying actions to be prioritized in local contexts, invariant to task details. This information has the effect of greatly increasing the speed of solving new problems. We formalise this notion as action priors, defined as distributions over the action space, conditioned on environment state, and show how these can be learnt from a set of value functions. We apply action priors in the setting of reinforcement learning, to bias action selection during exploration. Aggressive use of action priors performs context based pruning of the available actions, thus reducing the complexity of lookahead during search. We additionally define action priors over observation features, rather than states, which provides further flexibility and generalizability, with the additional benefit of enabling feature selection. Action priors are demonstrated in experiments in a simulated factory environment and a large random graph domain, and show significant speed ups in learning new tasks. Furthermore, we argue that this mechanism is cognitively plausible, and is compatible with findings from cognitive psychology.

Posted in: Psycho-physiological bases of engineering, Reinforcement learning theory , Tagged: Exploration vs. exploitation, Q-learning, Reinforcement learning, Useful for teaching

A brief general explanation of Rao-Blacwellization and a new way of applying it to reduce the variance of a point estimation in a sequential bayesian setting

July 20, 2015 11:08 , Juan-Antonio Fernández-Madrigal

Petetin, Y.; Desbouvries, F., Bayesian Conditional Monte Carlo Algorithms for Nonlinear Time-Series State Estimation, Signal Processing, IEEE Transactions on , vol.63, no.14, pp.3586,3598, DOI: 10.1109/TSP.2015.2423251.

Bayesian filtering aims at estimating sequentially a hidden process from an observed one. In particular, sequential Monte Carlo (SMC) techniques propagate in time weighted trajectories which represent the posterior probability density function (pdf) of the hidden process given the available observations. On the other hand, conditional Monte Carlo (CMC) is a variance reduction technique which replaces the estimator of a moment of interest by its conditional expectation given another variable. In this paper, we show that up to some adaptations, one can make use of the time recursive nature of SMC algorithms in order to propose natural temporal CMC estimators of some point estimates of the hidden process, which outperform the associated crude Monte Carlo (MC) estimator whatever the number of samples. We next show that our Bayesian CMC estimators can be computed exactly, or approximated efficiently, in some hidden Markov chain (HMC) models; in some jump Markov state-space systems (JMSS); as well as in multitarget filtering. Finally our algorithms are validated via simulations.

Posted in: Bayesian filtering , Tagged: Bayesian estimation, Rao-blackwellization, Recursive bayesian estimation

Efficient sampling of the agent-world interaction in reinforcement learning through the use of simulators with diverse fidelity to the real system

July 17, 2015 15:01 , Juan-Antonio Fernández-Madrigal

Cutler, M.; Walsh, T.J.; How, J.P., Real-World Reinforcement Learning via Multifidelity Simulators, Robotics, IEEE Transactions on , vol.31, no.3, pp.655,671, June 2015, DOI: 10.1109/TRO.2015.2419431.

Reinforcement learning (RL) can be a tool for designing policies and controllers for robotic systems. However, the cost of real-world samples remains prohibitive as many RL algorithms require a large number of samples before learning useful policies. Simulators are one way to decrease the number of required real-world samples, but imperfect models make deciding when and how to trust samples from a simulator difficult. We present a framework for efficient RL in a scenario where multiple simulators of a target task are available, each with varying levels of fidelity. The framework is designed to limit the number of samples used in each successively higher-fidelity/cost simulator by allowing a learning agent to choose to run trajectories at the lowest level simulator that will still provide it with useful information. Theoretical proofs of the framework’s sample complexity are given and empirical results are demonstrated on a remote-controlled car with multiple simulators. The approach enables RL algorithms to find near-optimal policies in a physical robot domain with fewer expensive real-world samples than previous transfer approaches or learning without simulators.

Posted in: Applications of reinforcement learning to robots , Tagged: Exploration vs. exploitation, Reinforcement learning, Transfer learning, Useful for teaching

« Previous 1 … 71 72 73 74 75 … 80 Next »

Author Archives: Juan-antonio Fernández-madrigal

Substituting the update step of a bayesian filter by a maximum likelihood optimisation in order to use non-linear observation models in a (linear-transition) Kalman framework

Damián Marelli, Minyue Fu, and Brett Ninness, Asymptotic Optimality of the Maximum-Likelihood Kalman Filter for Bayesian Tracking With Multiple Nonlinear Sensors, IEEE Transactions on signal processing, vol. 63, no. 17, DOI: 10.1109/TSP.2015.2440220.

A new algorithm for clock synchronization in wireless sensor networks with bounded delays, that includes interesting references to surveys

Emanuele Garone, Andrea Gasparri, Francesco Lamonaca, Clock synchronization protocol for wireless sensor networks with bounded communication delays, Automatica, Volume 59, September 2015, Pages 60-72, ISSN 0005-1098, DOI: 10.1016/j.automatica.2015.06.014.

A very interesting review of current approaches to SLAM based on smoothing (i.e., graph optimization) and in clustering the map into submaps

Jiantong Cheng, Jonghyuk Kim, Jinliang Shao, Weihua Zhang, Robust linear pose graph-based SLAM, Robotics and Autonomous Systems, Volume 72, October 2015, Pages 71-82, ISSN 0921-8890, DOI: 10.1016/j.robot.2015.04.010.

Quantum probability theory as an alternative to classical (Kolgomorov) probability theory for modelling human decision making processes, and a curious description of the effect of a particular ordering of decisions in the complete result

Peter D. Bruza, Zheng Wang, Jerome R. Busemeyer, Quantum cognition: a new theoretical approach to psychology, Trends in Cognitive Sciences, Volume 19, Issue 7, July 2015, Pages 383-393, ISSN 1364-6613, DOI: 10.1016/j.tics.2015.05.001.

Transfer learning in reinforcement learning through case-based and the use of heuristics for selecting actions

Semantic and syntactic bootstrapped learning for robots, inspired in similar processes in humans, that use language as a scaffolding mechanism to improve learning in unknown situations

Developmental approach for a robot manipulator that learns in several bootstrapped stages, strongly inspired in infant development

Ugur, E.; Nagai, Y.; Sahin, E.; Oztop, E., Staged Development of Robot Skills: Behavior Formation, Affordance Learning and Imitation with Motionese, Autonomous Mental Development, IEEE Transactions on , vol.7, no.2, pp.119,139, June 2015, DOI: 10.1109/TAMD.2015.2426192.

Finding the common utility of actions in several tasks learnt in the same domain in order to reduce the learning cost of reinforcement learning

Rosman, B.; Ramamoorthy, S., Action Priors for Learning Domain Invariances, Autonomous Mental Development, IEEE Transactions on , vol.7, no.2, pp.107,118, June 2015, DOI: 10.1109/TAMD.2015.2419715.

A brief general explanation of Rao-Blacwellization and a new way of applying it to reduce the variance of a point estimation in a sequential bayesian setting

Petetin, Y.; Desbouvries, F., Bayesian Conditional Monte Carlo Algorithms for Nonlinear Time-Series State Estimation, Signal Processing, IEEE Transactions on , vol.63, no.14, pp.3586,3598, DOI: 10.1109/TSP.2015.2423251.

Efficient sampling of the agent-world interaction in reinforcement learning through the use of simulators with diverse fidelity to the real system

Cutler, M.; Walsh, T.J.; How, J.P., Real-World Reinforcement Learning via Multifidelity Simulators, Robotics, IEEE Transactions on , vol.31, no.3, pp.655,671, June 2015, DOI: 10.1109/TRO.2015.2419431.

Post Navigation

Fields, areas and lines of research

Archives

Author Archives: Juan-antonio Fernández-madrigal

Damián Marelli, Minyue Fu, and Brett Ninness, Asymptotic Optimality of the Maximum-Likelihood Kalman Filter for Bayesian Tracking With Multiple Nonlinear Sensors, IEEE Transactions on signal processing, vol. 63, no. 17, DOI: 10.1109/TSP.2015.2440220.

Emanuele Garone, Andrea Gasparri, Francesco Lamonaca, Clock synchronization protocol for wireless sensor networks with bounded communication delays, Automatica, Volume 59, September 2015, Pages 60-72, ISSN 0005-1098, DOI: 10.1016/j.automatica.2015.06.014.

Jiantong Cheng, Jonghyuk Kim, Jinliang Shao, Weihua Zhang, Robust linear pose graph-based SLAM, Robotics and Autonomous Systems, Volume 72, October 2015, Pages 71-82, ISSN 0921-8890, DOI: 10.1016/j.robot.2015.04.010.

Peter D. Bruza, Zheng Wang, Jerome R. Busemeyer, Quantum cognition: a new theoretical approach to psychology, Trends in Cognitive Sciences, Volume 19, Issue 7, July 2015, Pages 383-393, ISSN 1364-6613, DOI: 10.1016/j.tics.2015.05.001.

Ugur, E.; Nagai, Y.; Sahin, E.; Oztop, E., Staged Development of Robot Skills: Behavior Formation, Affordance Learning and Imitation with Motionese, Autonomous Mental Development, IEEE Transactions on , vol.7, no.2, pp.119,139, June 2015, DOI: 10.1109/TAMD.2015.2426192.

Rosman, B.; Ramamoorthy, S., Action Priors for Learning Domain Invariances, Autonomous Mental Development, IEEE Transactions on , vol.7, no.2, pp.107,118, June 2015, DOI: 10.1109/TAMD.2015.2419715.

Petetin, Y.; Desbouvries, F., Bayesian Conditional Monte Carlo Algorithms for Nonlinear Time-Series State Estimation, Signal Processing, IEEE Transactions on , vol.63, no.14, pp.3586,3598, DOI: 10.1109/TSP.2015.2423251.

Cutler, M.; Walsh, T.J.; How, J.P., Real-World Reinforcement Learning via Multifidelity Simulators, Robotics, IEEE Transactions on , vol.31, no.3, pp.655,671, June 2015, DOI: 10.1109/TRO.2015.2419431.

Post Navigation

Fields, areas and lines of research

Transversal topics, methods and tools

Archives