Author Archives: Juan-antonio Fernández-madrigal

Massive parallelization of POMDPs with a very good state-of-the-art review

Taekhee Lee, Young J. Kim (2015), Massively parallel motion planning algorithms under uncertainty using POMDP , The International Journal of Robotics Research, Vol 35, Issue 8, pp. 928 – 942, DOI: 10.1177/0278364915594856.

We present new parallel algorithms that solve continuous-state partially observable Markov decision process (POMDP) problems using the GPU (gPOMDP) and a hybrid of the GPU and CPU (hPOMDP). We choose the Monte Carlo value iteration (MCVI) method as our base algorithm and parallelize this algorithm using the multi-level parallel formulation of MCVI. For each parallel level, we propose efficient algorithms to utilize the massive data parallelism available on modern GPUs. Our GPU-based method uses the two workload distribution techniques, compute/data interleaving and workload balancing, in order to obtain the maximum parallel performance at the highest level. Here we also present a CPU–GPU hybrid method that takes advantage of both CPU and GPU parallelism in order to solve highly complex POMDP planning problems. The CPU is responsible for data preparation, while the GPU performs Monte Cacrlo simulations; these operations are performed concurrently using the compute/data overlap technique between the CPU and GPU. To the best of the authors’ knowledge, our algorithms are the first parallel algorithms that efficiently execute POMDP in a massively parallel fashion utilizing the GPU or a hybrid of the GPU and CPU. Our algorithms outperform the existing CPU-based algorithm by a factor of 75–99 based on the chosen benchmark.

Combining efficiently symbolic planning with geometric planning

Fabien Lagriffoul, Benjamin Andres (2016), Combining task and motion planning: A culprit detection problem , The International Journal of Robotics Research, Vol 35, Issue 8, pp. 890 – 927, DOI: 10.1177/0278364915619022.

Solving problems combining task and motion planning requires searching across a symbolic search space and a geometric search space. Because of the semantic gap between symbolic and geometric representations, symbolic sequences of actions are not guaranteed to be geometrically feasible. This compels us to search in the combined search space, in which frequent backtracks between symbolic and geometric levels make the search inefficient. We address this problem by guiding symbolic search with rich information extracted from the geometric level through culprit detection mechanisms.

A novel clock synchronization architecture for networked systems based on forcing the synchronization, with a nice summary of uses of clock synchronization and of existing synchronization architectures

S. Bolognani, R. Carli, E. Lovisari and S. Zampieri, “A Randomized Linear Algorithm for Clock Synchronization in Multi-Agent Systems,” in IEEE Transactions on Automatic Control, vol. 61, no. 7, pp. 1711-1726, July 2016. DOI: 10.1109/TAC.2015.2479136.

A broad family of randomized clock synchronization protocols based on a second order consensus algorithm is proposed. Under mild conditions on the graph connectivity, it is proved that the parameters of the algorithm can always be tuned in such a way that the clock synchronization is achieved in the probabilistic mean-square sense. This family of algorithms contains, as particular cases, several known approaches which range from distributed asynchronous to hierarchical synchronous protocols. This is illustrated by specializing the algorithm for the well-known broadcast and gossip scenarios in wireless communications, and for the standard hierarchical protocol used in the context of wired communications in data networks. In these cases, we show how the feasible range for the algorithm parameters can be explicitly computed. Finally, the performance of this strategy is validated by actual implementation in a real testbed and by numerical simulations.

Interesting approach for communicating robots: through the passive recognition of others patterns of motion

Barnali Das, Micael S. Couceiro, Patricia A. Vargas, MRoCS: A new multi-robot communication system based on passive action recognition, Robotics and Autonomous Systems, Volume 82, August 2016, Pages 46-60, ISSN 0921-8890, DOI: 10.1016/j.robot.2016.04.002.

Multi-robot search-and-rescue missions often face major challenges in adverse environments due to the limitations of traditional implicit and explicit communication. This paper proposes a novel multi-robot communication system (MRoCS), which uses a passive action recognition technique that overcomes the shortcomings of traditional models. The proposed MRoCS relies on individual motion, by mimicking the waggle dance of honey bees and thus forming and recognising different patterns accordingly. The system was successfully designed and implemented in simulation and with real robots. Experimental results show that, the pattern recognition process successfully reported high sensitivity with good precision in all cases for three different patterns thus corroborating our hypothesis.

A formal study of the guarantees that deep neural network offer for classification

R. Giryes, G. Sapiro and A. M. Bronstein, “Deep Neural Networks with Random Gaussian Weights: A Universal Classification Strategy?,” in IEEE Transactions on Signal Processing, vol. 64, no. 13, pp. 3444-3457, July1, 1 2016. DOI: 10.1109/TSP.2016.2546221.

Three important properties of a classification machinery are i) the system preserves the core information of the input data; ii) the training examples convey information about unseen data; and iii) the system is able to treat differently points from different classes. In this paper, we show that these fundamental properties are satisfied by the architecture of deep neural networks. We formally prove that these networks with random Gaussian weights perform a distance-preserving embedding of the data, with a special treatment for in-class and out-of-class data. Similar points at the input of the network are likely to have a similar output. The theoretical analysis of deep networks here presented exploits tools used in the compressed sensing and dictionary learning literature, thereby making a formal connection between these important topics. The derived results allow drawing conclusions on the metric learning properties of the network and their relation to its structure, as well as providing bounds on the required size of the training set such that the training examples would represent faithfully the unseen data. The results are validated with state-of-the-art trained networks.

A new theoretical framework for modeling concepts that allows them to combine reflecting the way humans do, with a good related-work on other concept frameworks in AI

Martha Lewis, Jonathan Lawry, Hierarchical conceptual spaces for concept combination, Artificial Intelligence, Volume 237, August 2016, Pages 204-227, ISSN 0004-3702, DOI: 10.1016/j.artint.2016.04.008.

We introduce a hierarchical framework for conjunctive concept combination based on conceptual spaces and random set theory. The model has the flexibility to account for composition of concepts at various levels of complexity. We show that the conjunctive model includes linear combination as a special case, and that the more general model can account for non-compositional behaviours such as overextension, non-commutativity, preservation of necessity and impossibility of attributes and to some extent, attribute loss or emergence. We investigate two further aspects of human concept use, the conjunction fallacy and the “guppy effect”.

Interesting hypothesis about how cognitive abilities can be modelled with closed control loops that run in parallel -using hierarchies of abstraction and prediction-, traditionally used just for low-level behaviours

Giovanni Pezzulo, Paul Cisek, Navigating the Affordance Landscape: Feedback Control as a Process Model of Behavior and Cognition, Trends in Cognitive Sciences, Volume 20, Issue 6, June 2016, Pages 414-424, ISSN 1364-6613, DOI: 10.1016/j.tics.2016.03.013.

We discuss how cybernetic principles of feedback control, used to explain sensorimotor behavior, can be extended to provide a foundation for understanding cognition. In particular, we describe behavior as parallel processes of competition and selection among potential action opportunities (‘affordances’) expressed at multiple levels of abstraction. Adaptive selection among currently available affordances is biased not only by predictions of their immediate outcomes and payoffs but also by predictions of what new affordances they will make available. This allows animals to purposively create new affordances that they can later exploit to achieve high-level goals, resulting in intentional action that links across multiple levels of control. Finally, we discuss how such a ‘hierarchical affordance competition’ process can be mapped to brain structure.

Physiological evidences that visual attention is based on predictions

Martin Rolfs, Martin Szinte, Remapping Attention Pointers: Linking Physiology and Behavior, Trends in Cognitive Sciences, Volume 20, Issue 6, 2016, Pages 399-401, ISSN 1364-6613, DOI: 10.1016/j.tics.2016.04.003.

Our eyes rapidly scan visual scenes, displacing the projection on the retina with every move. Yet these frequent retinal image shifts do not appear to hamper vision. Two recent physiological studies shed new light on the role of attention in visual processing across saccadic eye movements.

A gentle introduction to Box-Particle Filters

A. Gning, B. Ristic, L. Mihaylova and F. Abdallah, An Introduction to Box Particle Filtering [Lecture Notes], in IEEE Signal Processing Magazine, vol. 30, no. 4, pp. 166-171, July 2013. DOI: 10.1109/MSP.2013.225460.

Resulting from the synergy between the sequential Monte Carlo (SMC) method [1] and interval analysis [2], box particle filtering is an approach that has recently emerged [3] and is aimed at solving a general class of nonlinear filtering problems. This approach is particularly appealing in practical situations involving imprecise stochastic measurements that result in very broad posterior densities. It relies on the concept of a box particle that occupies a small and controllable rectangular region having a nonzero volume in the state space. Key advantages of the box particle filter (box-PF) against the standard particle filter (PF) are its reduced computational complexity and its suitability for distributed filtering. Indeed, in some applications where the sampling importance resampling (SIR) PF may require thousands of particles to achieve accurate and reliable performance, the box-PF can reach the same level of accuracy with just a few dozen box particles. Recent developments [4] also show that a box-PF can be interpreted as a Bayes? filter approximation allowing the application of box-PF to challenging target tracking problems [5].

A robot architecture composed of reinforcement learners for predicting and developing behaviors

Richard S. Sutton, Joseph Modayil, Michael Delp, Thomas Degris, Patrick M. Pilarski, Adam White, and Doina PrecupHorde (2011), A scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction, Proc. of 10th Int. Conf. on Autonomous Agents and Multiagent Systems (AAMAS 2011), Tumer, Yolum, Sonenberg and Stone (eds.), May, 2–6, 2011, Taipei, Taiwan, pp. 761-768.

Maintaining accurate world knowledge in a complex and changing environment is a perennial problem for robots and other artificial intelligence systems. Our architecture for addressing this problem, called Horde, consists of a large number of independent reinforcement learning sub-agents, or demons. Each demon is responsible for answering a single predictive or goal-oriented question about the world, thereby contributing in a factored, modular way to the system’s overall knowledge. The questions are in the form of a value function, but each demon has its own policy, reward function, termination function, and terminal-reward function unrelated to those of the base problem. Learning proceeds in parallel by all demons simultaneously so as to extract the maximal training information from whatever actions are taken by the system as a whole. Gradient-based temporal-difference learning methods are used to learn efficiently and reliably with function approximation in this off-policy setting. Horde runs in constant time and memory per time step, and is thus suitable for learning online in realtime applications such as robotics. We present results using Horde on a multi-sensored mobile robot to successfully learn goal-oriented behaviors and long-term predictions from off-policy experience. Horde is a significant incremental step towards a real-time architecture for efficient learning of general knowledge from unsupervised sensorimotor interaction.