One of the first thorough studies of Monte Carlo Localization with line-segment maps

Biswajit Sarkar, Surojit Saha, Prabir K. Pal, A novel method for computation of importance weights in Monte Carlo localization on line segment-based maps, Robotics and Autonomous Systems, Volume 74, Part A, December 2015, Pages 51-65, ISSN 0921-8890, DOI: 10.1016/j.robot.2015.07.001.

Monte Carlo localization is a powerful and popular approach in mobile robot localization. Line segment-based maps provide a compact and scalable representation of indoor environments for mobile robot navigation. But Monte Carlo localization has seldom been studied in the context of line segment-based maps. A key step of the approach–and one that can endow it with or rob it of the attributes of accuracy, robustness and efficiency–is the computation of the so called importance weight associated with each particle. In this paper, we propose a new method for the computation of importance weights on maps represented with line segments, and extensively study its performance in pose tracking. We also compare our method with three other methods reported in the literature and present the results and insights thus gathered. The comparative study, conducted using both simulated and real data, on maps built from real data available in the public domain clearly establish that the proposed method is more accurate, robust and efficient than the other methods.

Multi-agent Q-learning applied to the defense against DDoS attacks with some provisions for scaling

Kleanthis Malialisa, Sam Devlina & Daniel Kudenkoa, Distributed reinforcement learning for adaptive and robust network intrusion response, Connection Science, Volume 27, Issue 3, 2015, DOI: 10.1080/09540091.2015.1031082.

Distributed denial of service (DDoS) attacks constitute a rapidly evolving threat in the current Internet. Multiagent Router Throttling is a novel approach to defend against DDoS attacks where multiple reinforcement learning agents are installed on a set of routers and learn to rate-limit or throttle traffic towards a victim server. The focus of this paper is on online learning and scalability. We propose an approach that incorporates task decomposition, team rewards and a form of reward shaping called difference rewards. One of the novel characteristics of the proposed system is that it provides a decentralised coordinated response to the DDoS problem, thus being resilient to DDoS attacks themselves. The proposed system learns remarkably fast, thus being suitable for online learning. Furthermore, its scalability is successfully demonstrated in experiments involving 1000 learning agents. We compare our approach against a baseline and a popular state-of-the-art throttling technique from the network security literature and show that the proposed approach is more effective, adaptive to sophisticated attack rate dynamics and robust to agent failures.

Modelling emotions in adaptive agents through the action selection part of reinforcement learning, plus some references on the neurophysiological bases of RL and a good review of literature on emotions

Joost Broekens , Elmer Jacobs , Catholijn M. Jonker, A reinforcement learning model of joy, distress, hope and fear, Connection Science, Vol. 27, Iss. 3, 2015, DOI: 10.1080/09540091.2015.1031081.

In this paper we computationally study the relation between adaptive behaviour and emotion. Using the reinforcement learning framework, we propose that learned state utility, V(s), models fear (negative) and hope (positive) based on the fact that both signals are about anticipation of loss or gain. Further, we propose that joy/distress is a signal similar to the error signal. We present agent-based simulation experiments that show that this model replicates psychological and behavioural dynamics of emotion. This work distinguishes itself by assessing the dynamics of emotion in an adaptive agent framework – coupling it to the literature on habituation, development, extinction and hope theory. Our results support the idea that the function of emotion is to provide a complex feedback signal for an organism to adapt its behaviour. Our work is relevant for understanding the relation between emotion and adaptation in animals, as well as for human–robot interaction, in particular how emotional signals can be used to communicate between adaptive agents and humans.

A clarification and systematization of UKF

Menegaz, H.M.T.; Ishihara, J.Y.; Borges, G.A.; Vargas, A.N., A Systematization of the Unscented Kalman Filter Theory, in Automatic Control, IEEE Transactions on , vol.60, no.10, pp.2583-2598, Oct. 2015 DOI: 10.1109/TAC.2015.2404511.

In this paper, we propose a systematization of the (discrete-time) Unscented Kalman Filter (UKF) theory. We gather all available UKF variants in the literature, present corrections to theoretical inconsistencies, and provide a tool for the construction of new UKF’s in a consistent way. This systematization is done, mainly, by revisiting the concepts of Sigma-Representation, Unscented Transformation (UT), Scaled Unscented Transformation (SUT), UKF, and Square-Root Unscented Kalman Filter (SRUKF). Inconsistencies are related to 1) matching the order of the transformed covariance and cross-covariance matrices of both the UT and the SUT; 2) multiple UKF definitions; 3) issue with some reduced sets of sigma points described in the literature; 4) the conservativeness of the SUT; 5) the scaling effect of the SUT on both its transformed covariance and cross-covariance matrices; and 6) possibly ill-conditioned results in SRUKF’s. With the proposed systematization, the symmetric sets of sigma points in the literature are formally justified, and we are able to provide new consistent variations for UKF’s, such as the Scaled SRUKF’s and the UKF’s composed by the minimum number of sigma points. Furthermore, our proposed SRUKF has improved computational properties when compared to state-of-the-art methods.

Survey on Model-Driven Software Engineering for real-time embedded systems and robotics

Brugali, D., Model-Driven Software Engineering in Robotics: Models Are Designed to Use the Relevant Things, Thereby Reducing the Complexity and Cost in the Field of Robotics, in Robotics & Automation Magazine, IEEE , vol.22, no.3, pp.155-166, Sept. 2015, DOI: 10.1109/MRA.2015.2452201.

A model is an abstract representation of a real system or phenomenon [1]. The idea of a model is to capture important properties of reality and to eglect irrelevant details. The properties that are relevant and that can be neglected depend on the purpose of creating a model. A model can make a particular system or phenomenon easier to understand, quantify, visualize, simulate, or predict.

Detecting objects in images through the timing of the changes in the visual sensor, rather than through the analysis of frames (without time information)

Orchard, G.; Meyer, C.; Etienne-Cummings, R.; Posch, C.; Thakor, N.; Benosman, R., HFirst: A Temporal Approach to Object Recognition, in Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol.37, no.10, pp.2028-2040, Oct. 1 2015 DOI: 10.1109/TPAMI.2015.2392947.

This paper introduces a spiking hierarchical model for object recognition which utilizes the precise timing information inherently present in the output of biologically inspired asynchronous address event representation (AER) vision sensors. The asynchronous nature of these systems frees computation and communication from the rigid predetermined timing enforced by system clocks in conventional systems. Freedom from rigid timing constraints opens the possibility of using true timing to our advantage in computation. We show not only how timing can be used in object recognition, but also how it can in fact simplify computation. Specifically, we rely on a simple temporal-winner-take-all rather than more computationally intensive synchronous operations typically used in biologically inspired neural networks for object recognition. This approach to visual computation represents a major paradigm shift from conventional clocked systems and can find application in other sensory modalities and computational tasks. We showcase effectiveness of the approach by achieving the highest reported accuracy to date (97.5% ± 3.5%) for a previously published four class card pip recognition task and an accuracy of 84.9% ± 1.9% for a new more difficult 36 class character recognition task.

On how the human cognition detects regularities in noisy sensory data (“Statistical learning” in psychology terms)

Annabelle Goujon, André Didierjean, Simon Thorpe, Investigating implicit statistical learning mechanisms through contextual cueing, Trends in Cognitive Sciences, Volume 19, Issue 9, September 2015, Pages 524-533, ISSN 1364-6613, DOI: 10.1016/j.tics.2015.07.009.

Since its inception, the contextual cueing (CC) paradigm has generated considerable interest in various fields of cognitive sciences because it constitutes an elegant approach to understanding how statistical learning (SL) mechanisms can detect contextual regularities during a visual search. In this article we review and discuss five aspects of CC: (i) the implicit nature of learning, (ii) the mechanisms involved in CC, (iii) the mediating factors affecting CC, (iv) the generalization of CC phenomena, and (v) the dissociation between implicit and explicit CC phenomena. The findings suggest that implicit SL is an inherent component of ongoing processing which operates through clustering, associative, and reinforcement processes at various levels of sensory-motor processing, and might result from simple spike-timing-dependent plasticity.

Planning tasks in mobile robots with MDPs that maximize the probability of satisfying user’s requirements specified through temporal logics, with estimation of transition probabilities through simulation only when needed

Jing Wang, Xuchu Ding, Morteza Lahijanian, Ioannis Ch. Paschalidis, and Calin A. Belta, Temporal logic motion control using actor–critic methods, The International Journal of Robotics Research September 2015 34: 1329-1344, first published on May 26, 2015. DOI: 10.1177/0278364915581505.

This paper considers the problem of deploying a robot from a specification given as a temporal logic statement about some properties satisfied by the regions of a large, partitioned environment. We assume that the robot has noisy sensors and actuators and model its motion through the regions of the environment as a Markov decision process (MDP). The robot control problem becomes finding the control policy which maximizes the probability of satisfying the temporal logic task on the MDP. For a large environment, obtaining transition probabilities for each state–action pair, as well as solving the necessary optimization problem for the optimal policy, are computationally intensive. To address these issues, we propose an approximate dynamic programming framework based on a least-squares temporal difference learning method of the actor–critic type. This framework operates on sample paths of the robot and optimizes a randomized control policy with respect to a small set of parameters. The transition probabilities are obtained only when needed. Simulations confirm that convergence of the parameters translates to an approximately optimal policy.

Building probabilistic models of physical processes from their deterministic models and some experimental data, with guarantees on the degree of coincidence between the generated model and the real system

Konstantinos Karydis, Ioannis Poulakakis, Jianxin Sun, and Herbert G. Tanner, Probabilistically valid stochastic extensions of deterministic models for systems with uncertainty, The International Journal of Robotics Research September 2015 34: 1278-1295, first published on May 28, 2015. DOI: 10.1177/0278364915576336.

Models capable of capturing and reproducing the variability observed in experimental trials can be valuable for planning and control in the presence of uncertainty. This paper reports on a new data-driven methodology that extends deterministic models to a stochastic regime and offers probabilistic guarantees of model fidelity. From an acceptable deterministic model, a stochastic one is generated, capable of capturing and reproducing uncertain system–environment interactions at given levels of fidelity. The reported approach combines methodological elements from probabilistic model validation and randomized algorithms, to simultaneously quantify the fidelity of a model and tune the distribution of random parameters in the corresponding stochastic extension, in order to reproduce the variability observed experimentally in the physical process of interest. The approach can be applied to an array of physical processes, the models of which may come in different forms, including differential equations; we demonstrate this point by considering examples from the areas of miniature legged robots and aerial vehicles.

Modelling ECGs with sums of gaussians and estimating them through switching Kalman Filters and the likelihood of each mode

Oster, J.; Behar, J.; Sayadi, O.; Nemati, S.; Johnson, A.E.W.; Clifford, G.D., Semisupervised ECG Ventricular Beat Classification With Novelty Detection Based on Switching Kalman Filters, in Biomedical Engineering, IEEE Transactions on , vol.62, no.9, pp.2125-2134, Sept. 2015, DOI: 10.1109/TBME.2015.2402236.

Automatic processing and accurate diagnosis of pathological electrocardiogram (ECG) signals remains a challenge. As long-term ECG recordings continue to increase in prevalence, driven partly by the ease of remote monitoring technology usage, the need to automate ECG analysis continues to grow. In previous studies, a model-based ECG filtering approach to ECG data from healthy subjects has been applied to facilitate accurate online filtering and analysis of physiological signals. We propose an extension of this approach, which models not only normal and ventricular heartbeats, but also morphologies not previously encountered. A switching Kalman filter approach is introduced to enable the automatic selection of the most likely mode (beat type), while simultaneously filtering the signal using appropriate prior knowledge. Novelty detection is also made possible by incorporating a third mode for the detection of unknown (not previously observed) morphologies, and denoted as X-factor. This new approach is compared to state-of-the-art techniques for the ventricular heartbeat classification in the MIT-BIH arrhythmia and Incart databases. F1 scores of 98.3% and 99.5% were found on each database, respectively, which are superior to other published algorithms’ results reported on the same databases. Only 3% of all the beats were discarded as X-factor, and the majority of these beats contained high levels of noise. The proposed technique demonstrates accurate beat classification in the presence of previously unseen (and unlearned) morphologies and noise, and provides an automated method for morphological analysis of arbitrary (unknown) ECG leads.