Category Archives: Probability And Statistics

Finding the common utility of actions in several tasks learnt in the same domain in order to reduce the learning cost of reinforcement learning

Rosman, B.; Ramamoorthy, S., Action Priors for Learning Domain Invariances, Autonomous Mental Development, IEEE Transactions on , vol.7, no.2, pp.107,118, June 2015, DOI: 10.1109/TAMD.2015.2419715.

An agent tasked with solving a number of different decision making problems in similar environments has an opportunity to learn over a longer timescale than each individual task. Through examining solutions to different tasks, it can uncover behavioral invariances in the domain, by identifying actions to be prioritized in local contexts, invariant to task details. This information has the effect of greatly increasing the speed of solving new problems. We formalise this notion as action priors, defined as distributions over the action space, conditioned on environment state, and show how these can be learnt from a set of value functions. We apply action priors in the setting of reinforcement learning, to bias action selection during exploration. Aggressive use of action priors performs context based pruning of the available actions, thus reducing the complexity of lookahead during search. We additionally define action priors over observation features, rather than states, which provides further flexibility and generalizability, with the additional benefit of enabling feature selection. Action priors are demonstrated in experiments in a simulated factory environment and a large random graph domain, and show significant speed ups in learning new tasks. Furthermore, we argue that this mechanism is cognitively plausible, and is compatible with findings from cognitive psychology.

A brief general explanation of Rao-Blacwellization and a new way of applying it to reduce the variance of a point estimation in a sequential bayesian setting

Petetin, Y.; Desbouvries, F., Bayesian Conditional Monte Carlo Algorithms for Nonlinear Time-Series State Estimation, Signal Processing, IEEE Transactions on , vol.63, no.14, pp.3586,3598, DOI: 10.1109/TSP.2015.2423251.

Bayesian filtering aims at estimating sequentially a hidden process from an observed one. In particular, sequential Monte Carlo (SMC) techniques propagate in time weighted trajectories which represent the posterior probability density function (pdf) of the hidden process given the available observations. On the other hand, conditional Monte Carlo (CMC) is a variance reduction technique which replaces the estimator of a moment of interest by its conditional expectation given another variable. In this paper, we show that up to some adaptations, one can make use of the time recursive nature of SMC algorithms in order to propose natural temporal CMC estimators of some point estimates of the hidden process, which outperform the associated crude Monte Carlo (MC) estimator whatever the number of samples. We next show that our Bayesian CMC estimators can be computed exactly, or approximated efficiently, in some hidden Markov chain (HMC) models; in some jump Markov state-space systems (JMSS); as well as in multitarget filtering. Finally our algorithms are validated via simulations.

Accelerating the updating stage of a PF through selection of a few representative particles and interpolation of their weights to the rest, with interesting methods for selection and interpolation and a nice related work of efficiency-improved PFs

Shabat, G.; Shmueli, Y.; Bermanis, A.; Averbuch, A., Accelerating Particle Filter Using Randomized Multiscale and Fast Multipole Type Methods, Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol.37, no.7, pp.1396,1407, July 1 2015, DOI: 10.1109/TPAMI.2015.2392754.

Particle filter is a powerful tool for state tracking using non-linear observations. We present a multiscale based method that accelerates the tracking computation by particle filters. Unlike the conventional way, which calculates weights over all particles in each cycle of the algorithm, we sample a small subset from the source particles using matrix decomposition methods. Then, we apply a function extension algorithm that uses a particle subset to recover the density function for all the rest of the particles not included in the chosen subset. The computational effort is substantial especially when multiple objects are tracked concurrently. The proposed algorithm significantly reduces the computational load. By using the Fast Gaussian Transform, the complexity of the particle selection step is reduced to a linear time in n and k , where n is the number of particles and k is the number of particles in the selected subset. We demonstrate our method on both simulated and on real data such as object tracking in video sequences.

Analysis of the deterioration of several Kalman Filters depending on the amount of uncertainty in the observations, when the observation model is non-linear

Mark R. Morelande and Ángel F. García-Fernández, Analysis of Kalman Filter Approximations for Nonlinear Measurements, IEEE Transactions on signal processing, vol. 61, no. 22, 2013 DOI: 10.1109/TSP.2013.2279367.

A theoretical analysis is presented of the correction step of the Kalman filter (KF) and its various approximations for the case of a nonlinear measurement equation with additive Gaussian noise. The KF is based on a Gaussian app roximation to the joint density of the state and the measurement. The analysis metric is the Kullback-Leibler divergence of this approximation from the true joint density. The purpose of the analysis is to provide a quantitative tool for understanding and assessing the performance of the KF and its variants in nonlinear scenarios. This is illustrated using a numerical example.

Study of the explanation of probability and reasoning in the human mind through mental models, probability logic and classical logic

P.N. Johnson-Laird, Sangeet S. Khemlani, Geoffrey P. Goodwin, Logic, probability, and human reasoning, Trends in Cognitive Sciences, Volume 19, Issue 4, April 2015, Pages 201-214, ISSN 1364-6613, DOI: 10.1016/j.tics.2015.02.006.

This review addresses the long-standing puzzle of how logic and probability fit together in human reasoning. Many cognitive scientists argue that conventional logic cannot underlie deductions, because it never requires valid conclusions to be withdrawn – not even if they are false; it treats conditional assertions implausibly; and it yields many vapid, although valid, conclusions. A new paradigm of probability logic allows conclusions to be withdrawn and treats conditionals more plausibly, although it does not address the problem of vapidity. The theory of mental models solves all of these problems. It explains how people reason about probabilities and postulates that the machinery for reasoning is itself probabilistic. Recent investigations accordingly suggest a way to integrate probability and deduction.

On the not-so-domain-generic nature of statistical learning in the human brain

Ram Frost, Blair C. Armstrong, Noam Siegelman, Morten H. Christiansen, 2015, Domain generality versus modality specificity: the paradox of statistical learning, Trends in Cognitive Sciences, Volume 19, Issue 3, March 2015, Pages 117-125, DOI: 10.1016/j.tics.2014.12.010.

Statistical learning (SL) is typically considered to be a domain-general mechanism by which cognitive systems discover the underlying distributional properties of the input. However, recent studies examining whether there are commonalities in the learning of distributional information across different domains or modalities consistently reveal modality and stimulus specificity. Therefore, important questions are how and why a hypothesized domain-general learning mechanism systematically produces such effects. Here, we offer a theoretical framework according to which SL is not a unitary mechanism, but a set of domain-general computational principles that operate in different modalities and, therefore, are subject to the specific constraints characteristic of their respective brain regions. This framework offers testable predictions and we discuss its computational and neurobiological plausibility.

Partially observable reinforcement learning and the problem of representing the history of the learning process efficiently

Doshi-Velez, F.; Pfau, D.; Wood, F.; Roy, N., Bayesian Nonparametric Methods for Partially-Observable Reinforcement Learning, Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol.37, no.2, pp.394,407, Feb. 2015, DOI: 10.1109/TPAMI.2013.191

Making intelligent decisions from incomplete information is critical in many applications: for example, robots must choose actions based on imperfect sensors, and speech-based interfaces must infer a user\u2019s needs from noisy microphone inputs. What makes these tasks hard is that often we do not have a natural representation with which to model the domain and use for choosing actions; we must learn about the domain\u2019s properties while simultaneously performing the task. Learning a representation also involves trade-offs between modeling the data that we have seen previously and being able to make predictions about new data. This article explores learning representations of stochastic systems using Bayesian nonparametric statistics. Bayesian nonparametric methods allow the sophistication of a representation to scale gracefully with the complexity in the data. Our main contribution is a careful empirical evaluation of how representations learned using Bayesian nonparametric methods compare to other standard learning approaches, especially in support of planning and control. We show that the Bayesian aspects of the methods result in achieving state-of-the-art performance in decision making with relatively few samples, while the nonparametric aspects often result in fewer computations. These results hold across a variety of different techniques for choosing actions given a representation.

Estimating an empirical distribution from a number of estimates distributed among several agents, minimizing the information exchange between the agents

Sarwate, A.D.; Javidi, T., Distributed Learning of Distributions via Social Sampling, Automatic Control, IEEE Transactions on , vol.60, no.1, pp.34,45, Jan. 2015, DOI: 10.1109/TAC.2014.2329611

A protocol for distributed estimation of discrete distributions is proposed. Each agent begins with a single sample from the distribution, and the goal is to learn the empirical distribution of the samples. The protocol is based on a simple message-passing model motivated by communication in social networks. Agents sample a message randomly from their current estimates of the distribution, resulting in a protocol with quantized messages. Using tools from stochastic approximation, the algorithm is shown to converge almost surely. Examples illustrate three regimes with different consensus phenomena. Simulations demonstrate this convergence and give some insight into the effect of network topology.