Category Archives: Probability And Statistics

Improving the estimation of the offset parameter of heavy-tailed distributions through the injection of noise

Y. Pan, F. Duan, F. Chapeau-Blondeau and D. Abbott, Noise Enhancement in Robust Estimation of Location, IEEE Transactions on Signal Processing, vol. 66, no. 8, pp. 1953-1966, DOI: 10.1109/TSP.2018.2802463.

In this paper, we investigate the noise benefits to maximum likelihood type estimators (M-estimator) for the robust estimation of a location parameter. Two distinct noise benefits are shown to be accessible under these conditions. With symmetric heavy-tailed noise distributions, the asymptotic efficiency of the estimation can be enhanced by injecting extra noise into the M-estimators. With an asymmetric contaminated noise model having a convex cumulative distribution function, we demonstrate that addition of noise can reduce the maximum bias of the median estimator. These findings extend the analysis of stochastic resonance effects for noise-enhanced signal and information processing.

A novel approach to use POMDP in practical active perception, where rewards are needed to penalize uncertainty and therefore reomve the piecewise-linear and convex property of the value function

Satsangi, Y., Whiteson, S., Oliehoek, F.A. et al., Exploiting submodular value functions for scaling up active perception, Auton Robot (2018) 42: 209, DOI: 10.1007/s10514-017-9666-5.

In active perception tasks, an agent aims to select sensory actions that reduce its uncertainty about one or more hidden variables. For example, a mobile robot takes sensory actions to efficiently navigate in a new environment. While partially observable Markov decision processes (POMDPs) provide a natural model for such problems, reward functions that directly penalize uncertainty in the agent’s belief can remove the piecewise-linear and convex (PWLC) property of the value function required by most POMDP planners. Furthermore, as the number of sensors available to the agent grows, the computational cost of POMDP planning grows exponentially with it, making POMDP planning infeasible with traditional methods. In this article, we address a twofold challenge of modeling and planning for active perception tasks. We analyze ρ POMDP and POMDP-IR, two frameworks for modeling active perception tasks, that restore the PWLC property of the value function. We show the mathematical equivalence of these two frameworks by showing that given a ρ POMDP along with a policy, they can be reduced to a POMDP-IR and an equivalent policy (and vice-versa). We prove that the value function for the given ρ POMDP (and the given policy) and the reduced POMDP-IR (and the reduced policy) is the same. To efficiently plan for active perception tasks, we identify and exploit the independence properties of POMDP-IR to reduce the computational cost of solving POMDP-IR (and ρ POMDP). We propose greedy point-based value iteration (PBVI), a new POMDP planning method that uses greedy maximization to greatly improve scalability in the action space of an active perception POMDP. Furthermore, we show that, under certain conditions, including submodularity, the value function computed using greedy PBVI is guaranteed to have bounded error with respect to the optimal value function. We establish the conditions under which the value function of an active perception POMDP is guaranteed to be submodular. Finally, we present a detailed empirical analysis on a dataset collected from a multi-camera tracking system employed in a shopping mall. Our method achieves similar performance to existing methods but at a fraction of the computational cost leading to better scalability for solving active perception tasks.

Detecting anomalies in sequences of data by first modeling the data and then distinguishing non-usual information based on that model

K. Gokcesu and S. S. Kozat, Online Anomaly Detection With Minimax Optimal Density Estimation in Nonstationary Environments, IEEE Transactions on Signal Processing, vol. 66, no. 5, pp. 1213-1227, DOI: 10.1109/TSP.2017.2784390.

We introduce a truly online anomaly detection algorithm that sequentially processes data to detect anomalies in time series. In anomaly detection, while the anomalous data are arbitrary, the normal data have similarities and generally conforms to a particular model. However, the particular model that generates the normal data is generally unknown (even nonstationary) and needs to be learned sequentially. Therefore, a two stage approach is needed, where in the first stage, we construct a probability density function to model the normal data in the time series. Then, in the second stage, we threshold the density estimation of the newly observed data to detect anomalies. We approach this problem from an information theoretic perspective and propose minimax optimal schemes for both stages to create an optimal anomaly detection algorithm in a strong deterministic sense. To this end, for the first stage, we introduce a completely online density estimation algorithm that is minimax optimal with respect to the log-loss and achieves Merhav’s lower bound for general nonstationary exponential-family of distributions without any assumptions on the observation sequence. For the second stage, we propose a threshold selection scheme that is minimax optimal (with logarithmic performance bounds) against the best threshold chosen in hindsight with respect to the surrogate logistic loss. Apart from the regret bounds, through synthetic and real life experiments, we demonstrate substantial performance gains with respect to the state-of-the-art density estimation based anomaly detection algorithms in the literature.

Probabilistic SLAM is still the way to go for dynamic environments (according to this paper)

C. Evers and P. A. Naylor, Optimized Self-Localization for SLAM in Dynamic Scenes Using Probability Hypothesis Density Filters, IEEE Transactions on Signal Processing, vol. 66, no. 4, pp. 863-878, DOI: 10.1109/TSP.2017.2775590.

In many applications, sensors that map the positions of objects in unknown environments are installed on dynamic platforms. As measurements are relative to the observer’s sensors, scene mapping requires accurate knowledge of the observer state. However, in practice, observer reports are subject to positioning errors. Simultaneous localization and mapping addresses the joint estimation problem of observer localization and scene mapping. State-of-the-art approaches typically use visual or optical sensors and therefore rely on static beacons in the environment to anchor the observer estimate. However, many applications involving sensors that are not conventionally used for Simultaneous Localization and Mapping (SLAM) are affected by highly dynamic scenes, such that the static world assumption is invalid. This paper proposes a novel approach for dynamic scenes, called GEneralized Motion (GEM) SLAM. Based on probability hypothesis density filters, the proposed approach probabilistically anchors the observer state by fusing observer information inferred from the scene with reports of the observer motion. This paper derives the general, theoretical framework for GEM-SLAM, and shows that it generalizes existing Probability Hypothesis Density (PHD)-based SLAM algorithms. Simulations for a model-specific realization using range-bearing sensors and multiple moving objects highlight that GEM-SLAM achieves significant improvements over three benchmark algorithms.

Solving MDPs with discounted rewards for minimizing variance instead of expected (discounted) reward

Li Xia, Mean–variance optimization of discrete time discounted Markov decision processes, Automatica, Volume 88, 2018, Pages 76-82, DOI: 10.1016/j.automatica.2017.11.012.

In this paper, we study a mean–variance optimization problem in an infinite horizon discrete time discounted Markov decision process (MDP). The objective is to minimize the variance of system rewards with the constraint of mean performance. Different from most of works in the literature which require the mean performance already achieve optimum, we can let the discounted performance equal any constant. The difficulty of this problem is caused by the quadratic form of the variance function which makes the variance minimization problem not a standard MDP. By proving the decomposable structure of the feasible policy space, we transform this constrained variance minimization problem to an equivalent unconstrained MDP under a new discounted criterion and a new reward function. The difference of the variances of Markov chains under any two feasible policies is quantified by a difference formula. Based on the variance difference formula, a policy iteration algorithm is developed to find the optimal policy. We also prove the optimality of deterministic policy over the randomized policy generated in the mean-constrained policy space. Numerical experiments demonstrate the effectiveness of our approach.

A new approach to SLAM based on KF but without linearization

Feng Tan, Winfried Lohmiller, and Jean-Jacques Slotine, Analytical SLAM without linearization, The International Journal of Robotics Research
Vol 36, Issue 13-14, pp. 1554 – 1578, DOI: 10.1177/0278364917710541.

This paper solves the classical problem of simultaneous localization and mapping (SLAM) in a fashion that avoids linearized approximations altogether. Based on the creation of virtual synthetic measurements, the algorithm uses a linear time-varying Kalman observer, bypassing errors and approximations brought by the linearization process in traditional extended Kalman filtering SLAM. Convergence rates of the algorithm are established using contraction analysis. Different combinations of sensor information can be exploited, such as bearing measurements, range measurements, optical flow, or time-to-contact. SLAM-DUNK, a more advanced version of the algorithm in global coordinates, exploits the conditional independence property of the SLAM problem, decoupling the covariance matrices between different landmarks and reducing computational complexity to O(n). As illustrated in simulations, the proposed algorithm can solve SLAM problems in both 2D and 3D scenarios with guaranteed convergence rates in a full nonlinear context.

A new method for estimating inertial sensor signals

M. Ghobadi, P. Singla and E. T. Esfahani, Robust Attitude Estimation from Uncertain Observations of Inertial Sensors Using Covariance Inflated Multiplicative Extended Kalman Filter, IEEE Transactions on Instrumentation and Measurement, vol. 67, no. 1, pp. 209-217, DOI: 10.1109/TIM.2017.2761230.

This paper presents an attitude estimation method from uncertain observations of inertial sensors, which is highly robust against different uncertainties. The proposed method of covariance inflated multiplicative extended Kalman filter (CI-MEKF) takes the advantage of non-singularity of covariance in MEKF as well as a novel covariance inflation (CI) approach to fuse inconsistent information. The proposed CI approach compensates the undesired effect of magnetic distortion and body acceleration (as inherent biases of magnetometer and accelerometer sensors data, respectively) on the estimated attitude. Moreover, the CI-MEKF can accurately estimate the gyro bias. A number of simulation scenarios are designed to compare the performance of the proposed method with the state of the art in attitude estimation. The results show the proposed method outperforms the state of the art in terms of estimation accuracy and robustness. Moreover, the proposed CI-MEKF method is shown to be significantly robust against different uncertainties, such as large body acceleration, magnetic distortion, and errors, in the initial condition of the attitude.

Extending bayesian fusion from Euclidean spaces to Lie groups

Kevin C. Wolfe, Michael Mashner, Gregory S. Chirikjian, Bayesian Fusion on Lie Groups, JOURNAL OF ALGEBRAIC STATISTICS Vol. 2, No. 1, 2011, 75-97, DOI: 10.18409/jas.v2i1.11.

An increasing number of real-world problems involve the measurement of data, and the computation of estimates, on Lie groups. Moreover, establishing confidence in the resulting estimates is important. This paper therefore seeks to contribute to a larger theoretical framework that generalizes classical multivariate statistical analysis from Euclidean space to the setting of Lie groups. The particular focus here is on extending Bayesian fusion, based on exponential families of probability densities, from the Euclidean setting to Lie groups. The definition and properties of a new kind of Gaussian distribution for connected unimodular Lie groups are articulated, and explicit formulas and algorithms are given for finding the mean and covariance of the fusion model based on the means and covariances of the constituent probability densities. The Lie groups that find the most applications in engineering are rotation groups and groups of rigid-body motions. Orientational (rotation-group) data and associated algorithms for estimation arise in problems including satellite attitude, molecular spectroscopy, and global geological studies. In robotics and manufacturing, quantifying errors in the position and orientation of tools and parts are important for task performance and quality control. Developing a general way to handle problems on Lie groups can be applied to all of these problems. In particular, we study the issue of how to ‘fuse’ two such Gaussians and how to obtain a new Gaussian of the same form that is ‘close to’ the fused density.This is done at two levels of approximation that result from truncating the Baker-Campbell-Hausdorff formula with different numbers of terms. Algorithms are developed and numerical results are presented that are shown to generate the equivalent fused density with good accuracy

Kalman Filter as the extreme case of finite impulse response filters as the horizon increases in length

Shunyi Zhao, Biao Huang, Yuriy S. Shmaliy, Bayesian state estimation on finite horizons: The case of linear state–space model,Automatica, Volume 85, 2017, Pages 91-99, DOI: 10.1016/j.automatica.2017.07.043.

The finite impulse response (FIR) filter and infinite impulse response filter including the Kalman filter (KF) are generally considered as two different types of state estimation methods. In this paper, the sequential Bayesian philosophy is extended to a filter design using a fixed amount of most recent measurements, with the aim of exploiting the FIR structure and unifying some basic FIR filters with the KF. Specifically, the conditional mean and covariance of the posterior probability density functions are first derived to show the FIR counterpart of the KF. To remove the dependence on initial states, the corresponding likelihood is further maximized and realized iteratively. It shows that the maximum likelihood modification unifies the existing unbiased FIR filters by tuning a weighting matrix. Moreover, it converges to the Kalman estimate with the increase of horizon length, and can thus be considered as a link between the FIR filtering and the KF. Several important properties including stability and robustness against errors of noise statistics are illustrated. Finally, a moving target tracking example and an experiment with a three degrees-of-freedom helicopter system are introduced to demonstrate effectiveness.

Example of learning a Bayesian network using expert knowledge

H. Amirkhani, M. Rahmati, P. J. F. Lucas and A. Hommersom, Exploiting Experts’ Knowledge for Structure Learning of Bayesian Networks, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 11, pp. 2154-2170, DOI: 10.1109/TPAMI.2016.2636828.

Learning Bayesian network structures from data is known to be hard, mainly because the number of candidate graphs is super-exponential in the number of variables. Furthermore, using observational data alone, the true causal graph is not discernible from other graphs that model the same set of conditional independencies. In this paper, it is investigated whether Bayesian network structure learning can be improved by exploiting the opinions of multiple domain experts regarding cause-effect relationships. In practice, experts have different individual probabilities of correctly labeling the inclusion or exclusion of edges in the structure. The accuracy of each expert is modeled by three parameters. Two new scoring functions are introduced that score each candidate graph based on the data and experts’ opinions, taking into account their accuracy parameters. In the first scoring function, the experts’ accuracies are estimated using an expectation-maximization-based algorithm and the estimated accuracies are explicitly used in the scoring process. The second function marginalizes out the accuracy parameters to obtain more robust scores when it is not possible to obtain a good estimate of experts’ accuracies. The experimental results on simulated and real world datasets show that exploiting experts’ knowledge can improve the structure learning if we take the experts’ accuracies into account.