Tag Archives: Bayesian Estimation

Bayesian estimation of the model in model-based RL for robots

Senda, Kei, Hishinuma, Toru, Tani, Yurika, Approximate Bayesian reinforcement learning based on estimation of plant, Autonomous Robots 44(5), DOI: 10.1007/s10514-020-09901-4.

This study proposes an approximate parametric model-based Bayesian reinforcement learning approach for robots, based on online Bayesian estimation and online planning for an estimated model. The proposed approach is designed to learn a robotic task with a few real-world samples and to be robust against model uncertainty, within feasible computational resources. The proposed approach employs two-stage modeling, which is composed of (1) a parametric differential equation model with a few parameters based on prior knowledge such as equations of motion, and (2) a parametric model that interpolates a finite number of transition probability models for online estimation and planning. The proposed approach modifies the online Bayesian estimation to be robust against approximation errors of the parametric model to a real plant. The policy planned for the interpolating model is proven to have a form of theoretical robustness. Numerical simulation and hardware experiments of a planar peg-in-hole task demonstrate the effectiveness of the proposed approach.

A probabilistically rigurous formulation of the estimation of grid maps in dynamic scenarios, and a nice review and state-of-the-art of grid maps, both for static and dynamic scenarios

Dominik Nuss, Stephan Reuter, Markus Thom, Ting Yuan, Gunther Krehl, Michael Maile, Axel Gern, and Klaus Dietmayer, A random finite set approach for dynamic occupancy grid maps with real-time application, The International Journal of Robotics Research
Vol 37, Issue 8, pp. 841 – 866, DOI: 10.1177/0278364918775523.

Grid mapping is a well-established approach for environment perception in robotic and automotive applications. Early work suggests estimating the occupancy state of each grid cell in a robot’s environment using a Bayesian filter to recursively combine new measurements with the current posterior state estimate of each grid cell. This filter is often referred to as binary Bayes filter. A basic assumption of classical occupancy grid maps is a stationary environment. Recent publications describe bottom-up approaches using particles to represent the dynamic state of a grid cell and outline prediction-update recursions in a heuristic manner. This paper defines the state of multiple grid cells as a random finite set, which allows to model the environment as a stochastic, dynamic system with multiple obstacles, observed by a stochastic measurement system. It motivates an original filter called the probability hypothesis density / multi-instance Bernoulli (PHD/MIB) filter in a top-down manner. The paper presents a real-time application serving as a fusion layer for laser and radar sensor data and describes in detail a highly efficient parallel particle filter implementation. A quantitative evaluation shows that parameters of the stochastic process model affect the filter results as theoretically expected and that appropriate process and observation models provide consistent state estimation results.

A brief general explanation of Rao-Blacwellization and a new way of applying it to reduce the variance of a point estimation in a sequential bayesian setting

Petetin, Y.; Desbouvries, F., Bayesian Conditional Monte Carlo Algorithms for Nonlinear Time-Series State Estimation, Signal Processing, IEEE Transactions on , vol.63, no.14, pp.3586,3598, DOI: 10.1109/TSP.2015.2423251.

Bayesian filtering aims at estimating sequentially a hidden process from an observed one. In particular, sequential Monte Carlo (SMC) techniques propagate in time weighted trajectories which represent the posterior probability density function (pdf) of the hidden process given the available observations. On the other hand, conditional Monte Carlo (CMC) is a variance reduction technique which replaces the estimator of a moment of interest by its conditional expectation given another variable. In this paper, we show that up to some adaptations, one can make use of the time recursive nature of SMC algorithms in order to propose natural temporal CMC estimators of some point estimates of the hidden process, which outperform the associated crude Monte Carlo (MC) estimator whatever the number of samples. We next show that our Bayesian CMC estimators can be computed exactly, or approximated efficiently, in some hidden Markov chain (HMC) models; in some jump Markov state-space systems (JMSS); as well as in multitarget filtering. Finally our algorithms are validated via simulations.

Brief but nice related work about structured prediction (MRFs, CRFs, etc.)

Bratieres, S.; Quadrianto, N.; Ghahramani, Z., GPstruct: Bayesian Structured Prediction Using Gaussian Processes, Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol.37, no.7, pp.1514,1520, July 1 2015, DOI: 10.1109/TPAMI.2014.2366151.

We introduce a conceptually novel structured prediction model, GPstruct, which is kernelized, non-parametric and Bayesian, by design. We motivate the model with respect to existing approaches, among others, conditional random fields (CRFs), maximum margin Markov networks (M ^3 N), and structured support vector machines (SVMstruct), which embody only a subset of its properties. We present an inference procedure based on Markov Chain Monte Carlo. The framework can be instantiated for a wide range of structured objects such as linear chains, trees, grids, and other general graphs. As a proof of concept, the model is benchmarked on several natural language processing tasks and a video gesture segmentation task involving a linear chain structure. We show prediction accuracies for GPstruct which are comparable to or exceeding those of CRFs and SVMstruct.

Accelerating the updating stage of a PF through selection of a few representative particles and interpolation of their weights to the rest, with interesting methods for selection and interpolation and a nice related work of efficiency-improved PFs

Shabat, G.; Shmueli, Y.; Bermanis, A.; Averbuch, A., Accelerating Particle Filter Using Randomized Multiscale and Fast Multipole Type Methods, Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol.37, no.7, pp.1396,1407, July 1 2015, DOI: 10.1109/TPAMI.2015.2392754.

Particle filter is a powerful tool for state tracking using non-linear observations. We present a multiscale based method that accelerates the tracking computation by particle filters. Unlike the conventional way, which calculates weights over all particles in each cycle of the algorithm, we sample a small subset from the source particles using matrix decomposition methods. Then, we apply a function extension algorithm that uses a particle subset to recover the density function for all the rest of the particles not included in the chosen subset. The computational effort is substantial especially when multiple objects are tracked concurrently. The proposed algorithm significantly reduces the computational load. By using the Fast Gaussian Transform, the complexity of the particle selection step is reduced to a linear time in n and k , where n is the number of particles and k is the number of particles in the selected subset. We demonstrate our method on both simulated and on real data such as object tracking in video sequences.

Analysis of the deterioration of several Kalman Filters depending on the amount of uncertainty in the observations, when the observation model is non-linear

Mark R. Morelande and Ángel F. García-Fernández, Analysis of Kalman Filter Approximations for Nonlinear Measurements, IEEE Transactions on signal processing, vol. 61, no. 22, 2013 DOI: 10.1109/TSP.2013.2279367.

A theoretical analysis is presented of the correction step of the Kalman filter (KF) and its various approximations for the case of a nonlinear measurement equation with additive Gaussian noise. The KF is based on a Gaussian app roximation to the joint density of the state and the measurement. The analysis metric is the Kullback-Leibler divergence of this approximation from the true joint density. The purpose of the analysis is to provide a quantitative tool for understanding and assessing the performance of the KF and its variants in nonlinear scenarios. This is illustrated using a numerical example.

Estimating the bandwidth of a communication channel for adjusting the bitrate in high-definition video streaming, using Pareto and Gamma distributions (that are conjugate) in a bayesian estimation framework

Javadtalab, A.; Semsarzadeh, M.; Khanchi, A.; Shirmohammadi, S.; Yassine, A., Continuous One-Way Detection of Available Bandwidth Changes for Video Streaming Over Best-Effort Networks, Instrumentation and Measurement, IEEE Transactions on , vol.64, no.1, pp.190,203, Jan. 2015. DOI: 10.1109/TIM.2014.2331423

Video streaming over best-effort networks, such as the Internet, is now a significant application used by most Internet users. However, best-effort networks are characterized by dynamic and unpredictable changes in the available bandwidth, which adversely affect the quality of video. As such, it is important to have real-time detection mechanisms of bandwidth changes to ensure that video is adapted to the available bandwidth and transmitted at the highest quality. In this paper, we propose a Bayesian instantaneous end-to-end bandwidth change prediction model and method to detect and predict one-way bandwidth changes at the receiver. Unlike existing congestion detection mechanisms, which use network parameters such as packet loss probability, round trip time (RTT), or jitter, our approach uses weighted interarrival time of video packets at the receiver side. Furthermore, our approach is continuous, since it measures available bandwidth changes with each incoming video packet, and therefore detects congestion occurrence in <200 ms, on average, which is significantly faster than existing approaches. In addition, it is a one-way scheme, since it only takes into account the characteristics of the incoming path and not the outgoing path, as opposed to other approaches, which use RTT and are hence less accurate. In this paper, we provide extensive experimental simulations and real-world network implementation. Our results indicate that the proposed detection method is superior to existing solutions.