Category Archives: Mathematics

Modelling hierarchical stochastic signals (i.e., decomposable into sub-signals hierarchichally)

Truyen Tran, Dinh Phung, Hung Bui, Svetha Venkatesh, Hierarchical semi-Markov conditional random fields for deep recursive sequential data, Artificial Intelligence, Volume 246, May 2017, Pages 53-85, ISSN 0004-3702, DOI: 10.1016/j.artint.2017.02.003.

We present the hierarchical semi-Markov conditional random field (HSCRF), a generalisation of linear-chain conditional random fields to model deep nested Markov processes. It is parameterised as a conditional log-linear model and has polynomial time algorithms for learning and inference. We derive algorithms for partially-supervised learning and constrained inference. We develop numerical scaling procedures that handle the overflow problem. We show that when depth is two, the HSCRF can be reduced to the semi-Markov conditional random fields. Finally, we demonstrate the HSCRF on two applications: (i) recognising human activities of daily living (ADLs) from indoor surveillance cameras, and (ii) noun-phrase chunking. The HSCRF is capable of learning rich hierarchical models with reasonable accuracy in both fully and partially observed data cases.

How very simple digital signal processing techniques, such as numerical filtering and linear interpolation, may provide PDF estimates with improved statistical properties over the histogram and close to, or better than, what can be obtained using Kernel based estimators

P. Carbone, D. Petri and K. Barbé, “Nonparametric Probability Density Estimation via Interpolation Filtering,” in IEEE Transactions on Instrumentation and Measurement, vol. 66, no. 4, pp. 681-690, April 2017.DOI: 10.1109/TIM.2017.2657398.

In this paper, we discuss nonparametric estimation of the probability density function (PDF) of a univariate random variable. This problem has been the subject of a vast amount of scientific literature in many domains, while statisticians are mainly interested in the analysis of the properties of proposed estimators, and engineers treat the histogram as a ready-to-use tool for a data set analysis. By considering histogram data as a numerical sequence, a simple approach for PDF estimation is presented in this paper. It is based on basic notions related to the reconstruction of a continuous-time signal from a sequence of samples. When estimating continuous PDFs, it is shown that the proposed approach is as accurate as kernel-based estimators, widely adopted in the statistical literature. Conversely, it can provide better accuracy when the PDF to be estimated exhibits a discontinuous behavior. The main statistical properties of the proposed estimators are derived and then verified by simulations related to the common cases of normal and uniform density functions. The obtained results are also used to derive optimal, i.e., minimum integral of the mean square error, estimators.

How to improve statistical results obtained from limited set-ups through active sampling, and a nice review of possible pitfalls in conducting statistical research (and a mention to “pre-registration” of hypothesis and plans to be peer-reviewed before submitting the results)

Romy Lorenz, Adam Hampshire, Robert Leech, Neuroadaptive Bayesian Optimization and Hypothesis Testing, Trends in Cognitive Sciences, Volume 21, Issue 3, March 2017, Pages 155-167, ISSN 1364-6613, DOI: 10.1016/j.tics.2017.01.006.

Cognitive neuroscientists are often interested in broad research questions, yet use overly narrow experimental designs by considering only a small subset of possible experimental conditions. This limits the generalizability and reproducibility of many research findings. Here, we propose an alternative approach that resolves these problems by taking advantage of recent developments in real-time data analysis and machine learning. Neuroadaptive Bayesian optimization is a powerful strategy to efficiently explore more experimental conditions than is currently possible with standard methodology. We argue that such an approach could broaden the hypotheses considered in cognitive science, improving the generalizability of findings. In addition, Bayesian optimization can be combined with preregistration to cover exploration, mitigating researcher bias more broadly and improving reproducibility.

Very efficient global non-linear optimization for real-time robotic problems through the re-use of pre-computed solutions

K. Hauser, “Learning the Problem-Optimum Map: Analysis and Application to Global Optimization in Robotics,” in IEEE Transactions on Robotics, vol. 33, no. 1, pp. 141-152, Feb. 2017. DOI: 10.1109/TRO.2016.2623345.

This paper describes a data-driven framework for approximate global optimization in which precomputed solutions to a sample of problems are retrieved and adapted during online use to solve novel problems. This approach has promise for real-time applications in robotics, since it can produce near globally optimal solutions orders of magnitude faster than standard methods. This paper establishes theoretical conditions on how many and where samples are needed over the space of problems to achieve a given approximation quality. The framework is applied to solve globally optimal collision-free inverse kinematics problems, wherein large solution databases are used to produce near-optimal solutions in a submillisecond time on a standard PC.

Varying the number of particles in a PF in order to improve the speed of convergence, with a short related work about adapting the number of particles for other goals

V. Elvira, J. Míguez and P. M. Djurić, “Adapting the Number of Particles in Sequential Monte Carlo Methods Through an Online Scheme for Convergence Assessment,” in IEEE Transactions on Signal Processing, vol. 65, no. 7, pp. 1781-1794, April1, 1 2017. DOI: 10.1109/TSP.2016.2637324.

Particle filters are broadly used to approximate posterior distributions of hidden states in state-space models by means of sets of weighted particles. While the convergence of the filter is guaranteed when the number of particles tends to infinity, the quality of the approximation is usually unknown but strongly dependent on the number of particles. In this paper, we propose a novel method for assessing the convergence of particle filters in an online manner, as well as a simple scheme for the online adaptation of the number of particles based on the convergence assessment. The method is based on a sequential comparison between the actual observations and their predictive probability distributions approximated by the filter. We provide a rigorous theoretical analysis of the proposed methodology and, as an example of its practical use, we present simulations of a simple algorithm for the dynamic and online adaptation of the number of particles during the operation of a particle filter on a stochastic version of the Lorenz 63 system.

A new Kalman Filter that is more robust under certain deviations of the gaussian hypothesis

Badong Chen, Xi Liu, Haiquan Zhao, Jose C. Principe, Maximum correntropy Kalman filter, Automatica, Volume 76, February 2017, Pages 70-77, ISSN 0005-1098, DOI: 10.1016/j.automatica.2016.10.004.

Traditional Kalman filter (KF) is derived under the well-known minimum mean square error (MMSE) criterion, which is optimal under Gaussian assumption. However, when the signals are non-Gaussian, especially when the system is disturbed by some heavy-tailed impulsive noises, the performance of KF will deteriorate seriously. To improve the robustness of KF against impulsive noises, we propose in this work a new Kalman filter, called the maximum correntropy Kalman filter (MCKF), which adopts the robust maximum correntropy criterion (MCC) as the optimality criterion, instead of using the MMSE. Similar to the traditional KF, the state mean vector and covariance matrix propagation equations are used to give prior estimations of the state and covariance matrix in MCKF. A novel fixed-point algorithm is then used to update the posterior estimations. A sufficient condition that guarantees the convergence of the fixed-point algorithm is also given. Illustration examples are presented to demonstrate the effectiveness and robustness of the new algorithm.

A promising survey on robust estimation methods aimed at robotic applications

Michael Bosse, Gabriel Agamennoni and Igor Gilitschenski (2016), “Robust Estimation and Applications in Robotics”, Foundations and Trends® in Robotics: Vol. 4: No. 4, pp 225-269. DOI: 10.1561/2300000047.

Solving estimation problems is a fundamental component of numerous robotics applications. Prominent examples involve pose estimation, point cloud alignment, or object tracking. Algorithms for solving these estimation problems need to cope with new challenges due to an increased use of potentially poor low-cost sensors, and an ever growing deployment of robotic algorithms in consumer products which operate in potentially unknown environments. These algorithms need to be capable of being robust against strong nonlinearities, high uncertainty levels, and numerous outliers. However, particularly in robotics, the Gaussian assumption is prevalent in solutions to multivariate parameter estimation problems without providing the desired level of robustness. The goal of this tutorial is helping to address the aforementioned challenges by providing an introduction to robust estimation with a particular focus on robotics. First, this is achieved by giving a concise overview of the theory on M-estimation. M-estimators share many of the convenient properties of least-squares estimators, and at the same time are much more robust to deviations from the Gaussian model assumption. Second, we present several example applications where M-Estimation is used to increase robustness against nonlinearities and outliers.

A new method of clustering of data with many advantages w.r.t. others

A. Sharma, K. A. Boroevich, D. Shigemizu, Y. Kamatani, M. Kubo and T. Tsunoda, “Hierarchical Maximum Likelihood Clustering Approach,” in IEEE Transactions on Biomedical Engineering, vol. 64, no. 1, pp. 112-122, Jan. 2017. DOI: 10.1109/TBME.2016.2542212.

In this paper, we focused on developing a clustering approach for biological data. In many biological analyses, such as multiomics data analysis and genome-wide association studies analysis, it is crucial to find groups of data belonging to subtypes of diseases or tumors. Methods: Conventionally, the k-means clustering algorithm is overwhelmingly applied in many areas including biological sciences. There are, however, several alternative clustering algorithms that can be applied, including support vector clustering. In this paper, taking into consideration the nature of biological data, we propose a maximum likelihood clustering scheme based on a hierarchical framework. Results: This method can perform clustering even when the data belonging to different groups overlap. It can also perform clustering when the number of samples is lower than the data dimensionality. Conclusion: The proposed scheme is free from selecting initial settings to begin the search process. In addition, it does not require the computation of the first and second derivative of likelihood functions, as is required by many other maximum likelihood-based methods. Significance: This algorithm uses distribution and centroid information to cluster a sample and was applied to biological data. A MATLAB implementation of this method can be downloaded from the web link http://www.riken.jp/en/research/labs/ims/med_sci_math/.

Bayesian estimation when computing the likelihood is hard

Kirthevasan Kandasamy, Jeff Schneider, Barnabás Póczos, Query efficient posterior estimation in scientific experiments via Bayesian active learning, Artificial Intelligence, Volume 243, February 2017, Pages 45-56, ISSN 0004-3702, DOI: 10.1016/j.artint.2016.11.002.

A common problem in disciplines of applied Statistics research such as Astrostatistics is of estimating the posterior distribution of relevant parameters. Typically, the likelihoods for such models are computed via expensive experiments such as cosmological simulations of the universe. An urgent challenge in these research domains is to develop methods that can estimate the posterior with few likelihood evaluations.In this paper, we study active posterior estimation in a Bayesian setting when the likelihood is expensive to evaluate. Existing techniques for posterior estimation are based on generating samples representative of the posterior. Such methods do not consider efficiency in terms of likelihood evaluations. In order to be query efficient we treat posterior estimation in an active regression framework. We propose two myopic query strategies to choose where to evaluate the likelihood and implement them using Gaussian processes. Via experiments on a series of synthetic and real examples we demonstrate that our approach is significantly more query efficient than existing techniques and other heuristics for posterior estimation.

Subgraph matching (isomorphism) using GPUs for managing commonsense knowledge, and a short list of other graph problems that have had benefit from multiprocessing

Ha-Nguyen Tran, Erik Cambria, Amir Hussain, Towards GPU-Based Common-Sense Reasoning: Using Fast Subgraph Matching, Cognitive Computation, December 2016, Volume 8, Issue 6, pp 1074–1086, DOI: 10.1007/s12559-016-9418-4.

Common-sense reasoning is concerned with simulating cognitive human ability to make presumptions about the type and essence of ordinary situations encountered every day. The most popular way to represent common-sense knowledge is in the form of a semantic graph. Such type of knowledge, however, is known to be rather extensive: the more concepts added in the graph, the harder and slower it becomes to apply standard graph mining techniques.In this work, we propose a new fast subgraph matching approach to overcome these issues. Subgraph matching is the task of finding all matches of a query graph in a large data graph, which is known to be a non-deterministic polynomial time-complete problem. Many algorithms have been previously proposed to solve this problem using central processing units. Here, we present a new graphics processing unit-friendly method for common-sense subgraph matching, termed GpSense, which is designed for scalable massively parallel architectures, to enable next-generation Big Data sentiment analysis and natural language processing applications.We show that GpSense outperforms state-of-the-art algorithms and efficiently answers subgraph queries on large common-sense graphs.