Category Archives: Probability And Statistics

A particular application of quick detection of changes in a signal: detecting changes of voltage regimes in the electric distribution network

D. Macii and D. Petri, Rapid Voltage Change Detection: Limits of the IEC Standard Approach and Possible Solutions, IEEE Transactions on Instrumentation and Measurement, vol. 69, no. 2, pp. 382-392, Feb. 2020, DOI: 10.1109/TIM.2019.2903617.

Rapid voltage changes (RVCs) are power quality (PQ) events characterized by small and fast transitions between two steady-state root-mean-square (rms) voltage levels. RVCs occur quite often at the distribution level and are expected to be even more frequent in the future due to the increasing penetration of dynamic loads and renewable-based generators in the smart grid. Unlike other PQ events, RVCs are less critical, but also more difficult to detect than dips/sags and swells, due to their smaller voltage variations. Nevertheless, they can be harmful to generators’ control systems and electronic equipment in general. Moreover, they strongly affect flicker. The IEC Standard 61000-4-3:2015 clearly describes an algorithm for RVC detection. However, this approach is poorly characterized in the scientific literature. In fact, it suffers from some drawbacks. In this paper, some of them (e.g., rate-dependent detection limits and detection delays) are analyzed in depth. In addition, an alternative approach based on the estimation of the rate of change of rms voltage is proposed. Multiple simulation results show that the approach considered is more sensitive to noise, but also faster, especially when not so fast RVCs occur. Moreover, it allows measuring the rate of change of rms voltage, which is currently disregarded in the IEC Standard.

Quantizing a continuous POMDP into a finite MDP to preserve optimality

Naci Saldi; Serdar Yüksel; Tamás Linder, Asymptotic Optimality of Finite Model Approximations for Partially Observed Markov Decision Processes With Discounted Cost, IEEE Transactions on Automatic Control ( Volume: 65, Issue: 1, Jan. 2020), DOI: 10.1109/TAC.2019.2907172.

We consider finite model approximations of discrete-time partially observed Markov decision processes (POMDPs) under the discounted cost criterion. After converting the original partially observed stochastic control problem to a fully observed one on the belief space, the finite models are obtained through the uniform quantization of the state and action spaces of the belief space Markov decision process (MDP). Under mild assumptions on the components of the original model, it is established that the policies obtained from these finite models are nearly optimal for the belief space MDP, and so, for the original partially observed problem. The assumptions essentially require that the belief space MDP satisfies a mild weak continuity condition. We provide an example and introduce explicit approximation procedures for the quantization of the set of probability measures on the state space of POMDP (i.e., belief space).

A universal approximator for the value function in continuous-state VI

William B. Haskell; Rahul Jain; Hiteshi Sharma; Pengqian Yu, TA Universal Empirical Dynamic Programming Algorithm for Continuous State MDPs, IEEE Transactions on Automatic Control ( Volume: 65, Issue: 1, Jan. 2020), DOI: 10.1109/TAC.2019.2907414.

We propose universal randomized function approximation-based empirical value learning (EVL) algorithms for Markov decision processes. The “empirical” nature comes from each iteration being done empirically from samples available from simulations of the next state. This makes the Bellman operator a random operator. A parametric and a nonparametric method for function approximation using a parametric function space and a reproducing kernel Hilbert space respectively are then combined with EVL. Both function spaces have the universal function approximation property. Basis functions are picked randomly. Convergence analysis is performed using a random operator framework with techniques from the theory of stochastic dominance. Finite time sample complexity bounds are derived for both universal approximate dynamic programming algorithms. Numerical experiments support the versatility and computational tractability of this approach.

Nice related work on change-point detection and a novel algorithm for off-line detection of abrupt changes in multivariate signals

Charles Truong; Laurent Oudre; Nicolas Vayatis, Greedy Kernel Change-Point Detection, IEEE Transactions on Signal Processing ( Volume: 67, Issue: 24, Dec.15, 15 2019), DOI: 10.1109/TSP.2019.2953670.

We consider the problem of detecting abrupt changes in the underlying stochastic structure of multivariate signals. A novel non-parametric and model-free off-line change-point detection method based on a kernel mapping is presented. This approach is sequential and alternates between two steps: a greedy detection to estimate a new breakpoint and a projection to remove its contribution to the signal. The resulting algorithm is able to segment time series for which no accurate model is available: it is computationally more efficient than exact kernel change-point detection and more precise than window-based approximations. The proposed method also offers some theoretical consistency properties. For the special case of a linear kernel, an even faster implementation is provided. The proposed strategy is compared to standard parametric and non-parametric procedures on a real-world data set composed of 262 accelerometer recordings.

Estimating aging of integrated circuits with machine learning

Ke Huang; Xinqiao Zhang; Naghmeh Karimi, Real-Time Prediction for IC Aging Based on Machine Learning, IEEE Transactions on Instrumentation and Measurement, Volume: 68, Issue: 12, Dec. 2019, DOI: 10.1109/TIM.2019.2899477.

Estimating the aging-related degradation and failure of nanoscale integrated circuits (ICs), before they actually occur, is crucial for developing aging prevention/mitigation actions and in turn avoiding unexpected in-field circuit failures. Real-time monitoring of IC operating conditions can be efficiently used for predicting aging degradation and in turn timing failures caused by device aging. The existing approaches only take some specific operating conditions (e.g., workload or temperature) into account. In this paper, we propose a novel method for real-time IC aging prediction by extending the prediction schemes to a comprehensive model which takes into account any time-variant dynamic operating conditions relevant to aging prediction. Using a machine learning prediction model and the notion of equivalent aging time, we show that our approach outperforms the existing methods in terms of aging-prediction accuracy under different scenarios of time-variant operating conditions.

On the effects of large variances in the transition function for Q-learning

D. Lee and W. B. Powell, Bias-Corrected Q-Learning With Multistate Extension. IEEE Transactions on Automatic Control, vol. 64, no. 10, pp. 4011-4023, DOI: 10.1109/TAC.2019.2912443.

Q-learning is a sample-based model-free algorithm that solves Markov decision problems asymptotically, but in finite time, it can perform poorly when random rewards and transitions result in large variance of value estimates. We pinpoint its cause to be the estimation bias due to the maximum operator in Q-learning algorithm, and present the evidence of max-operator bias in its Q value estimates. We then present an asymptotically optimal bias-correction strategy and construct an extension to bias-corrected Q-learning algorithm to multistate Markov decision processes, with asymptotic convergence properties as strong as those from Q-learning. We report the empirical performance of the bias-corrected Q-learning algorithm with multistate extension in two model problems: A multiarmed bandit version of Roulette and an electricity storage control simulation. The bias-corrected Q-learning algorithm with multistate extension is shown to control max-operator bias effectively, where the bias-resistance can be tuned predictably by adjusting a correction parameter.

A nice (short) survey of deep RL

Matthew Botvinick, Sam Ritter, Jane X. Wang, Zeb Kurth-Nelson, Charles Blundell, Demis Hassabis, Reinforcement Learning, Fast and Slow, Trends in Cognitive Sciences, Volume 23, Issue 5, 2019, Pages 408-422 DOI: 10.1016/j.tics.2019.02.006.

Deep reinforcement learning (RL) methods have driven impressive advances in artificial intelligence in recent years, exceeding human performance in domains ranging from Atari to Go to no-limit poker. This progress has drawn the attention of cognitive scientists interested in understanding human learning. However, the concern has been raised that deep RL may be too sample-inefficient – that is, it may simply be too slow – to provide a plausible model of how humans learn. In the present review, we counter this critique by describing recently developed techniques that allow deep RL to operate more nimbly, solving problems much more quickly than previous methods. Although these techniques were developed in an AI context, we propose that they may have rich implications for psychology and neuroscience. A key insight, arising from these AI methods, concerns the fundamental connection between fast RL and slower, more incremental forms of learning.

An interesting review of criticisms of deep learning in cognitive science

Radoslaw M. Cichy, Daniel Kaiser, Deep Neural Networks as Scientific Models, Trends in Cognitive Sciences, Volume 23, Issue 4, 2019, Pages 305-317, DOI: 10.1016/j.tics.2019.01.009.

Artificial deep neural networks (DNNs) initially inspired by the brain enable computers to solve cognitive tasks at which humans excel. In the absence of explanations for such cognitive phenomena, in turn cognitive scientists have started using DNNs as models to investigate biological cognition and its neural basis, creating heated debate. Here, we reflect on the case from the perspective of philosophy of science. After putting DNNs as scientific models into context, we discuss how DNNs can fruitfully contribute to cognitive science. We claim that beyond their power to provide predictions and explanations of cognitive phenomena, DNNs have the potential to contribute to an often overlooked but ubiquitous and fundamental use of scientific models: exploration.

A new framework for fitting jump models

Alberto Bemporad, Valentina Breschi, Dario Piga, Stephen P. Boyd, Fitting jump models, Automatica, Volume 96, 2018, Pages 11-21, DOI: 10.1016/j.automatica.2018.06.022.

We describe a new framework for fitting jump models to a sequence of data. The key idea is to alternate between minimizing a loss function to fit multiple model parameters, and minimizing a discrete loss function to determine which set of model parameters is active at each data point. The framework is quite general and encompasses popular classes of models, such as hidden Markov models and piecewise affine models. The shape of the chosen loss functions to minimize determines the shape of the resulting jump model.

Regression to help in finding the optimal policy in MDPs based on duality theory

H. Zhu, F. Ye and E. Zhou, Solving the Dual Problems of Dynamic Programs via Regression, IEEE Transactions on Automatic Control, vol. 63, no. 5, pp. 1340-1355, DOI: 10.1109/TAC.2017.2747405.

In recent years, information relaxation and duality in dynamic programs have been studied extensively, and the resulted primal-dual approach has become a powerful procedure in solving dynamic programs by providing lower-upper bounds on the optimal value function. Theoretically, with the so-called value-based optimal dual penalty, the optimal value function could be recovered exactly via strong duality. However, in practice, obtaining tight dual bounds usually requires good approximations of the optimal dual penalty, which could be time consuming if analytical computation is not possible and nested simulation has to be used to estimate the conditional expectations inside the dual penalty. In this paper, we will develop a framework of a regression approach to approximating the optimal dual penalty in a nonnested manner, by exploring the structure of the function space consisting of all feasible dual penalties. The resulted approximations maintain to be feasible dual penalties, and thus yielding valid dual bounds on the optimal value function. We show that the proposed framework is computationally efficient, and the resulted dual penalties lead to numerically tractable dual problems. Finally, we apply the framework to a high-dimensional dynamic trading problem to demonstrate its effectiveness in solving the dual problems of complex dynamic programs.