Tag Archives: Outlier Detection

Estimating speed from inertial data by dealing with noise and outliers

W. Xu, X. Peng and L. Kneip, Tight Fusion of Events and Inertial Measurements for Direct Velocity Estimation, IEEE Transactions on Robotics, vol. 40, pp. 240-256, 2024 DOI: 10.1109/TRO.2023.3333108.

Traditional visual-inertial state estimation targets absolute camera poses and spatial landmark locations while first-order kinematics are typically resolved as an implicitly estimated substate. However, this poses a risk in velocity-based control scenarios, as the quality of the estimation of kinematics depends on the stability of absolute camera and landmark coordinates estimation. To address this issue, we propose a novel solution to tight visual\u2013inertial fusion directly at the level of first-order kinematics by employing a dynamic vision sensor instead of a normal camera. More specifically, we leverage trifocal tensor geometry to establish an incidence relation that directly depends on events and camera velocity, and demonstrate how velocity estimates in highly dynamic situations can be obtained over short-time intervals. Noise and outliers are dealt with using a nested two-layer random sample consensus (RANSAC) scheme. In addition, smooth velocity signals are obtained from a tight fusion with preintegrated inertial signals using a sliding window optimizer. Experiments on both simulated and real data demonstrate that the proposed tight event-inertial fusion leads to continuous and reliable velocity estimation in highly dynamic scenarios independently of absolute coordinates. Furthermore, in extreme cases, it achieves more stable and more accurate estimation of kinematics than traditional, point-position-based visual-inertial odometry.

Reducing outliers in time series with singular spectrum analysis and use of deep learning for change detection

Muktesh Gupta, Rajesh Wadhvani, Akhtar Rasool, Real-time Change-Point Detection: A deep neural network-based adaptive approach for detecting changes in multivariate time series data, Expert Systems with Applications, Volume 209, 2022 DOI: 10.1016/j.eswa.2022.118260.

The behavior of a time series may be affected by various factors. Changes in mean, variance, frequency, and auto-correlation are the most common. Change-Point Detection (CPD) aims to track down abrupt statistical characteristic changes in time series that can benefit many applications in different domains. As demonstrated in recently introduced CPD methodologies, deep learning approaches have the potential to identify more subtle changes. However, due to improper handling of data and insufficient training, these methodologies generate more false alarms and are not efficient enough in detecting change-points. In real-time CPD algorithms, preprocessed data plays a vital role in increasing the algorithm\u2019s efficiency and minimizing false alarm rates. Therefore, preprocessing of data should be a part of the algorithm, but in the existing methods, preprocessing of data is done initially, and then the whole dataset is passed to the CPD algorithm. A new three-phase architecture is proposed to address this issue, in which all phases, from preprocessing to CPD, work in an adaptive manner. The phases are integrated into a pipeline, allowing the algorithm to work in real-time. Our proposed strategy performs optimally and consistently based on performance metrics resulting from experiments on real-world datasets and artifacts. This work effectively addresses the issue of non-stationary data normalization using deep learning approaches. To reduce noise and outliers from the data, a recursive version of singular spectrum analysis is introduced. It is demonstrated that the method\u2019s performance has significantly improved by combining adaptive preprocessing with deep learning CPD techniques.

NOTE: See also C. Ma, L. Zhang, W. Pedrycz and W. Lu, “The Long-Term Prediction of Time Series: A Granular Computing-Based Design Approach,” in IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 52, no. 10, pp. 6326-6338, Oct. 2022, doi: 10.1109/TSMC.2022.3144395.

See also https://babel.isa.uma.es/kipr/?p=1548

New algorithms for outlier detection with applications in robotics

P. Antonante, V. Tzoumas, H. Yang and L. Carlone, Outlier-Robust Estimation: Hardness, Minimally Tuned Algorithms, and Applications, IEEE Transactions on Robotics, vol. 38, no. 1, pp. 281-301, Feb. 2022 DOI: 10.1109/TRO.2021.3094984.

Nonlinear estimation in robotics and vision is typically plagued with outliers due to wrong data association or incorrect detections from signal processing and machine learning methods. This article introduces two unifying formulations for outlier-robust estimation, generalized maximum consensus ( $\text{G}$ – $\text{MC}$ ) and generalized truncated least squares ( $\text{G-TLS}$ ), and investigates fundamental limits, practical algorithms, and applications. Our first contribution is a proof that outlier-robust estimation is inapproximable: In the worst case, it is impossible to (even approximately) find the set of outliers, even with slower-than-polynomial-time algorithms (particularly, algorithms running in quasi-polynomial time). As a second contribution, we review and extend two general-purpose algorithms. The first, adaptive trimming ( $\text{ADAPT}$ ), is combinatorial and is suitable for $\text{G}$ – $\text{MC}$ ; the second, graduated nonconvexity ( $\text{GNC}$ ), is based on homotopy methods and is suitable for $\text{G-TLS}$ . We extend $\text{ADAPT}$ and $\text{GNC}$ to the case where the user does not have prior knowledge of the inlier-noise statistics (or the statistics may vary over time) and is unable to guess a reasonable threshold to separate inliers from outliers (as the one commonly used in RANdom SAmple Consensus $(\text{RANSAC})$ . We propose the first minimally tuned algorithms for outlier rejection, which dynamically decide how to separate inliers from outliers. Our third contribution is an evaluation of the proposed algorithms on robot perception problems: mesh registration, image-based object detection ( shape alignment ), and pose graph optimization. $\text{ADAPT}$ and $\text{GNC}$ execute in real time, are deterministic, outperform $\text{RANSAC}$ , and are robust up to 80\u201390% outliers. Their minimally tuned versions also compare favorably with the state of the art, even though they do not rely on a noise bound for the inliers.

A really nice comparison of different outlier detection methods

Hamzeh Alimohammadi, Shengnan Nancy Chen, Performance evaluation of outlier detection techniques in production timeseries: A systematic review and meta-analysis, Expert Systems with Applications, Volume 191, 2022 DOI: 10.1016/j.eswa.2021.116371.

Time-series data have been extensively collected and analyzed in many disciplines, such as stock market, medical diagnosis, meteorology, and oil and gas industry. Numerous data in these disciplines are sequence of observations measured as functions of time, which can be further used for different applications via analytical or data analytics techniques (e.g., to forecast future price, climate change, etc.). However, presence of outliers can cause significant uncertainties to interpretation results; hence, it is essential to remove the outliers accurately and efficiently before conducting any further analysis. A total of 17 techniques that belong to statistical, regression-based, and machine learning (ML) based categories for outlier detection in timeseries are applied to the oil and gas production data analysis. 15 of these methods are utilized for production data analysis for the first time. Two state-of-the-art and high-performance techniques are then selected for data cleaning which require minimum control and time complexity. Moreover, performances of these techniques are evaluated based on several metrics including the accuracy, precision, recall, F1 score, and Cohen\u2019s Kappa to rank the techniques. Results show that eight unsupervised algorithms outperform the rest of the methods based on the synthetic case study with known outliers. For example, accuracies of the eight shortlisted methods are in the range of 0.83\u20130.99 with a precision between 0.83 and 0.98, compared to 0.65\u20130.82 and 0.07\u20130.77 for the others. In addition, ML-based techniques perform better than statistical techniques. Our experimental results on real field data further indicate that the k-nearest neighbor (KNN) and Fulford-Blasingame methods are superior to other outlier detection frameworks for outlier detection in production data, followed by four others including density-based spatial clustering of applications with noise (DBSCAN), and angle-based outlier detection (ABOD). Even though the techniques are examined with oil and gas production data, but the same data cleaning workflow can be used to detect timeseries\u2019 outliers in other disciplines.