Category Archives: Uncategorized

Using CNNs trained with image data to predict time series data

Aniello De Santo, Antonino Ferraro, Antonio Galli, Vincenzo Moscato, Giancarlo Sperl�, Evaluating time series encoding techniques for Predictive Maintenance, Expert Systems with Applications, Volume 210, 2022 DOI: 10.1016/j.eswa.2022.118435.

Predictive Maintenance has become an important component in modern industrial scenarios, as a way to minimize down-times and fault rate for different equipment. In this sense, while machine learning and deep learning approaches are promising due to their accurate predictive abilities, their data-heavy requirements make them significantly limited in real world applications. Since one of the main issues to overcome is lack of consistent training data, recent work has explored the possibility of adapting well-known deep-learning models for image recognition, by exploiting techniques to encode time series as images. In this paper, we propose a framework for evaluating some of the best known time series encoding techniques, together with Convolutional Neural Network-based image classifiers applied to predictive maintenance tasks. We conduct an extensive empirical evaluation of these approaches for the failure prediction task on two real-world datasets (PAKDD2020 Alibaba AI OPS Competition and NASA bearings), also comparing their performances with respect to the state-of-the-art approaches. We further discuss advantages and limitation of the exploited models when coupled with proper data augmentation techniques.

Transfering RL knowledge to other tasks by transferring components of RL

Rex G. Liu, Michael J. Frank, Hierarchical clustering optimizes the tradeoff between compositionality and expressivity of task structures for flexible reinforcement learning, Artificial Intelligence, Volume 312, 2022 DOI: 10.1016/j.artint.2022.103770.

A hallmark of human intelligence, but challenging for reinforcement learning (RL) agents, is the ability to compositionally generalise, that is, to recompose familiar knowledge components in novel ways to solve new problems. For instance, when navigating in a city, one needs to know the location of the destination and how to operate a vehicle to get there, whether it be pedalling a bike or operating a car. In RL, these correspond to the reward function and transition function, respectively. To compositionally generalize, these two components need to be transferable independently of each other: multiple modes of transport can reach the same goal, and any given mode can be used to reach multiple destinations. Yet there are also instances where it can be helpful to learn and transfer entire structures, jointly representing goals and transitions, particularly whenever these recur in natural tasks (e.g., given a suggestion to get ice cream, one might prefer to bike, even in new towns). Prior theoretical work has explored how, in model-based RL, agents can learn and generalize task components (transition and reward functions). But a satisfactory account for how a single agent can simultaneously satisfy the two competing demands is still lacking. Here, we propose a hierarchical RL agent that learns and transfers individual task components as well as entire structures (particular compositions of components) by inferring both through a non-parametric Bayesian model of the task. It maintains a factorised representation of task components through a hierarchical Dirichlet process, but it also represents different possible covariances between these components through a standard Dirichlet process. We validate our approach on a variety of navigation tasks covering a wide range of statistical correlations between task components and show that it can also improve generalisation and transfer in more complex, hierarchical tasks with goal/subgoal structures. Finally, we end with a discussion of our work including how this clustering algorithm could conceivably be implemented by cortico-striatal gating circuits in the brain.

A nice survey on knowledge graphs for representing, well, knowledge, focused on explainability of AI, but whatever, they are interesting for many more things

Ilaria Tiddi, Stefan Schlobach, Knowledge graphs as tools for explainable machine learning: A survey, Artificial Intelligence, Volume 302, 2022 DOI: 10.1016/j.artint.2021.103627.

This paper provides an extensive overview of the use of knowledge graphs in the context of Explainable Machine Learning. As of late, explainable AI has become a very active field of research by addressing the limitations of the latest machine learning solutions that often provide highly accurate, but hardly scrutable and interpretable decisions. An increasing interest has also been shown in the integration of Knowledge Representation techniques in Machine Learning applications, mostly motivated by the complementary strengths and weaknesses that could lead to a new generation of hybrid intelligent systems. Following this idea, we hypothesise that knowledge graphs, which naturally provide domain background knowledge in a machine-readable format, could be integrated in Explainable Machine Learning approaches to help them provide more meaningful, insightful and trustworthy explanations. Using a systematic literature review methodology we designed an analytical framework to explore the current landscape of Explainable Machine Learning. We focus particularly on the integration with structured knowledge at large scale, and use our framework to analyse a variety of Machine Learning domains, identifying the main characteristics of such knowledge-based, explainable systems from different perspectives. We then summarise the strengths of such hybrid systems, such as improved understandability, reactivity, and accuracy, as well as their limitations, e.g. in handling noise or extracting knowledge efficiently. We conclude by discussing a list of open challenges left for future research.

SLAM based on submap joining that achieves linear cost through a novel choice of the reference frame of each submap, and an interesting related works on map joining, i.e., considering submaps as observations

Liang Zhao, Shoudong Huang, Gamini Dissanayake, Linear SLAM: Linearising the SLAM problems using submap joining, Automatica, Volume 100, 2019, Pages 231-246, DOI: 10.1016/j.automatica.2018.10.037.

The main contribution of this paper is a new submap joining based approach for solving large-scale Simultaneous Localization and Mapping (SLAM) problems. Each local submap is independently built using the local information through solving a small-scale SLAM; the joining of submaps mainly involves solving linear least squares and performing nonlinear coordinate transformations. Through approximating the local submap information as the state estimate and its corresponding information matrix, judiciously selecting the submap coordinate frames, and approximating the joining of a large number of submaps by joining only two maps at a time, either sequentially or in a more efficient Divide and Conquer manner, the nonlinear optimization process involved in most of the existing submap joining approaches is avoided. Thus the proposed submap joining algorithm does not require initial guess or iterations since linear least squares problems have closed-form solutions. The proposed Linear SLAM technique is applicable to feature-based SLAM, pose graph SLAM and D-SLAM, in both two and three dimensions, and does not require any assumption on the character of the covariance matrices. Simulations and experiments are performed to evaluate the proposed Linear SLAM algorithm. Results using publicly available datasets in 2D and 3D show that Linear SLAM produces results that are very close to the best solutions that can be obtained using full nonlinear optimization algorithm started from an accurate initial guess. The C/C++ and MATLAB source codes of Linear SLAM are available on OpenSLAM.

Optimization algorithms inspired in chemical reactions

Nazmul Siddique, Hojjat Adeli, Nature-Inspired Chemical Reaction Optimisation Algorithms, Cognitive Computation, Volume 9, Issue 4, pp 411–422, DOI: 10.1007/s12559-017-9485-1.

Nature-inspired meta-heuristic algorithms have dominated the scientific literature in the areas of machine learning and cognitive computing paradigm in the last three decades. Chemical reaction optimisation (CRO) is a population-based meta-heuristic algorithm based on the principles of chemical reaction. A chemical reaction is seen as a process of transforming the reactants (or molecules) through a sequence of reactions into products. This process of transformation is implemented in the CRO algorithm to solve optimisation problems. This article starts with an overview of the chemical reactions and how it is applied to the optimisation problem. A review of CRO and its variants is presented in the paper. Guidelines from the literature on the effective choice of CRO parameters for solution of optimisation problems are summarised.

Application of deep learning and reinforcement learning to an industrial process, with a gentle introduction to both and a clear explanation of the process and decisions made to build the whole control system

Johannes Günther, Patrick M. Pilarski, Gerhard Helfrich, Hao Shen, Klaus Diepold, Intelligent laser welding through representation, prediction, and control learning: An architecture with deep neural networks and reinforcement learning, Mechatronics, Volume 34, March 2016, Pages 1-11, ISSN 0957-4158, DOI: 10.1016/j.mechatronics.2015.09.004.

Laser welding is a widely used but complex industrial process. In this work, we propose the use of an integrated machine intelligence architecture to help address the significant control difficulties that prevent laser welding from seeing its full potential in process engineering and production. This architecture combines three contemporary machine learning techniques to allow a laser welding controller to learn and improve in a self-directed manner. As a first contribution of this work, we show how a deep, auto-encoding neural network is capable of extracting salient, low-dimensional features from real high-dimensional laser welding data. As a second contribution and novel integration step, these features are then used as input to a temporal-difference learning algorithm (in this case a general-value-function learner) to acquire important real-time information about the process of laser welding; temporally extended predictions are used in combination with deep learning to directly map sensor data to the final quality of a welding seam. As a third contribution and final part of our proposed architecture, we suggest that deep learning features and general-value-function predictions can be beneficially combined with actor–critic reinforcement learning to learn context-appropriate control policies to govern welding power in real time. Preliminary control results are demonstrated using multiple runs with a laser-welding simulator. The proposed intelligent laser-welding architecture combines representation, prediction, and control learning: three of the main hallmarks of an intelligent system. As such, we suggest that an integration approach like the one described in this work has the capacity to improve laser welding performance without ongoing and time-intensive human assistance. Our architecture therefore promises to address several key requirements of modern industry. To our knowledge, this architecture is the first demonstrated combination of deep learning and general value functions. It also represents the first use of deep learning for laser welding specifically and production engineering in general. We believe that it would be straightforward to adapt our architecture for use in other industrial and production engineering settings.