Author Archives: Juan-antonio Fernández-madrigal

Example of non-NN approach that produces better results in classification tasks than NNs

Jiang, Zhiying and Yang, Matthew and Tsirlin, Mikhail and Tang, Raphael and Dai, Yiqin and Lin, Jimmy, Low-Resource Text Classification: A Parameter-Free Classification Method with Compressors, . Findings of the Association for Computational Linguistics: ACL 2023 URL.

Deep neural networks (DNNs) are often used for text classification due to their high accuracy. However, DNNs can be computationally intensive, requiring millions of parameters and large amounts of labeled data, which can make them expensive to use, to optimize, and to transfer to out-of-distribution (OOD) cases in practice. In this paper, we propose a non-parametric alternative to DNNs that??s easy, lightweight, and universal in text classification: a combination of a simple compressor like gzip with a k-nearest-neighbor classifier. Without any training parameters, our method achieves results that are competitive with non-pretrained deep learning methods on six in-distribution datasets.It even outperforms BERT on all five OOD datasets, including four low-resource languages. Our method also excels in the few-shot setting, where labeled data are too scarce to train DNNs effectively.

Dropping laser scans for SLAM when they contribute with no relevant information

Kirill Krinkin, Anton Filatov, Correlation filter of 2D laser scans for indoor environment, . Robotics and Autonomous Systems, Volume 142, 2021 DOI: 10.1016/j.robot.2021.103809.

Modern laser SLAM (simultaneous localization and mapping) and structure from motion algorithms face the problem of processing redundant data. Even if a sensor does not move, it still continues to capture scans that should be processed. This paper presents the novel filter that allows dropping 2D scans that bring no new information to the system. Experiments on MIT and TUM datasets show that it is possible to drop more than half of the scans. Moreover the paper describes the formulas that enable filter adaptation to a particular robot with known speed and characteristics of lidar. In addition, the indoor corridor detector is introduced that also can be applied to any specific shape of a corridor and sensor.

The Evolutionary History of Brains for Numbers

Andreas Nieder, The Evolutionary History of Brains for Numbers, . Trends in Cognitive Sciences, Volume 25, Issue 7, 2021, Pages 608-621 DOI: 10.1016/j.tics.2021.03.012.

Humans and other animals share a number sense’, an intuitive understanding of countable quantities. Having evolved independent from one another for hundreds of millions of years, the brains of these diverse species, including monkeys, crows, zebrafishes, bees, and squids, differ radically. However, in all vertebrates investigated, the pallium of the telencephalon has been implicated in number processing. This suggests that properties of the telencephalon make it ideally suited to host number representations that evolved by convergent evolution as a result of common selection pressures. In addition, promising candidate regions in the brains of invertebrates, such as insects, spiders, and cephalopods, can be identified, opening the possibility of even deeper commonalities for number sense.

Building POMDPs under logical constraints

Bo Wu, Xiaobin Zhang, Hai Lin, Supervisor synthesis of POMDP via automata learning, . Automatica, Volume 129, 2021 DOI: 10.1016/j.automatica.2021.109654.

Partially observable Markov decision process (POMDP) is a comprehensive modeling framework that captures uncertainties from sensing noises, actuation errors, and environments. Traditional POMDP planning finds an optimal policy for reward maximization. However, for safety-critical applications, it is often necessary to guarantee system performance described by high-level temporal logic specifications. Hence, we are motivated to develop a supervisor synthesis framework for POMDP with respect to given formal specifications. We propose an iterative learning-based algorithm, which can learn a permissive policy in the form of a deterministic finite automaton. A human–robot collaboration case study validates the proposed algorithm.

State of the art of the convergence of Monte Carlo Exploring Starts RL, policy iteration kind, method

Jun Liu, On the convergence of reinforcement learning with Monte Carlo Exploring Starts, . Automatica, Volume 129, 2021 DOI: 10.1016/j.automatica.2021.109693.

A basic simulation-based reinforcement learning algorithm is the Monte Carlo Exploring Starts (MCES) method, also known as optimistic policy iteration, in which the value function is approximated by simulated returns and a greedy policy is selected at each iteration. The convergence of this algorithm in the general setting has been an open question. In this paper, we investigate the convergence of this algorithm for the case with undiscounted costs, also known as the stochastic shortest path problem. The results complement existing partial results on this topic and thereby help further settle the open problem.

Identifying state-space-models of systems with autoencoders

Daniele Masti, Alberto Bemporad, Learning nonlinear state–space models using autoencoders, . Automatica, Volume 129, 2021 DOI: 10.1016/j.automatica.2021.109666.

We propose a methodology for the identification of nonlinear state–space models from input/output data using machine-learning techniques based on autoencoders and neural networks. Our framework simultaneously identifies the nonlinear output and state-update maps of the model. After formulating the approach and providing guidelines for tuning the related hyper-parameters (including the model order), we show its capability in fitting nonlinear models on different nonlinear system identification benchmarks. Performance is assessed in terms of open-loop prediction on test data and of controlling the system via nonlinear model predictive control (MPC) based on the identified nonlinear state–space model.

Approximating the value function of RL through Max-Plus algebra

Vinicius Mariano Gonçalves, Max-plus approximation for reinforcement learning, . Automatica, Volume 129, 2021 DOI: 10.1016/j.automatica.2021.109623.

Max-Plus Algebra has been applied in several contexts, especially in the control of discrete events systems. In this article, we discuss another application closely related to control: the use of Max-Plus algebra concepts in the context of reinforcement learning. Max-Plus Algebra and reinforcement learning are strongly linked due to the latter’s dependence on the Bellman Equation which, in some cases, is a linear Max-Plus equation. This fact motivates the application of Max-Plus algebra to approximate the value function, central to the Bellman Equation and thus also to reinforcement learning. This article proposes conditions so that this approach can be done in a simple way and following the philosophy of reinforcement learning: explore the environment, receive the rewards and use this information to improve the knowledge of the value function. The proposed conditions are related to two matrices and impose on them a relationship that is analogous to the concept of weak inverses in traditional algebra.

Including a safety procedure in RL to avoid physical agent problems while learning

Kim Peter Wabersich, Melanie N. Zeilinger, A predictive safety filter for learning-based control of constrained nonlinear dynamical systems, . Automatica, Volume 129, 2021 DOI: 10.1016/j.automatica.2021.109597.

The transfer of reinforcement learning (RL) techniques into real-world applications is challenged by safety requirements in the presence of physical limitations. Most RL methods, in particular the most popular algorithms, do not support explicit consideration of state and input constraints. In this paper, we address this problem for nonlinear systems with continuous state and input spaces by introducing a predictive safety filter, which is able to turn a constrained dynamical system into an unconstrained safe system and to which any RL algorithm can be applied ‘out-of-the-box’. The predictive safety filter receives the proposed control input and decides, based on the current system state, if it can be safely applied to the real system, or if it has to be modified otherwise. Safety is thereby established by a continuously updated safety policy, which is based on a model predictive control formulation using a data-driven system model and considering state and input dependent uncertainties.

Including attention mechanisms in long-short term memory

Lin, X., Zhong, G., Chen, K. et al, Attention-Augmented Machine Memory, . Cogn Comput 13, 751–760 (2021) DOI: 10.1007/s12559-021-09854-5.

Attention mechanism plays an important role in the perception and cognition of human beings. Among others, many machine learning models have been developed to memorize the sequential data, such as the Long Short-Term Memory (LSTM) network and its extensions. However, due to lack of the attention mechanism, they cannot pay special attention to the important parts of the sequences. In this paper, we present a novel machine learning method called attention-augmented machine memory (AAMM). It seamlessly integrates the attention mechanism into the memory cell of LSTM. As a result, it facilitates the network to focus on valuable information in the sequences and ignore irrelevant information during its learning. We have conducted experiments on two sequence classification tasks for pattern classification and sentiment analysis, respectively. The experimental results demonstrate the advantages of AAMM over LSTM and some other related approaches. Hence, AAMM can be considered as a substitute of LSTM in the sequence learning applications.

Mixing logical planning with NNs for decision making

Zuo, G., Pan, T., Zhang, T. et al., SOAR Improved Artificial Neural Network for Multistep Decision-making Tasks, . Cogn Comput 13, 612–625 (2021) DOI: 10.1007/s12559-020-09716-6.

Recently, artificial neural networks (ANNs) have been applied to various robot-related research areas due to their powerful spatial feature abstraction and temporal information prediction abilities. Decision-making has also played a fundamental role in the research area of robotics. How to improve ANNs with the characteristics of decision-making is a challenging research issue. ANNs are connectionist models, which means they are naturally weak in long-term planning, logical reasoning, and multistep decision-making. Considering that a small refinement of the inner network structures of ANNs will usually lead to exponentially growing data costs, an additional planning module seems necessary for the further improvement of ANNs, especially for small data learning. In this paper, we propose a state operator and result (SOAR) improved ANN (SANN) model, which takes advantage of both the long-term cognitive planning ability of SOAR and the powerful feature detection ability of ANNs. It mimics the cognitive mechanism of the human brain to improve the traditional ANN with an additional logical planning module. In addition, a data fusion module is constructed to combine the probability vector obtained by SOAR planning and the original data feature array. A data fusion module is constructed to convert the information from the logical sequences in SOAR to the probabilistic vector in ANNs. The proposed architecture is validated in two types of robot multistep decision-making experiments for a grasping task: a multiblock simulated experiment and a multicup experiment in a real scenario. The experimental results show the efficiency and high accuracy of our proposed architecture. The integration of SOAR and ANN is a good compromise between logical planning with small data and probabilistic classification with big data. It also has strong potential for more complicated tasks that require robust classification, long-term planning, and fast learning. Some potential applications include recognition of grasping order in multiobject environment and cooperative grasping of multiagents.