Category Archives: Cognitive Sciences

They had to do it: Certified RL (through online reward shaping/definition)

Hosein Hasanbeig, Daniel Kroening, Alessandro Abate, Certified reinforcement learning with logic guidance, Artificial Intelligence, Volume 322, 2023 DOI: 10.1016/j.artint.2023.103949.

Reinforcement Learning (RL) is a widely employed machine learning architecture that has been applied to a variety of control problems. However, applications in safety-critical domains require a systematic and formal approach to specifying requirements as tasks or goals. We propose a model-free RL algorithm that enables the use of Linear Temporal Logic (LTL) to formulate a goal for unknown continuous-state/action Markov Decision Processes (MDPs). The given LTL property is translated into a Limit-Deterministic Generalised B�chi Automaton (LDGBA), which is then used to shape a synchronous reward function on-the-fly. Under certain assumptions, the algorithm is guaranteed to synthesise a control policy whose traces satisfy the LTL specification with maximal probability.

Meta-RL: given a distribution of tasks, learn a policy capable of adapting to any new task from the task distribution with as little data as possible

Jacob Beck, Risto Vuorio, Evan Zheran Liu, Zheng Xiong, Luisa Zintgraf, Chelsea Finn, Shimon Whiteson, A Survey of Meta-Reinforcement Learning, arXiv:2301.08028 [cs.LG], 2023 DOI: 10.48550/arXiv.2301.08028.

While deep reinforcement learning (RL) has fueled multiple high-profile successes in machine learning, it is held back from more widespread adoption by its often poor data efficiency and the limited generality of the policies it produces. A promising approach for alleviating these limitations is to cast the development of better RL algorithms as a machine learning problem itself in a process called meta-RL. Meta-RL is most commonly studied in a problem setting where, given a distribution of tasks, the goal is to learn a policy that is capable of adapting to any new task from the task distribution with as little data as possible. In this survey, we describe the meta-RL problem setting in detail as well as its major variations. We discuss how, at a high level, meta-RL research can be clustered based on the presence of a task distribution and the learning budget available for each individual task. Using these clusters, we then survey meta-RL algorithms and applications. We conclude by presenting the open problems on the path to making meta-RL part of the standard toolbox for a deep RL practitioner.

Leveraging the unexplainability and opacity of NNs to generate random numbers

Y. Almardeny, A. Benavoli, N. Boujnah and E. Naredo, A Reinforcement Learning System for Generating Instantaneous Quality Random Sequences, IEEE Transactions on Artificial Intelligence, vol. 4, no. 3, pp. 402-415, June 2023 DOI: 10.1109/TAI.2022.3161893.

Random numbers are essential to most computer applications. Still, producing high-quality random sequences is a big challenge. Inspired by the success of artificial neural networks and reinforcement learning, we propose a novel and effective end-to-end learning system to generate pseudorandom sequences that operates under the upside-down reinforcement learning framework. It is based on manipulating the generalized information entropy metric to derive commands that instantaneously guide the agent toward the optimal random behavior. Using a wide range of evaluation tests, the proposed approach is compared against three state-of-the-art accredited pseudorandom number generators (PRNGs). The experimental results agree with our theoretical study and show that the proposed framework is a promising candidate for a wide range of applications.

Limiting human intervention in the design of RL solutions (now called “Automated RL”)

Marco Mussi, Davide Lombarda, Alberto Maria Metelli, Francesco Trov�, Marcello Restelli, ARLO: A framework for Automated Reinforcement Learning, Expert Systems with Applications, Volume 224, 2023 DOI: 10.1016/j.eswa.2023.119883.

Automated Reinforcement Learning (AutoRL) is a relatively new area of research that is gaining increasing attention. The objective of AutoRL consists in easing the employment of Reinforcement Learning (RL) techniques for the broader public by alleviating some of its main challenges, including data collection, algorithm selection, and hyper-parameter tuning. In this work, we propose a general and flexible framework, namely ARLO: Automated Reinforcement Learning Optimizer, to construct automated pipelines for AutoRL. Based on this, we propose a pipeline for offline and one for online RL, discussing the components, interaction, and highlighting the difference between the two settings. Furthermore, we provide a Python implementation of such pipelines, released as an open-source library. Our implementation is tested on an illustrative LQG domain and on classic MuJoCo environments, showing the ability to reach competitive performances requiring limited human intervention. We also showcase the full pipeline on a realistic dam environment, automatically performing the feature selection and the model generation tasks.

Multi-task RL through common perceptions

Jinling Meng, Fei Zhu, Seek for commonalities: Shared features extraction for multi-task reinforcement learning via adversarial training, Expert Systems with Applications, Volume 224, 2023 DOI: 10.1016/j.eswa.2023.119975.

Multi-task reinforcement learning is promising to alleviate the low sample efficiency and high computation cost of reinforcement learning algorithms. However, current methods mostly focus on unique features that are not conducive to the transfer between tasks. Moreover, they usually lack a balance mechanism among tasks, which often leads to the unnecessary occupation of training resources by tasks that have already been trained. To address the problems, a simple yet effective method referred to as Adaptive Experience buffer with Shared Features Multi-Task Reinforcement Learning (AESF-MTRL) is proposed. In AESF-MTRL, input observation of the environment is divided into shared features and unique features, which are extracted using different feature extractors. Unique features are extracted by simple gradient descent, while shared features are extracted using adversarial training, with an additional discriminator trained to ensure that the extracted features are indeed shared features. AESF-MTRL also maintains a reward stack to adjust the sampling ratio of trajectories from different tasks dynamically during the update period to balance the learning process of different tasks. Experiments on multiple robotics control environments demonstrate the effectiveness of the proposed method.

Further support for a multi-tool approach for consciusness

Biyu J. He, Towards a pluralistic neurobiological understanding of consciousness, Trends in Cognitive Sciences, Volume 27, Issue 5, 2023 DOI: 10.1016/j.tics.2023.02.001.

Theories of consciousness are often based on the assumption that a single, unified neurobiological account will explain different types of conscious awareness. However, recent findings show that, even within a single modality such as conscious visual perception, the anatomical location, timing, and information flow of neural activity related to conscious awareness vary depending on both external and internal factors. This suggests that the search for generic neural correlates of consciousness may not be fruitful. I argue that consciousness science requires a more pluralistic approach and propose a new framework: joint determinant theory (JDT). This theory may be capable of accommodating different brain circuit mechanisms for conscious contents as varied as percepts, wills, memories, emotions, and thoughts, as well as their integrated experience.

Emergence of number meaning from sensorimotor experiences

Elena Sixtus, Florian Krause, Oliver Lindemann, Martin H. Fischer, A sensorimotor perspective on numerical cognition, Trends in Cognitive Sciences, Volume 27, Issue 4, 2023, Pages 367-378 DOI: 10.1016/j.tics.2023.01.002.

Numbers are present in every part of modern society and the human capacity to use numbers is unparalleled in other species. Understanding the mental and neural representations supporting this capacity is of central interest to cognitive psychology, neuroscience, and education. Embodied numerical cognition theory suggests that beyond the seemingly abstract symbols used to refer to numbers, their underlying meaning is deeply grounded in sensorimotor experiences, and that our specific understanding of numerical information is shaped by actions related to our fingers, egocentric space, and experiences with magnitudes in everyday life. We propose a sensorimotor perspective on numerical cognition in which number comprehension and numerical proficiency emerge from grounding three distinct numerical core concepts: magnitude, ordinality, and cardinality.

Review of emotions in AI

G. Assun��o, B. Patr�o, M. Castelo-Branco and P. Menezes, An Overview of Emotion in Artificial Intelligence, IEEE Transactions on Artificial Intelligence, vol. 3, no. 6, pp. 867-886, Dec. 2022 DOI: 10.1109/TAI.2022.3159614.

The field of artificial intelligence (AI) has gained immense traction over the past decade, producing increasingly successful applications as research strives to understand and exploit neural processing specifics. Nonetheless emotion, despite its demonstrated significance to reinforcement, social integration, and general development, remains a largely stigmatized and consequently disregarded topic by most engineers and computer scientists. In this article, we endorse emotion\u2019s value for the advancement of artificial cognitive processing, as well as explore real-world use cases of emotion-augmented AI. A schematization is provided on the psychological-neurophysiologic basics of emotion in order to bridge the interdisciplinary gap preventing emulation and integration in AI methodology, as well as exploitation by current systems. In addition, we overview three major subdomains of AI greatly benefiting from emotion, and produce a systematic survey of meaningful yet recent contributions to each area. To conclude, we address crucial challenges and promising research paths for the future of emotion in AI with the hope that more researchers will develop an interest for the topic and find it easier to develop their own contributions.

Normal blindness to visible objects seems to be the result of limited-capacity prediction mechanisms in the brain

Jeremy M. Wolfe, Anna Kosovicheva, Benjamin Wolfe, Normal blindness: when we Look But Fail To See, Trends in Cognitive Sciences, Volume 26, Issue 9, 2022, Pages 809-819 DOI: 10.1016/j.tics.2022.06.006.

Humans routinely miss important information that is \u2018right in front of our eyes\u2019, from overlooking typos in a paper to failing to see a cyclist in an intersection. Recent studies on these \u2018Looked But Failed To See\u2019 (LBFTS) errors point to a common mechanism underlying these failures, whether the missed item was an unexpected gorilla, the clearly defined target of a visual search, or that simple typo. We argue that normal blindness is the by-product of the limited-capacity prediction engine that is our visual system. The processes that evolved to allow us to move through the world with ease are virtually guaranteed to cause us to miss some significant stimuli, especially in important tasks like driving and medical image perception.

On the existence of multiple fundamental “languages” in the brain that use discrete symbols and a few basic structures

Stanislas Dehaene, Fosca Al Roumi, Yair Lakretz, Samuel Planton, Mathias Sabl�-Meyer, Symbols and mental programs: a hypothesis about human singularity, Trends in Cognitive Sciences, Volume 26, Issue 9, 2022, Pages 751-766 DOI: 10.1016/j.tics.2022.06.010.

Natural language is often seen as the single factor that explains the cognitive singularity of the human species. Instead, we propose that humans possess multiple internal languages of thought, akin to computer languages, which encode and compress structures in various domains (mathematics, music, shape\u2026). These languages rely on cortical circuits distinct from classical language areas. Each is characterized by: (i) the discretization of a domain using a small set of symbols, and (ii) their recursive composition into mental programs that encode nested repetitions with variations. In various tasks of elementary shape or sequence perception, minimum description length in the proposed languages captures human behavior and brain activity, whereas non-human primate data are captured by simpler nonsymbolic models. Our research argues in favor of discrete symbolic models of human thought.