Category Archives: Cognitive Sciences

Thermodynamics as a way of identifying hierarchies

Morten L. Kringelbach, Yonatan Sanz Perl, Gustavo Deco, The Thermodynamics of Mind, Trends in Cognitive Sciences, Volume 28, Issue 6, 2024, Pages 568-581 DOI: 10.1016/j.tics.2024.03.009.

To not only survive, but also thrive, the brain must efficiently orchestrate distributed computation across space and time. This requires hierarchical organisation facilitating fast information transfer and processing at the lowest possible metabolic cost. Quantifying brain hierarchy is difficult but can be estimated from the asymmetry of information flow. Thermodynamics has successfully characterised hierarchy in many other complex systems. Here, we propose the ‘Thermodynamics of Mind’ framework as a natural way to quantify hierarchical brain orchestration and its underlying mechanisms. This has already provided novel insights into the orchestration of hierarchy in brain states including movie watching, where the hierarchy of the brain is flatter than during rest. Overall, this framework holds great promise for revealing the orchestration of cognition.

Reducing discovered skills in DRL to the essential ones, modelling skills with SMDP Q-learning

Shuai Qing, Fei Zhu, Refine to the essence: Less-redundant skill learning via diversity clustering, Engineering Applications of Artificial Intelligence, Volume 133, Part A, 2024 DOI: 10.1016/j.engappai.2024.107981.

In reinforcement learning, skill is a potentially conditional policy that solves tasks in a hierarchically controlled manner. Progress on skill discovery helps agents learn a set of diverse and useful skills without external supervision to tackle complex tasks with sparse rewards. Although most of the studies have aimed to maximize the diversity of skills discovered, the distinguishability between skills diminishes as the number of skills increases, leading to a subset of similar and redundant skills. To tackle this problem, a method called Refine to the Essence of Skills (RE-Skill) is proposed, which aims at learning skills with less redundancy. RE-Skill integrates the concepts of cluster analysis and policy distillation, clustering similar skills together based on their unique features, learning the most optimal performance within each cluster, and filtering out similar skills that involve excessive and intricate actions, thereby reducing redundancy among skills. By refining clusters of similar skills into less-redundant independent skills, RE-Skill demonstrates superior performance compared to other skill discovery algorithms and shows how these less-redundant skills effectively address downstream tasks, indicating that RE-Skill is able to extend its efficacy to engineering applications in robot control and obstacle training tasks within complex environments.

An alternative conceptual basis for curiosity / motivation

Francesco Poli, Jill X. O’Reilly, Rogier B. Mars, Sabine Hunnius, Curiosity and the dynamics of optimal exploration, Trends in Cognitive Sciences, Volume 28, Issue 5, 2024, DOI: 10.1016/j.tics.2024.02.001.

What drives our curiosity remains an elusive and hotly debated issue, with multiple hypotheses proposed but a cohesive account yet to be established. This review discusses traditional and emergent theories that frame curiosity as a desire to know and a drive to learn, respectively. We adopt a model-based approach that maps the temporal dynamics of various factors underlying curiosity-based exploration, such as uncertainty, information gain, and learning progress. In so doing, we identify the limitations of past theories and posit an integrated account that harnesses their strengths in describing curiosity as a tool for optimal environmental exploration. In our unified account, curiosity serves as a ‘common currency’ for exploration, which must be balanced with other drives such as safety and hunger to achieve efficient action.

A novel RL setting for non-Markovian systems

Ronen I. Brafman, Giuseppe De Giacomo, Regular decision processes, Artificial Intelligence, Volume 331, 2024 DOI: 10.1016/j.artint.2024.104113.

We introduce and study Regular Decision Processes (RDPs), a new, compact model for domains with non-Markovian dynamics and rewards, in which the dependence on the past is regular, in the language theoretic sense. RDPs are an intermediate model between MDPs and POMDPs. They generalize k-order MDPs and can be viewed as a POMDP in which the hidden state is a regular function of the entire history. In factored RDPs, transition and reward functions are specified using formulas in linear temporal logics over finite traces, or using regular expressions. This allows specifying complex dependence on the past using intuitive and compact formulas, and building models of partially observable domains without specifying an underlying state space.

The problem of incorporating novelties into the knowledge of an AI agent

Shivam Goel, Panagiotis Lymperopoulos, Ravenna Thielstrom, Evan Krause, Patrick Feeney, Pierrick Lorang, Sarah Schneider, Yichen Wei, Eric Kildebeck, Stephen Goss, Michael C. Hughes, Liping Liu, Jivko Sinapov, Matthias Scheutz, A neurosymbolic cognitive architecture framework for handling novelties in open worlds, Artificial Intelligence, Volume 331, 2024 DOI: 10.1016/j.artint.2024.104111.

“Open world” environments are those in which novel objects, agents, events, and more can appear and contradict previous understandings of the environment. This runs counter to the “closed world” assumption used in most AI research, where the environment is assumed to be fully understood and unchanging. The types of environments AI agents can be deployed in are limited by the inability to handle the novelties that occur in open world environments. This paper presents a novel cognitive architecture framework to handle open-world novelties. This framework combines symbolic planning, counterfactual reasoning, reinforcement learning, and deep computer vision to detect and accommodate novelties. We introduce general algorithms for exploring open worlds using inference and machine learning methodologies to facilitate novelty accommodation. The ability to detect and accommodate novelties allows agents built on this framework to successfully complete tasks despite a variety of novel changes to the world. Both the framework components and the entire system are evaluated in Minecraft-like simulated environments. Our results indicate that agents are able to efficiently complete tasks while accommodating “concealed novelties” not shared with the architecture development team.

Imitating physiological processes for achieving robot-human social interaction

Marcos Maroto-Gómez, Martín Bueno-Adrada, María Malfaz, Álvaro Castro-González, Miguel Ángel Salichs, Human–robot pair-bonding from a neuroendocrine perspective: Modeling the effect of oxytocin, arginine vasopressin, and dopamine on the social behavior of an autonomous robot, Robotics and Autonomous Systems, Volume 176, 2024 DOI: 10.1016/j.robot.2024.104687.

Robots and humans coexist in various social environments. In these contexts, robots predominantly serve as assistants, necessitating communication and understanding capabilities. This paper introduces a biologically inspired model grounded on neuroendocrine substances that facilitate the development of social bonds between robots and individuals. The model simulates the effects of oxytocin, arginine vasopressin, and dopamine on social behavior, acting as modulators for bonding in the interaction between the social robot Mini and its users. Neuroendocrine levels vary in response to circadian rhythms and social stimuli perceived by the robot. If users express care for the robot, a positive bond is established, enhancing human–robot interaction by prompting the robot to engage in cooperative actions such as playing or communicating more frequently. Conversely, mistreating the robot leads to a deterioration of the relationship, causing user rejection. An experimenter-robot interaction scenario illustrates the model’s adaptive mechanisms involving three types of profiles: Friendly, Aversive, and Naive. Besides, a user study with 22 participants was conducted to analyze the differences in Attachment, Social Presence, perceived Anthropomorphism, Likability, and User Experience between a robot randomly selecting its behavior and a robot behaving using the bioinspired pair-bonded method proposed in this contribution. The results show how the pair-bonding with the user regulates the robot’s social behavior in response to user actions. The user study reveals statistical differences favoring the robot using the pair-bonding regulation in Attachment and Social Presence. A qualitative study using an interview-like form suggests the positive effects of creating bonds with bioinspired robots.

What attention is (from a cognitive science point of view)

Wayne Wu, We know what attention is!, Trends in Cognitive Sciences, Volume 28, Issue 4, 2024 DOI: 10.1016/j.tics.2023.11.007.

Attention is one of the most thoroughly investigated psychological phenomena, yet skepticism about attention is widespread: we do not know what it is, it is too many things, there is no such thing. The deficiencies highlighted are not about experimental work but the adequacy of the scientific theory of attention. Combining common scientific claims about attention into a single theory leads to internal inconsistency. This paper demonstrates that a specific functional conception of attention is incorporated into the tasks used in standard experimental paradigms. In accepting these paradigms as valid probes of attention, we commit to this common conception. The conception unifies work at multiple levels of analysis into a coherent scientific explanation of attention. Thus, we all know what attention is.

On how much imagery can be said to be real or not by the human brain

Rebecca Keogh, Reality check: how do we know what’s real?, Trends in Cognitive Sciences, Volume 28, Issue 4, 2024 DOI: 10.1016/j.tics.2023.06.001.

How do we know what is real and what is merely a figment of our imagination? Dijkstra and Fleming tackle this question in a recent study. In contrast to the classic Perky effect, they found that once imagery crosses a ‘reality threshold’, it becomes difficult to distinguish from reality.

POMDPs focused on obtaining policies that can be understood well just through the observation of the robot actions

Miguel Faria, Francisco S. Melo, Ana Paiva, “Guess what I’m doing”: Extending legibility to sequential decision tasks, Artificial Intelligence, Volume 330, 2024 DOI: 10.1016/j.artint.2024.104107.

In this paper we investigate the notion of legibility in sequential decision tasks under uncertainty. Previous works that extend legibility to scenarios beyond robot motion either focus on deterministic settings or are computationally too expensive. Our proposed approach, dubbed PoLMDP, is able to handle uncertainty while remaining computationally tractable. We establish the advantages of our approach against state-of-the-art approaches in several scenarios of varying complexity. We also showcase the use of our legible policies as demonstrations in machine teaching scenarios, establishing their superiority in teaching new behaviours against the commonly used demonstrations based on the optimal policy. Finally, we assess the legibility of our computed policies through a user study, where people are asked to infer the goal of a mobile robot following a legible policy by observing its actions.

On the influence of the representations obtained through Deep RL in the learning process

Han Wang, Erfan Miahi, Martha White, Marlos C. Machado, Zaheer Abbas, Raksha Kumaraswamy, Vincent Liu, Adam White, Investigating the properties of neural network representations in reinforcement learning, Artificial Intelligence, Volume 330, 2024 DOI: 10.1016/j.artint.2024.104100.

In this paper we investigate the properties of representations learned by deep reinforcement learning systems. Much of the early work on representations for reinforcement learning focused on designing fixed-basis architectures to achieve properties thought to be desirable, such as orthogonality and sparsity. In contrast, the idea behind deep reinforcement learning methods is that the agent designer should not encode representational properties, but rather that the data stream should determine the properties of the representation—good representations emerge under appropriate training schemes. In this paper we bring these two perspectives together, empirically investigating the properties of representations that support transfer in reinforcement learning. We introduce and measure six representational properties over more than 25,000 agent-task settings. We consider Deep Q-learning agents with different auxiliary losses in a pixel-based navigation environment, with source and transfer tasks corresponding to different goal locations. We develop a method to better understand why some representations work better for transfer, through a systematic approach varying task similarity and measuring and correlating representation properties with transfer performance. We demonstrate the generality of the methodology by investigating representations learned by a Rainbow agent that successfully transfers across Atari 2600 game modes.