Cognitive sciences | kipr

An inspiring formalization of the latest models of human emotions into RL

September 19, 2024 17:09 , Juan-Antonio Fernández-Madrigal

Aviv Emanuel, Eran Eldar, Emotions as Computations, Neuroscience & Biobehavioral Reviews, Volume 144, January 2023 DOI: 10.1016/j.neubiorev.2022.104977.

Emotions ubiquitously impact action, learning, and perception, yet their essence and role remain widely debated. Computational accounts of emotion aspire to answer these questions with greater conceptual precision informed by normative principles and neurobiological data. We examine recent progress in this regard and find that emotions may implement three classes of computations, which serve to evaluate states, actions, and uncertain prospects. For each of these, we use the formalism of reinforcement learning to offer a new formulation that better accounts for existing evidence. We then consider how these distinct computations may map onto distinct emotions and moods. Integrating extensive research on the causes and consequences of different emotions suggests a parsimonious one-to-one mapping, according to which emotions are integral to how we evaluate outcomes (pleasure & pain), learn to predict them (happiness & sadness), use them to inform our (frustration & content) and others’ (anger & gratitude) actions, and plan in order to realize (desire & hope) or avoid (fear & anxiety) uncertain outcomes.

Posted in: Psycho-physiological bases of engineering , Tagged: Emotions, Reinforcement learning

The seminal work on the “firstly cooperate, then repeat other’s actions” strategy in game theory

September 19, 2024 15:36 , Juan-Antonio Fernández-Madrigal

Robert Axelrod; William D. Hamilton, The Evolution of Cooperation, Science, New Series, Vol. 211, No. 4489. (Mar. 27, 1981), pp. 1390-1396 https://ee.stanford.edu/~hellman/Breakthrough/book/pdfs/axelrod.pdf.

Cooperation in organisms, whether bacteria or primates, has been a
difficulty for evolutionary theory since Darwin. On the assumption that interactions
between pairs of individuals occur on a probabilistic basis, a model is developed
based on the concept of an evolutionarily stable strategy in the context of the
Prisoner’s Dilemma game. Deductions from the model, and the results of a computer
tournament show how cooperation based on reciprocity can get started in an asocial
world, can thrive while interacting with a wide range of other strategies, and can resist
invasion once fully established. Potential applications include specific aspects of
territoriality, mating, and disease.

Posted in: Psycho-physiological bases of engineering , Tagged: Game theory

An interesting survey -before the “generative AI” boom- of the integration of sub-symbolic (for learning) and symbolic (for reasoning) systems

September 19, 2024 09:31 , Juan-Antonio Fernández-Madrigal

Artur d’Avila Garcez, Luis C. Lamb, Neurosymbolic AI: The 3rd Wave, arXiv:2012.05876 [cs.AI] https://arxiv.org/abs/2012.05876v2.

Current advances in Artificial Intelligence (AI) and Machine Learning (ML) have achieved unprecedented impact across research communities and industry. Nevertheless, concerns about trust, safety, interpretability and accountability of AI were raised by influential thinkers. Many have identified the need for well-founded knowledge representation and reasoning to be integrated with deep learning and for sound explainability. Neural-symbolic computing has been an active area of research for many years seeking to bring together robust learning in neural networks with reasoning and explainability via symbolic representations for network models. In this paper, we relate recent and early research results in neurosymbolic AI with the objective of identifying the key ingredients of the next wave of AI systems. We focus on research that integrates in a principled way neural network-based learning with symbolic knowledge representation and logical reasoning. The insights provided by 20 years of neural-symbolic computing are shown to shed new light onto the increasingly prominent role of trust, safety, interpretability and accountability of AI. We also identify promising directions and challenges for the next decade of AI research from the perspective of neural-symbolic systems.

Posted in: Artificial Intelligence , Tagged: Deep neural networks, Neurosymbolic AI, Reasoning, Symbol emergence

A relatively simple way of reducing the sampling cost of DQN

September 19, 2024 08:12 , Juan-Antonio Fernández-Madrigal

Hossein Hassani, Soodeh Nikan, Abdallah Shami, Traffic navigation via reinforcement learning with episodic-guided prioritized experience replay, Engineering Applications of Artificial Intelligence, Volume 137, Part A, 2024, DOI: 10.1016/j.engappai.2024.109147.

Deep Reinforcement Learning (DRL) models play a fundamental role in autonomous driving applications; however, they typically suffer from sample inefficiency because they often require many interactions with the environment to learn effective policies. This makes the training process time-consuming. To address this shortcoming, Prioritized Experience Replay (PER) has proven to be effective by prioritizing samples with high Temporal-Difference (TD) error for learning. In this context, this study contributes to artificial intelligence by proposing a sample-efficient DRL algorithm called Episodic-Guided Prioritized Experience Replay (EPER). The core innovation of EPER lies in the utilization of an episodic memory, dedicated to storing successful training episodes. Within this memory, expected returns for each state–action pair are extracted. These returns, combined with TD error-based prioritization, form a novel objective function for deep Q-network training. To prevent excessive determinism, EPER introduces exploration into the learning process by incorporating a regularization term into the objective function that allows exploration of state-space regions with diverse Q-values. The proposed EPER algorithm is suitable to train a DRL agent for handling episodic tasks, and it can be integrated into off-policy DRL models. EPER is employed for traffic navigation through scenarios such as highway driving, merging, roundabout, and intersection to showcase its application in engineering. The attained results denote that, compared with the PER and an additional state-of-the-art training technique, EPER is superior in expediting the training of the agent and learning a more optimal policy that leads to lower collision rates within the constructed navigation scenarios.

Posted in: Reinforcement learning in AI , Tagged: Deep reinforcement learning, Sample efficiency

Equivalence between Transformers and SVMs

September 12, 2024 11:16 , Juan-Antonio Fernández-Madrigal

Davoud Ataee Tarzanagh, Yingcong Li, Christos Thrampoulidis, Samet Oymak, Transformers as Support Vector Machines, arXiv:2308.16898 [cs.LG], https://arxiv.org/abs/2308.16898.

Since its inception in “Attention Is All You Need”, transformer architecture has led to revolutionary advancements in NLP. The attention layer within the transformer admits a sequence of input tokens X and makes them interact through pairwise similarities computed as softmax(XQK⊤X⊤), where (K,Q) are the trainable key-query parameters. In this work, we establish a formal equivalence between the optimization geometry of self-attention and a hard-margin SVM problem that separates optimal input tokens from non-optimal tokens using linear constraints on the outer-products of token pairs. This formalism allows us to characterize the implicit bias of 1-layer transformers optimized with gradient descent: (1) Optimizing the attention layer with vanishing regularization, parameterized by (K,Q), converges in direction to an SVM solution minimizing the nuclear norm of the combined parameter W=KQ⊤. Instead, directly parameterizing by W minimizes a Frobenius norm objective. We characterize this convergence, highlighting that it can occur toward locally-optimal directions rather than global ones. (2) Complementing this, we prove the local/global directional convergence of gradient descent under suitable geometric conditions. Importantly, we show that over-parameterization catalyzes global convergence by ensuring the feasibility of the SVM problem and by guaranteeing a benign optimization landscape devoid of stationary points. (3) While our theory applies primarily to linear prediction heads, we propose a more general SVM equivalence that predicts the implicit bias with nonlinear heads. Our findings are applicable to arbitrary datasets and their validity is verified via experiments. We also introduce several open problems and research directions. We believe these findings inspire the interpretation of transformers as a hierarchy of SVMs that separates and selects optimal tokens.

Posted in: Artificial Intelligence , Tagged: Deep neural networks, LLMs, Support vector machines, Transformers

A novel way of addressing the maximization bias in RL

September 12, 2024 08:34 , Juan-Antonio Fernández-Madrigal

Martin Waltz, Ostap Okhrin, Addressing maximization bias in reinforcement learning with two-sample testing, Artificial Intelligence, Volume 336, 2024, DOI: 10.1016/j.artint.2024.104204.

Value-based reinforcement-learning algorithms have shown strong results in games, robotics, and other real-world applications. Overestimation bias is a known threat to those algorithms and can sometimes lead to dramatic performance decreases or even complete algorithmic failure. We frame the bias problem statistically and consider it an instance of estimating the maximum expected value (MEV) of a set of random variables. We propose the T-Estimator (TE) based on two-sample testing for the mean, that flexibly interpolates between over- and underestimation by adjusting the significance level of the underlying hypothesis tests. We also introduce a generalization, termed K-Estimator (KE), that obeys the same bias and variance bounds as the TE and relies on a nearly arbitrary kernel function. We introduce modifications of Q-Learning and the Bootstrapped Deep Q-Network (BDQN) using the TE and the KE, and prove convergence in the tabular setting. Furthermore, we propose an adaptive variant of the TE-based BDQN that dynamically adjusts the significance level to minimize the absolute estimation bias. All proposed estimators and algorithms are thoroughly tested and validated on diverse tasks and environments, illustrating the bias control and performance potential of the TE and KE.

Posted in: Reinforcement learning in AI , Tagged: RL maximization bias

It seems that vectors can help in the path toward symbols for ANNs

September 6, 2024 09:30 , Juan-Antonio Fernández-Madrigal

Steven T. Piantadosi, Dyana C.Y. Muller, Joshua S. Rule, Karthikeya Kaushik, Mark Gorenstein, Elena R. Leib, Emily Sanford, Why concepts are (probably) vectors, Trends in Cognitive Sciences, Volume 28, Issue 9, 2024, Pages 844-856 DOI: 10.1016/j.tics.2024.06.011.

For decades, cognitive scientists have debated what kind of representation might characterize human concepts. Whatever the format of the representation, it must allow for the computation of varied properties, including similarities, features, categories, definitions, and relations. It must also support the development of theories, ad hoc categories, and knowledge of procedures. Here, we discuss why vector-based representations provide a compelling account that can meet all these needs while being plausibly encoded into neural architectures. This view has become especially promising with recent advances in both large language models and vector symbolic architectures. These innovations show how vectors can handle many properties traditionally thought to be out of reach for neural models, including compositionality, definitions, structures, and symbolic computational processes.

Posted in: Psycho-physiological bases of engineering , Tagged: Concept formation, Symbolism/Subsymbolism

Cognitive evidences of the need of abstraction (==”modularity”) in achieving AI

September 6, 2024 08:52 , Juan-Antonio Fernández-Madrigal

Schilling, M., Hammer, B., Ohl, F.W. et al. Modularity in Nervous Systems—a Key to Efficient Adaptivity for Deep Reinforcement Learning, Cogn Comput 16, 2358–2373 (2024) DOI: 10.1007/s12559-022-10080-w.

Modularity as observed in biological systems has proven valuable for guiding classical motor theories towards good answers about action selection and execution. New challenges arise when we turn to learning: Trying to scale current computational models, such as deep reinforcement learning (DRL), to action spaces, input dimensions, and time horizons seen in biological systems still faces severe obstacles unless vast amounts of training data are available. This leads to the question: does biological modularity also hold an important key for better answers to obtain efficient adaptivity for deep reinforcement learning? We review biological experimental work on modularity in biological motor control and link this with current examples of (deep) RL approaches. Analyzing outcomes of simulation studies, we show that these approaches benefit from forms of modularization as found in biological systems. We identify three different strands of modularity exhibited in biological control systems. Two of them—modularity in state (i) and in action (ii) spaces—appear as a consequence of local interconnectivity (as in reflexes) and are often modulated by higher levels in a control hierarchy. A third strand arises from chunking of action elements along a (iii) temporal dimension. Usually interacting in an overarching spatio-temporal hierarchy of the overall system, the three strands offer major “factors” decomposing the entire modularity structure. We conclude that modularity with its above strands can provide an effective prior for DRL approaches to speed up learning considerably and making learned controllers more robust and adaptive.

Posted in: Psycho-physiological bases of engineering , Tagged: Abstraction

Reducing dimensionality of brain-body state dynamics

September 6, 2024 08:10 , Juan-Antonio Fernández-Madrigal

Daniel S. Kluger, Micah G. Allen, Joachim Gross, Brain–body states embody complex temporal dynamics, Trends in Cognitive Sciences, Volume 28, Issue 8, 2024, Pages 695-698 DOI: 10.1016/j.tics.2024.05.003.

We propose a computational framework for high-dimensional brain–body states as transient embodiments of nested internal and external dynamics governed by interoception. Unifying recent theoretical work, we suggest ways to reduce arbitrary state complexity to an observable number of features in order to accurately predict and intervene in pathological trajectories.

Posted in: Psycho-physiological bases of engineering , Tagged: Dimensionality reduction

RL in periodic scenarios

July 18, 2024 12:42 , Juan-Antonio Fernández-Madrigal

A. Aniket and A. Chattopadhyay, Online Reinforcement Learning in Periodic MDP, IEEE Transactions on Artificial Intelligence, vol. 5, no. 7, pp. 3624-3637, July 2024 DOI: 10.1109/TAI.2024.3375258.

We study learning in periodic Markov decision process (MDP), a special type of nonstationary MDP where both the state transition probabilities and reward functions vary periodically, under the average reward maximization setting. We formulate the problem as a stationary MDP by augmenting the state space with the period index and propose a periodic upper confidence bound reinforcement learning-2 (PUCRL2) algorithm. We show that the regret of PUCRL2 varies linearly with the period N and as O(TlogT−−−−−√) with the horizon length T . Utilizing the information about the sparsity of transition matrix of augmented MDP, we propose another algorithm [periodic upper confidence reinforcement learning with Bernstein bounds (PUCRLB) which enhances upon PUCRL2, both in terms of regret ( O(N−−√) dependency on period] and empirical performance. Finally, we propose two other algorithms U-PUCRL2 and U-PUCRLB for extended uncertainty in the environment in which the period is unknown but a set of candidate periods are known. Numerical results demonstrate the efficacy of all the algorithms.

Posted in: Reinforcement learning in AI , Tagged: Periodic RL

« Previous 1 … 3 4 5 6 7 … 27 Next »

Category Archives: Cognitive Sciences

An inspiring formalization of the latest models of human emotions into RL

Aviv Emanuel, Eran Eldar, Emotions as Computations, Neuroscience & Biobehavioral Reviews, Volume 144, January 2023 DOI: 10.1016/j.neubiorev.2022.104977.

The seminal work on the “firstly cooperate, then repeat other’s actions” strategy in game theory

Robert Axelrod; William D. Hamilton, The Evolution of Cooperation, Science, New Series, Vol. 211, No. 4489. (Mar. 27, 1981), pp. 1390-1396 https://ee.stanford.edu/~hellman/Breakthrough/book/pdfs/axelrod.pdf.

An interesting survey -before the “generative AI” boom- of the integration of sub-symbolic (for learning) and symbolic (for reasoning) systems

Artur d’Avila Garcez, Luis C. Lamb, Neurosymbolic AI: The 3rd Wave, arXiv:2012.05876 [cs.AI] https://arxiv.org/abs/2012.05876v2.

A relatively simple way of reducing the sampling cost of DQN

Hossein Hassani, Soodeh Nikan, Abdallah Shami, Traffic navigation via reinforcement learning with episodic-guided prioritized experience replay, Engineering Applications of Artificial Intelligence, Volume 137, Part A, 2024, DOI: 10.1016/j.engappai.2024.109147.

Equivalence between Transformers and SVMs

Davoud Ataee Tarzanagh, Yingcong Li, Christos Thrampoulidis, Samet Oymak, Transformers as Support Vector Machines, arXiv:2308.16898 [cs.LG], https://arxiv.org/abs/2308.16898.

A novel way of addressing the maximization bias in RL

Martin Waltz, Ostap Okhrin, Addressing maximization bias in reinforcement learning with two-sample testing, Artificial Intelligence, Volume 336, 2024, DOI: 10.1016/j.artint.2024.104204.

It seems that vectors can help in the path toward symbols for ANNs

Steven T. Piantadosi, Dyana C.Y. Muller, Joshua S. Rule, Karthikeya Kaushik, Mark Gorenstein, Elena R. Leib, Emily Sanford, Why concepts are (probably) vectors, Trends in Cognitive Sciences, Volume 28, Issue 9, 2024, Pages 844-856 DOI: 10.1016/j.tics.2024.06.011.

Cognitive evidences of the need of abstraction (==”modularity”) in achieving AI

Schilling, M., Hammer, B., Ohl, F.W. et al. Modularity in Nervous Systems—a Key to Efficient Adaptivity for Deep Reinforcement Learning, Cogn Comput 16, 2358–2373 (2024) DOI: 10.1007/s12559-022-10080-w.

Reducing dimensionality of brain-body state dynamics

Daniel S. Kluger, Micah G. Allen, Joachim Gross, Brain–body states embody complex temporal dynamics, Trends in Cognitive Sciences, Volume 28, Issue 8, 2024, Pages 695-698 DOI: 10.1016/j.tics.2024.05.003.

RL in periodic scenarios

A. Aniket and A. Chattopadhyay, Online Reinforcement Learning in Periodic MDP, IEEE Transactions on Artificial Intelligence, vol. 5, no. 7, pp. 3624-3637, July 2024 DOI: 10.1109/TAI.2024.3375258.

Post Navigation

Fields, areas and lines of research

Archives

Category Archives: Cognitive Sciences

Aviv Emanuel, Eran Eldar, Emotions as Computations, Neuroscience & Biobehavioral Reviews, Volume 144, January 2023 DOI: 10.1016/j.neubiorev.2022.104977.

Robert Axelrod; William D. Hamilton, The Evolution of Cooperation, Science, New Series, Vol. 211, No. 4489. (Mar. 27, 1981), pp. 1390-1396 https://ee.stanford.edu/~hellman/Breakthrough/book/pdfs/axelrod.pdf.

Artur d’Avila Garcez, Luis C. Lamb, Neurosymbolic AI: The 3rd Wave, arXiv:2012.05876 [cs.AI] https://arxiv.org/abs/2012.05876v2.

Hossein Hassani, Soodeh Nikan, Abdallah Shami, Traffic navigation via reinforcement learning with episodic-guided prioritized experience replay, Engineering Applications of Artificial Intelligence, Volume 137, Part A, 2024, DOI: 10.1016/j.engappai.2024.109147.

Davoud Ataee Tarzanagh, Yingcong Li, Christos Thrampoulidis, Samet Oymak, Transformers as Support Vector Machines, arXiv:2308.16898 [cs.LG], https://arxiv.org/abs/2308.16898.

Martin Waltz, Ostap Okhrin, Addressing maximization bias in reinforcement learning with two-sample testing, Artificial Intelligence, Volume 336, 2024, DOI: 10.1016/j.artint.2024.104204.

Steven T. Piantadosi, Dyana C.Y. Muller, Joshua S. Rule, Karthikeya Kaushik, Mark Gorenstein, Elena R. Leib, Emily Sanford, Why concepts are (probably) vectors, Trends in Cognitive Sciences, Volume 28, Issue 9, 2024, Pages 844-856 DOI: 10.1016/j.tics.2024.06.011.

Schilling, M., Hammer, B., Ohl, F.W. et al. Modularity in Nervous Systems—a Key to Efficient Adaptivity for Deep Reinforcement Learning, Cogn Comput 16, 2358–2373 (2024) DOI: 10.1007/s12559-022-10080-w.

Daniel S. Kluger, Micah G. Allen, Joachim Gross, Brain–body states embody complex temporal dynamics, Trends in Cognitive Sciences, Volume 28, Issue 8, 2024, Pages 695-698 DOI: 10.1016/j.tics.2024.05.003.

A. Aniket and A. Chattopadhyay, Online Reinforcement Learning in Periodic MDP, IEEE Transactions on Artificial Intelligence, vol. 5, no. 7, pp. 3624-3637, July 2024 DOI: 10.1109/TAI.2024.3375258.

Post Navigation

Fields, areas and lines of research

Transversal topics, methods and tools

Archives