Category Archives: Cognitive Sciences

Dealing with the exploration with a nice introduction to the problem

Jiayi Lu, Shuai Han, Shuai L�, Meng Kang, Junwei Zhang, Sampling diversity driven exploration with state difference guidance, Expert Systems with Applications, Volume 203, 2022 DOI: 10.1016/j.eswa.2022.117418.

Exploration is one of the key issues of deep reinforcement learning, especially in the environments with sparse or deceptive rewards. Exploration based on intrinsic rewards can handle these environments. However, these methods cannot take both global interaction dynamics and local environment changes into account simultaneously. In this paper, we propose a novel intrinsic reward for off-policy learning, which not only encourages the agent to take actions not fully learned from a global perspective, but also instructs the agent to trigger remarkable changes in the environment from a local perspective. Meanwhile, we propose the double-actors\u2013double-critics framework to combine intrinsic rewards with extrinsic rewards to avoid the inappropriate combination of intrinsic and extrinsic rewards in previous methods. This framework can be applied to off-policy learning algorithms based on the actor\u2013critic method. We provide a comprehensive evaluation of our approach on the MuJoCo benchmark environments. The results demonstrate that our method can perform effective exploration in the environments with dense, deceptive and sparse rewards. Besides, we conduct sufficient ablation and quantitative analyses to intrinsic rewards. Furthermore, we also verify the superiority and rationality of our double-actors\u2013double-critics framework through comparative experiments.

Increasing exploration when the agent performs worse, decreasing when performing better, in the context of DQN for distributing computation among cloud and edge servers, also dealing with hybridization of RL with Fuzzy

Do Bao Son, Ta Huu Binh, Hiep Khac Vo, Binh Minh Nguyen, Huynh Thi Thanh Binh, Shui Yu, Value-based reinforcement learning approaches for task offloading in Delay Constrained Vehicular Edge Computing, Engineering Applications of Artificial Intelligence, Volume 113, 2022 DOI: 10.1016/j.engappai.2022.104898.

In the age of booming information technology, human-being has witnessed the need for new paradigms with both high computational capability and low latency. A potential solution is Vehicular Edge Computing (VEC). Previous work proposed a Fuzzy Deep Q-Network in Offloading scheme (FDQO) that combines Fuzzy rules and Deep Q-Network (DQN) to improve DQN\u2019s early performance by using Fuzzy Controller (FC). However, we notice that frequent usage of FC can hinder the future growth performance of model. One way to overcome this issue is to remove Fuzzy Controller entirely. We introduced an algorithm called baseline DQN (b-DQN), represented by its two variants Static baseline DQN (Sb-DQN) and Dynamic baseline DQN (Db-DQN), to modify the exploration rate base on the average rewards of closest observations. Our findings confirm that these baseline DQN algorithms surpass traditional DQN models in terms of average Quality of Experience (QoE) in 100 time slots by about 6%, but still suffer from poor early performance (such as in the first 5 time slots). Here, we introduce baseline FDQO (b-FDQO). This algorithm has a strategy to modify the Fuzzy Logic usage instead of removing it entirely while still observing the rewards to modify the exploration rate. It brings a higher average QoE in the first 5 time slots compared to other non-fuzzy-logic algorithms by at least 55.12%, prevent the model from getting too bad result over all time slots, while having the late performance as good as that of b-DQN.

Live-RL enhancement / reduction of unsafe situations by reducing the transition possibility of unsafe actions

Serhat Duman, Hamdi Tolga Kahraman, Yusuf Sonmez, Ugur Guvenc, Mehmet Kati, Sefa Aras, A powerful meta-heuristic search algorithm for solving global optimization and real-world solar photovoltaic parameter estimation problems, Engineering Applications of Artificial Intelligence, Volume 111, 2022 DOI: 10.1016/j.engappai.2022.104763.

The teaching-learning-based artificial bee colony (TLABC) is a new hybrid swarm-based metaheuristic search algorithm. It combines the exploitation of the teaching learning-based optimization (TLBO) with the exploration of the artificial bee colony (ABC). With the hybridization of these two nature-inspired swarm intelligence algorithms, a robust method has been proposed to solve global optimization problems. However, as with swarm-based algorithms, with the TLABC method, it is a great challenge to effectively simulate the selection process. Fitness-distance balance (FDB) is a powerful recently developed method to effectively imitate the selection process in nature. In this study, the three search phases of the TLABC algorithm were redesigned using the FDB method. In this way, the FDB-TLABC algorithm, which imitates nature more effectively and has a robust search performance, was developed. To investigate the exploitation, exploration, and balanced search capabilities of the proposed algorithm, it was tested on standard and complex benchmark suites (Classic, IEEE CEC 2014, IEEE CEC 2017, and IEEE CEC 2020). In order to verify the performance of the proposed FDB-TLABC for global optimization problems and in the photovoltaic parameter estimation problem (a constrained real-world engineering problem) a very comprehensive and qualified experimental study was carried out according to IEEE CEC standards. Statistical analysis results confirmed that the proposed FDB-TLABC provided the best optimum solution and yielded a superior performance compared to other optimization methods.

Interesting summary of photovoltaic modelling

Serhat Duman, Hamdi Tolga Kahraman, Yusuf Sonmez, Ugur Guvenc, Mehmet Kati, Sefa Aras, A powerful meta-heuristic search algorithm for solving global optimization and real-world solar photovoltaic parameter estimation problems, Engineering Applications of Artificial Intelligence, Volume 111, 2022 DOI: 10.1016/j.engappai.2022.104763.

The teaching-learning-based artificial bee colony (TLABC) is a new hybrid swarm-based metaheuristic search algorithm. It combines the exploitation of the teaching learning-based optimization (TLBO) with the exploration of the artificial bee colony (ABC). With the hybridization of these two nature-inspired swarm intelligence algorithms, a robust method has been proposed to solve global optimization problems. However, as with swarm-based algorithms, with the TLABC method, it is a great challenge to effectively simulate the selection process. Fitness-distance balance (FDB) is a powerful recently developed method to effectively imitate the selection process in nature. In this study, the three search phases of the TLABC algorithm were redesigned using the FDB method. In this way, the FDB-TLABC algorithm, which imitates nature more effectively and has a robust search performance, was developed. To investigate the exploitation, exploration, and balanced search capabilities of the proposed algorithm, it was tested on standard and complex benchmark suites (Classic, IEEE CEC 2014, IEEE CEC 2017, and IEEE CEC 2020). In order to verify the performance of the proposed FDB-TLABC for global optimization problems and in the photovoltaic parameter estimation problem (a constrained real-world engineering problem) a very comprehensive and qualified experimental study was carried out according to IEEE CEC standards. Statistical analysis results confirmed that the proposed FDB-TLABC provided the best optimum solution and yielded a superior performance compared to other optimization methods.

Action selection strategy for model-free RL based on neurophysiology

D. Wang, S. Chen, Y. Hu, L. Liu and H. Wang, Behavior Decision of Mobile Robot With a Neurophysiologically Motivated Reinforcement Learning Model, IEEE Transactions on Cognitive and Developmental Systems, vol. 14, no. 1, pp. 219-233, March 2022 DOI: 10.1109/TCDS.2020.3035778.

Online model-free reinforcement learning (RL) approaches play a crucial role in coping with the real-world applications, such as the behavioral decision making in robotics. How to balance the exploration and exploitation processes is a central problem in RL. A balanced ratio of exploration/exploitation has a great influence on the total learning time and the quality of the learned strategy. Therefore, various action selection policies have been presented to obtain a balance between the exploration and exploitation procedures. However, these approaches are rarely, automatically, and dynamically regulated to the environment variations. One of the most amazing self-adaptation mechanisms in animals is their capacity to dynamically switch between exploration and exploitation strategies. This article proposes a novel neurophysiologically motivated model which simulates the role of medial prefrontal cortex (MPFC) and lateral prefrontal cortex (LPFC) in behavior decision. The sensory input is transmitted to the MPFC, then the ventral tegmental area (VTA) receives a reward and calculates a dopaminergic reinforcement signal, and the feedback categorization neurons in anterior cingulate cortex (ACC) calculate the vigilance according to the dopaminergic reinforcement signal. Then the vigilance is transformed to LPFC to regulate the exploration rate, finally the exploration rate is transmitted to thalamus to calculate the corresponding action probability. This action selection mechanism is introduced to the actor\u2013critic model of the basal ganglia, combining with the cerebellum model based on the developmental network to construct a new hybrid neuromodulatory model to select the action of the agent. Both the simulation comparison with other four traditional action selection policies and the physical experiment results demonstrate the potential of the proposed neuromodulatory model in action selection.

The brain as a communication network

John D. Mollon, Chie Takahashi, Marina V. Danilova, What kind of network is the brain? Trends in Cognitive Sciences, Volume 26, Issue 4, 2022, Pages 312-324 DOI: 10.1016/j.tics.2022.01.007.

The different areas of the cerebral cortex are linked by a network of white matter, comprising the myelinated axons of pyramidal cells. Is this network a neural net, in the sense that representations of the world are embodied in the structure of the net, its pattern of nodes, and connections? Or is it a communications network, where the same physical substrate carries different information from moment to moment? This question is part of the larger question of whether the brain is better modeled by connectionism or by symbolic artificial intelligence (AI), but we review it in the specific context of the psychophysics of stimulus comparison and the format and protocol of information transmission over the long-range tracts of the brain.

An hypothesis that human perception can only be done in real-time if prediction mechanisms go ahead and save the gap caused by the processing of inputs, which actually cannot be done in real-time (plus further post-processing and adjustment of past perceptions)

Hinze Hogendoorn, Perception in real-time: predicting the present, reconstructing the past, Trends in Cognitive Sciences, Volume 26, Issue 2, 2022 DOI: 10.1016/j.tics.2021.11.003.

We feel that we perceive events in the environment as they unfold in real-time. However, this intuitive view of perception is impossible to implement in the nervous system due to biological constraints such as neural transmission delays. I propose a new way of thinking about real-time perception: at any given moment, instead of representing a single timepoint, perceptual mechanisms represent an entire timeline. On this timeline, predictive mechanisms predict ahead to compensate for delays in incoming sensory input, and reconstruction mechanisms retroactively revise perception when those predictions do not come true. This proposal integrates and extends previous work to address a crucial gap in our understanding of a fundamental aspect of our everyday life: the experience of perceiving the present.

Defining and measuring mathematically the level of knowledge, ignorance and uncertainty

Fujun Hou, Evangelos Triantaphyllou, Juri Yanase, Knowledge, ignorance, and uncertainty: An investigation from the perspective of some differential equations, Expert Systems with Applications, Volume 191, 2022 DOI: 10.1016/j.eswa.2021.116325.

People use knowledge on several cognitive tasks such as when they recognize objects, rank entities such as the alternatives in multi-criteria decision making, or for classification tasks of decision making / expert / intelligent systems. When people have sufficient relevant knowledge, they can make well-distinctive assessments among entities. Otherwise, they may exhibit some uncertainty. This paper establishes two differential equations, of which one is for the interaction between the knowledge level and the uncertainty level, and the other is for the interaction between the ignorance level and the uncertainty level. By solving these two differential equations under certain boundary conditions, one can derive that the proposed knowledge level indicator is equivalent to Wierman’s knowledge granularity measure up to a constant (exactly, ln2). Moreover, the knowledge level indicator and the ignorance level indicator are found to be in a complementary relationship with each other. That is, more knowledge implies less ignorance, and vice-versa. The results of this study bridge a critical gap that exists in the understanding of the concepts of knowledge and ignorance.

A nice survey on knowledge graphs for representing, well, knowledge, focused on explainability of AI, but whatever, they are interesting for many more things

Ilaria Tiddi, Stefan Schlobach, Knowledge graphs as tools for explainable machine learning: A survey, Artificial Intelligence, Volume 302, 2022 DOI: 10.1016/j.artint.2021.103627.

This paper provides an extensive overview of the use of knowledge graphs in the context of Explainable Machine Learning. As of late, explainable AI has become a very active field of research by addressing the limitations of the latest machine learning solutions that often provide highly accurate, but hardly scrutable and interpretable decisions. An increasing interest has also been shown in the integration of Knowledge Representation techniques in Machine Learning applications, mostly motivated by the complementary strengths and weaknesses that could lead to a new generation of hybrid intelligent systems. Following this idea, we hypothesise that knowledge graphs, which naturally provide domain background knowledge in a machine-readable format, could be integrated in Explainable Machine Learning approaches to help them provide more meaningful, insightful and trustworthy explanations. Using a systematic literature review methodology we designed an analytical framework to explore the current landscape of Explainable Machine Learning. We focus particularly on the integration with structured knowledge at large scale, and use our framework to analyse a variety of Machine Learning domains, identifying the main characteristics of such knowledge-based, explainable systems from different perspectives. We then summarise the strengths of such hybrid systems, such as improved understandability, reactivity, and accuracy, as well as their limitations, e.g. in handling noise or extracting knowledge efficiently. We conclude by discussing a list of open challenges left for future research.

A general model of abstraction of graphs

Christer Bäckström, Peter Jonsson, A framework for analysing state-abstraction methods, Artificial Intelligence, Volume 302, 2022 DOI: 10.1016/j.artint.2021.103608.

Abstraction has been used in combinatorial search and action planning from the very beginning of AI. Many different methods and formalisms for state abstraction have been proposed in the literature, but they have been designed from various points of view and with varying purposes. Hence, these methods have been notoriously difficult to analyse and compare in a structured way. In order to improve upon this situation, we present a coherent and flexible framework for modelling abstraction (and abstraction-like) methods based on graph transformations. The usefulness of the framework is demonstrated by applying it to problems in both search and planning. We model six different abstraction methods from the planning literature and analyse their intrinsic properties. We show how to capture many search abstraction concepts (such as avoiding backtracking between levels) and how to put them into a broader context. We also use the framework to identify and investigate connections between refinement and heuristics—two concepts that have usually been considered as unrelated in the literature. This provides new insights into various topics, e.g. Valtorta’s theorem and spurious states. We finally extend the framework with composition of transformations to accommodate for abstraction hierarchies, and other multi-level concepts. We demonstrate the latter by modelling and analysing the merge-and-shrink abstraction method.