Thomas L. Griffiths, Understanding Human Intelligence through Human Limitations, . Trends in Cognitive Sciences, Volume 24, Issue 11, 2020, Pages 873-883 DOI: 10.1016/j.tics.2020.09.001.
(no abstract)
(no abstract)
(no abstract)
Our goal is to learn control policies for robots that provably generalize well to novel environments given a dataset of example environments. The key technical idea behind our approach is to leverage tools from generalization theory in machine learning by exploiting a precise analogy (which we present in the form of a reduction) between generalization of control policies to novel environments and generalization of hypotheses in the supervised learning setting. In particular, we utilize the probably approximately correct (PAC)-Bayes framework, which allows us to obtain upper bounds that hold with high probability on the expected cost of (stochastic) control policies across novel environments. We propose policy learning algorithms that explicitly seek to minimize this upper bound. The corresponding optimization problem can be solved using convex optimization (relative entropy programming in particular) in the setting where we are optimizing over a finite policy space. In the more general setting of continuously parameterized policies (e.g., neural network policies), we minimize this upper bound using stochastic gradient descent. We present simulated results of our approach applied to learning (1) reactive obstacle avoidance policies and (2) neural network-based grasping policies. We also present hardware results for the Parrot Swing drone navigating through different obstacle environments. Our examples demonstrate the potential of our approach to provide strong generalization guarantees for robotic systems with continuous state and action spaces, complicated (e.g., nonlinear) dynamics, rich sensory inputs (e.g., depth images), and neural network-based policies.
We propose an explainable reinforcement learning (XRL) framework that analyzes an agent’s history of interaction with the environment to extract interestingness elements that help explain its behavior. The framework relies on data readily available from standard RL algorithms, augmented with data that can easily be collected by the agent while learning. We describe how to create visual summaries of an agent’s behavior in the form of short video-clips highlighting key interaction moments, based on the proposed elements. We also report on a user study where we evaluated the ability of humans to correctly perceive the aptitude of agents with different characteristics, including their capabilities and limitations, given visual summaries automatically generated by our framework. The results show that the diversity of aspects captured by the different interestingness elements is crucial to help humans correctly understand an agent’s strengths and limitations in performing a task, and determine when it might need adjustments to improve its performance.
We suggest to express policies for contingent planning by knowledge-based programs (KBPs). KBPs, introduced by Fagin et al. (1995) [32], are high-level protocols describing the actions that the agent should perform as a function of their current knowledge: branching conditions are epistemic formulas that are interpretable by the agent. The main aim of our paper is to show that KBPs can be seen as a succinct language for expressing policies in single-agent contingent planning. KBP are conceptually very close to languages used for expressing policies in the partially observable planning literature: like them, they have conditional and looping structures, with actions as atomic programs and Boolean formulas on beliefs for choosing the execution path. Now, the specificity of KBPs is that branching conditions refer to the belief state and not to the observations. Because of their structural proximity, KBPs and standard languages for representing policies have the same power of expressivity: every standard policy can be expressed as a KBP, and every KBP can be “unfolded” into a standard policy. However, KBPs are more succinct, more readable, and more explainable than standard policies. On the other hand, they require more online computation time, but we show that this is an unavoidable tradeoff. We study knowledge-based programs along four criteria: expressivity, succinctness, complexity of online execution, and complexity of verification.
How does consciousness vary across the animal kingdom? Are some animals ‘more conscious’ than others? This article presents a multidimensional framework for understanding interspecies variation in states of consciousness. The framework distinguishes five key dimensions of variation: perceptual richness, evaluative richness, integration at a time, integration across time, and self-consciousness. For each dimension, existing experiments that bear on it are reviewed and future experiments are suggested. By assessing a given species against each dimension, we can construct a consciousness profile for that species. On this framework, there is no single scale along which species can be ranked as more or less conscious. Rather, each species has its own distinctive consciousness profile.
We highlight a novel brain correlate of prediction, the prediction potential (or PP), a slow negative-going potential shift preceding visual, acoustic, and spoken or written verbal stimuli that can be predicted from their context. The cortical sources underlying the prediction potential reflect perceptual and semantic features of anticipated stimuli before these appear.
Deep neural networks are generally designed as a stack of differentiable layers, in which a prediction is obtained only after running the full stack. Recently, some contributions have proposed techniques to endow the networks with early exits, allowing to obtain predictions at intermediate points of the stack. These multi-output networks have a number of advantages, including (i) significant reductions of the inference time, (ii) reduced tendency to overfitting and vanishing gradients, and (iii) capability of being distributed over multi-tier computation platforms. In addition, they connect to the wider themes of biological plausibility and layered cognitive reasoning. In this paper, we provide a comprehensive introduction to this family of neural networks, by describing in a unified fashion the way these architectures can be designed, trained, and actually deployed in time-constrained scenarios. We also describe in-depth their application scenarios in 5G and Fog computing environments, as long as some of the open research questions connected to them.
(no abstract).
(no abstract).