Abstracting and representing tasks performed under Learning from Demonstration, using bayesian non-parametric time-series analysis (good review of both LfD and HMMs for time-series)

Scott Niekum, Sarah Osentoski, George Konidaris, Sachin Chitta, Bhaskara Marthi, Andrew G. Barto (2015), Learning grounded finite-state representations from unstructured demonstrations, The International Journal of Robotics Research, vol. 34, pp. 131-157. DOI: 10.1177/0278364914554471

Robots exhibit flexible behavior largely in proportion to their degree of knowledge about the world. Such knowledge is often meticulously hand-coded for a narrow class of tasks, limiting the scope of possible robot competencies. Thus, the primary limiting factor of robot capabilities is often not the physical attributes of the robot, but the limited time and skill of expert programmers. One way to deal with the vast number of situations and environments that robots face outside the laboratory is to provide users with simple methods for programming robots that do not require the skill of an expert. For this reason, learning from demonstration (LfD) has become a popular alternative to traditional robot programming methods, aiming to provide a natural mechanism for quickly teaching robots. By simply showing a robot how to perform a task, users can easily demonstrate new tasks as needed, without any special knowledge about the robot. Unfortunately, LfD often yields little knowledge about the world, and thus lacks robust generalization capabilities, especially for complex, multi-step tasks. We present a series of algorithms that draw from recent advances in Bayesian non-parametric statistics and control theory to automatically detect and leverage repeated structure at multiple levels of abstraction in demonstration data. The discovery of repeated structure provides critical insights into task invariants, features of importance, high-level task structure, and appropriate skills for the task. This culminates in the discovery of a finite-state representation of the task, composed of grounded skills that are flexible and reusable, providing robust generalization and transfer in complex, multi-step robotic tasks. These algorithms are tested and evaluated using a PR2 mobile manipulator, showing success on several complex real-world tasks, such as furniture assembly.

Scientific limitations to the non-scientific idea that super-intelligence will come (for exterminating humans)

Ernest Davis, Ethical guidelines for a superintelligence, Artificial Intelligence, Volume 220, March 2015, Pages 121-124, ISSN 0004-3702, DOI: 10.1016/j.artint.2014.12.003.

Nick Bostrom, in his new book SuperIntelligence, argues that the creation of an artificial intelligence with human-level intelligence will be followed fairly soon by the existence of an almost omnipotent superintelligence, with consequences that may well be disastrous for humanity. He considers that it is therefore a top priority for mankind to figure out how to imbue such a superintelligence with a sense of morality; however, he considers that this task is very difficult. I discuss a number of flaws in his analysis, particularly the viewpoint that implementing ethical behavior is an especially difficult problem in AI research.

Solving the problem of the slow learning rate of reinfocerment learning through the acquisition of the transition model from the data

Deisenroth, M.P.; Fox, D.; Rasmussen, C.E., Gaussian Processes for Data-Efficient Learning in Robotics and Control, Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol.37, no.2, pp.408,423, Feb. 2015, DOI: 10.1109/TPAMI.2013.218

Autonomous learning has been a promising direction in control and robotics for more than a decade since data-driven learning allows to reduce the amount of engineering knowledge, which is otherwise required. However, autonomous reinforcement learning (RL) approaches typically require many interactions with the system to learn controllers, which is a practical limitation in real systems, such as robots, where many interactions can be impractical and time consuming. To address this problem, current learning approaches typically require task-specific knowledge in form of expert demonstrations, realistic simulators, pre-shaped policies, or specific knowledge about the underlying dynamics. In this paper, we follow a different approach and speed up learning by extracting more information from data. In particular, we learn a probabilistic, non-parametric Gaussian process transition model of the system. By explicitly incorporating model uncertainty into long-term planning and controller learning our approach reduces the effects of model errors, a key problem in model-based learning. Compared to state-of-the art RL our model-based policy search method achieves an unprecedented speed of learning. We demonstrate its applicability to autonomous learning in real robot and control tasks.

Partially observable reinforcement learning and the problem of representing the history of the learning process efficiently

Doshi-Velez, F.; Pfau, D.; Wood, F.; Roy, N., Bayesian Nonparametric Methods for Partially-Observable Reinforcement Learning, Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol.37, no.2, pp.394,407, Feb. 2015, DOI: 10.1109/TPAMI.2013.191

Making intelligent decisions from incomplete information is critical in many applications: for example, robots must choose actions based on imperfect sensors, and speech-based interfaces must infer a user\u2019s needs from noisy microphone inputs. What makes these tasks hard is that often we do not have a natural representation with which to model the domain and use for choosing actions; we must learn about the domain\u2019s properties while simultaneously performing the task. Learning a representation also involves trade-offs between modeling the data that we have seen previously and being able to make predictions about new data. This article explores learning representations of stochastic systems using Bayesian nonparametric statistics. Bayesian nonparametric methods allow the sophistication of a representation to scale gracefully with the complexity in the data. Our main contribution is a careful empirical evaluation of how representations learned using Bayesian nonparametric methods compare to other standard learning approaches, especially in support of planning and control. We show that the Bayesian aspects of the methods result in achieving state-of-the-art performance in decision making with relatively few samples, while the nonparametric aspects often result in fewer computations. These results hold across a variety of different techniques for choosing actions given a representation.

Mathematical model of quartz crystal clocks and Kalman Filter estimation for clock synchronization

Giorgi, G., An Event-Based Kalman Filter for Clock Synchronization, Instrumentation and Measurement, IEEE Transactions on , vol.64, no.2, pp.449,457, Feb. 2015, DOI: 10.1109/TIM.2014.2340631

The distribution of a time reference has long been a significant research topic in measurement and different solutions have been proposed over the years. In this context, the design of servo clocks plays an important role to get better performances by smoothing the influence of noise sources affecting a synchronization system. A servo clock is asked to provide an adaptive and conservative measure of the time distance between the local clock and the time reference by minimizing, if possible, the energy consumption. In this paper, we propose a servo clock based on an efficient implementation of the Kalman filter (KF), called in the following event-based KF that allows to overcome drawbacks of existing KF-based servo clocks with furthermore a significant reduction of the computational cost. An in-depth analysis of the synchronization uncertainty has been reported to completely characterize the proposed solution; and finally, some guidelines on how to correctly initialize the KF are provided.

Estimating an empirical distribution from a number of estimates distributed among several agents, minimizing the information exchange between the agents

Sarwate, A.D.; Javidi, T., Distributed Learning of Distributions via Social Sampling, Automatic Control, IEEE Transactions on , vol.60, no.1, pp.34,45, Jan. 2015, DOI: 10.1109/TAC.2014.2329611

A protocol for distributed estimation of discrete distributions is proposed. Each agent begins with a single sample from the distribution, and the goal is to learn the empirical distribution of the samples. The protocol is based on a simple message-passing model motivated by communication in social networks. Agents sample a message randomly from their current estimates of the distribution, resulting in a protocol with quantized messages. Using tools from stochastic approximation, the algorithm is shown to converge almost surely. Examples illustrate three regimes with different consensus phenomena. Simulations demonstrate this convergence and give some insight into the effect of network topology.

How to bypass the NP-hardness of estimating the best explanation of given data (instantiated as MAP, i.e., Maximum A Posteriori, not as maximum likelihood) in discrete Bayesian Networks, through distinction of relevant and irrelevant variables

Johan Kwisthout, Most frugal explanations in Bayesian networks, Artificial Intelligence, Volume 218, January 2015, Pages 56-73, ISSN 0004-3702, DOI: 10.1016/j.artint.2014.10.001

Inferring the most probable explanation to a set of variables, given a partial observation of the remaining variables, is one of the canonical computational problems in Bayesian networks, with widespread applications in AI and beyond. This problem, known as MAP, is computationally intractable (NP-hard) and remains so even when only an approximate solution is sought. We propose a heuristic formulation of the MAP problem, denoted as Inference to the Most Frugal Explanation (MFE), based on the observation that many intermediate variables (that are neither observed nor to be explained) are irrelevant with respect to the outcome of the explanatory process. An explanation based on few samples (often even a singleton sample) from these irrelevant variables is typically almost as good as an explanation based on (the computationally costly) marginalization over these variables. We show that while MFE is computationally intractable in general (as is MAP), it can be tractably approximated under plausible situational constraints, and its inferences are fairly robust with respect to which intermediate variables are considered to be relevant.

On search as a consequence of the exploration-exploitation trade-off, and as a core element in human cognition

Thomas T. Hills, Peter M. Todd, David Lazer, A. David Redish, Iain D. Couzin, the Cognitive Search Research Group, Exploration versus exploitation in space, mind, and society, Trends in Cognitive Sciences, Volume 19, Issue 1, January 2015, Pages 46-54, ISSN 1364-6613, DOI: 10.1016/j.tics.2014.10.004.

Search is a ubiquitous property of life. Although diverse domains have worked on search problems largely in isolation, recent trends across disciplines indicate that the formal properties of these problems share similar structures and, often, similar solutions. Moreover, internal search (e.g., memory search) shows similar characteristics to external search (e.g., spatial foraging), including shared neural mechanisms consistent with a common evolutionary origin across species. Search problems and their solutions also scale from individuals to societies, underlying and constraining problem solving, memory, information search, and scientific and cultural innovation. In summary, search represents a core feature of cognition, with a vast influence on its evolution and processes across contexts and requiring input from multiple domains to understand its implications and scope.

On the way humans reduce perceptual information during decision making, falling apart from statistically optimal behavior, in order to deal with the overwhelming sensory flow

Christopher Summerfield, Konstantinos Tsetsos, Do humans make good decisions?, Trends in Cognitive Sciences, Volume 19, Issue 1, January 2015, Pages 27-34, ISSN 1364-6613, DOI: 10.1016/j.tics.2014.11.005

Human performance on perceptual classification tasks approaches that of an ideal observer, but economic decisions are often inconsistent and intransitive, with preferences reversing according to the local context. We discuss the view that suboptimal choices may result from the efficient coding of decision-relevant information, a strategy that allows expected inputs to be processed with higher gain than unexpected inputs. Efficient coding leads to \u2018robust\u2019 decisions that depart from optimality but maximise the information transmitted by a limited-capacity system in a rapidly-changing world. We review recent work showing that when perceptual environments are variable or volatile, perceptual decisions exhibit the same suboptimal context-dependence as economic choices, and we propose a general computational framework that accounts for findings across the two domains.

A new simple method for mobile robot path planning based on particles and inspired in bacteria

Md. Arafat Hossain, Israt Ferdous, Autonomous robot path planning in dynamic environment using a new optimization technique inspired by bacterial foraging technique, Robotics and Autonomous Systems, Volume 64, February 2015, Pages 137-141, ISSN 0921-8890, DOI: 10.1016/j.robot.2014.07.002

.

Path planning is one of the basic and interesting functions for a mobile robot. This paper explores the application of Bacterial Foraging Optimization to the problem of mobile robot navigation to determine the shortest feasible path to move from any current position to the target position in an unknown environment with moving obstacles. It develops a new algorithm based on Bacterial Foraging Optimization (BFO) technique. This algorithm finds a path towards the target and avoiding the obstacles using particles which are randomly distributed on a circle around a robot. The criterion on which it selects the best particle is the distance to the target and the Gaussian cost function of the particle. Then, a high level decision strategy is used for the selection and thus proceeds for the result. It works on local environment by using a simple robot sensor. So, it is free from having generated additional map which adds cost. Furthermore, it can be implemented without requirement to tuning algorithm and complex calculation. To simulate the algorithm, the program is written in C language and the environment is created by OpenGL. To test the efficiency of the proposed technique, results are compared with Basic Bacterial Foraging Optimization (BFO) and another well-known algorithm called Particle Swarm Optimization (PSO) to give the guarantee that the proposed method gives better and optimal path.