Category Archives: Artificial Intelligence

A new variant of A* that is more computationally efficient

Adam Niewola, Leszek Podsedkowski, L* Algorithm—A Linear Computational Complexity Graph Searching Algorithm for Path Planning, Journal of Intelligent & Robotic Systems, September 2018, Volume 91, Issue 3–4, pp 425–444, DOI: 10.1007/s10846-017-0748-6.

The state-of-the-art graph searching algorithm applied to the optimal global path planning problem for mobile robots is the A* algorithm with the heap structured open list. In this paper, we present a novel algorithm, called the L* algorithm, which can be applied to global path planning and is faster than the A* algorithm. The structure of the open list with the use of bidirectional sublists (buckets) ensures the linear computational complexity of the L* algorithm because the nodes in the current bucket can be processed in any sequence and it is not necessary to sort the bucket. Our approach can maintain the optimality and linear computational complexity with the use of the cost expressed by floating-point numbers. The paper presents the requirements of the L* algorithm use and the proof of the admissibility of this algorithm. The experiments confirmed that the L* algorithm is faster than the A* algorithm in various path planning scenarios. We also introduced a method of estimating the execution time of the A* and the L* algorithm. The method was compared with the experimental results.

A survey on decision making for multiagent systems, including multirobot systems

Y. Rizk, M. Awad and E. W. Tunstel, Decision Making in Multiagent Systems: A Survey, IEEE Transactions on Cognitive and Developmental Systems, vol. 10, no. 3, pp. 514-529, DOI: 10.1109/TCDS.2018.2840971.

Intelligent transport systems, efficient electric grids, and sensor networks for data collection and analysis are some examples of the multiagent systems (MAS) that cooperate to achieve common goals. Decision making is an integral part of intelligent agents and MAS that will allow such systems to accomplish increasingly complex tasks. In this survey, we investigate state-of-the-art work within the past five years on cooperative MAS decision making models, including Markov decision processes, game theory, swarm intelligence, and graph theoretic models. We survey algorithms that result in optimal and suboptimal policies such as reinforcement learning, dynamic programming, evolutionary computing, and neural networks. We also discuss the application of these models to robotics, wireless sensor networks, cognitive radio networks, intelligent transport systems, and smart electric grids. In addition, we define key terms in the area and discuss remaining challenges that include incorporating big data advancements to decision making, developing autonomous, scalable and computationally efficient algorithms, tackling more complex tasks, and developing standardized evaluation metrics. While recent surveys have been published on this topic, we present a broader discussion of related models and applications.Note to Practitioners:Future smart cities will rely on cooperative MAS that make decisions about what actions to perform that will lead to the completion of their tasks. Decision making models and algorithms have been developed and reported in the literature to generate such sequences of actions. These models are based on a wide variety of principles including human decision making and social animal behavior. In this paper, we survey existing decision making models and algorithms that generate optimal and suboptimal sequences of actions. We also discuss some of the remaining challenges faced by the research community before more effective MAS deployment can be achieved in this age of Internet of Things, robotics, and mobile devices. These challenges include developing more scalable and efficient algorithms, utilizing the abundant sensory data available, tackling more complex tasks, and developing evaluation standards for decision making.

Interpreting time series patterns through reasoning

T. Teijeiro, P. Félix, On the adoption of abductive reasoning for time series interpretation, Artificial Intelligence, Volume 262, 2018, Pages 163-188, DOI: 10.1016/j.artint.2018.06.005.

Time series interpretation aims to provide an explanation of what is observed in terms of its underlying processes. The present work is based on the assumption that the common classification-based approaches to time series interpretation suffer from a set of inherent weaknesses, whose ultimate cause lies in the monotonic nature of the deductive reasoning paradigm. In this document we propose a new approach to this problem, based on the initial hypothesis that abductive reasoning properly accounts for the human ability to identify and characterize the patterns appearing in a time series. The result of this interpretation is a set of conjectures in the form of observations, organized into an abstraction hierarchy and explaining what has been observed. A knowledge-based framework and a set of algorithms for the interpretation task are provided, implementing a hypothesize-and-test cycle guided by an attentional mechanism. As a representative application domain, interpretation of the electrocardiogram allows us to highlight the strengths of the proposed approach in comparison with traditional classification-based approaches.

Relation between optimization and reinforcement learning

Megumi Miyashita, Shiro Yano, Toshiyuki Kondo Mirror descent search and its acceleration, Robotics and Autonomous Systems, Volume 106, 2018, Pages 107-116 DOI: 10.1016/j.robot.2018.04.009.

In recent years, attention has been focused on the relationship between black-box optimization problem and reinforcement learning problem. In this research, we propose the Mirror Descent Search (MDS) algorithm which is applicable both for black box optimization problems and reinforcement learning problems. Our method is based on the mirror descent method, which is a general optimization algorithm. The contribution of this research is roughly twofold. We propose two essential algorithms, called MDS and Accelerated Mirror Descent Search (AMDS), and two more approximate algorithms: Gaussian Mirror Descent Search (G-MDS) and Gaussian Accelerated Mirror Descent Search (G-AMDS). This research shows that the advanced methods developed in the context of the mirror descent research can be applied to reinforcement learning problem. We also clarify the relationship between an existing reinforcement learning algorithm and our method. With two evaluation experiments, we show our proposed algorithms converge faster than some state-of-the-art methods.

Adapting inverse reinforcement learning for including the risk-aversion of the agent

Sumeet Singh, Jonathan Lacotte, Anirudha Majumdar, and Marco Pavone, Risk-sensitive inverse reinforcement learning via semi- and non-parametric methods , The International Journal of Robotics Research First Published May 22, 2018 DOI: 10.1177/0278364918772017.

The literature on inverse reinforcement learning (IRL) typically assumes that humans take actions to minimize the expected value of a cost function, i.e., that humans are risk neutral. Yet, in practice, humans are often far from being risk neutral. To fill this gap, the objective of this paper is to devise a framework for risk-sensitive (RS) IRL to explicitly account for a human’s risk sensitivity. To this end, we propose a flexible class of models based on coherent risk measures, which allow us to capture an entire spectrum of risk preferences from risk neutral to worst case. We propose efficient non-parametric algorithms based on linear programming and semi-parametric algorithms based on maximum likelihood for inferring a human’s underlying risk measure and cost function for a rich class of static and dynamic decision-making settings. The resulting approach is demonstrated on a simulated driving game with 10 human participants. Our method is able to infer and mimic a wide range of qualitatively different driving styles from highly risk averse to risk neutral in a data-efficient manner. Moreover, comparisons of the RS-IRL approach with a risk-neutral model show that the RS-IRL framework more accurately captures observed participant behavior both qualitatively and quantitatively, especially in scenarios where catastrophic outcomes such as collisions can occur.

Multi-agent reinfocerment learning for working with high-dimensional spaces

David L. Leottau, Javier Ruiz-del-Solar, Robert Babuška, Decentralized Reinforcement Learning of Robot Behaviors, Artificial Intelligence, Volume 256, 2018, Pages 130-159, DOI: 10.1016/j.artint.2017.12.001.

A multi-agent methodology is proposed for Decentralized Reinforcement Learning (DRL) of individual behaviors in problems where multi-dimensional action spaces are involved. When using this methodology, sub-tasks are learned in parallel by individual agents working toward a common goal. In addition to proposing this methodology, three specific multi agent DRL approaches are considered: DRL-Independent, DRL Cooperative-Adaptive (CA), and DRL-Lenient. These approaches are validated and analyzed with an extensive empirical study using four different problems: 3D Mountain Car, SCARA Real-Time Trajectory Generation, Ball-Dribbling in humanoid soccer robotics, and Ball-Pushing using differential drive robots. The experimental validation provides evidence that DRL implementations show better performances and faster learning times than their centralized counterparts, while using less computational resources. DRL-Lenient and DRL-CA algorithms achieve the best final performances for the four tested problems, outperforming their DRL-Independent counterparts. Furthermore, the benefits of the DRL-Lenient and DRL-CA are more noticeable when the problem complexity increases and the centralized scheme becomes intractable given the available computational resources and training time.

Survey of the modelling of agents (intentions, goals, etc.)

Stefano V. Albrecht, Peter Stone, Autonomous agents modelling other agents: A comprehensive survey and open problems, Artificial Intelligence,
Volume 258, 2018, Pages 66-95, DOI: 10.1016/j.artint.2018.01.002.

Much research in artificial intelligence is concerned with the development of autonomous agents that can interact effectively with other agents. An important aspect of such agents is the ability to reason about the behaviours of other agents, by constructing models which make predictions about various properties of interest (such as actions, goals, beliefs) of the modelled agents. A variety of modelling approaches now exist which vary widely in their methodology and underlying assumptions, catering to the needs of the different sub-communities within which they were developed and reflecting the different practical uses for which they are intended. The purpose of the present article is to provide a comprehensive survey of the salient modelling methods which can be found in the literature. The article concludes with a discussion of open problems which may form the basis for fruitful future research.

Using interactive reinforcement learning with the advisor being another reinforcement learning agent

Francisco Cruz, Sven Magg, Yukie Nagai & Stefan Wermter, Improving interactive reinforcement learning: What makes a good teacher?, Connection Science, DOI: 10.1080/09540091.2018.1443318.

Interactive reinforcement learning (IRL) has become an important apprenticeship approach to speed up convergence in classic reinforcement learning (RL) problems. In this regard, a variant of IRL is policy shaping which uses a parent-like trainer to propose the next action to be performed and by doing so reduces the search space by advice. On some occasions, the trainer may be another artificial agent which in turn was trained using RL methods to afterward becoming an advisor for other learner-agents. In this work, we analyse internal representations and characteristics of artificial agents to determine which agent may outperform others to become a better trainer-agent. Using a polymath agent, as compared to a specialist agent, an advisor leads to a larger reward and faster convergence of the reward signal and also to a more stable behaviour in terms of the state visit frequency of the learner-agents. Moreover, we analyse system interaction parameters in order to determine how influential they are in the apprenticeship process, where the consistency of feedback is much more relevant when dealing with different learner obedience parameters.

Using memory of past input data to improve the convergence of NN when faced with small samples

Zhang, S., Huang, K., Zhang, R. et al., Learning from Few Samples with Memory Network, Cogn Comput (2018) 10: 15, DOI: 10.1007/s12559-017-9507-z.

Neural networks (NN) have achieved great successes in pattern recognition and machine learning. However, the success of a NN usually relies on the provision of a sufficiently large number of data samples as training data. When fed with a limited data set, a NN’s performance may be degraded significantly. In this paper, a novel NN structure is proposed called a memory network. It is inspired by the cognitive mechanism of human beings, which can learn effectively, even from limited data. Taking advantage of the memory from previous samples, the new model achieves a remarkable improvement in performance when trained using limited data. The memory network is demonstrated here using the multi-layer perceptron (MLP) as a base model. However, it would be straightforward to extend the idea to other neural networks, e.g., convolutional neural networks (CNN). In this paper, the memory network structure is detailed, the training algorithm is presented, and a series of experiments are conducted to validate the proposed framework. Experimental results show that the proposed model outperforms traditional MLP-based models as well as other competitive algorithms in response to two real benchmark data sets.

An interesting soft-partition method based on hierarchical graphs (trees, actually) applied to topic detection in documents

Peixian Chen, Nevin L. Zhang, Tengfei Liu, Leonard K.M. Poon, Zhourong Chen, Farhan Khawar, Latent tree models for hierarchical topic detection, Artificial Intelligence, Volume 250, 2017, Pages 105-124, DOI: 10.1016/j.artint.2017.06.004.

We present a novel method for hierarchical topic detection where topics are obtained by clustering documents in multiple ways. Specifically, we model document collections using a class of graphical models called hierarchical latent tree models (HLTMs). The variables at the bottom level of an HLTM are observed binary variables that represent the presence/absence of words in a document. The variables at other levels are binary latent variables that represent word co-occurrence patterns or co-occurrences of such patterns. Each latent variable gives a soft partition of the documents, and document clusters in the partitions are interpreted as topics. Latent variables at high levels of the hierarchy capture long-range word co-occurrence patterns and hence give thematically more general topics, while those at low levels of the hierarchy capture short-range word co-occurrence patterns and give thematically more specific topics. In comparison with LDA-based methods, a key advantage of the new method is that it represents co-occurrence patterns explicitly using model structures. Extensive empirical results show that the new method significantly outperforms the LDA-based methods in term of model quality and meaningfulness of topics and topic hierarchies.