An orientation sensor for robot navigation that uses the sky

Julien Dupeyroux, Stéphane Viollet, Julien R. Serres, An ant-inspired celestial compass applied to autonomous outdoor robot navigation, Robotics and Autonomous Systems, Volume 117, 2019, Pages 40-56, DOI: 10.1016/j.robot.2019.04.007.

Desert ants use the polarization of skylight and a combination of stride and ventral optic flow integration processes to track the nest and food positions when traveling, achieving outstanding performances. Navigation sensors such as global positioning systems and inertial measurement units still have disadvantages such as their low resolution and drift. Taking our inspiration from ants, we developed a 2-pixel celestial compass which computes the heading angle of a mobile robot in the ultraviolet range. The output signals obtained with this optical compass were investigated under various weather and ultraviolet conditions and compared with those obtained on a magnetometer in the vicinity of our laboratory. After being embedded on-board the robot, the sensor was first used to compensate for random yaw disturbances. We then used the compass to keep the Hexabot robot’s heading angle constant in a straight forward walking task over a flat terrain while its walking movements were imposing yaw disturbances. Experiments performed under various meteorological conditions showed the occurrence of steady state heading angle errors ranging from 0.3∘ (with a clear sky) to 2.9∘ (under changeable sky conditions). The compass was also tested under canopies and showed a strong ability to determine the robot’s heading while most of the sky was hidden by the foliage. Lastly, a waterproof, mono-pixel version of the sensor was designed and successfully tested in a preliminary underwater benchmark test. These results suggest this new optical compass shows great precision and reliability in a wide range of outdoor conditions, which makes it highly suitable for autonomous robotic outdoor navigation tasks. A celestial compass and a minimalistic optic flow sensor called M2APix (based on Michaelis–Menten Auto-adaptive Pixels) were therefore embedded on-board our latest insectoid robot called AntBot, to complete the previously mentioned ant-like homing navigation processes. First the robot was displaced manually and made to return to its starting-point on the basis of its absolute knowledge of the coordinates of this point. Lastly, AntBot was tested in fully autonomous navigation experiments, in which it explored its environment and then returned to base using the same sensory modes as those on which desert ants rely. AntBot produced robust, precise localization performances with a homing error as small as 0.7% of the entire trajectory.

Human interaction with the RL process

Celemin, C., Ruiz-del-Solar, J. & Kober, A fast hybrid reinforcement learning framework with human corrective feedback, Auton Robot (2019) 43: 1173, DOI: 10.1007/s10514-018-9786-6.

Reinforcement Learning agents can be supported by feedback from human teachers in the learning loop that guides the learning process. In this work we propose two hybrid strategies of Policy Search Reinforcement Learning and Interactive Machine Learning that benefit from both sources of information, the cost function and the human corrective feedback, for accelerating the convergence and improving the final performance of the learning process. Experiments with simulated and real systems of balancing tasks and a 3 DoF robot arm validate the advantages of the proposed learning strategies: (i) they speed up the convergence of the learning process between 3 and 30 times, saving considerable time during the agent adaptation, and (ii) they allow including non-expert feedback because they have low sensibility to erroneous human advice.

Aligning maps of different modalities, coverage and scale

Gholami Shahbandi, S. & Magnusson M., 2D map alignment with region decomposition, Auton Robot (2019) 43: 1117, DOI: 10.1007/s10514-018-9785-7.

In many applications of autonomous mobile robots the following problem is encountered. Two maps of the same environment are available, one a prior map and the other a sensor map built by the robot. To benefit from all available information in both maps, the robot must find the correct alignment between the two maps. There exist many approaches to address this challenge, however, most of the previous methods rely on assumptions such as similar modalities of the maps, same scale, or existence of an initial guess for the alignment. In this work we propose a decomposition-based method for 2D spatial map alignment which does not rely on those assumptions. Our proposed method is validated and compared with other approaches, including generic data association approaches and map alignment algorithms. Real world examples of four different environments with thirty six sensor maps and four layout maps are used for this analysis. The maps, along with an implementation of the method, are made publicly available online.

A nice (short) survey of deep RL

Matthew Botvinick, Sam Ritter, Jane X. Wang, Zeb Kurth-Nelson, Charles Blundell, Demis Hassabis, Reinforcement Learning, Fast and Slow, Trends in Cognitive Sciences, Volume 23, Issue 5, 2019, Pages 408-422 DOI: 10.1016/j.tics.2019.02.006.

Deep reinforcement learning (RL) methods have driven impressive advances in artificial intelligence in recent years, exceeding human performance in domains ranging from Atari to Go to no-limit poker. This progress has drawn the attention of cognitive scientists interested in understanding human learning. However, the concern has been raised that deep RL may be too sample-inefficient – that is, it may simply be too slow – to provide a plausible model of how humans learn. In the present review, we counter this critique by describing recently developed techniques that allow deep RL to operate more nimbly, solving problems much more quickly than previous methods. Although these techniques were developed in an AI context, we propose that they may have rich implications for psychology and neuroscience. A key insight, arising from these AI methods, concerns the fundamental connection between fast RL and slower, more incremental forms of learning.

Measuring unconscious behaviour

David Soto, Usman Ayub Sheikh, Clive R. Rosenthal, A Novel Framework for Unconscious Processing, Trends in Cognitive Sciences, Volume 23, Issue 5, 2019, Pages 372-376 DOI: 10.1016/j.tics.2019.03.002.

Understanding the distinction between conscious and unconscious cognition remains a priority in psychology and neuroscience. A comprehensive neurocognitive account of conscious awareness will not be possible without a sound framework to isolate and understand unconscious information processing. Here, we provide a brain-based framework that allows the identification of unconscious processes, even with null effects on behaviour.

A survey on HTN planning

Ilche Georgievski, Marco Aiello, HTN planning: Overview, comparison, and beyond, Artificial Intelligence, Volume 222, 2015, Pages 124-156 DOI: 10.1016/j.artint.2015.02.002.

Hierarchies are one of the most common structures used to understand and conceptualise the world. Within the field of Artificial Intelligence (AI) planning, which deals with the automation of world-relevant problems, Hierarchical Task Network (HTN) planning is the branch that represents and handles hierarchies. In particular, the requirement for rich domain knowledge to characterise the world enables HTN planning to be very useful, and also to perform well. However, the history of almost 40 years obfuscates the current understanding of HTN planning in terms of accomplishments, planning models, similarities and differences among hierarchical planners, and its current and objective image. On top of these issues, the ability of hierarchical planning to truly cope with the requirements of real-world applications has been often questioned. As a remedy, we propose a framework-based approach where we first provide a basis for defining different formal models of hierarchical planning, and define two models that comprise a large portion of HTN planners. Second, we provide a set of concepts that helps in interpreting HTN planners from the aspect of their search space. Then, we analyse and compare the planners based on a variety of properties organised in five segments, namely domain authoring, expressiveness, competence, computation and applicability. Furthermore, we select Web service composition as a real-world and current application, and classify and compare the approaches that employ HTN planning to solve the problem of service composition. Finally, we conclude with our findings and present directions for future work. In summary, we provide a novel and comprehensive viewpoint on a core AI planning technique.

Designing robotic architectures by coordinating different modules in a data-flow graphical paradigm

Sebastian Buck, Andreas Zell, CS::APEX: A Framework for Algorithm Prototyping and Experimentation with Robotic Systems. Modeling Perception and High Level Robot Control with Activity Flow Graphs, Journal of Intelligent & Robotic Systems (2019) 94:371–387, DOI: 10.1007/s10846-018-0831-7.

Robotic systems differ drastically in their sensory capabilities, their computational power and their designated tasks. For
efficient algorithm development, however, we need to have a common modeling framework that enables us to generalize and
re-use existing solutions. A modular approach, which is coherent across different platforms, also allows faster prototyping
of new systems, given that existing functionality can be reused from already implemented modules. In this paper we develop
a modeling framework based on data flow graphs that achieves the following goal: We first merge synchronous data flow
and reactive programming into hybrid flow graphs, where we explicitly model synchronous and asynchronous data flow.
Then we transfer concepts from finite-state machines to achieve a coherent framework which we call Activity Flow Graphs.
The flow of activity enables us to model high level states directly in the data flow graph. The result is a single computation
graph that can express both perception and high level control aspects of any robotic system. This theoretical foundation is
the core of our open-source software framework CS::APEX, which allows the creation, manipulation and evaluation of
Activity Flow Graphs and enables rapid prototyping and experimentation and can be used with any robot supporting the
Robot Operating System (ROS). We then demonstrate the framework with two high level models for a fetch-and-delivery
robot and a person following robot.

Learning multiple-factors metrics for measuring the similarity between objects

H. Ye, D. Zhan, Y. Jiang and Z. Zhou, What Makes Objects Similar: A Unified Multi-Metric Learning Approach, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 5, pp. 1257-1270, DOI: 10.1109/TPAMI.2018.2829192.

Linkages are essentially determined by similarity measures that may be derived from multiple perspectives. For example, spatial linkages are usually generated based on localities of heterogeneous data. Semantic linkages, however, can come from even more properties, such as different physical meanings behind social relations. Many existing metric learning models focus on spatial linkages but leave the rich semantic factors unconsidered. We propose a Unified Multi-Metric Learning (Um$^2$2l) framework to exploit multiple types of metrics with respect to overdetermined similarities between linkages. In Um$^2$2l, types of combination operators are introduced for distance characterization from multiple perspectives, and thus can introduce flexibilities for representing and utilizing both spatial and semantic linkages. Besides, we propose a uniform solver for Um$^2$2l, and the theoretical analysis reflects the generalization ability of Um$^2$2l as well. Extensive experiments on diverse applications exhibit the superior classification performance and comprehensibility of Um$^2$2l. Visualization results also validate its ability to physical meanings discovery.

Taking into account the influence of a recommender in the change of behaviour of the agent using it

Jonathan P. Epperlein, Sergiy Zhuk, Robert Shorten, Recovering Markov models from closed-loop data, Automatica, Volume 103, 2019, Pages 116-125, DOI: 10.1016/j.automatica.2019.01.022.

Situations in which recommender systems are used to augment decision making are becoming prevalent in many application domains. Almost always, these prediction tools (recommenders) are created with a view to affecting behavioural change. Clearly, successful applications actuating behavioural change, affect the original model underpinning the predictor, leading to an inconsistency. This feedback loop is often not considered in standard machine learning techniques which rely upon machine learning/statistical learning machinery. The objective of this paper is to develop tools that recover unbiased user models in the presence of recommenders. More specifically, we assume that we observe a time series which is a trajectory of a Markov chain R modulated by another Markov chain S, i.e. the transition matrix of R is unknown and depends on the current state of S. The transition matrix of the latter is also unknown. In other words, at each time instant, S selects a transition matrix for R within a given set which consists of known and unknown matrices. The state of S, in turn, depends on the current state of R thus introducing a feedback loop. We propose an Expectation–Maximisation (EM) type algorithm, which estimates the transition matrices of S and R. Experimental results are given to demonstrate the efficacy of the approach.

Predicting the structure of indoor environments for mobile robots

Matteo Luperto, Francesco Amigoni, Predicting the global structure of indoor environments: A constructive machine learning approach, Autonomous Robots, April 2019, Volume 43, Issue 4, pp 813–835, DOI: 10.1007/s10514-018-9732-7.

Consider a mobile robot exploring an initially unknown school building and assume that it has already discovered some corridors, classrooms, offices, and bathrooms. What can the robot infer about the presence and the locations of other classrooms and offices and, more generally, about the structure of the rest of the building? This paper presents a system that makes a step towards providing an answer to the above question. The proposed system is based on a generative model that is able to represent the topological structures and the semantic labeling schemas of buildings and to generate plausible hypotheses for unvisited portions of these environments. We represent the buildings as undirected graphs, whose nodes are rooms and edges are physical connections between them. Given an initial knowledge base of graphs, our approach, relying on constructive machine learning techniques, segments each graph for finding significant subgraphs and clusters them according to their similarity, which is measured using graph kernels. A graph representing a new building or an unvisited part of a building is eventually generated by sampling subgraphs from clusters and connecting them.