Category Archives: Robotics

More efficient pose-graph optimization by using the cycles (loop closures) in the graph as a basis, and a nice summary of conventional pose-graph optimization

F. Bai, T. Vidal-Calleja and G. Grisetti, Sparse Pose Graph Optimization in Cycle Space, .IEEE Transactions on Robotics, vol. 37, no. 5, pp. 1381-1400, Oct 2021 DOI: 10.1109/TRO.2021.3050328.

The state-of-the-art modern pose-graph optimization (PGO) systems are vertex based. In this context, the number of variables might be high, albeit the number of cycles in the graph (loop closures) is relatively low. For sparse problems particularly, the cycle space has a significantly smaller dimension than the number of vertices. By exploiting this observation, in this article, we propose an alternative solution to PGO that directly exploits the cycle space. We characterize the topology of the graph as a cycle matrix, and reparameterize the problem using relative poses, which are further constrained by a cycle basis of the graph. We show that by using a minimum cycle basis, the cycle-based approach has superior convergence properties against its vertex-based counterpart, in terms of convergence speed and convergence to the global minimum. For sparse graphs, our cycle-based approach is also more time efficient than the vertex-based. As an additional contribution of this work, we present an effective algorithm to compute the minimum cycle basis. Albeit known in computer science, we believe that this algorithm is not familiar to the robotics community. All the claims are validated by experiments on both standard benchmarks and simulated datasets. To foster the reproduction of the results, we provide a complete open-source C++ implementation 1 of our approach.

Learning rewards from diverse human sources

Bıyık E, Losey DP, Palan M, Landolfi NC, Shevchuk G, Sadigh D., Learning reward functions from diverse sources of human feedback: Optimally integrating demonstrations and preferences, . The International Journal of Robotics Research. 2022;41(1):45-67 DOI: 10.1177/02783649211041652.

Reward functions are a common way to specify the objective of a robot. As designing reward functions can be extremely challenging, a more promising approach is to directly learn reward functions from human teachers. Importantly, data from human teachers can be collected either passively or actively in a variety of forms: passive data sources include demonstrations (e.g., kinesthetic guidance), whereas preferences (e.g., comparative rankings) are actively elicited. Prior research has independently applied reward learning to these different data sources. However, there exist many domains where multiple sources are complementary and expressive. Motivated by this general problem, we present a framework to integrate multiple sources of information, which are either passively or actively collected from human users. In particular, we present an algorithm that first utilizes user demonstrations to initialize a belief about the reward function, and then actively probes the user with preference queries to zero-in on their true reward. This algorithm not only enables us combine multiple data sources, but it also informs the robot when it should leverage each type of information. Further, our approach accounts for the human’s ability to provide data: yielding user-friendly preference queries which are also theoretically optimal. Our extensive simulated experiments and user studies on a Fetch mobile manipulator demonstrate the superiority and the usability of our integrated framework..

A MPC-based (non-POMDP) approach to sequential decision planning with partial observability in continuous time and space

Nishimura H, Schwager M., SACBP: Belief space planning for continuous-time dynamical systems via stochastic sequential action control, . The International Journal of Robotics Research. 2021;40(10-11):1167-1195 DOI: 10.1177/02783649211037697.

We propose a novel belief space planning technique for continuous dynamics by viewing the belief system as a hybrid dynamical system with time-driven switching. Our approach is based on the perturbation theory of differential equations and extends sequential action control to stochastic dynamics. The resulting algorithm, which we name SACBP, does not require discretization of spaces or time and synthesizes control signals in near real-time. SACBP is an anytime algorithm that can handle general parametric Bayesian filters under certain assumptions. We demonstrate the effectiveness of our approach in an active sensing scenario and a model-based Bayesian reinforcement learning problem. In these challenging problems, we show that the algorithm significantly outperforms other existing solution techniques including approximate dynamic programming and local trajectory optimization.

Dropping laser scans for SLAM when they contribute with no relevant information

Kirill Krinkin, Anton Filatov, Correlation filter of 2D laser scans for indoor environment, . Robotics and Autonomous Systems, Volume 142, 2021 DOI: 10.1016/j.robot.2021.103809.

Modern laser SLAM (simultaneous localization and mapping) and structure from motion algorithms face the problem of processing redundant data. Even if a sensor does not move, it still continues to capture scans that should be processed. This paper presents the novel filter that allows dropping 2D scans that bring no new information to the system. Experiments on MIT and TUM datasets show that it is possible to drop more than half of the scans. Moreover the paper describes the formulas that enable filter adaptation to a particular robot with known speed and characteristics of lidar. In addition, the indoor corridor detector is introduced that also can be applied to any specific shape of a corridor and sensor.

Including a safety procedure in RL to avoid physical agent problems while learning

Kim Peter Wabersich, Melanie N. Zeilinger, A predictive safety filter for learning-based control of constrained nonlinear dynamical systems, . Automatica, Volume 129, 2021 DOI: 10.1016/j.automatica.2021.109597.

The transfer of reinforcement learning (RL) techniques into real-world applications is challenged by safety requirements in the presence of physical limitations. Most RL methods, in particular the most popular algorithms, do not support explicit consideration of state and input constraints. In this paper, we address this problem for nonlinear systems with continuous state and input spaces by introducing a predictive safety filter, which is able to turn a constrained dynamical system into an unconstrained safe system and to which any RL algorithm can be applied ‘out-of-the-box’. The predictive safety filter receives the proposed control input and decides, based on the current system state, if it can be safely applied to the real system, or if it has to be modified otherwise. Safety is thereby established by a continuously updated safety policy, which is based on a model predictive control formulation using a data-driven system model and considering state and input dependent uncertainties.

Classical task planning at an abstract level for achieving good low level motion planning under uncertainty

Antony Thomas, Fulvio Mastrogiovanni, Marco Baglietto, MPTP: Motion-planning-aware task planning for navigation in belief space, . Robotics and Autonomous Systems, Volume 141, 2021 DOI: 10.1016/j.robot.2021.103786.

We present an integrated Task-Motion Planning (TMP) framework for navigation in large-scale environments. Of late, TMP for manipulation has attracted significant interest resulting in a proliferation of different approaches. In contrast, TMP for navigation has received considerably less attention. Autonomous robots operating in real-world complex scenarios require planning in the discrete (task) space and the continuous (motion) space. In knowledge-intensive domains, on the one hand, a robot has to reason at the highest-level, for example, the objects to procure, the regions to navigate to in order to acquire them; on the other hand, the feasibility of the respective navigation tasks have to be checked at the execution level. This presents a need for motion-planning-aware task planners. In this paper, we discuss a probabilistically complete approach that leverages this task-motion interaction for navigating in large knowledge-intensive domains, returning a plan that is optimal at the task-level. The framework is intended for motion planning under motion and sensing uncertainty, which is formally known as belief space planning. The underlying methodology is validated in simulation, in an office environment and its scalability is tested in the larger Willow Garage world. A reasonable comparison with a work that is closest to our approach is also provided. We also demonstrate the adaptability of our approach by considering a building floor navigation domain. Finally, we also discuss the limitations of our approach and put forward suggestions for improvements and future work.

POMDPs to combine human semantic sensing with robot sensing

Luke Burks, Nisar Ahmed, Ian Loefgren, Luke Barbier, Jeremy Muesing, Jamison McGinley, Sousheel Vunnam, Collaborative human-autonomy semantic sensing through structured POMDP planning, . Robotics and Autonomous Systems, Volume 140, 2021 DOI: 10.1016/j.robot.2021.103753.

Autonomous unmanned systems and robots must be able to actively leverage all available information sources — including imprecise but readily available semantic observations provided by human collaborators. This work develops and validates a novel active collaborative human–machine sensing solution for robotic information gathering and optimal decision making problems, with an example implementation of a dynamic target search scenario. Our approach uses continuous partially observable Markov decision process (CPOMDP) planning to generate vehicle trajectories that optimally exploit imperfect detection data from onboard sensors, as well as semantic natural language observations that can be specifically requested from human sensors. The key innovations are a method for the inclusion of a human querying/sensing model in a CPOMDP based autonomous decision making process, as well as a scalable hierarchical Gaussian mixture model formulation for efficiently solving CPOMDPs with semantic observations in continuous dynamic state spaces. Unlike previous state-of-the-art approaches this allows planning in large, complex, highly segmented environments. Our solution is demonstrated and validated with a real human–robot team engaged in dynamic indoor target search and capture scenarios on a custom testbed..

A hierarchical POMDP system for robot manipulation

Wenrui Zhao, Weidong Chen, Hierarchical POMDP planning for object manipulation in clutter, . Robotics and Autonomous Systems, Volume 139, 2021 DOI: 10.1016/j.robot.2021.103736.

Object manipulation planning in clutter suffers from perception uncertainties due to occlusion, as well as action constraints required by collision avoidance. Partially observable Markov decision process (POMDP) provides a general model for planning under uncertainties. But a manipulation task usually have a large action space, which not only makes task planning intractable but also brings significant motion planning effort to check action feasibility. In this work, a new kind of hierarchical POMDP is presented for object manipulation tasks, in which a brief abstract POMDP is extracted and utilized together with the original POMDP. And a hierarchical belief tree search algorithm is proposed for efficient online planning, which constructs fewer belief nodes by building part of the tree with the abstract POMDP and invokes motion planning fewer times by determining action feasibility with observation function of the abstract POMDP. A learning mechanism is also designed in case there are unknown probabilities in transition and observation functions. This planning framework is demonstrated with an object fetching task and the performance is empirically validated by simulations and experiments.

A hierarchical robot control architecture that supports learning of skills at different levels through “curriculum learning” and an interesting approach to mix behaviours

Suro, F., Ferber, J., Stratulat, T. et al., A hierarchical representation of behaviour supporting open ended development and progressive learning for artificial agents, . Auton Robot 45, 245–264 (2021) DOI: 10.1007/s10514-020-09960-7.

One of the challenging aspects of open ended or lifelong agent development is that the final behaviour for which an agent is trained at a given moment can be an element for the future creation of one, or even several, behaviours of greater complexity, whose purpose cannot be anticipated. In this paper, we present modular influence network design (MIND), an artificial agent control architecture suited to open ended and cumulative learning. The MIND architecture encapsulates sub behaviours into modules and combines them into a hierarchy reflecting the modular and hierarchical nature of complex tasks. Compared to similar research, the main original aspect of MIND is the multi layered hierarchy using a generic control signal, the influence, to obtain an efficient global behaviour. This article shows the ability of MIND to learn a curriculum of independent didactic tasks of increasing complexity covering different aspects of a desired behaviour. In so doing we demonstrate the contributions of MIND to open-ended development: encapsulation into modules allows for the preservation and re-usability of all the skills acquired during the curriculum and their focused retraining, the modular structure serves the evolving topology by easing the coordination of new sensors, actuators and heterogeneous learning structures.

Motion planning with uncertain obstacles is NP-hard

Shimanuki L, Axelrod B., Hardness of Motion Planning with Obstacle Uncertainty in Two Dimensions, . The International Journal of Robotics Research. 2021;40(10-11):1151-1166 DOI: 10.1177/0278364921992787.

We consider the problem of motion planning in the presence of uncertain obstacles, modeled as polytopes with Gaussian-distributed faces (PGDFs). A number of practical algorithms exist for motion planning in the presence of known obstacles by constructing a graph in configuration space, then efficiently searching the graph to find a collision-free path. We show that such an exact algorithm is unlikely to be practical in the domain with uncertain obstacles. In particular, we show that safe 2D motion planning among PGDF obstacles is NP-hard with respect to the number of obstacles, and remains NP-hard after being restricted to a graph. Our reduction is based on a path encoding of MAXQHORNSAT and uses the risk of collision with an obstacle to encode variable assignments and literal satisfactions. This implies that, unlike in the known case, planning under uncertainty is hard, even when given a graph containing the solution. We further show by reduction from 3-SAT that both safe 3D motion planning among PGDF obstacles and the related minimum constraint removal problem remain NP-hard even when restricted to cases where each obstacle overlaps with at most a constant number of other obstacles.