Category Archives: Robot Task Planning

On the need to replanning in POMDPs when applied to real systems, due to imperfect sensing and computational cost of online planning

Ali-akbar Agha-mohammadi et al., SLAP: Simultaneous Localization and Planning Under Uncertainty via Dynamic Replanning in Belief Space, IEEE Transactions on Robotics, vol. 34, no. 5, DOI: 10.1109/TRO.2018.2838556.

Simultaneous localization and planning (SLAP) is a crucial ability for an autonomous robot operating under uncertainty. In its most general form, SLAP induces a continuous partially observable Markov decision process (POMDP), which needs to be repeatedly solved online. This paper addresses this problem and proposes a dynamic replanning scheme in belief space. The underlying POMDP, which is continuous in state, action, and observation space, is approximated offline via sampling-based methods, but operates in a replanning loop online to admit local improvements to the coarse offline policy. This construct enables the proposed method to combat changing environments and large localization errors, even when the change alters the homotopy class of the optimal trajectory. It further outperforms the state-of-the-art Feedback-based Information RoadMap (FIRM) method by eliminating unnecessary stabilization steps. Applying belief space planning to physical systems brings with it a plethora of challenges. A key focus of this paper is to implement the proposed planner on a physical robot and show the SLAP solution performance under uncertainty, in changing environments and in the presence of large disturbances, such as a kidnapped robot situation.

Considering the robot and all the intermmediate objects that participate in the manipulation of another object as a MDP

Yilun Zhou, Benjamin Burchfiel, George Konidaris, Representing, learning, and controlling complex object interactions, Autonomous Robots, Volume 42, Issue 7, pp 1355–1367, DOI: 10.1007/s1051.

We present a framework for representing scenarios with complex object interactions, where a robot cannot directly interact with the object it wishes to control and must instead influence it via intermediate objects. For instance, a robot learning to drive a car can only change the car’s pose indirectly via the steering wheel, and must represent and reason about the relationship between its own grippers and the steering wheel, and the relationship between the steering wheel and the car. We formalize these interactions as chains and graphs of Markov decision processes (MDPs) and show how such models can be learned from data. We also consider how they can be controlled given known or learned dynamics. We show that our complex model can be collapsed into a single MDP and solved to find an optimal policy for the combined system. Since the resulting MDP may be very large, we also introduce a planning algorithm that efficiently produces a potentially suboptimal policy. We apply these models to two systems in which a robot uses learning from demonstration to achieve indirect control: playing a computer game using a joystick, and using a hot water dispenser to heat a cup of water.

Distributing a neural network among the robots of a swarm

Michael Otte, An emergent group mind across a swarm of robots: Collective cognition and distributed sensing via a shared wireless neural network, The International Journal of Robotics Research, DOI: 10.1177/0278364918779704.

We pose the “trained-at-runtime heterogeneous swarm response problem,” in which a swarm of robots must do the following three things: (1) Learn to differentiate between multiple classes of environmental feature patterns (where the feature patterns are distributively sensed across all robots in the swarm). (2) Perform the particular collective behavior that is the appropriate response to the feature pattern that the swarm recognizes in the environment at runtime (where a collective behavior is defined by a mapping of robot actions to robots). (3) The data required for both (1) and (2) is uploaded to the swarm after it has been deployed, i.e., also at runtime (the data required for (1) is the specific environmental feature patterns that the swarm should learn to differentiate between, and the data required for (2) is the mapping from feature classes to swarm behaviors). To solve this problem, we propose a new form of emergent distributed neural network that we call an “artificial group mind.” The group mind transforms a robotic swarm into a single meta-computer that can be programmed at runtime. In particular, the swarm-spanning artificial neural network emerges as each robot maintains a slice of neurons and forms wireless neural connections between its neurons and those on nearby robots. The nearby robots are discovered at runtime. Experiments on real swarms containing up to 316 robots demonstrate that the group mind enables collective decision-making based on distributed sensor data, and solves the trained-at-runtime heterogeneous swarm response problem. The group mind is a new tool that can be used to create more complex emergent swarm behaviors. The group mind also enables swarm behaviors to be a function of global patterns observed across the environment—where the patterns are orders of magnitude larger than the robots themselves.

A novel algorithm for coverage path planning with very strong guarantees

J. Song and S. Gupta, $varepsilon ^{star }$: An Online Coverage Path Planning Algorithm, IEEE Transactions on Robotics, vol. 34, no. 2, pp. 526-533, DOI: 10.1109/TRO.2017.2780259.

This paper presents an algorithm called ε*, for online coverage path planning of unknown environment. The algorithm is built upon the concept of an Exploratory Turing Machine (ETM), which acts as a supervisor to the autonomous vehicle to guide it with adaptive navigation commands. The ETM generates a coverage path online using Multiscale Adaptive Potential Surfaces (MAPS), which are hierarchically structured and dynamically updated based on sensor information. The ε*-algorithm is computationally efficient, guarantees complete coverage, and does not suffer from the local extrema problem. Its performance is validated by 1) high-fidelity simulations on Player/Stage and 2) actual experiments in a laboratory setting on autonomous vehicles.

Robot topological navigation

Sergio Miguel-Tomé, Navigation through unknown and dynamic open spaces using topological notions,Connection Science vol. 30, iss. 2, DOI: 10.1080/09540091.2016.1277691.

Until now, most algorithms used for navigation have had the purpose of directing system towards one point in space. However, humans communicate tasks by specifying spatial relations among elements or places. In addition, the environments in which humans develop their activities are extremely dynamic. The only option that allows for successful navigation in dynamic and unknown environments is making real-time decisions. Therefore, robots capable of collaborating closely with human beings must be able to make decisions based on the local information registered by the sensors and interpret and express spatial relations. Furthermore, when one person is asked to perform a task in an environment, this task is communicated given a category of goals so the person does not need to be supervised. Thus, two problems appear when one wants to create multifunctional robots: how to navigate in dynamic and unknown environments using spatial relations and how to accomplish this without supervision. In this article, a new architecture to address the two cited problems is presented, called the topological qualitative navigation architecture. In previous works, a qualitative heuristic called the heuristic of topological qualitative semantics (HTQS) has been developed to establish and identify spatial relations. However, that heuristic only allows for establishing one spatial relation with a specific object. In contrast, navigation requires a temporal sequence of goals with different objects. The new architecture attains continuous generation of goals and resolves them using HTQS. Thus, the new architecture achieves autonomous navigation in dynamic or unknown open environments.

POMDPs aware of the data association problem

Shashank Pathak, Antony Thomas, and Vadim Indelman, A unified framework for data association aware robust belief space planning and perception, The International Journal of Robotics Research Vol 37, Issue 2-3, pp. 287 – 315, DOI: 10.1177/0278364918759606.

We develop a belief space planning approach that advances the state of the art by incorporating reasoning about data association within planning, while considering additional sources of uncertainty. Existing belief space planning approaches typically assume that data association is given and perfect, an assumption that can be harder to justify during operation in the presence of localization uncertainty, or in ambiguous and perceptually aliased environments. By contrast, our data association aware belief space planning (DA-BSP) approach explicitly reasons about data association within belief evolution owing to candidate actions, and as such can better accommodate these challenging real-world scenarios. In particular, we show that, owing to perceptual aliasing, a posterior belief can become a mixture of probability distribution functions and design cost functions, which measure the expected level of ambiguity and posterior uncertainty given candidate action. Furthermore, we also investigate more challenging situations, such as when prior belief is multimodal and when data association aware planning is performed over several look-ahead steps. Our framework models the belief as a Gaussian mixture model. Another unique aspect of this approach is that the number of components of this Gaussian mixture model can increase as well as decrease, thereby reflecting reality more accurately. Using these and standard costs (e.g. control penalty, distance to goal) within the objective function yields a general framework that reliably represents action impact and, in particular, is capable of active disambiguation. Our approach is thus applicable to both robust perception in a passive setting with data given a priori and in an active setting, such as in autonomous navigation in perceptually aliased environments. We demonstrate key aspects of DA-BSP in a theoretical example, in a Gazebo-based realistic simulation, and also on the real robotic platform using a Pioneer robot in an office environment.

Extending STRIPS-like symbolic planners with metrical/physical constraints for the domain of robotic manipulation

Caelan Reed Garrett, Tomás Lozano-Pérez, and Leslie Pack Kaelbling, FFRob: Leveraging symbolic planning for efficient task and motion planning, The International Journal of Robotics Research Vol 37, Issue 1, pp. 104 – 136, DOI: 10.1177/0278364917739114
.

Mobile manipulation problems involving many objects are challenging to solve due to the high dimensionality and multi-modality of their hybrid configuration spaces. Planners that perform a purely geometric search are prohibitively slow for solving these problems because they are unable to factor the configuration space. Symbolic task planners can efficiently construct plans involving many variables but cannot represent the geometric and kinematic constraints required in manipulation. We present the FFRob algorithm for solving task and motion planning problems. First, we introduce extended action specification (EAS) as a general purpose planning representation that supports arbitrary predicates as conditions. We adapt existing heuristic search ideas for solving strips planning problems, particularly delete-relaxations, to solve EAS problem instances. We then apply the EAS representation and planners to manipulation problems resulting in FFRob. FFRob iteratively discretizes task and motion planning problems using batch sampling of manipulation primitives and a multi-query roadmap structure that can be conditionalized to evaluate reachability under different placements of movable objects. This structure enables the EAS planner to efficiently compute heuristics that incorporate geometric and kinematic planning constraints to give a tight estimate of the distance to the goal. Additionally, we show FFRob is probabilistically complete and has a finite expected runtime. Finally, we empirically demonstrate FFRob’s effectiveness on complex and diverse task and motion planning tasks including rearrangement planning and navigation among movable objects.

Improving efficiency of decision with POMDPs in high-dimension state spaces

Dmitry Kopitkov and Vadim Indelman, No belief propagation required: Belief space planning in high-dimensional state spaces via factor graphs, the matrix determinant lemma, and re-use of calculation, The International Journal of Robotics Research, Vol 36, Issue 10, pp. 1088 – 1130, DOI: 10.1177/0278364917721629.

We develop a computationally efficient approach for evaluating the information-theoretic term within belief space planning (BSP), where during belief propagation the state vector can be constant or augmented. We consider both unfocused and focused problem settings, whereas uncertainty reduction of the entire system or only of chosen variables is of interest, respectively. State-of-the-art approaches typically propagate the belief state, for each candidate action, through calculation of the posterior information (or covariance) matrix and subsequently compute its determinant (required for entropy). In contrast, our approach reduces runtime complexity by avoiding these calculations. We formulate the problem in terms of factor graphs and show that belief propagation is not needed, requiring instead a one-time calculation that depends on (the increasing with time) state dimensionality, and per-candidate calculations that are independent of the latter. To that end, we develop an augmented version of the matrix determinant lemma, and show that computations can be re-used when evaluating impact of different candidate actions. These two key ingredients and the factor graph representation of the problem result in a computationally efficient (augmented) BSP approach that accounts for different sources of uncertainty and can be used with various sensing modalities. We examine the unfocused and focused instances of our approach, and compare it with the state of the art, in simulation and using real-world data, considering problems such as autonomous navigation in unknown environments, measurement selection and sensor deployment. We show that our approach significantly reduces running time without any compromise in performance.

An application of POMDPs to robot surveillance

S. Witwicki et al., Autonomous Surveillance Robots: A Decision-Making Framework for Networked Muiltiagent Systems, IEEE Robotics & Automation Magazine, vol. 24, no. 3, pp. 52-64, DOI: 10.1109/MRA.2017.2662222.

This article proposes an architecture for an intelligent surveillance system, where the aim is to mitigate the burden on humans in conventional surveillance systems by incorporating intelligent interfaces, computer vision, and autonomous mobile robots. Central to the intelligent surveillance system is the application of research into planning and decision making in this novel context. In this article, we describe the robot surveillance decision problem and explain how the integration of components in our system supports fully automated decision making. Several concrete scenarios deployed in real surveillance environments exemplify both the flexibility of our system to experiment with different representations and algorithms and the portability of our system into a variety of problem contexts. Moreover, these scenarios demonstrate how planning enables robots to effectively balance surveillance objectives, autonomously performing the job of human patrols and responders.

POMDPs with multicriteria in the cost to optimize – a hierarchical approach

Seyedshams Feyzabadi, Stefano Carpin, Planning using hierarchical constrained Markov decision processes, Autonomous Robots, Volume 41, Issue 8, pp 1589–1607, DOI: 10.1007/s10514-017-9630-4.

Constrained Markov decision processes offer a principled method to determine policies for sequential stochastic decision problems where multiple costs are concurrently considered. Although they could be very valuable in numerous robotic applications, to date their use has been quite limited. Among the reasons for their limited adoption is their computational complexity, since policy computation requires the solution of constrained linear programs with an extremely large number of variables. To overcome this limitation, we propose a hierarchical method to solve large problem instances. States are clustered into macro states and the parameters defining the dynamic behavior and the costs of the clustered model are determined using a Monte Carlo approach. We show that the algorithm we propose to create clustered states maintains valuable properties of the original model, like the existence of a solution for the problem. Our algorithm is validated in various planning problems in simulation and on a mobile robot platform, and we experimentally show that the clustered approach significantly outperforms the non-hierarchical solution while experiencing only moderate losses in terms of objective functions.