Tag Archives: Task Planning

Extending STRIPS-like symbolic planners with metrical/physical constraints for the domain of robotic manipulation

Caelan Reed Garrett, Tomás Lozano-Pérez, and Leslie Pack Kaelbling, FFRob: Leveraging symbolic planning for efficient task and motion planning, The International Journal of Robotics Research Vol 37, Issue 1, pp. 104 – 136, DOI: 10.1177/0278364917739114
.

Mobile manipulation problems involving many objects are challenging to solve due to the high dimensionality and multi-modality of their hybrid configuration spaces. Planners that perform a purely geometric search are prohibitively slow for solving these problems because they are unable to factor the configuration space. Symbolic task planners can efficiently construct plans involving many variables but cannot represent the geometric and kinematic constraints required in manipulation. We present the FFRob algorithm for solving task and motion planning problems. First, we introduce extended action specification (EAS) as a general purpose planning representation that supports arbitrary predicates as conditions. We adapt existing heuristic search ideas for solving strips planning problems, particularly delete-relaxations, to solve EAS problem instances. We then apply the EAS representation and planners to manipulation problems resulting in FFRob. FFRob iteratively discretizes task and motion planning problems using batch sampling of manipulation primitives and a multi-query roadmap structure that can be conditionalized to evaluate reachability under different placements of movable objects. This structure enables the EAS planner to efficiently compute heuristics that incorporate geometric and kinematic planning constraints to give a tight estimate of the distance to the goal. Additionally, we show FFRob is probabilistically complete and has a finite expected runtime. Finally, we empirically demonstrate FFRob’s effectiveness on complex and diverse task and motion planning tasks including rearrangement planning and navigation among movable objects.

Using MDPs when the transition probability matrix is just partially specified, therefore getting closer to a model-free approach

Karina V. Delgado, Leliane N. de Barros, Daniel B. Dias, Scott Sanner, Real-time dynamic programming for Markov decision processes with imprecise probabilities, Artificial Intelligence, Volume 230, January 2016, Pages 192-223, ISSN 0004-3702, DOI: 10.1016/j.artint.2015.09.005.

Markov Decision Processes have become the standard model for probabilistic planning. However, when applied to many practical problems, the estimates of transition probabilities are inaccurate. This may be due to conflicting elicitations from experts or insufficient state transition information. The Markov Decision Process with Imprecise Transition Probabilities (MDP-IPs) was introduced to obtain a robust policy where there is uncertainty in the transition. Although it has been proposed a symbolic dynamic programming algorithm for MDP-IPs (called SPUDD-IP) that can solve problems up to 22 state variables, in practice, solving MDP-IP problems is time-consuming. In this paper we propose efficient algorithms for a more general class of MDP-IPs, called Stochastic Shortest Path MDP-IPs (SSP MDP-IPs) that use initial state information to solve complex problems by focusing on reachable states. The (L)RTDP-IP algorithm, a (Labeled) Real Time Dynamic Programming algorithm for SSP MDP-IPs, is proposed together with three different methods for sampling the next state. It is shown here that the convergence of (L)RTDP-IP can be obtained by using any of these three methods, although the Bellman backups for this class of problems prescribe a minimax optimization. As far as we are aware, this is the first asynchronous algorithm for SSP MDP-IPs given in terms of a general set of probability constraints that requires non-linear optimization over imprecise probabilities in the Bellman backup. Our results show up to three orders of magnitude speedup for (L)RTDP-IP when compared with the SPUDD-IP algorithm.

See also:

  • Karina Valdivia Delgado, Scott Sanner, Leliane Nunes de Barros, Efficient solutions to factored MDPs with imprecise transition probabilities, Artif. Intell. 175 (9–10) (2011) 1498–1527.
  • Satia, J. K., and Lave Jr., R. E. 1970. MDPs with uncertain transition probabilities. Operations Research 21:728–740
  • White III, C. C., and El-Deib, H. K. 1994. MDPs with Imprecise Transition Probabilities. Operations Research 42(4):739–749

Planning tasks in mobile robots with MDPs that maximize the probability of satisfying user’s requirements specified through temporal logics, with estimation of transition probabilities through simulation only when needed

Jing Wang, Xuchu Ding, Morteza Lahijanian, Ioannis Ch. Paschalidis, and Calin A. Belta, Temporal logic motion control using actor–critic methods, The International Journal of Robotics Research September 2015 34: 1329-1344, first published on May 26, 2015. DOI: 10.1177/0278364915581505.

This paper considers the problem of deploying a robot from a specification given as a temporal logic statement about some properties satisfied by the regions of a large, partitioned environment. We assume that the robot has noisy sensors and actuators and model its motion through the regions of the environment as a Markov decision process (MDP). The robot control problem becomes finding the control policy which maximizes the probability of satisfying the temporal logic task on the MDP. For a large environment, obtaining transition probabilities for each state–action pair, as well as solving the necessary optimization problem for the optimal policy, are computationally intensive. To address these issues, we propose an approximate dynamic programming framework based on a least-squares temporal difference learning method of the actor–critic type. This framework operates on sample paths of the robot and optimizes a randomized control policy with respect to a small set of parameters. The transition probabilities are obtained only when needed. Simulations confirm that convergence of the parameters translates to an approximately optimal policy.

Nice related work on efficient POMDPs and two novel approaches to reduce their computational cost

Grady, D.K.; Moll, M.; Kavraki, L.E., Extending the Applicability of POMDP Solutions to Robotic Tasks, in Robotics, IEEE Transactions on , vol.31, no.4, pp.948-961, Aug. 2015 DOI: 10.1109/TRO.2015.2441511

Partially observable Markov decision processes (POMDPs) are used in many robotic task classes from soccer to household chores. Determining an approximately optimal action policy for POMDPs is PSPACE-complete, and the exponential growth of computation time prohibits solving large tasks. This paper describes two techniques to extend the range of robotic tasks that can be solved using a POMDP. Our first technique reduces the motion constraints of a robot and, then, uses state-of-the-art robotic motion planning techniques to respect the true motion constraints at runtime. We then propose a novel task decomposition that can be applied to some indoor robotic tasks. This decomposition transforms a long time horizon task into a set of shorter tasks. We empirically demonstrate the performance gain provided by these two techniques through simulated execution in a variety of environments. Comparing a direct formulation of a POMDP to solving our proposed reductions, we conclude that the techniques proposed in this paper can provide significant enhancement to current POMDP solution techniques, extending the POMDP instances that can be solved to include large continuous-state robotic tasks.

Semantic and syntactic bootstrapped learning for robots, inspired in similar processes in humans, that use language as a scaffolding mechanism to improve learning in unknown situations

Worgotter, F.; Geib, C.; Tamosiunaite, M.; Aksoy, E.E.; Piater, J.; Hanchen Xiong; Ude, A.; Nemec, B.; Kraft, D.; Kruger, N.; Wachter, M.; Asfour, T., Structural Bootstrapping—A Novel, Generative Mechanism for Faster and More Efficient Acquisition of Action-Knowledge, Autonomous Mental Development, IEEE Transactions on , vol.7, no.2, pp.140,154, June 2015, DOI: 10.1109/TAMD.2015.2427233.

Humans, but also robots, learn to improve their behavior. Without existing knowledge, learning either needs to be explorative and, thus, slow or-to be more efficient-it needs to rely on supervision, which may not always be available. However, once some knowledge base exists an agent can make use of it to improve learning efficiency and speed. This happens for our children at the age of around three when they very quickly begin to assimilate new information by making guided guesses how this fits to their prior knowledge. This is a very efficient generative learning mechanism in the sense that the existing knowledge is generalized into as-yet unexplored, novel domains. So far generative learning has not been employed for robots and robot learning remains to be a slow and tedious process. The goal of the current study is to devise for the first time a general framework for a generative process that will improve learning and which can be applied at all different levels of the robot’s cognitive architecture. To this end, we introduce the concept of structural bootstrapping-borrowed and modified from child language acquisition-to define a probabilistic process that uses existing knowledge together with new observations to supplement our robot’s data-base with missing information about planning-, object-, as well as, action-relevant entities. In a kitchen scenario, we use the example of making batter by pouring and mixing two components and show that the agent can efficiently acquire new knowledge about planning operators, objects as well as required motor pattern for stirring by structural bootstrapping. Some benchmarks are shown, too, that demonstrate how structural bootstrapping improves performance.

Developmental approach for a robot manipulator that learns in several bootstrapped stages, strongly inspired in infant development

Ugur, E.; Nagai, Y.; Sahin, E.; Oztop, E., Staged Development of Robot Skills: Behavior Formation, Affordance Learning and Imitation with Motionese, Autonomous Mental Development, IEEE Transactions on , vol.7, no.2, pp.119,139, June 2015, DOI: 10.1109/TAMD.2015.2426192.

Inspired by infant development, we propose a three staged developmental framework for an anthropomorphic robot manipulator. In the first stage, the robot is initialized with a basic reach-and- enclose-on-contact movement capability, and discovers a set of behavior primitives by exploring its movement parameter space. In the next stage, the robot exercises the discovered behaviors on different objects, and learns the caused effects; effectively building a library of affordances and associated predictors. Finally, in the third stage, the learned structures and predictors are used to bootstrap complex imitation and action learning with the help of a cooperative tutor. The main contribution of this paper is the realization of an integrated developmental system where the structures emerging from the sensorimotor experience of an interacting real robot are used as the sole building blocks of the subsequent stages that generate increasingly more complex cognitive capabilities. The proposed framework includes a number of common features with infant sensorimotor development. Furthermore, the findings obtained from the self-exploration and motionese guided human-robot interaction experiments allow us to reason about the underlying mechanisms of simple-to-complex sensorimotor skill progression in human infants.

A new approach to solve POMDP-like problems through gradient descent and optimal control

Vadim Indelman, Luca Carlone, Frank Dellaert, Planning in the continuous domain: A generalized belief space approach for autonomous navigation in unknown environments, The International Journal of Robotics Research, vol. 34 no. 7, pp. 849-882, DOI: 10.1177/0278364914561102.

We investigate the problem of planning under uncertainty, with application to mobile robotics. We propose a probabilistic framework in which the robot bases its decisions on the generalized belief, which is a probabilistic description of its own state and of external variables of interest. The approach naturally leads to a dual-layer architecture: an inner estimation layer, which performs inference to predict the outcome of possible decisions; and an outer decisional layer which is in charge of deciding the best action to undertake. Decision making is entrusted to a model predictive control (MPC) scheme. The formulation is valid for general cost functions and does not discretize the state or control space, enabling planning in continuous domain. Moreover, it allows to relax the assumption of maximum likelihood observations: predicted measurements are treated as random variables, and binary random variables are used to model the event that a measurement is actually taken by the robot. We successfully apply our approach to the problem of uncertainty-constrained exploration, in which the robot has to perform tasks in an unknown environment, while maintaining localization uncertainty within given bounds. We present an extensive numerical analysis of the proposed approach and compare it against related work. In practice, our planning approach produces smooth and natural trajectories and is able to impose soft upper bounds on the uncertainty. Finally, we exploit the results of this analysis to identify current limitations and show that the proposed framework can accommodate several desirable extensions.

Survey of Hierarchical Task Planning

Ilche Georgievski, Marco Aiello, 2015, HTN planning: Overview, comparison, and beyond, Artificial Intelligence, Volume 222, May 2015, Pages 124-156, ISSN 0004-3702, DOI: 10.1016/j.artint.2015.02.002.

Hierarchies are one of the most common structures used to understand and conceptualise the world. Within the field of Artificial Intelligence (AI) planning, which deals with the automation of world-relevant problems, Hierarchical Task Network (HTN) planning is the branch that represents and handles hierarchies. In particular, the requirement for rich domain knowledge to characterise the world enables HTN planning to be very useful, and also to perform well. However, the history of almost 40 years obfuscates the current understanding of HTN planning in terms of accomplishments, planning models, similarities and differences among hierarchical planners, and its current and objective image. On top of these issues, the ability of hierarchical planning to truly cope with the requirements of real-world applications has been often questioned. As a remedy, we propose a framework-based approach where we first provide a basis for defining different formal models of hierarchical planning, and define two models that comprise a large portion of HTN planners. Second, we provide a set of concepts that helps in interpreting HTN planners from the aspect of their search space. Then, we analyse and compare the planners based on a variety of properties organised in five segments, namely domain authoring, expressiveness, competence, computation and applicability. Furthermore, we select Web service composition as a real-world and current application, and classify and compare the approaches that employ HTN planning to solve the problem of service composition. Finally, we conclude with our findings and present directions for future work. In summary, we provide a novel and comprehensive viewpoint on a core AI planning technique.