TamedPUMA: safe and stable imitation learning with geometric fabrics

Saray Bakker¹ Rodrigo Pérez-Dattari¹ Cosimo Della Santina¹ Wendelin Böhmer² Javier Alonso-Mora¹

¹Mechanical Engineering, ²Electrical Engineering, Mathematics & Computer Science, TU Delft

arXiv Code Video Appendix Related Publications

Supplementary theoretical details on TamedPUMA

A PDF of these supplementary theoretical details can be found HERE.

Energization and Finsler energies

In fabrics^[1], the concept of energization is used to transform the system $\qddot = \h(\q, \qdot)$ into an energy-conserving system. To ensure path-consistency after energization, the system $\qddot = \h(\q, \qdot)$ is designed to be Homogenous of Degree 2 (HD2) which holds if $\h(\q, \alpha\qdot) = \alpha^2 \h(\q, \qdot)$ for $\alpha\geq0$. An energized system that is HD2 follows the same path as the original system and differs only by an acceleration along the direction of motion.

It is common to use a Finsler energy to energize the system, although any Lagrangian can be used. A Finsler energy extends the concept of kinetic energy by enabling the modeling of directionally dependence metric tensors, e.g. the Finsler energy is HD2 in the velocity^[2]. Finsler energies have the property that the Hamiltonian $\mathcal{H}_{\mathcal{L}_e}$ associated with the Finsler energy $\mathcal{L}_e$, is the Finsler energy itself. An energized system that conserves a Finsler energy and is path consistent is called a geometric fabric.

To transform the original system $\qddot = \h(\q, \qdot)$ into a geometric fabric, we find $\bar{\alpha}$ for which the system, $\qddot = energize_{\mathcal{H}_{\mathcal{L}_e}}[\h] = \h(\q, \qdot) + \bar{\alpha}\qdot$, conserves the Finsler energy $\mathcal{L}_e$. This energization is performed by setting the time-derivative of the Hamilonian to zero, \begin{align} \dot{\mathcal{H}}_{\mathcal{L}_e} &= \qdot^{\top}[\vec{M}_{\mathcal{L}_e}\qddot + \vec{\xi}_{\mathcal{L}_e}] = 0,\\ &= \qdot^{\top}[\vec{M}_{\mathcal{L}_e}(\h + \bar{\alpha}\qdot)+ \vec{\xi}_{\mathcal{L}_e}] = 0,\\ \bar{\alpha} &= - \frac{\qdot^{\top}\left( \vec{M}_{\mathcal{L}_e} \h + \vec{\xi}_{\mathcal{L}_e}\right)}{\qdot^{\top}\bf{M}_{\mathcal{L}_e}\qdot}, \end{align} where $\left(\vec{M}_{\mathcal{L}_e}\qddot + \vec{\xi}_{\mathcal{L}_e}\right)$ are the Euler-Lagrange equations of $\mathcal{L}_e$ with $\vec{M}_{\mathcal{L}_e} = \partial^2_{\qdot\qdot}\mathcal{L}_e$ and $\vec{\xi}_{\mathcal{L}_e} = \partial_{\qdot \q}\mathcal{L}_e \qdot - \partial_{\q} \mathcal{L}_e$^[1].

Theoretical details on the Compatible Potential Method

In the following, a theoretical analysis is provided of the stability and convergence properties related to the Compatible Potential Method (CPM). Firstly, more details are provided regarding the stability and convergence proof for a navigation policy $\f = \fC$ with a compatible potential $\psi(\q)$ and the required assumptions. This is followed by an illustration of TamedPUMA with its corresponding relationships between latent space, task spaces and configuration space.

The stability and convergence properties of the CPM are based on Theorem III.5 in Ratliff et al. (2023)^[1]. The Theorem reads as follows, where we correct for two typos, replacing $\mathcal{H}$ with $\mathcal{H}_{\mathcal{L}_e}$ and $\gamma \qdot $ with $\gamma$ in the original description, i.e. \begin{equation} \label{eq: system_CPM} \qddot = energize_{\mathcal{H}_{\mathcal{L}_e}}[\h+\f]+\gamma \qdot \ \ \ \ \text{is replaced by:} \ \ \ \ \qddot = energize_{\mathcal{H}}[\h+\f]+\gamma. \end{equation}

Theorem III.5 adapted from Ratliff et al. (2023)^[1]: Let energize$_{\mathcal{L}_e}[\h(\q, \qdot)]$ be a fabric with generator $\h$ and Finsler energy $\mathcal{L}_e$, and let $\f(\q, \qdot)$ be a navigation policy with compatible potential $\psi(\q)$. Denote the total energy by $\mathcal{H}=\mathcal{L}_e+\psi$. The system \begin{align} \label{eq: qddot_theoremIII5} \qddot &= energize_{\mathcal{H}}[\h+\f] - \beta \qdot\\ &= energize_{\mathcal{H}_{\mathcal{L}_e}}[\h+\f]+\gamma %= \h + \f + \bar{\alpha} + \gamma, \end{align} with energy regulator, \begin{equation} \label{eq: gamma} \gamma(\q, \qdot) = - \left(\frac{\qdot \qdot^{\top}}{\qdot^{\top}M_{\mathcal{L}_e}\qdot}\right)\delta \psi - \beta \qdot, \end{equation} converges to the zero set of $\f$ for $\beta>0$.

The proof of Theorem III.5 consists of two parts: (1) Showing that the system in Eq. \eqref{eq: qddot_theoremIII5} is energy-decreasing and therefore it results in $\qdot \rightarrow \vec{0}$ and $\qddot \rightarrow \vec{0}$ as time goes to infinity. (2) Assuring that when the system converges, it converges to the zero set of the navigation policy $\f$.

Step 1: To ensure that the damped system decreases in energy resulting in $\qdot = \vec{0}$ and $\qddot = \vec{0}$, we first analyze the energy-conservative system, $energize_{\mathcal{H}}[\h +\f]$.

Step 1a: Let's start with finding $\alpha$ for which the system $\qddot = energize_{\mathcal{H}}[\h+\f] = \h + \f + \alpha \qdot$ is energy-conservative, i.e. the derivative of the Hamiltonian is zero, $\dot{\mathcal{H}} = 0$. The total energy is a summation of the Hamiltonian associated with the Finsler energy, $\mathcal{H}_{\mathcal{L}_e}$, and the potential energy $\psi$. The total energy and its derivative therefore become, \begin{align} \mathcal{H} = \mathcal{H}_{\mathcal{L}_e} + \psi, && \dot{\mathcal{H}} = \frac{\partial \q}{\partial t}^{\top} \frac{\partial \mathcal{H}}{\partial \q} = \qdot^{\top}[\vec{M}_{\mathcal{L}_e}\qddot + \vec{\xi}_{\mathcal{L}_e} + \delta \psi]. \end{align} To find $\alpha$ for which $\dot{\mathcal{H}} = 0$, we substitute $\qddot = \h + \f + \alpha \qdot$ into the derivative of the Hamiltonian and set this equal to zero, \begin{align} \dot{\mathcal{H}} &= \qdot^{\top}[\vec{M}_{\mathcal{L}_e}\qddot + \vec{\xi}_{\mathcal{L}_e} + \delta \psi] = 0, \\ &= \qdot^{\top}[\vec{M}_{\mathcal{L}_e} \left( \h + \f + \alpha \qdot \right) + \vec{\xi}_{\mathcal{L}_e} + \delta \psi] = 0, \\ &= \alpha (\qdot^{\top}\vec{M}_{\mathcal{L}_e}\qdot) + \qdot^{\top}\left( \vec{M}_{\mathcal{L}_e} \h + \vec{\xi}_{\mathcal{L}_e}\right) + \qdot^{\top}\left(\vec{M}_{\mathcal{L}_e} \f + \partial \psi \right) = 0.\\ \alpha &= - \left(\frac{\qdot^{\top}}{\qdot^{\top}\vec{M}_{\mathcal{L}_e}\qdot}\right) [ \vec{M}_{\mathcal{L}_e}(\h +\f) + \vec{\xi}_{\mathcal{L}_e} ] - \left(\frac{\qdot^{\top}}{\qdot^{\top}\vec{M}_{\mathcal{L}_e}\qdot}\right) \partial \psi. \label{eq: alpha_appendix} \end{align} The expression for $\alpha$ is substituted into the system $\qddot = \h + \f + \alpha \qdot$, to obtain, \begin{align} \label{eq: qddot_conservative} \qddot &= energize_{\mathcal{H}}[\h+\f], \\ &= \h + \f + \alpha \qdot, \\ &= \underbrace{\h + \f \underbrace{- \left(\frac{\qdot \qdot^{\top}}{\qdot^{\top}\vec{M}_{\mathcal{L}_e}\qdot}\right) [ \vec{M}_{\mathcal{L}_e}(\h +\f) + \vec{\xi}_{\mathcal{L}_e} ]}_{\bar{\alpha}}}_{energize_{\mathcal{H}_{\mathcal{L}_e}}[\h+\f]} \underbrace{- \left(\frac{\qdot \qdot^{\top}}{\qdot^{\top}\vec{M}_{\mathcal{L}_e}\qdot}\right) \partial \psi}_{\gamma \text{ with } \beta=0}. \label{eq:} \end{align} Analyzing Eq. \eqref{eq: qddot_conservative}, the system can be split into the energized system of $\qddot = \h +\f$ with the Finsler energy $\mathcal{L}_e$ and corresponding energization vector $\bar{\alpha}$, and the term $\gamma$ where $\beta=0$. By adding damping to Eq. \eqref{eq: qddot_conservative}, we obtain the damped system as represented in Eq. \eqref{eq: qddot_theoremIII5}. In Step 1b, it is proven that the damped system decreases energy and converges to $\dot{\mathcal{H}} \rightarrow 0$, since $\mathcal{H}$ is decreasing and lower bounded, leading to $\qdot \rightarrow \vec{0}$ and $\qddot \rightarrow \vec{0}$ as time goes to infinity.

Step 1b: Damping is added to Eq. \eqref{eq: qddot_conservative} via $-\beta \qdot$ with $\beta>0$, \begin{equation} \qddot = \h + \f + \alpha \qdot - \beta \qdot. \end{equation} As the derivative of the Hamiltonian is zero for $\beta=0$, this leads to the following derivative of the Hamiltonian for the damped system, \begin{align} \label{eq: Hdot_damped} \dot{\mathcal{H}} &= \qdot^{\top}[\vec{M}_{\mathcal{L}_e}\qddot + \vec{\xi}_{\mathcal{L}_e} + \delta \psi] - \beta \qdot^{\top} \vec{M}_{\mathcal{L}_e} \qdot, \\ &= - \beta \qdot^{\top} \vec{M}_{\mathcal{L}_e} \qdot. \end{align} As $\vec{M}_{\mathcal{L}_e}$ is strictly positive, Eq. \eqref{eq: Hdot_damped} is less than zero for all $\qdot \neq \vec{0}$ and zero for $\qdot = \vec{0}$. Since the total energy $\mathcal{H}$ is always decreasing and lower bounded by zero, the rate of the decrease must converge to zero, $\dot{\mathcal{H}} \rightarrow 0$, which means that $\dot{\mathcal{H}} = -\beta \qdot^{\top}\vec{M}_{\mathcal{L}_e}\qdot \rightarrow 0$ converges to zero, and therefore $\qdot \rightarrow \vec{0}$ leading to $\qddot \rightarrow \vec{0}$.

Step 2: The second step is to ensure convergence of the system in Eq. \eqref{eq: qddot_theoremIII5} to the zero set of the navigation policy $\f$. For the CPM, this ensures that the system Eq. \eqref{eq: system_CPM} converges to the zero set of the pulled dynamical system of PUMA ($\qddot = \f^{\mathcal{C}}_{\theta}(\q, \qdot)$, Eq. 6 in our paper) which contains the desired goal. To explore convergence of Eq. \eqref{eq: qddot_theoremIII5} over infinite time to the zero set of $\f$, we take the limit with $\qdot, \qddot \rightarrow \vec{0}$, \begin{align} \label{eq: limit_qddot} \qddot &= \h + \f + \alpha \qdot - \beta \qdot,\\ &= \h + \f - \frac{\qdot \qdot^{\top}}{\qdot^{\top}\vec{M}_{\mathcal{L}_e}\qdot} [ \vec{M}_{\mathcal{L}_e}(\h +\f) + \vec{\xi}_{\mathcal{L}_e} ] - \frac{\qdot \qdot^{\top}}{\qdot^{\top}\vec{M}_{\mathcal{L}_e}\qdot} \partial \psi - \beta \qdot,\\ &= \h - \frac{\qdot \qdot^{\top}}{\qdot^{\top}\vec{M}_{\mathcal{L}_e}\qdot}[ \vec{M}_{\mathcal{L}_e}\h + \vec{\xi}_{\mathcal{L}_e} ]- \beta \qdot + \f - \frac{\qdot \qdot^{\top}}{\qdot^{\top}\vec{M}_{\mathcal{L}_e}\qdot} \left(\vec{M}_{\mathcal{L}_e} \f + \partial \psi \right),\label{eq: limit_qddot_c}\\ \xrightarrow[t \rightarrow \infty]{}\ \vec{0} &= \underbrace{energize_{\mathcal{H}_{\mathcal{L}_e}}[\h]- \beta \qdot }_{\xrightarrow[t \rightarrow \infty]{}\ \vec{0}}+ \f - \frac{\qdot \qdot^{\top}}{\qdot^{\top}\vec{M}_{\mathcal{L}_e}\qdot} \left(\vec{M}_{\mathcal{L}_e} \f + \partial \psi \right),\label{eq: limit_qddot_d}\\ \xrightarrow[t \rightarrow \infty]{}\ \vec{0} &= \f - \frac{\qdot \qdot^{\top}}{\qdot^{\top}\vec{M}_{\mathcal{L}_e}\qdot} \left(\vec{M}_{\mathcal{L}_e} \f + \partial \psi \right). \label{eq: limit_qddot_e} \end{align} In the limit, both $\beta\qdot \rightarrow \vec{0}$ and $energize_{\mathcal{H}_{\mathcal{L}_e}}[\h]$ converge to zero in Eq. \eqref{eq: limit_qddot_d}. In the following, we will elaborate why the equality in Eq. \eqref{eq: limit_qddot_e} requires $\f=\vec{0}$ in the limit.

The fraction $\frac{\qdot \qdot^{\top}}{\qdot^{\top}\vec{M}_{\mathcal{L}_e}\qdot}$ in Eq. \eqref{eq: limit_qddot_e} has two occurances of $\qdot$ both in the numerator and denominator. As $\vec{M}_{\mathcal{L}_e}$ is positive definite and bounded, the fraction in the limit becomes, \begin{equation} \label{eq: A_limit} \lim_{t \rightarrow \infty}\ \frac{\qdot \qdot^{\top}}{\qdot^{\top}\vec{M}_{\mathcal{L}_e}\qdot} = \vec{A} = \frac{\vec{v}\vec{v}^{\top}}{\vec{v}^{\top}\vec{M}_{\mathcal{L}_e}\vec{v}} \ \ \ \ \text{where} \ \ \ \ \vec{v} = \lim_{t \rightarrow \infty} \frac{\qdot}{\norm{\qdot}}. \end{equation} Using the definition for $\vec{A}$ in Eq. \eqref{eq: A_limit}, Eq. \eqref{eq: limit_qddot_e} can be rewritten as, \begin{equation} \label{eq: limit_lambdas} \xrightarrow[t \rightarrow \infty]{}\ \vec{0} = \underbrace{[\vec{I} - \vec{A} \vec{M}_{\mathcal{L}_e}] \f}_{\lambda_1} + \underbrace{\vec{A}(\delta \psi)}_{\lambda_2}, \end{equation} In the limit, the term $[\vec{I} - \vec{A} \vec{M}_{\mathcal{L}_e}]$ has nullspace $\vec{v}$ as $[\vec{I} - \vec{A} \vec{M}_{\mathcal{L}_e}]\vec{v} = \vec{0}$, \begin{equation} \label{eq: spanv} [\vec{I} - \vec{A} \vec{M}_{\mathcal{L}_e}] \vec{v} = \vec{v} - \frac{\vec{v} \vec{v}^{\top}}{\vec{v}^{\top}\vec{M}_{\mathcal{L}_e}\vec{v}} \vec{M}_{\mathcal{L}_e}\vec{v} = \vec{v} - \vec{v} \frac{\vec{v} \vec{M}_{\mathcal{L}_e}\vec{v}^{\top}}{\vec{v}^{\top}\vec{M}_{\mathcal{L}_e}\vec{v}} = \vec{v} - \vec{v} = \vec{0}, \end{equation} which implies that $\lambda_1$ and $\lambda_2$ in Eq. \eqref{eq: limit_lambdas}, are orthogonal, $\lambda_1 \perp \lambda_2$. Both terms must be zero, $\lambda_1 = \vec{0}, \ \lambda_2 = \vec{0}$ for Eq. \eqref{eq: limit_lambdas} to hold. By contradiction, it is proven that $\f$ is equal to zero as time goes to infinity.

Proof by contradiction: Let us assume that $\f \neq \vec{0}$. First, note that if $\f \neq \vec{0}$, for $\lambda_1$ to be zero, $\f$ must be in the nullspace of $[\vec{I} - \vec{A} \vec{M}_{\mathcal{L}_e}]$, which we have already noted is equal to $\vec{v}$. As a consequence, $\f \in \text{span}(\vec{v})$. In contrast, for $\lambda_2$ to be equal to zero, i.e., $\vec{A}(\partial \psi) = \vec{0}$, two cases exist:

Case 1: The gradient of the potential is equal to zero, $\partial \psi = \vec{0}$. For $\f \neq \vec{0}$, this case cannot hold, as the potential is a compatible potential of $\f$, which indicates that $\partial \psi = \vec{0}$ if only if $\f = 0$, which would lead to a contradiction.
Case 2: The other possibility is that $\partial \psi$ is in the nullspace of $\vec{A}$, which implies that $\partial \psi \perp \vec{v}$. Consequently, since $\f \in \text{span}(\vec{v})$, $\partial \psi \perp \vec{v}$ implies that $\partial \psi \perp \f$; hence, $\partial \psi^{\top}\f=0$. However, a compatible potential also has the property $-\partial \psi^{\top} \f>\vec{0}$ wherever $\f \neq \vec{0}$, which once again leads to a contradiciton.

As a result, since both possible cases lead to a contradiction for $\f \neq 0$, $\f$ must be zero in the limit.

Discussion

Let's take a closer look at the implications of the assumptions on $\vec{M}_{\mathcal{L}_{e}}$. For a fabric describing collision avoidance, two cases exist as the Spec describing the fabric must be boundary conforming^[2]: (1) The metric $\vec{M}_{\mathcal{L}_{e}}$ is finite along the Eigen-directions parallel with the boundary's tangent space but goes to infinite along directions orthogonal to the tangent space. (2) The metric $\vec{M}_{\mathcal{L}_{e}}$ is a finite matrix along all trajectories, implying that $\vec{M}_{\mathcal{L}_{e}}$ is also finite in the limit when $t\rightarrow \infty$. Observing the assumption on $\vec{M}_{\mathcal{L}_{e}}$ in the \ac{cpm} that the metric is bounded in a finite region and strictly positive definite everywhere, only fabrics can be designed following the second case. This implies that in the limit we ensure convergence to the zero set of the forcing policy $\f^{\mathcal{C}}_{\theta}(\q, \qdot)$, but collision avoidance is not guaranteed as barrier-like functions going to infinity on the boundary, cannot be used to construct $\vec{M}_{\mathcal{L}_e}$

The metric $\vec{M}_{\mathcal{L}_e}$ is additionally assumed to not vanish in the limit $\qdot \rightarrow \vec{0}$. In practice, numerical instability of the fraction $\frac{\qdot \qdot^{\top}}{\qdot^{\top}\vec{M}_{\mathcal{L}_e}\qdot}$ is avoided by replacing the denominator by $\qdot^{\top}\vec{M}_{\mathcal{L}_e}\qdot + \epsilon$ with $\epsilon>0$.

Illustration of TamedPUMA

This illustration of TamedPUMA shows the behavior design given the relationships between the different task and configuration-space variables. The joint angles and velocities get mapped into task space where the desired behavior is specified. Via fabrics, all avoidance behaviors are defined using the joint limits, and varying obstacle positions. The DNN captures the desired behavior of the end-effector position and orientation.

References

Ratliff, Nathan, and Van Wyk, Karl. (2023). "Fabrics: A Foundationally Stable Medium for Encoding Prior Experience." arXiv preprint arXiv:2309.07368.
Ratliff, Nathan D., Van Wyk, Karl, Xie, Mandy, Li, Anqi, and Rana, Muhammad Asif. (2020). "Optimization fabrics." arXiv preprint arXiv:2008.02399.

Pushing Through Clutter With Movability Awareness of Blocking Obstacles
Joris J. Weeda, Saray Bakker, Gang Chen, Javier Alonso-Mora. In IEEE Int. Conf. on Robotics and Automation (ICRA), 2025.

abstract pdf code

Navigation Among Movable Obstacles (NAMO) poses a challenge for traditional path-planning methods when obstacles block the path, requiring push actions to reach the goal. We propose a framework that enables movability-aware planning to overcome this challenge without relying on explicit obstacle placement. Our framework integrates a global Semantic Visibility Graph and a local Model Predictive Path Integral (SVG-MPPI) approach to efficiently sample rollouts, taking into account the continuous range of obstacle movability. A physics engine is adopted to simulate the interaction result of the rollouts with the environment, and generate trajectories that minimize contact force. In qualitative and quantitative experiments, SVG-MPPI outperforms the existing paradigm that uses only binary movability for planning, achieving higher success rates with reduced cumulative contact forces.

Globally-Guided Geometric Fabrics for Reactive Mobile Manipulation in Dynamic Environments
T. Merva, S. Bakker, M. Spahn, D. Zhao, I. Virgala, J. Alonso-Mora. In IEEE Robotics and Automation Letters (RA-L), 2025.

abstract web pdf video

Mobile manipulators operating in dynamic environments shared with humans and robots must adapt in real time to environmental changes to complete their tasks effectively. While global planning methods are effective at considering the full task scope, they lack the computational efficiency required for reactive adaptation. In contrast, local planning approaches can be executed online but are limited by their inability to account for the full task’s duration. To tackle this, we propose Globally-Guided Geometric Fabrics (G3F), a framework for real-time motion generation along the full task horizon, by interleaving an optimization-based planner with a fast reactive geometric motion planner, called Geometric Fabrics (GF). The approach adapts the path and explores a multitude of acceptable target poses, while accounting for collision avoidance and the robot’s physical constraints. This results in a real-time adaptive framework considering whole-body motions, where a robot operates in close proximity to other robots and humans. We validate our approach through various simulations and real-world experiments on mobile manipulators in multi-agent settings, achieving improved success rates compared to vanilla GF, Prioritized Rollout Fabrics and Model Predictive Control.

TamedPUMA: safe and stable imitation learning with geometric fabrics
S. Bakker, R. Pérez-Dattari, C. Della Santina, W. Böhmer, J. Alonso-Mora. In Learning for Dynamics and Control (L4DC), 2025.

abstract web pdf

Using the language of dynamical systems, Imitation learning (IL) provides an intuitive and effective way of teaching stable task-space motions to robots with goal convergence. Yet, IL techniques are affected by serious limitations when it comes to ensuring safety and fulfillment of physical constraints. With this work, we solve this challenge via TamedPUMA, an IL algorithm augmented with a recent development in motion generation called geometric fabrics. As both the IL policy and geometric fabrics describe motions as artificial second-order dynamical systems, we propose two variations where IL provides a navigation policy for geometric fabrics. The result is a stable imitation learning strategy within which we can seamlessly blend geometrical constraints like collision avoidance and joint limits. Beyond providing a theoretical analysis, we demonstrate TamedPUMA with simulated and real-world tasks, including a 7-DoF manipulator.

Safe and stable motion primitives via imitation learning and geometric fabrics
Saray Bakker, Rodrigo Pérez-Dattari, Cosimo Della Santina, Wendelin Böhmer, Javier Alonso-Mora. In Robotics: Science and Systems, Workshop on Structural Priors as Inductive Biases for Learning Robot Dynamics, 2024.

abstract web pdf

Using the language of dynamical systems, Imitation learning (IL) provides an intuitive and effective way of teaching stable task-space motions to robots with goal convergence. Yet, these techniques are affected by serious limitations when it comes to ensuring safety and fulfillment of physical constraints. With this work, we propose to solve this challenge via TamedPUMA, an IL algorithm augmented with a recent development in motion planning called geometric fabrics. We explore two variations of this approach, which we name the forcing policy method and the compatible potential method. Making these combinations possible requires two enabling factors: the possibility of learning second-order dynamical systems by imitation and the availability of a potential function that is compatible with the learned dynamics. In this paper, we show how these conditions can be met when using an IL strategy called PUMA. The result is a stable imitation learning strategy within which we can seamlessly blend geometrical constraints like collision avoidance and joint limits. Beyond providing a theoretical analysis, we demonstrate TamedPUMA with simulated and real-world tasks, including a 7-degree-of-freedom manipulator that is trained to pick a tomato from a crate in the presence of obstacles.

Reactive grasp and motion planning for adaptive mobile manipulation among obstacles
Tomas Merva, Saray Bakker, Max Spahn, Ivan Virgala, Javier Alonso-Mora. In Robotics: Science and Systems, Workshop on Frontiers of Optimization for Robotics, 2024.

abstract pdf

Mobile manipulators are susceptible to situations in which the precomputed grasp pose is not reachable as the result of conflicts between collision avoidance behaviour and the manipulation task. In this work, we address this issue by combining real-time grasp planning with geometric motion planning for decentralized multi-agent systems, referred to as Reactive Grasp Fabrics (RGF). We optimize the precomputed grasp pose candidate to account for obstacles and the robot's kinematics. By leveraging a reactive geometric motion planner, specifically geometric fabrics, the grasp optimization problem can be simplified, resulting in a fast, adaptive framework that can resolve deadlock situations in pick-and-place tasks. We demonstrate the robustness of this approach by controlling a mobile manipulator in both simulation and real-world experiments in dynamic environments.

Multi-Robot Local Motion Planning Using Dynamic Optimization Fabrics
Saray Bakker, Luzia Knoedler, Max Spahn, Wendelin Boehmer, Javier Alonso-Mora. In Proc. IEEE International Symposium on Multi-Robot and Multi-Agent Systems, 2023.

abstract pdf code video

In this paper, we address the problem of real-time motion planning for multiple robotic manipulators that operate in close proximity. We build upon the concept of dynamic fabrics and extend them to multi-robot systems, referred to as Multi-Robot Dynamic Fabrics (MRDF). This geometric method enables a very high planning frequency for high-dimensional systems at the expense of being reactive and prone to deadlocks. To detect and resolve deadlocks, we propose Rollout Fabrics where MRDF are forward simulated in a decentralized manner. We validate the methods in simulated close-proximity pick-and-place scenarios with multiple manipulators, showing high success rates and real-time performance.

Back to Publications