Biased-MPPI: Informing Sampling-Based Model Predictive Control by Fusing Ancillary Controllers

Elia Trevisan1 Javier Alonso-Mora1

Classic MPPI

MPPI usually only takes samples around a previous plan. Here, the enviroment changes in an unexpected way. This results in all samples being in collision, thus computing a plan that also collides.

Biased-MPPI

Biased-MPPI can sample from multiple ancillary controllers. Here, sampling a zero-velocity reference, the sampling distribution quickly collapses to a braking manoeuvre, avoiding collision.

Abstract

Motion planning for autonomous robots in human-populated environments poses numerous challenges due to uncertainties in the robot's dynamics, environment, and interaction with other agents. Sampling-based MPC approaches, such as Model Predictive Path Integral (MPPI) control, have shown promise in addressing these complex motion planning problems. However, the performance of MPPI relies heavily on the choice of sampling distribution. Existing literature often uses the previously computed input sequence as the mean of a Gaussian distribution for sampling, leading to potential failures and local minima. In this paper, we propose novel derivations of the MPPI method to enhance its efficiency, robustness, and convergence. Our approach includes a mathematical formulation allowing for arbitrary sampling distributions, addressing numerical issues, and alleviating the problem of local minima. We present an efficient importance sampling scheme that combines classical and learning-based ancillary controllers simultaneously, resulting in more informative sampling and control fusion. We demonstrate our proposed scheme's superior efficiency and robustness through experiments by handling model uncertainties and rapid environmental changes and reducing susceptibility to local minima.

Illustrative Example: Rotary Inverted Pendulum

Switching Controller

Uses heuristics to switch between an EBC for swingup, an LQR for stabilization, and an LQI for tracking a reference with the arm when the pendulum is at the top equilibrium.

Classic MPPI

Takes 100 samples around a time shifted version of the previous plan.

Biased-MPPI

Samples all the ancillary controllers used by the switching controller once, and takes the remaining samples around the previous plan.

Experiment: Crossing an Intersection

Classic MPPI

Classic MPPI gets stuck in a local minima whereby both agents think they should pass first, even when it becomes clear the orange agent should yield, resulting in a collision.

Biased-MPPI

Biased-MPPI, sampling from multiple ancillary controllers, results in the orange agent quickly converging to a yielding manoeuvre, avoiding collisions.

Experiment: Multi-Agent Navigation

Classic MPPI

with 2000 samples

With 2000 samples, IA-MPPI based on the classic MPPI sampling strategy can solve the problem correctly.

Classic MPPI

with 200 samples

With 200 samples, the algorithm using the classic MPPI sampling strategy fails to find a good solution.

Biased-MPPI

with 200 samples

Biased-MPPI, taking suggestions from several ancillary controllers, can solve the problem correctly using only 200 samples.