Autonomous Drones for Emergency Responders

People

Dennis Benders - PhD candidate

Max Lodel - PhD candidate

Thijs Niesten - Research Engineer

Prof. Laura Ferranti

Prof. Javier Alonso-Mora

Prof. Robert Babuska

Funding

This project is funded by the National Police (Politie) of the Netherlands.

About the Project

How can autonomous drones support operations of emergency responders such as the police? This project targets scenarios such as search and rescue or reconnaissance in large, unknown and potentially hazardous environments, where it can be difficult or even dangerous for policemen to operate and fulfil the task themselves. In this project we aim to enable drones to operator in such remote environments and gather information required by police operators. We develop methods to control entire teams of drones, so they can fly safely between obstacles and are robust to unexpected disturbances, and can navigate unknown environments to provide the required information.

Search missions require motion planning and navigation methods for information gathering that continuously replan based on new observations of the robot’s surroundings. Current methods for information gathering are capable of reasoning over long horizons, but they are computationally expensive. To overcome these limitations we train an information-aware policy via deep reinforcement learning, that guides a trajectory optimization planner. In particular, the policy continuously recommends a reference viewpoint to the local planner, such that the resulting collision-free trajectories lead to observations that maximize the information gain and reduce the uncertainty about the environment. In simulation tests in previously unseen environments, the proposed method consistently outperforms greedy next-best-view policies in terms of information gains and coverage time, with a reduction in execution time by three orders of magnitude.

Another contribution from this project is a novel method that aims at understanding the objectives of each agent in a multi-agent setting based on the observed interactions. These objectives can be obtained using noisy and partial state observations. This is beneficial in several scenarios where different agents need to interact with other agents in it’s environment. Consider a traffic scenario where drivers are performing lane changes. Such a setting requires a driver to understand the behavior of others as well so that a safe distance can be maintained without compromising on speed. The proposed method can identify unknown parameters of each agent’s cost function based on the observatiosn so that the future trajectories or states of each agent can be predicted directly.

Project Demonstrations

Funding & Partners

This project is funded by the National Police (Politie) of the Netherlands.

Embedded Hierarchical MPC for Autonomous Navigation
Dennis Benders, Johannes Köhler, Thijs Niesten, Robert Babuška, Javier Alonso-Mora, Laura Ferranti. In IEEE Transactions on Robotics (T-RO), 2025.

abstract IEEE Xplore arXiv code web video

To efficiently deploy robotic systems in society, mobile robots must move autonomously and safely through complex environments. Nonlinear model predictive control (MPC) methods provide a natural way to find a dynamically feasible trajectory through the environment without colliding with nearby obstacles. However, the limited computation power available on typical embedded robotic systems, such as quadrotors, poses a challenge to running MPC in real time, including its most expensive tasks: constraints generation and optimization. To address this problem, we propose a novel hierarchical MPC scheme that consists of a planning and a tracking layer. The planner constructs a trajectory with a long prediction horizon at a slow rate, while the tracker ensures trajectory tracking at a relatively fast rate. We prove that the proposed framework avoids collisions and is recursively feasible. Furthermore, we demonstrate its effectiveness in simulations and lab experiments with a quadrotor that needs to reach a goal position in a complex static environment. The code is efficiently implemented on the quadrotor's embedded computer to ensure real-time feasibility. Compared to a state-of-the-art single-layer MPC formulation, this allows us to increase the planning horizon by a factor of 5, which results in significantly better performance.

Where to Look Next: Learning Viewpoint Recommendations for Informative Trajectory Planning
M. Lodel, B. Brito, A. Serra-Gomez, L. Ferranti, R. Babuska, J. Alonso-Mora. In Proc. IEEE Int. Conf. on Robotics and Automation (ICRA), 2022.

abstract arxiv pdf video

Search missions require motion planning and navigation methods for information gathering that continuously replan based on new observations of the robot's surroundings. Current methods for information gathering, such as Monte Carlo Tree Search, are capable of reasoning over long horizons, but they are computationally expensive. An alternative for fast online execution is to train, offline, an information gathering policy, which indirectly reasons about the information value of new observations. However, these policies lack safety guarantees and do not account for the robot dynamics. To overcome these limitations we train an information-aware policy via deep reinforcement learning, that guides a receding-horizon trajectory optimization planner. In particular, the policy continuously recommends a reference viewpoint to the local planner, such that the resulting dynamically feasible and collision-free trajectories lead to observations that maximize the information gain and reduce the uncertainty about the environment. In simulation tests in previously unseen environments, our method consistently outperforms greedy next-best-view policies and achieves competitive performance compared to Monte Carlo Tree Search, in terms of information gains and coverage time, with a reduction in execution time by three orders of magnitude.

Learning Mixed Strategies in Trajectory Games
L. Peters, D. Fridovich-Keil, L. Ferranti, J. Alonso-Mora, F. Laine. In , Proc. of Robotics: Science and Systems (RSS), 2022.

abstract web pdf video

In multi-agent settings, game theory is a natural framework for describing the strategic interactions of agents whose objectives depend upon one another’s behavior. Trajectory games capture these complex effects by design. In competitive settings, this makes them a more faithful interaction model than traditional “predict then plan” approaches. However, current game-theoretic planning methods have important limitations. In this work, we propose two main contributions. First, we introduce an offline training phase which reduces the online computational burden of solving trajectory games. Second, we formulate a lifted game which allows players to optimize multiple candidate trajectories in unison and thereby construct more competitive “mixed” strategies. We validate our approach on a number of experiments using the pursuit-evasion game “tag.”