CTRMs: Learning to Construct Cooperative Timed Roadmaps for Multi-agent Path Planning in Continuous Spaces

1Tokyo Institute of Technology2OMRON SINIC X Corporation* work done as an intern at OMRON SINIC X.
roadmap construction
respective roadmaps


Multi-agent path planning (MAPP) in continuous spaces is a challenging problem with significant practical importance. One promising approach could first construct graphs approximating the spaces called roadmaps, then apply multi-agent pathfinding (MAPF) algorithms on those roadmaps to derive a set of conflict-free trajectories. While conventional studies have just utilized roadmap construction methods developed for single-agent planning, it remains largely unexplored what can serve as effective roadmaps for multiple agents and how we can construct them.

To this end, we propose a novel concept of graph representations of the space named cooperative timed roadmaps (CTRMs). CTRMs enable each agent to focus on their important locations around potential solution paths, in a way to consider the behavior of other agents to avoid inter-agent collisions (i.e., "cooperative"), while being augmented in time direction to make it easy to derive a "timed" solution path. To construct CTRMs, we develop a machine learning approach that learns a generative model from a collection of relevant problem instances and plausible solutions, then uses the learned model to sample the vertices of CTRMs for new, previously unseen problem instances. Our empirical evaluation revealed that the use of CTRMs significantly reduced the planning effort with acceptable overheads while maintaining its success rate and solution quality comparable to conventional roadmap construction approaches.



Multi-Agent Path Planning (MAPP) in continuous spaces is a fundamental problem in multi-agent systems with attractive applications. One possible approach to MAPP consists of two steps:

  1. approximating the spaces by constructing graphs called roadmaps, and then
  2. using powerful multi-agent pathfinding (MAPF) algorithms on those roadmaps to derive a solution.

While such approaches based on roadmaps have widely been used for single-agent planning, doing the same for multiple agents is non-trivial. This is primarily due to the necessity of constructing sparse roadmaps, as doing otherwise will make it dramatically hard to find a combination of plausible paths by managing a higher number of inter-agent collisions. Nevertheless, there is generally a trade-off between roadmap density and solution quality; roadmaps should be sufficiently dense to ensure high planning success rate and better solutions.

This begs our key questions:

What are the characteristics to consider in order to effectively construct roadmaps for MAPP?

Concept of CTRMs

We propose a novel concept of graph representations of the space named cooperative timed roadmaps (CTRMs). CTRMs consist of directed acyclic graphs constructed in the following fashion:

  1. Agent-specific: each CTRM is specialized for individual agents to focus on their important locations.
  2. Cooperative: each CTRM is also aware of the behaviors of other agents so as to make it easy for the subsequent MAPF algorithms to find a collision-free solution path.
  3. Timed: each vertex in CTRMs is augmented in time direction to represent not only "where," as commonly doing so in conventional roadmaps, but also "when," because solutions of MAPP are a set of "timed" paths.
By taking these properties together, CTRMs aim to provide a small search space that still contains plausible solutions for MAPF algorithms.


To construct CTRMs, we develop a machine learning (ML)-based approach:

  1. Suppose that we are given a collection of MAPP demonstrations (a pair of problem instances and plausible solution paths) obtained from offline and intensive computation with conventional roadmaps, e.g., those constructed with uniform random sampling.
  2. The proposed ML model learns from how agents behave cooperatively at each unit time (i.e., timestep), which is implicitly included in the demonstrations, to predict how they move in the next timesteps.
  3. For a new, previously unseen problem instance, the learned model can then be used to sample a small set of agent-specific vertices (i.e., space-time pairs) for generating multiple solution path candidates, and eventually, constructing CTRMs by compositing those candidates.

As a result, the CTRMs can serve a smaller but promising search space and enable the MAPF algorithms to derive a solution more efficiently than when using conventional roadmaps. Technically, we extend a class of generative models called conditional variational autoencoder (CVAE), a popular choice for ML-based sampling-based motion planning, to learn a conditional probability distribution of vertices of CTRMs for each agent, given the observations of multiple agents. Our model can work with an arbitrary size of space, an arbitrary number of agents, and even with a heterogeneous setting.


We extensively evaluate our proposed approach on a variety of MAPP problems with several different setups in terms of the number of agents (21-40), the presence of obstacles, and the heterogeneity in agent sizes and motion speeds. Our results demonstrate that, compared to other standard roadmap construction strategies, planning by learning to construct CTRMs is order-of-magnitude efficient in the planning effort (e.g., assessed by search node expansions or runtime), while maintaining comparable planning success rate and solution quality, with acceptable overheads.

In what follows, we introduce part of the results. For details, please check our paper.

Visualization of Constructed Roadmaps for One Agent

As you can see below, CTRMs produce small but effective roadmaps specific to each agent, compared to alternatives.

tested instance & solution with CTRMs
Simplified PRM
as conventional roadmaps for MAPP
[Karaman & Frazzoli, 2011]
commonly used in MAPF studies,
e.g., [Stern et al., 2019]
generate sparse roadmap for single-agent
[Dobson & Bekris, 2014]
as intuitive example of agent-specific roadmaps

Performance on Basic Scenario

Since the objective of MAPP is to plan conflict-free trajectories, the effectiveness of roadmap construction methods should be evaluated via the subsequent planning results. By using standard prioritized planning (e.g., [Silver, 2005; Van Den Berg & Overmars, 2005]) as a reasonable choice of the planner, we computed the following metrics:

success rate v.s. size of search space
solution quality v.s. search effort
runtime (roadmap + planning)

*For each method, plots to the right correspond to the denser roadmaps (e.g., grid: 32x32 => 84x84).

The main observations are summarized below:

  • CTRMs contain solutions in a small search space.
  • While keeping solution quality, CTRMs contribute to reducing the planning effort by orders of magnitude compared to the alternatives.
  • CTRMs achieve efficient MAPP solving from the end-to-end perspective.


# aamas-22
  author = {Okumura, Keisuke and Yonetani, Ryo and Nishimura, Mai and Kanezaki, Asako},
  title = {CTRMs: Learning to Construct Cooperative Timed Roadmaps for Multi-Agent Path Planning in Continuous Spaces},
  year = {2022},
  isbn = {9781450392136},
  publisher = {International Foundation for Autonomous Agents and Multiagent Systems},
  booktitle = {Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems},
  pages = {972–981},
  numpages = {10},
  series = {AAMAS '22}

# arXiv version
  title={CTRMs: Learning to Construct Cooperative Timed Roadmaps for Multi-agent Path Planning in Continuous Spaces},
  author={Okumura, Keisuke and Yonetani, Ryo and Nishimura, Mai and Kanezaki, Asako},
  journal={arXiv preprint arXiv:2201.09467},