When to Replan? An Adaptive Replanning Strategy for Autonomous Navigation using Deep Reinforcement Learning

ICRA2024
1OMRON SINIC X Corporation2Nagoya University* work done as an intern at OMRON SINIC X.

TL;DR propose an adaptive replanning strategy using deep reinforcement learning

Overview

The hierarchy of global and local planners is one of the most commonly utilized system designs in autonomous robot navigation. While the global planner generates a reference path from the current to goal locations based on the pre-built map, the local planner produces a kinodynamic trajectory to follow the reference path while avoiding perceived obstacles. To account for unforeseen or dynamic obstacles not present on the pre-built map, ``when to replan'' the reference path is critical for the success of safe and efficient navigation. However, determining the ideal timing to execute replanning in such partially unknown environments still remains an open question. In this work, we first conduct an extensive simulation experiment to compare several common replanning strategies and confirm that effective strategies are highly dependent on the environment as well as the global and local planners. Based on this insight, we then derive a new adaptive replanning strategy based on deep reinforcement learning, which can learn from experience to decide appropriate replanning timings in the given environment and planning setups. Our experimental results show that the proposed replanner can perform on par or even better than the current best-performing strategies in multiple situations regarding navigation robustness and efficiency.

Video

Adaptive Replanning using Deep Reinforcement Learning

We derive a replanning controller that can learn from its previous navigation experiences to create a better replanning timing for navigation efficiency and robustness. As illustrated in the following figure, the replanner’s action is essentially the same as that of existing replanning strategies, i.e., binary actions indicating whether or not to execute replanning to produce a new reference path for the local planner after the current time step. In other words, the replanner can potentially be utilized as a replacement module for the replanning strategy in existing planning frameworks, thus making it compatible with various combinations of planners and other modules.

Evaluation

v.s. Conventional Replanning Strategies

Distance-based
Stuck-based
Time-based

We conduct a comprehensive simulation study to systematically evaluate the existing planning strategies and the DRL replanner. In this work, we compare four types of rule-based replanning available in ROS2 Navigation Stack.

  • Distance-based: determines replanning timings on the basis of traveled distance.
  • Stuck-based: decides to execute replans when the robot stops at the same position for a given Δtstuck\Delta t_{\text{stuck}} seconds.
  • Time-based: performs replanning at every fixed period of Δtrep\Delta t_{\text{rep}} seconds.
  • Time w/patience: adopts time-based strategy when the robot is far rom the goal and changes to stuck-based otherwise, expecting to prevent a large detour near the goal.

Selected Simulation Results

No ENTRY AREAS16
MetricSR ⬆️CR ⬇️SGT ⬆️SPL ⬆️NR ⬇️
No replan27100.4390.270--
Distance-based
Stuck-based
Time-based
Time w/ patience
62
64
70
70
12
13
10
10
0.509
0.493
0.538
0.540
0.547
0.605
0.615
0.615
2186
739
3076
2956
Ours (DRL Replanner)7740.5630.6682577
25
SR ⬆️CR ⬇️SGT ⬆️SPL ⬆️NR ⬇️
2440.4180.240--
82
71
79
79
12
6
12
12
0.561
0.482
0.562
0.564
0.705
0.658
0.688
0.688
1995
818
2671
2558
8760.6000.7512066
Metric
SRSuccess Rate over 100 trials, where success is defined as the robot reaching the goal without collision.
CRCollision Rate over 100 trials.
SGTSuccess-weighted by normalized Goal Time.
SPLAverate Success-weighted normalized Path Length in the number of trials (100).
NRNumber of Replanning over 100 trials.

The table lists the quantitave evaluation results of 100 trials for each map layout with Dijkstra (Global) and DWA (Local) planners. Our DRL replanner, which learns from its experiences to seek better replanning timings, works comparably well or sometimes substantially bettern than the other rule-based strategies in each environment.

v.s. Stuck-based Replanning

Ours (DRL Replanner)
Stuck-based
The stuck-based replanner can aggressively head towards the goal, especially when dynamic obstacles move out of the way. However, replanning after detecting a stuck situation is inefficient. In contrast, our method replans before getting stuck and quickly reaches the goal.

v.s. Time-based Replanning

Ours (DRL Replanner)
Time-based
While the time-based method is efficient in scenes where obstacles are stationary, as it always tracks the shortest distance, in dynamic scenes, unnecessary replanning can lead to path oscillation, and cases where it fails to reach the goal have been observed. On the other hand, our proposed method efficiently reaches the goal by replanning only at key points.

Citation

@misc{honda2024replan,
      title={When to Replan? An Adaptive Replanning Strategy for Autonomous Navigation using Deep Reinforcement Learning},
      author={Kohei Honda and Ryo Yonetani and Mai Nishimura and Tadashi Kozuno},
      year={2024},
      eprint={2304.12046},
      archivePrefix={arXiv},
      primaryClass={cs.RO}
}