When to Replan? An Adaptive Replanning Strategy for Autonomous Navigation using Deep Reinforcement Learning

ICRA2024

Kohei Honda*^1,2Ryo Yonetani¹Mai Nishimura¹Tadashi Kozuno¹

¹OMRON SINIC X Corporation²Nagoya University* work done as an intern at OMRON SINIC X.

paper code video

TL;DR propose an adaptive replanning strategy using deep reinforcement learning

Overview

The hierarchy of global and local planners is one of the most commonly utilized system designs in autonomous robot navigation. While the global planner generates a reference path from the current to goal locations based on the pre-built map, the local planner produces a kinodynamic trajectory to follow the reference path while avoiding perceived obstacles. To account for unforeseen or dynamic obstacles not present on the pre-built map, ``when to replan'' the reference path is critical for the success of safe and efficient navigation. However, determining the ideal timing to execute replanning in such partially unknown environments still remains an open question. In this work, we first conduct an extensive simulation experiment to compare several common replanning strategies and confirm that effective strategies are highly dependent on the environment as well as the global and local planners. Based on this insight, we then derive a new adaptive replanning strategy based on deep reinforcement learning, which can learn from experience to decide appropriate replanning timings in the given environment and planning setups. Our experimental results show that the proposed replanner can perform on par or even better than the current best-performing strategies in multiple situations regarding navigation robustness and efficiency.

Video

Adaptive Replanning using Deep Reinforcement Learning

We derive a replanning controller that can learn from its previous navigation experiences to create a better replanning timing for navigation efficiency and robustness. As illustrated in the following figure, the replanner’s action is essentially the same as that of existing replanning strategies, i.e., binary actions indicating whether or not to execute replanning to produce a new reference path for the local planner after the current time step. In other words, the replanner can potentially be utilized as a replacement module for the replanning strategy in existing planning frameworks, thus making it compatible with various combinations of planners and other modules.

Evaluation

v.s. Conventional Replanning Strategies

Distance-based

Stuck-based

Time-based

We conduct a comprehensive simulation study to systematically evaluate the existing planning strategies and the DRL replanner. In this work, we compare four types of rule-based replanning available in ROS2 Navigation Stack.

Distance-based: determines replanning timings on the basis of traveled distance.
Stuck-based: decides to execute replans when the robot stops at the same position for a given $\Delta t_{\text{stuck}}$ seconds.
Time-based: performs replanning at every fixed period of $\Delta t_{\text{rep}}$ seconds.
Time w/patience: adopts time-based strategy when the robot is far rom the goal and changes to stuck-based otherwise, expecting to prevent a large detour near the goal.

Selected Simulation Results

No ENTRY AREAS	16
Metric	SR ⬆️	CR ⬇️	SGT ⬆️	SPL ⬆️	NR ⬇️
No replan	27	10	0.439	0.270	--
Distance-based Stuck-based Time-based Time w/ patience	62 64 70 70	12 13 10 10	0.509 0.493 0.538 0.540	0.547 0.605 0.615 0.615	2186 739 3076 2956
Ours (DRL Replanner)	77	4	0.563	0.668	2577

25
SR ⬆️	CR ⬇️	SGT ⬆️	SPL ⬆️	NR ⬇️
24	4	0.418	0.240	--
82 71 79 79	12 6 12 12	0.561 0.482 0.562 0.564	0.705 0.658 0.688 0.688	1995 818 2671 2558
87	6	0.600	0.751	2066

Metric
SR	Success Rate over 100 trials, where success is defined as the robot reaching the goal without collision.
CR	Collision Rate over 100 trials.
SGT	Success-weighted by normalized Goal Time.
SPL	Averate Success-weighted normalized Path Length in the number of trials (100).
NR	Number of Replanning over 100 trials.

The table lists the quantitave evaluation results of 100 trials for each map layout with Dijkstra (Global) and DWA (Local) planners. Our DRL replanner, which learns from its experiences to seek better replanning timings, works comparably well or sometimes substantially bettern than the other rule-based strategies in each environment.

v.s. Stuck-based Replanning

Ours (DRL Replanner)

Stuck-based

The stuck-based replanner can aggressively head towards the goal, especially when dynamic obstacles move out of the way. However, replanning after detecting a stuck situation is inefficient. In contrast, our method replans before getting stuck and quickly reaches the goal.

v.s. Time-based Replanning

Ours (DRL Replanner)

Time-based

While the time-based method is efficient in scenes where obstacles are stationary, as it always tracks the shortest distance, in dynamic scenes, unnecessary replanning can lead to path oscillation, and cases where it fails to reach the goal have been observed. On the other hand, our proposed method efficiently reaches the goal by replanning only at key points.

Contact

contact@sinicx.com

OMRON SINIC X

Mai Nishimura

Citation

@misc{honda2024replan,
      title={When to Replan? An Adaptive Replanning Strategy for Autonomous Navigation using Deep Reinforcement Learning},
      author={Kohei Honda and Ryo Yonetani and Mai Nishimura and Tadashi Kozuno},
      year={2024},
      eprint={2304.12046},
      archivePrefix={arXiv},
      primaryClass={cs.RO}
}