Deep Deterministic Policy Gradient-Based Autonomous Driving for Mobile Robots in Sparse Reward Environments.

Park, Minjae; Lee, Seok Young; Hong, Jin Seok; Kwon, Nam Kyu

Park, Minjae; Lee, Seok Young; Hong, Jin Seok; Kwon, Nam Kyu.

Afiliación

Park M; Department of Electronic Engineering, Yeungnam University, Gyeongsan 38541, Republic of Korea.
Lee SY; Department of Electronic Engineering, Soonchunhyang University, Asan 31538, Republic of Korea.
Hong JS; Department of Electronic Engineering, Yeungnam University, Gyeongsan 38541, Republic of Korea.
Kwon NK; Department of Electronic Engineering, Yeungnam University, Gyeongsan 38541, Republic of Korea.

Sensors (Basel) ; 22(24)2022 Dec 07.

Article en En | MEDLINE | ID: mdl-36559941

RESUMEN

In this paper, we propose a deep deterministic policy gradient (DDPG)-based path-planning method for mobile robots by applying the hindsight experience replay (HER) technique to overcome the performance degradation resulting from sparse reward problems occurring in autonomous driving mobile robots. The mobile robot in our analysis was a robot operating system-based TurtleBot3, and the experimental environment was a virtual simulation based on Gazebo. A fully connected neural network was used as the DDPG network based on the actor-critic architecture. Noise was added to the actor network. The robot recognized an unknown environment by measuring distances using a laser sensor and determined the optimized policy to reach its destination. The HER technique improved the learning performance by generating three new episodes with normal experience from a failed episode. The proposed method demonstrated that the HER technique could help mitigate the sparse reward problem; this was further corroborated by the successful autonomous driving results obtained after applying the proposed method to two reward systems, as well as actual experimental results.

Asunto(s)

Conducción de Automóvil; Robótica; Simulación por Computador; Políticas; Recompensa

Palabras clave

autonomous driving; deep deterministic policy gradient; hindsight experience replay; mobile robot; reinforcement learning; sparse reward environments

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Conducción de Automóvil / Robótica Tipo de estudio: Prognostic_studies Idioma: En Revista: Sensors (Basel) Año: 2022 Tipo del documento: Article Pais de publicación: Suiza

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google