End-to-End AUV Motion Planning Method Based on Soft Actor-Critic.

Yu, Xin; Sun, Yushan; Wang, Xiangbin; Zhang, Guocheng

Yu, Xin; Sun, Yushan; Wang, Xiangbin; Zhang, Guocheng.

Afiliación

Yu X; Science and Technology on Underwater Vehicle Laboratory, Harbin Engineering University, Harbin 150001, China.
Sun Y; Science and Technology on Underwater Vehicle Laboratory, Harbin Engineering University, Harbin 150001, China.
Wang X; Science and Technology on Underwater Vehicle Laboratory, Harbin Engineering University, Harbin 150001, China.
Zhang G; Science and Technology on Underwater Vehicle Laboratory, Harbin Engineering University, Harbin 150001, China.

Sensors (Basel) ; 21(17)2021 Sep 01.

Article en En | MEDLINE | ID: mdl-34502781

RESUMEN

This study aims to solve the problems of poor exploration ability, single strategy, and high training cost in autonomous underwater vehicle (AUV) motion planning tasks and to overcome certain difficulties, such as multiple constraints and a sparse reward environment. In this research, an end-to-end motion planning system based on deep reinforcement learning is proposed to solve the motion planning problem of an underactuated AUV. The system directly maps the state information of the AUV and the environment into the control instructions of the AUV. The system is based on the soft actor-critic (SAC) algorithm, which enhances the exploration ability and robustness to the AUV environment. We also use the method of generative adversarial imitation learning (GAIL) to assist its training to overcome the problem that learning a policy for the first time is difficult and time-consuming in reinforcement learning. A comprehensive external reward function is then designed to help the AUV smoothly reach the target point, and the distance and time are optimized as much as possible. Finally, the end-to-end motion planning algorithm proposed in this research is tested and compared on the basis of the Unity simulation platform. Results show that the algorithm has an optimal decision-making ability during navigation, a shorter route, less time consumption, and a smoother trajectory. Moreover, GAIL can speed up the AUV training speed and minimize the training time without affecting the planning effect of the SAC algorithm.

Asunto(s)

Algoritmos; Aprendizaje; Simulación por Computador; Movimiento (Física)

Palabras clave

autonomous underwater vehicle (AUV); deep reinforcement learning (DRL); generative adversarial imitation learning (GAIL); motion planning; soft actorcritic (SAC)

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Base de datos: MEDLINE Asunto principal: Algoritmos / Aprendizaje Tipo de estudio: Prognostic_studies Idioma: En Revista: Sensors (Basel) Año: 2021 Tipo del documento: Article País de afiliación: China Pais de publicación: Suiza

Texto completo

Añadir a Mi BVS

Imprimir

XML

PubMed Links

Buscar en Google