Research on hybrid policy optimization method based on deep reinforcement learning for ship heading control and path following

2026-02-24

Sijin Yu, Yunbo Li, Jiaye Gong,
Research on hybrid policy optimization method based on deep reinforcement learning for ship heading control and path following,
Ocean Engineering,
Volume 335,
2025,
121597,
ISSN 0029-8018,
https://doi.org/10.1016/j.oceaneng.2025.121597.
(https://www.sciencedirect.com/science/article/pii/S0029801825013034)
Abstract: With the development of autonomous ships, achieving efficient and precise control in complex environments has become a critical challenge. This study proposes a Hybrid Policy Optimization algorithm based on deep reinforcement learning to address the problem of ship heading control. By integrating a PID controller with a Deep Reinforcement Learning agent, the HPO method leverages the initial policy guidance capability of PID control and the adaptive optimization characteristics of the Proximal Policy Optimization algorithm, achieving efficient and stable control performance. The motion of the ship is modeled using a 3-DOF dynamic model, and the accuracy of the established dynamic model was validated. The reinforcement learning agent was trained through interaction with this numerical model, achieving precise control and stable maintenance of the target heading. Furthermore, the applicability of the HPO algorithm to complex path-following tasks was validated by combining it with the Line-of-Sight guidance method. This included scenarios with fixed target angle paths, random waypoint distributions, and complex-shaped path trajectories. The experimental results demonstrate that the HPO control algorithm achieves high-precision heading control under both calm water and wave conditions, and performs excellently in dynamic response during complex path-following tasks.
Keywords: Autonomous ships; Hybrid policy optimization; Deep reinforcement learning; Heading control; Path following