Resumen
Ship collisions often result in huge losses of life, cargo and ships, as well as serious pollution of the water environment. Meanwhile, it is estimated that between 75% and 86% of maritime accidents are related to human factors. Thus, it is necessary to enhance the intelligence of ships to partially or fully replace the traditional piloting mode and eventually achieve autonomous collision avoidance to reduce the influence of human factors. In this paper, we propose a multi-ship automatic collision avoidance method based on a double deep Q network (DDQN) with prioritized experience replay. Firstly, we vectorize the predicted hazardous areas as the observation states of the agent so that similar ship encounter scenarios can be clustered and the input dimension of the neural network can be fixed. The reward function is designed based on the International Regulations for Preventing Collision at Sea (COLREGs) and human experience. Different from the architecture of previous collision avoidance methods based on deep reinforcement learning (DRL), in this paper, the interaction between the agent and the environment occurs only in the collision avoidance decision-making phase, which greatly reduces the number of state transitions in the Markov decision process (MDP). The prioritized experience replay method is also used to make the model converge more quickly. Finally, 19 single-vessel collision avoidance scenarios were constructed based on the encounter situations classified by the COLREGs, which were arranged and combined as the training set for the agent. The effectiveness of the proposed method in close-quarters situation was verified using the Imazu problem. The simulation results show that the method can achieve multi-ship collision avoidance in crowded waters, and the decisions generated by this method conform to the COLREGs and are close to the level of human ship handling.