Resumen
Water segmentation is essential for the autonomous driving system of unmanned surface vehicles (USVs), which provides reliable navigation for making safety decisions. However, existing methods have only used monocular images as input, which often suffer from the changes in illumination and weather. Compared with monocular images, LiDAR point clouds can be collected independently of ambient light and provide sufficient 3D information but lack the color and texture that images own. Thus, in this paper, we propose a novel camera-LiDAR cross-modality fusion water segmentation method, which combines the data characteristics of the 2D image and 3D LiDAR point cloud in water segmentation for the first time. Specifically, the 3D point clouds are first supplemented with 2D color and texture information from the images and then distinguished into water surface points and non-water points by the early 3D cross-modality segmentation module. Subsequently, the 3D segmentation results and features are fed into the late 2D cross-modality segmentation module to perform 2D water segmentation. Finally, the 2D and 3D water segmentation results are fused for the refinement by an uncertainty-aware cross-modality fusion module. We further collect, annotate and present a novel Cross-modality Water Segmentation (CMWS) dataset to validate our proposed method. To the best of our knowledge, this is the first water segmentation dataset for USVs in inland waterways consisting of images and corresponding point clouds. Extensive experiments on the CMWS dataset demonstrate that our proposed method can significantly improve image-only-based methods, achieving improvements in accuracy and MaxF of approximately 2% for all the image-only-based methods.