Robust Chinese Short Text Entity Disambiguation Method Based on Feature Fusion and Contrastive Learning

Qishun Mei and Xuhui Li

Resumen

To address the limitations of existing methods of short-text entity disambiguation, specifically in terms of their insufficient feature extraction and reliance on massive training samples, we propose an entity disambiguation model called COLBERT, which fuses LDA-based topic features and BERT-based semantic features, as well as using contrastive learning, to enhance the disambiguation process. Experiments on a publicly available Chinese short-text entity disambiguation dataset show that the proposed model achieves an F1-score of 84.0%, which outperforms the benchmark method by 0.6%. Moreover, our model achieves an F1-score of 74.5% with a limited number of training samples, which is 2.8% higher than the benchmark method. These results demonstrate that our model achieves better effectiveness and robustness and can reduce the burden of data annotation as well as training costs.

Palabras claves

short text - entity disambiguation - topic model - pre-trained model - feature fusion - contrastive learning

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 15 Parte: 3 (2024)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

REVISTAS SIMILARES

Journal of Marine Science and Engineering
Applied Sciences

DOI