Resumen
Efficient document recognition and sharing remain challenges in the healthcare, insurance, and finance sectors. One solution to this problem has been the use of deep learning techniques to automatically extract structured information from paper documents. Specifically, the structured extraction of a medical examination report (MER) can enhance medical efficiency, data analysis, and scientific research. While current methods focus on reconstructing table bodies, they often overlook table headers, leading to incomplete information extraction. This paper proposes MSIE-Net (multi-stage-structured information extraction network), a novel structured information extraction method, leveraging refined attention transformers and associated entity detection to enhance comprehensive MER information retrieval. MSIE-Net includes three stages. First, the RVI-LayoutXLM (refined visual-feature independent LayoutXLM) targets key information extraction. In this stage, the refined attention accentuates the interaction between different modalities by adjusting the attention score at the current position using previous position information. This design enables the RVI-LayoutXLM to learn more specific contextual information to improve extraction performance. Next, the associated entity detection module, RIFD-Net (relevant intra-layer fine-tuned detection network), identifies each test item?s location within the MER table body. Significantly, the backbone of RIFD-Net incorporates the intra-layer feature adjustment module (IFAM) to extract global features while homing in on local areas, proving especially sensitive for inspection tasks with dense and long bins. Finally, structured post-processing based on coordinate aggregation links the outputs from the prior stages. For the evaluation, we constructed the Chinese medical examination report dataset (CMERD), based on real medical scenarios. MSIE-Net demonstrated competitive performance in tasks involving key information extraction and associated entity detection. Experimental results validate MSIE-Net?s capability to successfully detect key entities in MER and table images with various complex layouts, perform entity relation extraction, and generate structured labels, laying the groundwork for intelligent medical documentation.