Inicio  /  Applied Sciences  /  Vol: 10 Par: 22 (2020)  /  Artículo
ARTÍCULO
TITULO

Learning-Free Text Line Segmentation for Historical Handwritten Documents

Berat Kurar Barakat    
Rafi Cohen    
Ahmad Droby    
Irina Rabaev and Jihad El-Sana    

Resumen

We present a learning-free method for text line segmentation of historical handwritten document images. This method relies on automatic scale selection together with second derivative of anisotropic Gaussian filters to detect the blob lines that strike through the text lines. Detected blob lines guide an energy minimization procedure to extract the text lines. Historical handwritten documents contain noise, heterogeneous text line heights, skews and touching characters among text lines. Automatic scale selection allows for automatic adaption to the heterogeneous nature of handwritten text lines in case the character height range is correctly estimated. In the extraction phase, the method can accurately split the touching characters among the text lines. We provide results investigating various settings and compare the model with recent learning-free and learning-based methods on the cBAD competition dataset.

 Artículos similares

       
 
Gursu Gurer, Yaser Dalveren, Ali Kara and Mohammad Derawi    
The automatic dependent surveillance broadcast (ADS-B) system is one of the key components of the next generation air transportation system (NextGen). ADS-B messages are transmitted in unencrypted plain text. This, however, causes significant security vu... ver más
Revista: Aerospace

 
Jean-Sébastien Dessureault, Félix Clément, Seydou Ba, François Meunier and Daniel Massicotte    
The field of interior home design has witnessed a growing utilization of machine learning. However, the subjective nature of aesthetics poses a significant challenge due to its variability among individuals and cultures. This paper proposes an applied ma... ver más
Revista: Information

 
Yue Zha, Yuanzhi Ke, Xiao Hu and Caiquan Xiong    
Named entity recognition (NER) is particularly challenging for medical texts due to the high domain specificity, abundance of technical terms, and sparsity of data in this field. In this work, we propose a novel attention layer, called the ?ontology atte... ver más
Revista: Applied Sciences

 
Sta?a Pu?karic, Mateo Sokac, ?ivana Nincevic, Heliodor Prelesnik and Knut Yngve Børsheim    
In this communication, we present the prototype of a new simulated in situ lab/on-deck incubator, the light spectrum replicator (LSR), and a method for simulating the measured in situ HOCR light spectrum curves in incubation chambers. We developed this s... ver más

 
Fenfang Li, Zhengzhang Zhao, Li Wang and Han Deng    
Sentence Boundary Disambiguation (SBD) is crucial for building datasets for tasks such as machine translation, syntactic analysis, and semantic analysis. Currently, most automatic sentence segmentation in Tibetan adopts the methods of rule-based and stat... ver más
Revista: Applied Sciences