Redirigiendo al acceso original de articulo en 18 segundos...
ARTÍCULO
TITULO

Algorithms for Table Structure Recognition

Yosveni Escalona    

Resumen

Tables are widely adopted to organize and publish data. For example, the Web has an enormous number of tables, published in HTML, embedded in PDF documents, or that can be simply downloaded from Web pages. However, tables are not always easy to interpret due to the variety of features and formats used. Indeed, a large number of methods and tools have been developed to interpreted tables. This work presents the implementation of an algorithm, based on Conditional Random Fields (CRFs), to classify the rows of a table as header rows, data rows or metadata rows. The implementation is complemented by two algorithms for table recognition in a spreadsheet document, respectively based on rules and on region detection. Finally, the work describes the results and the benefits obtained by applying the implemented algorithm to HTML tables, obtained from the Web, and to spreadsheet tables, downloaded from the Brazilian National Petroleum Agency.

 Artículos similares

       
 
Wenbin Zheng, Jinjin Li and Shujiao Liao    
Multi-label learning has become a hot topic in recent years, attracting scholars? attention, including applying the rough set model in multi-label learning. Exciting works that apply the rough set model into multi-label learning usually adapt the rough s... ver más
Revista: Information

 
Subramanyam Shashi Kumar and Prakash Ramachandran    
Nowadays, healthcare is becoming very modern, and the support of Internet of Things (IoT) is inevitable in a personal healthcare system. A typical personal healthcare system acquires vital parameters from human users and stores them in a cloud platform f... ver más
Revista: Applied Sciences

 
Chun-Liang Lee, Guan-Yu Lin and Yaw-Chung Chen    
To support advanced network services, Internet routers must perform packet classification based on a set of rules called packet filters. If two or more filters overlap, a filter conflict will occur and lead to ambiguity in packet classification. Further,... ver más
Revista: Algorithms

 
Carlos Serrano, Jesus-Enrique Sierra-Garcia and Matilde Santos    
Floating offshore wind turbines (FOWTs) are systems with complex and highly nonlinear dynamics; they are subjected to heavy loads, making control with classical strategies a challenge. In addition, they experience vibrations due to wind and waves. Furthe... ver más

 
Marco Peterson, Minzhen Du, Bryant Springle and Jonathan Black    
The proliferation of reusable space vehicles has fundamentally changed how assets are injected into the low earth orbit and beyond, increasing both the reliability and frequency of launches. Consequently, it has led to the rapid development and adoption ... ver más
Revista: Aerospace