REVISTA
Information

TODAS

Redirigiendo al acceso original de articulo en 24 segundos...

Inicio / Information / Vol: 9 Par: 12 (2018) / Artículo

ARTÍCULO

TITULO

A Compression-Based Toolkit for Modelling and Processing Natural Language Text

William John Teahan

Resumen

A novel compression-based toolkit for modelling and processing natural language text is described. The design of the toolkit adopts an encoding perspective?applications are considered to be problems in searching for the best encoding of different transformations of the source text into the target text. This paper describes a two phase ?noiseless channel model? architecture that underpins the toolkit which models the text processing as a lossless communication down a noise-free channel. The transformation and encoding that is performed in the first phase must be both lossless and reversible. The role of the verification and decoding second phase is to verify the correctness of the communication of the target text that is produced by the application. This paper argues that this encoding approach has several advantages over the decoding approach of the standard noisy channel model. The concepts abstracted by the toolkit?s design are explained together with details of the library calls. The pseudo-code for a number of algorithms is also described for the applications that the toolkit implements including encoding, decoding, classification, training (model building), parallel sentence alignment, word segmentation and language segmentation. Some experimental results, implementation details, memory usage and execution speeds are also discussed for these applications.

Palabras claves

text compression - text processing - encoding - decoding

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 9 Parte: 12 (2018)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

REVISTAS SIMILARES

Applied Sciences
Aerospace
Algorithms

DOI

https://doi.org/10.3390/info9120294

Artículos similares

Sep-RefineNet: A Deinterleaving Method for Radar Signals Based on Semantic Segmentation

Acceso

Yongjiang Mao, Wenjuan Ren, Xipeng Li, Zhanpeng Yang and Wei Cao

With the progress of signal processing technology and the emergence of new system radars, the space electromagnetic environment becomes more and more complex, which puts forward higher requirements for the deinterleaving method of radar signals. Traditio... ver más

Revista: Applied Sciences

The Recommendation Algorithm Based on Improved Conditional Variational Autoencoder and Constrained Probabilistic Matrix Factorization

Acceso

Yunfei Zhang, Hongzhen Xu and Xiaojun Yu

An improved recommendation algorithm based on Conditional Variational Autoencoder (CVAE) and Constrained Probabilistic Matrix Factorization (CPMF) is proposed to address the issues of poor recommendation performance in traditional user-based collaborativ... ver más

Revista: Applied Sciences

Encoder?Decoder Structure Fusing Depth Information for Outdoor Semantic Segmentation

Acceso

Songnan Chen, Mengxia Tang, Ruifang Dong and Jiangming Kan

The semantic segmentation of outdoor images is the cornerstone of scene understanding and plays a crucial role in the autonomous navigation of robots. Although RGB?D images can provide additional depth information for improving the performance of semanti... ver más

Revista: Applied Sciences

CI-UNet: Application of Segmentation of Medical Images of the Human Torso

Acceso

Junkang Qin, Xiao Wang, Dechang Mi, Qinmu Wu, Zhiqin He and Yu Tang

The study of human torso medical image segmentation is significant for computer-aided diagnosis of human examination, disease tracking, and disease prevention and treatment. In this paper, two application tasks are designed for torso medical images: the ... ver más

Revista: Applied Sciences

Nearest Neighbours Graph Variational AutoEncoder

Acceso

Lorenzo Arsini, Barbara Caccia, Andrea Ciardiello, Stefano Giagu and Carlo Mancini Terracciano

Graphs are versatile structures for the representation of many real-world data. Deep Learning on graphs is currently able to solve a wide range of problems with excellent results. However, both the generation of graphs and the handling of large graphs st... ver más

Revista: Algorithms

Revistas destacadas

Acceso directo a los números publicados en la revista Infrastructures

Infrastructures

Acceso directo a los números publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los números publicados en la revista BiT

Acceso directo a los números publicados en la revista Revista de la Construcción

Revista de la Construcción

Ver todas las revistas