Redirigiendo al acceso original de articulo en 18 segundos...
Inicio  /  Information  /  Vol: 10 Par: 2 (2019)  /  Artículo
ARTÍCULO
TITULO

Automatic Acquisition of Annotated Training Corpora for Test-Code Generation

Magdalena Kacmajor and John D. Kelleher    

Resumen

Open software repositories make large amounts of source code publicly available. Potentially, this source code could be used as training data to develop new, machine learning-based programming tools. For many applications, however, raw code scraped from online repositories does not constitute an adequate training dataset. Building on the recent and rapid improvements in machine translation (MT), one possibly very interesting application is code generation from natural language descriptions. One of the bottlenecks in developing these MT-inspired systems is the acquisition of parallel text-code corpora required for training code-generative models. This paper addresses the problem of automatically synthetizing parallel text-code corpora in the software testing domain. Our approach is based on the observation that self-documentation through descriptive method names is widely adopted in test automation, in particular for unit testing. Therefore, we propose synthesizing parallel corpora comprised of parsed test function names serving as code descriptions, aligned with the corresponding function bodies. We present the results of applying one of the state-of-the-art MT methods on such a generated dataset. Our experiments show that a neural MT model trained on our dataset can generate syntactically correct and semantically relevant short Java functions from quasi-natural language descriptions of functionality.

 Artículos similares

       
 
Milon Chowdhury, Md Nasim Reza, Mohammod Ali, Md Shaha Nur Kabir, Shafik Kiraga, Seung-Jin Lim, Il-Su Choi and Sun-Ok Chung    
Vibration assessment of upland crop machinery under development is essential because high vibrational exposures affect machine efficiency, service life of components, degradation of the working environment, and cause health risks to the operator. It is i... ver más
Revista: Applied Sciences

 
Feiyun Wang, Chengxu Lv, Yuxuan Pan, Liming Zhou and Bo Zhao    
External defects of kiwifruit seriously affect its added commercialization. To address the existing problems, kiwifruit external defects detection has a few methods for detecting multi-category defects and weak adaptability to complex images. In this stu... ver más
Revista: Applied Sciences

 
Xinfan Yin, Hongxu Ma, Xianmin Peng, Guichuan Zhang, Honglei An and Liangquan Wang    
In order to simulate the flight state of the helicopter effectively, it is necessary to trim the helicopter during the forward flight in a wind tunnel test. Previously, due to the lack of an internal-control closed loop in the test rig, the helicopter-wi... ver más
Revista: Aerospace

 
Shaona Wang, Yang Liu and Linlin Li    
In this study, a novel feature learning method for synthetic aperture radar (SAR) image automatic target recognition is presented. It is based on spatial pyramid matching (SPM), which represents an image by concatenating the pooling feature vectors that ... ver más
Revista: Applied Sciences

 
Antonella Falini    
The Earth?s observation programs, through the acquisition of remotely sensed hyperspectral images, aim at detecting and monitoring any relevant surface change due to natural or anthropogenic causes. The proposed algorithm, given as input a pair of hypers... ver más
Revista: Algorithms