Inicio  /  Algorithms  /  Vol: 15 Par: 1 (2022)  /  Artículo
ARTÍCULO
TITULO

Accelerating Symmetric Rank-1 Quasi-Newton Method with Nesterov?s Gradient for Training Neural Networks

S. Indrapriyadarsini    
Shahrzad Mahboubi    
Hiroshi Ninomiya    
Takeshi Kamio and Hideki Asai    

Resumen

Gradient-based methods are popularly used in training neural networks and can be broadly categorized into first and second order methods. Second order methods have shown to have better convergence compared to first order methods, especially in solving highly nonlinear problems. The BFGS quasi-Newton method is the most commonly studied second order method for neural network training. Recent methods have been shown to speed up the convergence of the BFGS method using the Nesterov?s acclerated gradient and momentum terms. The SR1 quasi-Newton method, though less commonly used in training neural networks, is known to have interesting properties and provide good Hessian approximations when used with a trust-region approach. Thus, this paper aims to investigate accelerating the Symmetric Rank-1 (SR1) quasi-Newton method with the Nesterov?s gradient for training neural networks, and to briefly discuss its convergence. The performance of the proposed method is evaluated on a function approximation and image classification problem.

 Artículos similares

       
 
Puti Yan, Zhen Cao, Jiangbo Peng, Chaobo Yang, Xin Yu, Penghua Qiu, Shanchun Zhang, Minghong Han, Wenbei Liu and Zuo Jiang    
A flame?s structural feature is a crucial parameter required to comprehensively understand the interaction between turbulence and flames. The generation and evolution processes of the structure feature have rarely been investigated in lean blowout (LBO) ... ver más
Revista: Aerospace

 
Diya Wang, Yonglin Zhang, Lixin Wu, Yupeng Tai, Haibin Wang, Jun Wang, Fabrice Meriaudeau and Fan Yang    
In recent years, the study of deep learning techniques for underwater acoustic channel estimation has gained widespread attention. However, existing neural network channel estimation methods often overfit to training dataset noise levels, leading to dimi... ver más

 
Károly Héberger    
Background: The development and application of machine learning (ML) methods have become so fast that almost nobody can follow their developments in every detail. It is no wonder that numerous errors and inconsistencies in their usage have also spread wi... ver más
Revista: Algorithms

 
Stanislav Kirpichenko, Lev Utkin, Andrei Konstantinov and Vladimir Muliukha    
A method for estimating the conditional average treatment effect under the condition of censored time-to-event data, called BENK (the Beran Estimator with Neural Kernels), is proposed. The main idea behind the method is to apply the Beran estimator for e... ver más
Revista: Algorithms

 
Darian M. Onchis, Flavia Costi, Codruta Istin, Ciprian Cosmin Secasan and Gabriel V. Cozma    
(1) Background: Lung cancers are the most common cancers worldwide, and prostate cancers are among the second in terms of the frequency of cancers diagnosed in men. Automatic ranking of the risk groups of such diseases is highly in demand, but the clinic... ver más
Revista: Applied Sciences