REVISTA
Algorithms

TODAS

Redirigiendo al acceso original de articulo en 23 segundos...

Inicio / Algorithms / Vol: 16 Par: 10 (2023) / Artículo

ARTÍCULO

TITULO

Deep Neural Networks Training by Stochastic Quasi-Newton Trust-Region Methods

Mahsa Yousefi and Ángeles Martínez

Resumen

While first-order methods are popular for solving optimization problems arising in deep learning, they come with some acute deficiencies. To overcome these shortcomings, there has been recent interest in introducing second-order information through quasi-Newton methods that are able to construct Hessian approximations using only gradient information. In this work, we study the performance of stochastic quasi-Newton algorithms for training deep neural networks. We consider two well-known quasi-Newton updates, the limited-memory Broyden?Fletcher?Goldfarb?Shanno (BFGS) and the symmetric rank one (SR1). This study fills a gap concerning the real performance of both updates in the minibatch setting and analyzes whether more efficient training can be obtained when using the more robust BFGS update or the cheaper SR1 formula, which?allowing for indefinite Hessian approximations?can potentially help to better navigate the pathological saddle points present in the non-convex loss functions found in deep learning. We present and discuss the results of an extensive experimental study that includes many aspects affecting performance, like batch normalization, the network architecture, the limited memory parameter or the batch size. Our results show that stochastic quasi-Newton algorithms are efficient and, in some instances, able to outperform the well-known first-order Adam optimizer, run with the optimal combination of its numerous hyperparameters, and the stochastic second-order trust-region STORM algorithm.

Palabras claves

stochastic optimization - quasi-Newton methods - trust-region methods - BFGS - SR1 - deep neural networks training MSC: 90C30 - 90C06 - 90C53 - 90C90 - 65K05

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 16 Parte: 10 (2023)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

REVISTAS SIMILARES

Journal of Marine Science and Engineering
Algorithms
Applied Sciences

DOI

https://doi.org/10.3390/a16100490

Artículos similares

ADMM-1DNet: Online Monitoring Method for Outdoor Mechanical Equipment Part Signals Based on Deep Learning and Compressed Sensing

Acceso

Jingyi Hu, Junfeng Guo, Zhiyuan Rui and Zhiming Wang

To solve the problem that noise seriously affects the online monitoring of parts signals of outdoor machinery, this paper proposes a signal reconstruction method integrating deep neural network and compression sensing, called ADMM-1DNet, and gives a deta... ver más

Revista: Applied Sciences

Deep Learning-Based Wave Overtopping Prediction

Acceso

Alberto Alvarellos, Andrés Figuero, Santiago Rodríguez-Yáñez, José Sande, Enrique Peña, Paulo Rosa-Santos and Juan Rabuñal

Port managers can use predictions of the wave overtopping predictors created in this work to take preventative measures and optimize operations, ultimately improving safety and helping to minimize the economic impact that overtopping events have on the p... ver más

Revista: Applied Sciences

Improved SE-ResNet Acoustic?Vibration Fusion for Rolling Bearing Composite Fault Diagnosis

Acceso

Xiaojiao Gu, Yang Tian, Chi Li, Yonghe Wei and Dashuai Li

The fault diagnosis method proposed in this paper can be applied to the diagnosis of bearings in machine tool spindle systems.

Revista: Applied Sciences

DBSTGNN-Att: Dual Branch Spatio-Temporal Graph Neural Network with an Attention Mechanism for Cellular Network Traffic Prediction

Acceso

Zengyu Cai, Chunchen Tan, Jianwei Zhang, Liang Zhu and Yuan Feng

As network technology continues to develop, the popularity of various intelligent terminals has accelerated, leading to a rapid growth in the scale of wireless network traffic. This growth has resulted in significant pressure on resource consumption and ... ver más

Revista: Applied Sciences

Sequential Brain CT Image Captioning Based on the Pre-Trained Classifiers and a Language Model

Acceso

Jin-Woo Kong, Byoung-Doo Oh, Chulho Kim and Yu-Seop Kim

Intracerebral hemorrhage (ICH) is a severe cerebrovascular disorder that poses a life-threatening risk, necessitating swift diagnosis and treatment. While CT scans are the most effective diagnostic tool for detecting cerebral hemorrhage, their interpreta... ver más

Revista: Applied Sciences

Revistas destacadas

Acceso directo a los números publicados en la revista Infrastructures

Infrastructures

Acceso directo a los números publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los números publicados en la revista BiT

Acceso directo a los números publicados en la revista Revista de la Construcción

Revista de la Construcción

Ver todas las revistas