REVISTA
AI

TODAS

Redirigiendo al acceso original de articulo en 24 segundos...

Inicio / AI / Vol: 4 Par: 4 (2023) / Artículo

ARTÍCULO

TITULO

Deep Learning Performance Characterization on GPUs for Various Quantization Frameworks

Muhammad Ali Shafique

Arslan Munir and Joonho Kong

Resumen

Deep learning is employed in many applications, such as computer vision, natural language processing, robotics, and recommender systems. Large and complex neural networks lead to high accuracy; however, they adversely affect many aspects of deep learning performance, such as training time, latency, throughput, energy consumption, and memory usage in the training and inference stages. To solve these challenges, various optimization techniques and frameworks have been developed for the efficient performance of deep learning models in the training and inference stages. Although optimization techniques such as quantization have been studied thoroughly in the past, less work has been done to study the performance of frameworks that provide quantization techniques. In this paper, we have used different performance metrics to study the performance of various quantization frameworks, including TensorFlow automatic mixed precision and TensorRT. These performance metrics include training time and memory utilization in the training stage along with latency and throughput for graphics processing units (GPUs) in the inference stage. We have applied the automatic mixed precision (AMP) technique during the training stage using the TensorFlow framework, while for inference we have utilized the TensorRT framework for the post-training quantization technique using the TensorFlow TensorRT (TF-TRT) application programming interface (API).We performed model profiling for different deep learning models, datasets, image sizes, and batch sizes for both the training and inference stages, the results of which can help developers and researchers to devise and deploy efficient deep learning models for GPUs.

Palabras claves

optimization - deep learning - quantization - performance - TensorRT - automatic mixed precision

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 4 Parte: 4 (2023)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

REVISTAS SIMILARES

Water
Applied Sciences
Algorithms

DOI

https://doi.org/10.3390/ai4040047

Artículos similares

CRAS: Curriculum Regularization and Adaptive Semi-Supervised Learning with Noisy Labels

Acceso

Ryota Higashimoto, Soh Yoshida and Mitsuji Muneyasu

This paper addresses the performance degradation of deep neural networks caused by learning with noisy labels. Recent research on this topic has exploited the memorization effect: networks fit data with clean labels during the early stages of learning an... ver más

Revista: Applied Sciences

An Attention-Based Method for the Minimum Vertex Cover Problem on Complex Networks

Acceso

Giorgio Lazzarinetti, Riccardo Dondi, Sara Manzoni and Italo Zoppis

Solving combinatorial problems on complex networks represents a primary issue which, on a large scale, requires the use of heuristics and approximate algorithms. Recently, neural methods have been proposed in this context to find feasible solutions for r... ver más

Revista: Algorithms

Information Retrieval and Machine Learning Methods for Academic Expert Finding

Acceso

Luis M. de Campos, Juan M. Fernández-Luna, Juan F. Huete, Francisco J. Ribadas-Pena and Néstor Bolaños

In the context of academic expert finding, this paper investigates and compares the performance of information retrieval (IR) and machine learning (ML) methods, including deep learning, to approach the problem of identifying academic figures who are expe... ver más

Revista: Algorithms

Deep Learning Method to Detect Missing Welds for Joist Assembly Line

Acceso

Hamed Raoofi, Asa Sabahnia, Daniel Barbeau and Ali Motamedi

Traditional methods of supervision in the construction industry are time-consuming and costly, requiring significant investments in skilled labor. However, with advancements in artificial intelligence, computer vision, and deep learning, these methods ca... ver más

Revista: Applied System Innovation

Identification of Time-Varying Conceptual Hydrological Model Parameters with Differentiable Parameter Learning

Acceso

Xie Lian, Xiaolong Hu, Liangsheng Shi, Jinhua Shao, Jiang Bian and Yuanlai Cui

The parameters of the GR4J-CemaNeige coupling model (GR4neige) are typically treated as constants. However, the maximum capacity of the production store (parX1) exhibits time-varying characteristics due to climate variability and vegetation coverage chan... ver más

Revista: Water

Revistas destacadas

Acceso directo a los números publicados en la revista Infrastructures

Infrastructures

Acceso directo a los números publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los números publicados en la revista BiT

Acceso directo a los números publicados en la revista Revista de la Construcción

Revista de la Construcción

Ver todas las revistas