Denoising Diffusion Models on Model-Based Latent Space

Carmelo Scribano

Danilo Pezzi

Giorgia Franchini and Marco Prato

Resumen

With the recent advancements in the field of diffusion generative models, it has been shown that defining the generative process in the latent space of a powerful pretrained autoencoder can offer substantial advantages. This approach, by abstracting away imperceptible image details and introducing substantial spatial compression, renders the learning of the generative process more manageable while significantly reducing computational and memory demands. In this work, we propose to replace autoencoder coding with a model-based coding scheme based on traditional lossy image compression techniques; this choice not only further diminishes computational expenses but also allows us to probe the boundaries of latent-space image generation. Our objectives culminate in the proposal of a valuable approximation for training continuous diffusion models within a discrete space, accompanied by enhancements to the generative model for categorical values. Beyond the good results obtained for the problem at hand, we believe that the proposed work holds promise for enhancing the adaptability of generative diffusion models across diverse data types beyond the realm of imagery.

Palabras claves

information theory - generative models - diffusion models - image compression - vector quantization - denoising

Acceso

PÁGINAS

pp. 0 - 0

NÚMERO

Volumen: 16 Parte: 11 (2023)

MATERIAS

INGENIERÍA Y CONSTRUCCIÓN CIVIL
TECNOLOGÍA

REVISTAS SIMILARES

Journal of Marine Science and Engineering
Supercomputing Frontiers and Innovations
IEEE TRANSACTIONS ON IMAGE PROCESSING

DOI