WDM: 3D Wavelet Diffusion Models for High-Resolution Medical Image Synthesis

Due to the three-dimensional nature of CT- or MR-scans, generative modelling of medical images is a particularly challenging task. We propose WDM, a wavelet-based medical image synthesis framework that applies a diffusion model on wavelet decomposed images. [1]

Our proposed approach (Fig. 1) follows a concept that is closely related to Latent Diffusion Models but replaces the first-stage autoencoder with the discrete wavelet transform, a dataset-agnostic, training-free approach for spatial dimensionality reduction.

Experiments on unconditional image generation tasks demonstrated state-of-the-art image fidelity (FID) and sample diversity (MS-SSIM) scores compared to recent GANs, Diffusion Models, and Latent Diffusion Models. The proposed method is capable of producing high resolution images (256³) on standard hardware (a single 40GB GPU).

We additionally present a conditional WDM [2] for solving image-to-image translation tasks on high-resolution volumes (Fig. 3). We apply our model to a missing modality synthesis task (Fig. 2) that allows for the application of pre-trained brain tumor segmentation models – even if required image modalities are missing.

More information and relevant links can be found on the project page: https://pfriedri.github.io/wdm-3d-io/

Schematic overview of the proposed wavelet-based image synthesis framework. A diffusion model is trained on the wavelet coefficients of the real input data. During sampling, starting from random wavelet coefficients, T denoising steps are performed iteratively to predict denoised wavelet coefficients. The final output images are produced by applying Inverse Discrete Wavelet Transform (IDWT) to the generated wavelet coefficients.

Schematic overview of the proposed pipeline for missing MR image generation - in this case, for a missing FLAIR image. We aim to generate the missing modality image conditioned on the three available images, ultimately allowing for pre-trained downstream task segmentation models to be applied.

Schematic overview of the proposed conditional Wavelet Diffusion Model. The process of generating the output images wavelet coefficients is conditioned on the wavelet coefficients of the conditioning images by concatenating them with the noisy coefficients in each denoising step.

Prof. Dr. Philippe Cattin philippe.cattin@unibas.ch

[1] Friedrich, Paul, et al. "Wdm: 3d wavelet diffusion models for high-resolution medical image synthesis." MICCAI Workshop on Deep Generative Models. Cham: Springer Nature Switzerland, 2024.

[2] Friedrich, Paul, et al. "cWDM: Conditional Wavelet Diffusion Models for Cross-Modality 3D Medical Image Synthesis." arXiv preprint arXiv:2411.17203. 2024..

Quick Links

Social Media

WDM: 3D Wavelet Diffusion Models for High-Resolution Medical Image Synthesis

Persons

Publications