ZECO: ZeroFusion Guided 3D MRI Conditional Generation

1Illinois Institute of Technology 2Univeristy of Michigan 3University of Illinois Chicago
†Corresponding author

Abstract

Medical image segmentation is crucial for enhancing diagnostic accuracy and treatment planning in Magnetic Resonance Imaging (MRI). However, acquiring precise lesion masks for segmentation model training demands specialized expertise and significant time investment, leading to a small dataset scale in clinical practice. In this paper, we present ZECO, a ZeroFusion guided 3D MRI conditional generation framework that extracts, compresses, and generates high-fidelity MRI images with corresponding 3D segmentation masks to mitigate data scarcity. To effectively capture inter-slice relationships within volumes, we introduce a Spatial Transformation Module that encodes MRI images into a compact latent space for the diffusion process. Moving beyond unconditional generation, our novel ZeroFusion method progressively maps 3D masks to MRI images in latent space, enabling robust training on limited datasets while avoiding overfitting. ZECO outperforms state-of-the-art models in both quantitative and qualitative evaluations on Brain MRI datasets across various modalities, showcasing its exceptional capability in synthesizing high-quality MRI images conditioned on segmentation masks.

Pipeline

Overview of ZECO

(a) The Spatial Transformation Module encodes MRI images into latent space. (b) The 3D U-Net reconstruct latent images across timesteps in the reverse diffusion process. (c) The ZeroFusion generates Middle-feature and Down-feature conditioned on segmentation masks C for controllable generation.

Qualitative Result

ZECO consistently generates accurate and anatomically coherent slices in both Flair and T1 MRI modalities.

Quantitative Result

CT Reconstruction

ZECO achieves substantial improvements across all evaluation metrics consistently across experimental settings.

Scientific Application

CT Reconstruction

ZECO has demonstrated strong generalization in generating 3D neuron images from sparse datasets in our scientific project, see details in project page here.


With sparse labeled data for training, our generative model generates 1,000 synthetic images that capture both the morphology and spatial distribution of neurons, guided by segmentation patterns.

BibTeX