LangDAug: Langevin Data Augmentation for Multi-Source Domain Generalization in Medical Image Segmentation

^† Indian Institute of Science, Bangalore

ICML 2025

Overview of our proposed LangDAug method.

Abstract

LangDAug is a Langevin-based data augmentation method for improving domain generalization in 2D medical image segmentation. It leverages Energy-Based Models (EBMs) trained via contrastive divergence to generate intermediate samples that bridge source domains using Langevin Dynamics. These samples act as natural augmentations, improving generalization to unseen domains.

We show that LangDAug provides a regularizing effect, theoretically bounding model complexity by the intrinsic dimensionality of the data manifold. Empirically, it outperforms state-of-the-art methods on retinal fundus and prostate MRI segmentation tasks, and complements domain-randomization strategies effectively.

Methodology

Problem Setup

Given multiple source domains $\{D_i\}_{i=1}^n$ , each associated with a distribution $P_{D_i}(x, y)$ over the input-output space $\mathcal{X} \times \mathcal{Y}$ , the standard Empirical Risk Minimization (ERM) objective seeks to find model parameters $\hat{\theta}$ that minimize the average loss over all training samples:

\hat{\theta} = \arg\min_{\theta} \frac{1}{N} \sum_{i=1}^{N} \ell(f_\theta(x_i), y_i)

where $N = \sum_{i=1}^n |D_i|$ denotes the total number of training samples aggregated from all source domains. However, ERM often fails to generalize to an unseen target domain $D_{n+1} \notin \{D_i\}_{i=1}^n$ , as it only optimizes performance over the observed source domains.

Inter-Domain Traversal with EBMs

To bridge domains, we train an EBM $E_{\theta_{ij}}$ to model the energy between domain pairs $(D_i, D_j)$ . The model is trained using Contrastive Divergence:

\nabla_{\theta_{ij}} \mathcal{L}_{CD} = \mathbb{E}_{x \sim P_{D_j}}[\nabla_{\theta_{ij}} E_{\theta_{ij}}(x)] - \mathbb{E}_{x \sim P_{\theta_{ij}}}[\nabla_{\theta_{ij}} E_{\theta_{ij}}(x)]

P_{\theta_{ij}} = \frac{\exp(-E_{\theta_{ij}}(x))}{Z_{\theta_{ij}}}, \quad \text{where } Z_{\theta_{ij}} = \int_{\mathcal{X}} \exp(-E_{\theta_{ij}}(x)) \, dx

Sampling from $P_{\theta_{ij}}$ is done using Langevin Dynamics with the chain being initialized at a point $x_0 \sim P_{D_i}$ :

x_{t+1} = x_t - \frac{\alpha^2}{2} \nabla_x E_{\theta_{ij}}(x_t) + \alpha \epsilon, \quad \epsilon \sim \mathcal{N}(0, I)

These LD iterates form samples that interpolate between domains.

Langevin Data Augmentation

We use the intermediate LD samples as augmentation data. For each sample $x_j \in D_i$ , LD is run for $K$ steps to generate $\{x_j^t\}_{t=1}^K$ :

x_{j}^{t+1} = x_j^t - \frac{\beta^2}{2} \nabla_x E_{\theta_{ij}}(x_j^t) + \beta \epsilon

These samples, combined with original labels $y_j$ , are used in ERM training, effectively expanding the domain support:

\mathcal{D}_{\text{aug}} = \bigcup_{i \neq j, k} D_{ij}^k, \quad \text{where } D_{ij}^k = \{(x_j^k, y_j)\}

Theoretical Insights

LangDAug acts as a regularizer on the ERM objective. Let $\tilde{x}_i$ be the Langevin-perturbed sample. Then the augmented empirical risk is:

\mathcal{L}_{\text{aug}}(\theta, \mathcal{D}) = \frac{1}{k} \sum_{i=1}^k \mathbb{E}_{\epsilon \sim \mathcal{N}(0, I)} [\ell(\theta, \tilde{z}_i)]

This can be decomposed as:

\mathcal{L}_{\text{aug}} = \mathcal{L}_{\text{std}} + R_1 + R_2 + R_3

Where $R_1, R_2, R_3$ are regularization terms involving the first and second derivatives of the model $f_\theta$ , encouraging smoother and flatter solutions.

Cross-domain generalization performance

Retinal Fundus Segmentation

Method	Domain A	Domain B	Domain C	Domain D	Avg mIoU	Avg mDSC
Hutchinson	66.73	66.73	69.36	66.73	67.39	78.14
MixStyle	80.76	67.69	79.79	77.09	76.33	85.58
FedDG	76.65	72.14	76.10	75.96	75.21	83.67
RAM	77.42	73.79	79.66	78.74	77.40	85.39
TriD	80.92	72.45	79.34	78.96	77.92	85.95
LangDAug (Ours)	78.79	75.05	81.01	80.51	78.84	87.61

Prostate MRI Segmentation

Method	Domain A	Domain B	Domain C	Domain D	Domain E	Domain F	Avg ASD	Avg DSC
Hutchinson	3.28	1.48	2.07	3.98	2.78	1.64	2.54	78.62
MixStyle	0.72	0.88	1.62	0.65	1.59	0.51	1.00	86.27
FedDG	1.09	0.93	1.31	0.88	1.73	0.50	1.07	85.95
RAM	0.93	0.98	1.26	0.74	1.78	0.32	1.00	87.02
TriD	0.70	0.72	1.39	0.71	1.43	0.46	0.90	87.68
LangDAug (Ours)	0.58	0.64	1.21	0.57	1.49	0.38	0.81	89.16

Inter-Domain Traversal Examples

Retinal Fundus Dataset

Prostate MRI Dataset

BibTeX citation

If you find this work useful, please cite:

@inproceedings{tiwary2025langdaug,
  title={LangDAug: Langevin Data Augmentation for Multi-Source Domain Generalization in Medical Image Segmentation},
  author={Tiwary, Piyush and Bhattacharyya, Kinjawl and Prathosh, A.P.},
  booktitle={Proceedings of the 42nd International Conference on Machine Learning},
  year={2025}
}