3D Viewer Ranking table Proxy metrics Distributions Data & downloads

Distance Map Auxiliary Loss for Brain Tumor Segmentation

A Fragment-Centric Analysis and the Saturation Ceiling of Post-hoc Connected-Component Consensus Filtering

Guillaume Cassez · ORCID 0009-0007-0987-3931 · 2026

BraTS 2023 GLInnU-Net v2MedNeXt-B1196 patients5-fold CV

📄 Download PDF 📥 arXiv (pending)💻 Code + data (GitHub)🇫🇷 HAL 🔗 Zenodo DOI

Abstract

We study the use of a Signed Distance Transform (SDT) auxiliary loss on top of MedNeXt-B / nnU-Net v2 for 3D brain tumor segmentation on BraTS 2023 GLI. The SDT task brings a modest but statistically robust Dice improvement (+0.63 to +0.96 pp per region, Wilcoxon p < 10⁻⁶, n = 240) while introducing a new failure mode: spurious isolated connected components (fragments) not present in the ground truth, most acute in NCR and ED.

We propose a parameter-free post-hoc connected-component consensus filter (CC-consensus filter) : start from the DistMap prediction and, for each class, drop any connected component whose same-class mask has zero voxel overlap with the Baseline prediction. This rule is not a Mixture-of-Experts in the strict gating-network sense ; it is a hard-label consensus filter at the connected-component level. It reduces NCR fragments by 81 % (Wilcoxon p = 7 × 10⁻⁴) at no Dice cost.

A large-scale ceiling analysis on 1196 patients shows that the oracle per-class selection upper-bound is only +0.005 Dice avg above the default CC-consensus rule; 4 classifier families trained on 31 hand-crafted features (tumor morphology, topology, inter-model agreement) fail to robustly beat the default CC-consensus in 5-fold cross-validation. The gap cannot be closed without voxel-level probabilistic voting or architectural diversity — motivating a training-time fragment-aware loss over further post-hoc engineering.

Interactive 3D viewer

The viewer lets you explore every prediction (Baseline / DistMap / CC-Consensus / Ground Truth) on any of the 1196 patients. Six patients (C1–C6) are pinned at the top of the patient dropdown — each one illustrates a distinct ordering between the three outputs. Start there.

Tip — open the Patient dropdown and pick one of the ★ Research report 1 demo entries to see the C1–C6 case.

Headline results

1. DistMap auxiliary improves Dice on all 3 regions

Variant	Epochs	Avg Dice	WT	TC	ET
MedNeXt-B Baseline	10	0.8638	0.894	0.869	0.829
MedNeXt-B + DistMap	10	0.8713	0.9	0.875	0.838
nnU-Net + DistMap	10	0.8684	0.903	0.876	0.827

Per-region paired Wilcoxon (MedNeXt-B + DistMap vs Baseline, n = 240): WT p = 1.1×10⁻⁶, TC p = 7.9×10⁻⁷, ET p = 9.3×10⁻¹⁶. Largest gain on ET.

2. DistMap introduces fragments — the CC-consensus filter removes them

Fragments (CC ≤ 50 vx)	Baseline	DistMap	CC-Consensus	Δ CC-Cons. vs DistMap
NCR	116	179.4	34.8	−81 %
ED	33.2	49	19.2	−61 %
ET	0.8	1.8	0.8	−56 %
Dice avg	0.95	0.949	0.948	−0.04 pp (ns)

Wilcoxon paired fragments (CC-Consensus < DistMap): p = 7 × 10⁻⁴. Dice insensitive to these small CCs.

3. Cross-validated ceiling (1196 patients)

Strategy	Dice avg	Δ vs CC-consensus
Baseline only	0.9078	−0.00115
DistMap only	0.9088	−0.00020
CC-Consensus (default rule)	0.909	0 (ref)
Oracle patient-level (best-of-3)	0.9131	+0.00412
Oracle per-class (best-per-region)	0.9139	+0.00494
Best 1-feature rule — fit on all data	0.91016	+0.00119 (overfit)
Best 1-feature rule — 5-fold CV	0.90801	−0.00096
Meta-RF 31 features per-region (CV)	0.90807	−0.00090
Meta-LR 31 features per-region (CV)	0.90844	−0.00053
Meta-GBM 31 features per-region (CV)	0.90833	−0.00064

4. Case distribution — the CC-consensus filter is asymmetric

Case (F = CC-consensus output vs B, D)	n	%
C1 — baseline beats distmap	602	50.3 %
C2 — distmap beats baseline	593	49.6 %
C3 — B < F < D (filter pulled baseline-side)	390	32.6 %
C4 — D < F < B (filter pulled distmap-side)	169	14.1 %
C5 — filter worst (F < min(B, D))	463	38.7 %
C6 — filter best (F > max(B, D))	157	13.1 %

The CC-consensus filter damages the patient-level score in 38.7 % of cases against only 13.1 % of synergy — a ~3:1 asymmetry. Per-region: the filter wins strictly only 2.7 % on WT, 21.7 % on TC, 6.9 % on ET (38 % ET ties from empty / trivial GT). The filter benefit is concentrated on TC.

The six demonstration patients (★ in the viewer)

Case	Patient ID	Fold	B	D	F	Take-away
C1 — B > D	00048-001	1	0.983	0.308	0.973	DistMap hallucinates TC/ET on an oedema-only case
C2 — D > B	01437-000	2	0.589	0.923	0.923	DistMap rescues an under-segmenting Baseline
C3 — B < F < D	01428-000	1	0.618	0.656	0.645	Filter output sits between, pulled baseline-side
C4 — D < F < B	00017-001	0	0.991	0.657	0.89	Filter output rescues DistMap via consensus
C5 — F < min(B, D)	01530-000	1	0.241	0.541	0.169	Filter deletes a legitimate large DistMap CC
C6 — F > max(B, D)	00540-000	1	0.785	0.795	0.869	Clean synergy

→ Open the viewer and pick any of these in the ★ Research report 1 demo group of the Patient dropdown.

Try the CC-consensus filter on your own predictions

The rule is parameter-free (only 26-connectivity). Plug your own two segmentation arrays (DistMap + Baseline, both label maps in {0, 1, 2, 3}) :

import numpy as np
from scipy import ndimage as ndi

STRUCT_26 = ndi.generate_binary_structure(3, 3)

def cc_consensus_filter(distmap_seg, baseline_seg, classes=(1, 2, 3)):
    """Connected-component consensus filter.
    Start from DistMap; for each class, drop any CC whose same-class mask has
    zero voxel overlap with Baseline. Parameter-free (26-connectivity)."""
    filtered = distmap_seg.copy()
    for c in classes:
        d_mask = (distmap_seg == c); b_mask = (baseline_seg == c)
        if not d_mask.any(): continue
        lab, n = ndi.label(d_mask, structure=STRUCT_26)
        for cc_id in range(1, n + 1):
            cc = (lab == cc_id)
            if not np.any(cc & b_mask):
                filtered[cc] = 0  # unconfirmed fragment
    return filtered

Reproducibility

All numbers in this paper are reproducible from the GitHub repo in under 10 min on a modern CPU.

scripts/extract_patient_features.py — 20 GT-morphology features per patient (~5 min, 10 E-cores)
scripts/extract_agreement_features.py — 11 inter-model agreement features (~6 min)
scripts/select_demo_patients.py — 6 champions (C1–C6)
scripts/oracle_per_class.py — patient and per-class oracle bounds
scripts/sweep_adaptive_fusion.py — threshold sweep (~2 min, 10 workers)
scripts/meta_selector_perregion.py — 3 classifier families × 5-fold CV
scripts/simple_rule_cv.py — 1-feature rule in 5-fold CV (overfit check)

Cite

@article{cassez2026ccconsensus,
  title   = {Distance Map Auxiliary Loss for Brain Tumor Segmentation:
             A Fragment-Centric Analysis and the Saturation Ceiling of
             Post-hoc Connected-Component Consensus Filtering},
  author  = {Cassez, Guillaume},
  journal = {arXiv preprint},
  year    = {2026},
  url     = {https://guillaume-cassez.fr/brats/paper1/},
  doi     = {10.5281/zenodo.19695264}
}

Links

Feedback welcome at cassez.guillaume@gmail.com, via the GitHub issues, or on Bluesky @guillaume-cassez.bsky.social.