← Back to Computer Vision
cs.CV

Can single microscopy images match multi-channel quality using teacher models?

Sakib Mohammad, Jarin Ritu, Md Sakhawat Hossain

May 30, 2026

Medical microscopy typically needs multiple fluorescent channels (nuclear, membrane) to segment tissue accurately, but that's expensive and limits deployment. This work uses a frozen foundation model (SAM ViT-H) trained on full multi-channel images to teach a tiny student network (1.5–27M parameters) using only the nuclear channel. The distillation combines probability matching, boundary-aware losses, and uncertainty weighting. On TissueNet, the SAM-distilled Swin-Tiny student jumps from 65.31 to 78.36 Dice—recovering 88% of the full-channel oracle's performance—consistently improving all four tested architectures by ~12 points without retraining across datasets.
Published as Single-Channel Tissue Segmentation via Cross-Modal Distillation from Foundation Models arXiv:2606.00928
Read the original paper →