← Back to Computer Vision
cs.CV

Reconstructing mental images from brain scans with vision models

Reese Kneeland, Cesar Kadir Torrico Villanueva, Jordyn Ojeda, Shuhb Khanna, Jonathan Xu, Paul S. Scotti, Thomas Naselaris

May 16, 2026

Brain-to-image decoding models trained to reconstruct viewed images often fail when applied to mental imagery—internally generated visual representations. MIRAGE addresses this gap by combining a linear backbone with multi-modal text and image features fed into a diffusion model, explicitly designed to cross-decode mental images. Testing on the NSD-Imagery dataset shows MIRAGE outperforms existing vision decoders on mental image reconstruction, even those with strong performance on seen images. Ablations reveal that lower-dimensional image features plus guidance from both text and multi-level visual features work best. The finding that large-scale external stimulus datasets can effectively train mental image decoders suggests practical utility for brain imaging applications.
Published as MIRAGE: Robust multi-modal architectures translate fMRI-to-image models from vision to mental imagery arXiv:2605.17198
Read the original paper →