← Back to Computer Vision cs.CV
Extracting hidden details to fix image reconstruction in autoencoders
Tianhang Wang, Yitong Chen, Wei Song, Zuxuan Wu, Min Li, Jiaqi Wang
May 21, 2026
Representation autoencoders using frozen vision models generate sharp images but reconstruct poorly because freezing limits spatial detail. DecQ solves this with lightweight queries that extract fine-grained information from intermediate layers, feeding it into the decoder. The result: reconstruction quality jumps from 19.13 to 22.76 dB PSNR, generative convergence accelerates 3.3×, and the model hits FID 1.41—all with just 3.9% extra computation and no fine-tuning needed.
Read the original paper →