← Back to Computation and Language
cs.CL

Do neurons specialize more as models grow larger?

Amil Dravid, Yasaman Bahri, Alexei A. Efros, Yossi Gandelsman

June 2, 2026

Scaling laws describe how loss improves with model size, but what happens inside? This work tracks Rosetta Neurons—neurons that activate similarly across independently trained models—from 30B language models to 5B vision models. They find these interpretable neurons follow a sublinear scaling law: their absolute count grows, but they represent a shrinking percentage of total neurons. Critically, these neurons become increasingly selective and monosemantic (responding to single concepts) as models scale, while non-Rosetta neurons remain scattered. An analytical model explains this polarization as competition for limited neuron capacity. The findings reveal that interpretability and specialization improve with scale, not degrade.
Published as Neuron Populations Exhibit Divergent Selectivity with Scale arXiv:2606.03990
Read the original paper →