← Back to Computation and Language cs.CL
Do neurons specialize more as models grow larger?
Amil Dravid, Yasaman Bahri, Alexei A. Efros, Yossi Gandelsman
June 2, 2026
Scaling laws describe how loss improves with model size, but what happens inside? This work tracks Rosetta Neurons—neurons that activate similarly across independently trained models—from 30B language models to 5B vision models. They find these interpretable neurons follow a sublinear scaling law: their absolute count grows, but they represent a shrinking percentage of total neurons. Critically, these neurons become increasingly selective and monosemantic (responding to single concepts) as models scale, while non-Rosetta neurons remain scattered. An analytical model explains this polarization as competition for limited neuron capacity. The findings reveal that interpretability and specialization improve with scale, not degrade.
Read the original paper →