← Back to Computation and Language
cs.CL

Expanding language models to new tongues without costly retraining

Hao Zhou, Tianhao Li, Zhijun Wang, Shuaijie She, Linjuan Wu, Hao-Ran Wei, Baosong Yang, Jiajun Chen, Shujian Huang

May 18, 2026

Extending LLMs to new languages typically requires expensive continued pre-training and alignment phases. This work resolves the core tension in parameter merging—where reducing conflicts with the original model weakens new language learning—by converting a dense model into a Mixture-of-Experts architecture with language-specific experts. The method transfers alignment ability by merging a post-training parameter delta into the CPT-enhanced base, skipping full alignment. Experiments show performance gains on new languages while maintaining original capabilities, with the approach generalizing across different models and post-training deltas.
Published as A Data-Efficient Path to Multilingual LLMs: Language Expansion via Post-training PARAM$Δ$ Integration into Upcycled MoE arXiv:2605.18083
Read the original paper →