← Back to Machine Learning cs.LG
A simulator for predicting large-scale LLM training and inference performance
Mengtian Yang, Zhekun Zhang, Mingheng Wu, Jianwen Yan, Hanshi Sun, Li-wen Chang
May 16, 2026
Deploying large-scale LLM training and inference requires navigating a complex space of parallelism strategies, system optimizations, and hardware choices. Charon is a modular simulator that predicts performance across these configurations with high accuracy: under 5.35% prediction error overall, and 3.74% for training on large GPU clusters. In a practical inference case, the simulator identified a configuration that outperformed an engineering-tuned baseline, demonstrating utility for practitioners optimizing real systems.
Read the original paper →