← Back to Machine Learning
cs.LG

A simulator for predicting large-scale LLM training and inference performance

Mengtian Yang, Zhekun Zhang, Mingheng Wu, Jianwen Yan, Hanshi Sun, Li-wen Chang

May 16, 2026

Deploying large-scale LLM training and inference requires navigating a complex space of parallelism strategies, system optimizations, and hardware choices. Charon is a modular simulator that predicts performance across these configurations with high accuracy: under 5.35% prediction error overall, and 3.74% for training on large GPU clusters. In a practical inference case, the simulator identified a configuration that outperformed an engineering-tuned baseline, demonstrating utility for practitioners optimizing real systems.
Published as Charon: A Unified and Fine-Grained Simulator for Large-Scale LLM Training and Inference arXiv:2605.17164
Read the original paper →