← Back to Machine Learning (Statistics) stat.ML
Can synthetic data from AI improve statistical inference without full models?
Jiguang Li, Sid Kankanala, Veronika Rockova
May 29, 2026
When you know the relationships your data should satisfy (moment conditions) but not the full probability model, standard inference breaks down. This work builds a Bayesian framework around empirical likelihood—assigning weights to observed data so sample moments match exactly—and extends it to incorporate synthetic data from generative AI as regularization. The method projects posterior draws onto the moment constraints, stays computationally tractable, and comes with theoretical convergence guarantees. In stock prediction from news headlines, AI-generated auxiliary data improved performance when domain-specific parameter priors were unavailable.
Read the original paper →