← Back to Artificial Intelligence
cs.AI

Training AI to stop favoring one political side

Long Phan, Devin Kim, Alexander Pan, Alice Blair, Adam Khoja, Dan Hendrycks

May 21, 2026

Large language models show hidden political bias: they handle conservative and liberal topics asymmetrically, in rhetoric, depth, and engagement. Researchers identified 7 categories of this behavior and built two consistency metrics to measure it. Political Consistency Training, an RL method, forces models to treat paired opposing viewpoints with equal fairness and depth. The approach cuts covert bias significantly, preserves helpfulness, and transfers to unseen benchmarks.
Published as Reducing Political Manipulation with Consistency Training arXiv:2605.22771
Read the original paper →