Can AI personality tests avoid the biases that plague human ones?

Ming Wang, Shuang Wu, Bixuan Wang, Lu Lin, Yuxin Chen, Xiaocui Yang, Daling Wang, Shi Feng, Yifei Zhang, Yufan Sun

Self-report questionnaires for measuring AI agent psychology suffer from training-data contamination and social bias—answering what people think they should say rather than what they actually do. GenPT adapts classical psychology's projective tests (Rorschach, TAT) to use AI-generated stimuli, measuring how agents respond narratively rather than checking boxes. On depression and suicide ideation, projective responses showed massive shifts in real scenarios while questionnaires stayed artificially flat, suggesting GenPT captures genuine behavioral patterns questionnaires miss. Code released.