← Back to Computation and Language cs.CL
Can AI moderate communities with their own rulebooks?
Zoher Kachwala, Bao Tran Truong, Rasika Muralidharan, Haewoon Kwak, Jisun An, Filippo Menczer
May 16, 2026
Social media platforms are moving toward community-governed moderation, where each group sets its own rules and norms. This paper introduces PluRule, a multilingual benchmark spanning 1,989 Reddit communities, 2,885 unique rules, and 9 languages, framed as a multiple-choice problem: given a comment and context, identify which rule is violated. Testing state-of-the-art vision-language models reveals a fundamental gap—GPT-5.2 marginally outperforms trivial baselines, and larger models or additional context provide only slight improvements. Universal rules like civility and self-promotion are easier to detect than community-specific norms. The benchmark and code are publicly available.
Read the original paper →