Can AI catch research flaws that human reviewers miss?

E3 is an automated review assistant that flags decision-relevant technical problems in research papers—unsupported claims, missing ablations, weak baselines, validity threats—and explains what evidence would resolve each issue. Tested on 100 ICLR 2026 papers using a clean backtesting protocol that avoids data contamination, E3 catches 90.2% of issues (partial-inclusive) versus 60.8% for human reviewers, and surfaces 406 additional concerns the ICLR panel missed entirely. Code, corpus, and evaluation templates are open.