Making prompt optimization transparent through explanations

Prompt optimization typically treats instruction design as a black-box search problem, obscuring why changes succeed or fail. iPOE instead guides optimization by extracting guidelines from explanations of annotation decisions—either from the LLM itself or humans—then iteratively refines these guidelines through removal, addition, shuffling, and merging. The resulting prompts are interpretable, embedding explicit annotation instructions that make both the model's reasoning and the optimization process transparent. Tested on four datasets, iPOE achieves up to 35% improvement over random guidelines and up to 31% over prompts lacking guidelines. Importantly, automatically generated LLM explanations perform comparably to human explanations, lowering the barrier for non-experts to optimize prompts in specialized domains.