Teaching AI to remember its mistakes and improve faster

Rubric-based reinforcement learning fine-tunes language models using structured reward signals, but existing methods discard evaluation insights after each step. AMARIS adds a persistent memory system that retrieves relevant past feedback—both recent and semantically similar—to refine rubrics over time. The result: consistent improvements across coding, math, and creative writing tasks with only ~5% computational overhead, showing that accumulated training history beats stateless per-step heuristics.