How can robots trust their own learning when safety is at stake?

Robots working around humans must balance safety with efficiency, but current safety filters are overly conservative because they don't trust the robot's online learning. This work uses conformal prediction to formally verify that belief-space safety filters (which reason about human uncertainty) remain safe even when their neural inference makes mistakes. The method focuses verification only where the robot's reasoning is reliable, cutting out needless restrictions while keeping mathematical guarantees—demonstrated on simulated human-vehicle interaction.