Why deep networks generalize: a geometry-based theory

This work establishes a pointwise generalization theory for fully connected networks by introducing the pointwise Riemannian Dimension, a measure derived from eigenvalue spectra of learned representations across layers. The approach resolves previous barriers to characterizing feature learning in deep networks and provides representation-aware generalization bounds that improve significantly over existing methods based on model size, norm products, and infinite-width limits. Empirically, the framework shows that the Riemannian Dimension exhibits feature compression, decreases with over-parameterization, and captures optimizer implicit bias. The theory demonstrates that deep networks are mathematically tractable in practical settings and that their generalization behavior is explained by pointwise, feature-spectrum-aware complexity.