← Back to Machine Learning
cs.LG

Why how you tokenize graphs matters more than you think

Maya Bechler-Speicher, Gilad Yehudai, Gil Harari, Clayton Sanford, Amir Globerson, Joan Bruna

May 21, 2026

Transformers need graphs converted into tokens before processing, but the paper proves this choice isn't neutral: different tokenization schemes (spectral, random-walk, adjacency) create distinct computational regimes where the same task may be trivial for one approach yet impossible for another at shallow depths. Random-walk tokenization provably loses information irretrievably; spectral tokenization preserves everything but struggles with local patterns. A shallow transformer cannot efficiently convert between tokenization families, meaning a poor initial choice can't be fixed by the model itself. Experiments on synthetic and real tasks confirm the theory and show that mixing complementary tokenizations helps.
Published as Lost in Tokenization: Fundamental Trade-offs in Graph Tokenization for Transformers arXiv:2605.22471
Read the original paper →