← Back to Computer Vision
cs.CV

Teaching computers to read old handwritten sheet music

Pau Torras, Jiří Mayer, Carles Badal, Martina Dvořáková, Markéta Herzanová Vlková, Gerard Asbert, Vojtěch Dvořák, Samuel Šomorjai, Jan Hajič, Alicia Fornés

May 18, 2026

Libraries and archives hold thousands of digitized historical sheet music pages that remain unsearchable and unreadable to machines. This dataset provides 1,309 pages of primarily handwritten scores with complete MusicXML transcriptions and symbol annotations—the largest handwritten music collection released to date. It comes from real memory institutions, not synthetic data, making it suitable for training both end-to-end and object detection systems for optical music recognition.
Published as A Dataset for the Recognition of Historical and Handwritten Music Scores in Western Notation arXiv:2605.18436
Read the original paper →