Problem: Collecting text on maps from scanned images of historical maps requires a data model that allows researchers to:
- use synthetic AND/OR human annotations for training text spotting or HTR models
- move between different file formats (to save as IIIF annotations in particular)
- associate text with other map features (for example, a symbol that the text labels)
- establish versions of data for the same image processed by different models
This problem cropped up on the Machines Reading Maps project, but is now a shared concern across many projects creating data from text on maps.
At the Open Maps Meeting in November 2024, a number of us developed a sketch of a data model for these purposes:
If you would like to get involved in formalizing this data model, let us know here by opening a ticket or get in touch on the MapReader Slack workspace.