A model registry is a tool to catalog ML models and their versions. Models from your data science projects can be discovered, tested, shared, deployed, and audited from there. DVC and GTO enable these capabilities on top of Git, so you can stick to an existing software engineering stack.
This repo is an example of Model Registry built with these tools. The model dashboard:
$ gto show ╒══════════╤══════════╤═════════╤═════════╤════════════╕ │ name │ latest │ #dev │ #prod │ #staging │ ╞══════════╪══════════╪═════════╪═════════╪════════════╡ │ churn │ v3.1.1 │ v3.1.0 │ v3.0.0 │ v3.1.0 │ │ segment │ v0.4.1 │ v0.4.1 │ - │ - │ │ cv-class │ v0.1.13 │ - │ - │ - │ ╘══════════╧══════════╧═════════╧═════════╧════════════╛
- The
latest
column shows the latest model versions, - The
#dev
column represent model versions promoted to a Stagedev
(same for#prod
and#staging
), - Versions are registered and promoted to Stages by Git tags - you can click the links to see the which specific Git tag did it,
- Artifact metadata like
path
anddescription
is stored inartifacts.yaml
, - Github Actions page of this repo have examples of workflows where we act upon these Git tags.
Check out public Model Registry in Studio built on top of DVC and GTO that provides more insight into your ML models development, including training params, metrics and plots.
🧑💻 To continue learning, head to Get Started with GTO.