Skip to content

Commit

Permalink
docs: update mlflow ingestion docs to include new concept mappings (#…
Browse files Browse the repository at this point in the history
…12791)

Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
  • Loading branch information
yoonhyejin and hsheth2 authored Mar 5, 2025
1 parent cf0dc3a commit 1d1ed78
Showing 1 changed file with 7 additions and 5 deletions.
12 changes: 7 additions & 5 deletions metadata-ingestion/docs/sources/mlflow/mlflow_pre.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,10 @@

This ingestion source maps the following MLflow Concepts to DataHub Concepts:

| Source Concept | DataHub Concept | Notes |
|:---------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------:|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [`Registered Model`](https://mlflow.org/docs/latest/model-registry.html#concepts) | [`MlModelGroup`](https://datahubproject.io/docs/generated/metamodel/entities/mlmodelgroup/) | The name of a Model Group is the same as a Registered Model's name (e.g. my_mlflow_model) |
| [`Model Version`](https://mlflow.org/docs/latest/model-registry.html#concepts) | [`MlModel`](https://datahubproject.io/docs/generated/metamodel/entities/mlmodel/) | The name of a Model is `{registered_model_name}{model_name_separator}{model_version}` (e.g. my_mlflow_model_1 for Registered Model named my_mlflow_model and Version 1, my_mlflow_model_2, etc.) |
| [`Model Stage`](https://mlflow.org/docs/latest/model-registry.html#concepts) | [`Tag`](https://datahubproject.io/docs/generated/metamodel/entities/tag/) | The mapping between Model Stages and generated Tags is the following:<br/>- Production: mlflow_production<br/>- Staging: mlflow_staging<br/>- Archived: mlflow_archived<br/>- None: mlflow_none |
| Source Concept | DataHub Concept | Notes |
|:-----------------------------------------------------------------------------------:|:---------------------------------------------------------------------------------------------------------:|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [`Registered Model`](https://mlflow.org/docs/latest/model-registry/#registered-model) | [`MlModelGroup`](https://datahubproject.io/docs/generated/metamodel/entities/mlmodelgroup/) | The name of a Model Group is the same as a Registered Model's name (e.g. my_mlflow_model). Registered Models serve as containers for multiple versions of the same model in MLflow. |
| [`Model Version`](https://mlflow.org/docs/latest/model-registry/#model-version) | [`MlModel`](https://datahubproject.io/docs/generated/metamodel/entities/mlmodel/) | The name of a Model is `{registered_model_name}{model_name_separator}{model_version}` (e.g. my_mlflow_model_1 for Registered Model named my_mlflow_model and Version 1, my_mlflow_model_2, etc.). Each Model Version represents a specific iteration of a model with its own artifacts and metadata. |
| [`Experiment`](https://mlflow.org/docs/latest/tracking/#experiments) | [`Container`](https://datahubproject.io/docs/generated/metamodel/entities/container/) | Each Experiment in MLflow is mapped to a Container in DataHub. Experiments organize related runs and serve as logical groupings for model development iterations, allowing tracking of parameters, metrics, and artifacts. |
| [`Run`](https://mlflow.org/docs/latest/tracking/#runs) | [`DataProcessInstance`](https://datahubproject.io/docs/generated/metamodel/entities/dataprocessinstance/) | Captures the run's execution details, parameters, metrics, and lineage to a model. |
| [`Model Stage`](https://mlflow.org/docs/latest/model-registry/#deprecated-using-model-stages) | [`Tag`](https://datahubproject.io/docs/generated/metamodel/entities/tag/) | The mapping between Model Stages and generated Tags is the following:<br/>- Production: mlflow_production<br/>- Staging: mlflow_staging<br/>- Archived: mlflow_archived<br/>- None: mlflow_none. Model Stages indicate the deployment status of each version. |

0 comments on commit 1d1ed78

Please sign in to comment.