Skip to content

Showcase a demo-template for managing a MLOps project by considering code collaboration, repository architecture, continuos integration, continuos development and deployment by implementing Google Cloud services and products.

License

Notifications You must be signed in to change notification settings

ferneutron/mlops-gcp

Repository files navigation

MLOps on GCP

Prod CI/CD Dev CI/CD License

mlops

The purpose of this repository is to demonstrate the operation, collaboration, management, and functionality of components and services in a Machine Learning Operations (MLOps) ecosystem with Google Cloud Platform services.

Furthermore, this repository is intended to be used as a foundational framework for any type of organization that seeks to implement MLOps best practices under the Google Cloud ecosystem or any organization motivated to move to the next level of MLOps maturity [ 1 ].

We believe that this base template framework is replicable as a starting point for integration of components and services for the scalability of MLOps in various organizations.

  1. How to understand this repository?

  2. Repository organization

  3. Contributions

  4. References

1. How to understand this repository?

This repository implements (for now) 3 operational flows of integration, development and continuous deployment. In the following image we can see a representation of each of these flows and the components involved.

workflow

The Github Actions components are intended to carry out continuous integration as part of the deployment. On the other hand, Cloud Build components have the purpose of validating, registering and deploying Google Cloud components.

1.1 Vertex Pipelines

The first flow describes the process of updating pipelines and components within vertex-pipelines [ 2 ]. When a collaborator submits a change, one of the Github Actions flows detects that the change has been within the vertex-pipelines directory and consequently triggers the flow corresponding to updating the component and registering the pipeline. The following figure represents the update flow of vertex-pipelines.

workflow

1.2 Cloud Functions

The second flow describes the process of updating some element within the cloud-functions [ 3 ] directory. When a collaborator submits a change, Github Actions detects if the change has occurred in cloud-functions and consequently triggers the flow corresponding to the registration and deployment of a Cloud Function. The following figure represents the mentioned flow.

workflow

1.3 Cloud Build & Github Actions

The third flow describes the flow to update Github Actions [ 4 ] CI/CD flows such as the verification, registration, and deployment flows defined in Cloud Build [ 5 ]. The following image shows the representation of the flow for updating the YAML files.

workflow

2. Repository organization

This repository follows a service-based structure, the objective of which is to manage in an organized way the elements that contribute to the service.

2.1 vertex-pipelines

The purpose of the vertex-pipelines directory is to organize Vertex Pipelines elements by:

  • components: organized by evaluators, models and utils (if required, more categories can be added, for example: explainability).
  • pipelines: organized by projects, this repository contains two example projects: beans and houses .

The interaction between components and pipelines should be understood as:

"A pipeline is defined under a specific project where such pipeline is built based on components defined in components”

2.2 cloud-functions

The purpose of the cloud-functions directory is to manage functions that will serve as triggers or callers of Vertex Pipelines.

  • Cloud Functions are organized by project.
  • Each project might be linked to a pipeline defined in vertex-pipelines. However, the Cloud Functions defined under cloud-functions are not required to be tied to a specific pipeline in vertex-pipelines.

In this repository, we have two Cloud Functions, beans and houses that serve as callers of the pipelines beans_pipeline.py and houses_pipeline.py respectively.

2.3 cloud-build

Contains the Cloud Build definition for vertex-pipelines and cloud-functions. For Vertex Pipelines, the cloud-build/vertex-pipeline.yaml file contains the steps to compile and register a pipeline. For Cloud Functions, the cloud-build/cloud-functions.yamlfile contains the steps to deploy a Cloud Function.

In this repository, everything under the vertex-pipelines/ directory is considered by cloud-build/vertex-pipeline.yaml. Likewise, everything under cloud-functions/ is considered by cloud-build/cloud-functions.yaml.

2.4 .github/workflows

Github Actions. This directory contains the definition for the Github Actions flow. In this case, two flows have been defined: cicd-dev.yaml and cicd-prod.yaml.

3. Contributions

Let's grow this MLOps framework together.

The steps to create your first contribution are:

  • 1️⃣ Create an issue where you explain:

    • What is the objective of the feature? (In case of adding a new feature)
    • What is the bug to be solved? (In case of bugs)
  • 2️⃣ Make a Pull Request to develop

    • Set @ferneutron or @ulises-code as the reviewers
  • 3️⃣ Let's work together to make your PR merged!

4. References

Happy coding!

About

Showcase a demo-template for managing a MLOps project by considering code collaboration, repository architecture, continuos integration, continuos development and deployment by implementing Google Cloud services and products.

Topics

Resources

License

Stars

Watchers

Forks

Contributors 3

  •  
  •  
  •