The purpose of this repository is to demonstrate the operation, collaboration, management, and functionality of components and services in a Machine Learning Operations (MLOps) ecosystem with Google Cloud Platform services.
Furthermore, this repository is intended to be used as a foundational framework for any type of organization that seeks to implement MLOps best practices under the Google Cloud ecosystem or any organization motivated to move to the next level of MLOps maturity [ 1 ].
We believe that this base template framework is replicable as a starting point for integration of components and services for the scalability of MLOps in various organizations.
-
- 2.1 vertex-pipelines
- 2.2 cloud-functions
- 2.3 cloud-build
- 2.4 .github/workflows
This repository implements (for now) 3 operational flows of integration, development and continuous deployment. In the following image we can see a representation of each of these flows and the components involved.
The Github Actions components are intended to carry out continuous integration as part of the deployment. On the other hand, Cloud Build components have the purpose of validating, registering and deploying Google Cloud components.
The first flow describes the process of updating pipelines
and components
within vertex-pipelines
[ 2 ]. When a collaborator submits a change, one of the Github Actions flows detects that the change has been within the vertex-pipelines
directory and consequently triggers the flow corresponding to updating the component and registering the pipeline. The following figure represents the update flow of vertex-pipelines.
The second flow describes the process of updating some element within the cloud-functions
[ 3 ] directory. When a collaborator submits a change, Github Actions detects if the change has occurred in cloud-functions
and consequently triggers the flow corresponding to the registration and deployment of a Cloud Function. The following figure represents the mentioned flow.
The third flow describes the flow to update Github Actions [ 4 ] CI/CD flows such as the verification, registration, and deployment flows defined in Cloud Build [ 5 ]. The following image shows the representation of the flow for updating the YAML files.
This repository follows a service-based structure, the objective of which is to manage in an organized way the elements that contribute to the service.
The purpose of the vertex-pipelines
directory is to organize Vertex Pipelines elements by:
components
: organized byevaluators
,models
andutils
(if required, more categories can be added, for example:explainability
).pipelines
: organized by projects, this repository contains two example projects:beans
andhouses
.
The interaction between components
and pipelines
should be understood as:
"A pipeline is defined under a specific project where such pipeline is built based on components defined in components”
The purpose of the cloud-functions
directory is to manage functions that will serve as triggers or callers of Vertex Pipelines.
- Cloud Functions are organized by project.
- Each project might be linked to a
pipeline
defined invertex-pipelines
. However, the Cloud Functions defined undercloud-functions
are not required to be tied to a specificpipeline
invertex-pipelines
.
In this repository, we have two Cloud Functions, beans
and houses
that serve as callers of the pipelines beans_pipeline.py
and houses_pipeline.py
respectively.
Contains the Cloud Build definition for vertex-pipelines
and cloud-functions
. For Vertex Pipelines, the cloud-build/vertex-pipeline.yaml
file contains the steps to compile and register a pipeline. For Cloud Functions, the cloud-build/cloud-functions.yaml
file contains the steps to deploy a Cloud Function.
In this repository, everything under the vertex-pipelines/
directory is considered by cloud-build/vertex-pipeline.yaml
. Likewise, everything under cloud-functions/
is considered by cloud-build/cloud-functions.yaml
.
Github Actions. This directory contains the definition for the Github Actions flow. In this case, two flows have been defined: cicd-dev.yaml
and cicd-prod.yaml
.
Let's grow this MLOps framework together.
The steps to create your first contribution are:
-
1️⃣ Create an issue where you explain:
- What is the objective of the feature? (In case of adding a new feature)
- What is the bug to be solved? (In case of bugs)
-
2️⃣ Make a
Pull Request
todevelop
- Set
@ferneutron
or@ulises-code
as the reviewers
- Set
-
3️⃣ Let's work together to make your PR merged!
- [ 1 ] MLOps: Continuous delivery and automation pipelines in machine learning
- [ 2 ] Introduction to Vertex AI Pipelines]
- [ 3 ] Cloud Functions Documentation
- [ 4 ] GitHub Actions Documentation
- [ 5 ] Overview of Cloud Build
Happy coding!