Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grouping feature #133

Merged
merged 28 commits into from
Nov 22, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
21d8d95
wip: implementing node grouping classes and config changes draft
Lasica Nov 16, 2023
c6a481d
feature: grouping moved to other file, adjusted config and added test…
Lasica Nov 17, 2023
796501b
wip: making progress to finish grouping reorganization and testing
Lasica Nov 17, 2023
582d77f
fix: fixed errors introduced in generator class adding grouping
Lasica Nov 17, 2023
7af75a5
tests: fixed tests for new version
Lasica Nov 17, 2023
7499e8f
fixup! feature: grouping moved to other file, adjusted config and add…
Lasica Nov 17, 2023
796bed5
tests: implementing grouping tests and fixing minor issues in other t…
Lasica Nov 17, 2023
43d56a2
docs: added comment on dependencies blockers
Lasica Nov 17, 2023
b4fba77
updated changelog
Lasica Nov 17, 2023
8ae0c49
docs: wrote docs about grouping feature, reenabled pyspelling check
Lasica Nov 17, 2023
aa60e1f
gitignore - added dictionary
Lasica Nov 17, 2023
d2bb258
deps: fresh dependency update
Lasica Nov 17, 2023
40c8a4b
fix: restored python 3.8 compatibility
Lasica Nov 17, 2023
46f815e
docs: clarified docs and some comments about grouping and some typing…
Lasica Nov 20, 2023
cad68fb
deps: bump spellcheck action version, fixed config
Lasica Nov 20, 2023
d8fb7f6
test: debugging spellcheck GH actions issue
Lasica Nov 20, 2023
45e43c2
fix: dictionary is necessary for tox spellcheck hook to work
Lasica Nov 20, 2023
126e4a4
cicd: disabled spellcheck config for cicd, as it misbehaves
Lasica Nov 20, 2023
857a0f9
Revert "fix: dictionary is necessary for tox spellcheck hook to work"
Lasica Nov 20, 2023
650514c
refactor: added run_config and context as optional arguments of nodeg…
Lasica Nov 20, 2023
ec50f51
refactor: removed run_config from Grouper class, improved grp config …
Lasica Nov 21, 2023
59c8696
docs: added disclaimer about grouping feature
Lasica Nov 21, 2023
6bbb464
docs: added makefile and fixed disclaimer
Lasica Nov 21, 2023
ded9fc7
tests: cicd: reworked e2e tests to support multiple cases and added c…
Lasica Nov 21, 2023
bacc20f
fix: tests: fixed double error calling
Lasica Nov 21, 2023
223fc37
cicd: reverted python version 3.10-3.8, as it failed. debugging can b…
Lasica Nov 21, 2023
9973cd8
tests: cicd: extended timeout for standard pipeline e2e test
Lasica Nov 21, 2023
71ff26d
docs: added pictures to grouping docs and changed note to warning
Lasica Nov 22, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions .github/pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
repos:
- repo: https://github.com/pycqa/isort
rev: 5.12.0
hooks:
- id: isort
args: ["--profile", "black", "--line-length=79"]
- repo: https://github.com/psf/black
rev: 22.3.0
hooks:
- id: black
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v2.3.0
hooks:
- id: flake8
args: ['--ignore=E203,W503', '--max-line-length=120'] # see https://github.com/psf/black/issues/315 https://github.com/psf/black/issues/52
2 changes: 1 addition & 1 deletion .github/workflows/spellcheck.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,5 @@ jobs:
steps:
# Spellcheck
- uses: actions/checkout@v4
- uses: rojopolis/spellcheck-github-actions@0.25.0
- uses: rojopolis/spellcheck-github-actions@0.35.0
name: Spellcheck
25 changes: 18 additions & 7 deletions .github/workflows/test_and_publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ jobs:
- name: Check pre-commit status
run: |
poetry install -v
poetry run pre-commit run --all-files
poetry run pre-commit run --all-files -c .github/pre-commit-config.yaml

- name: Test with tox
run: |
Expand Down Expand Up @@ -98,7 +98,11 @@ jobs:

e2e_tests:
runs-on: ubuntu-latest
timeout-minutes: 60
needs: [unit_tests, sonarcloud]
strategy:
matrix:
e2e_case: ["standard", "grouping"]
steps:
- uses: actions/checkout@v4

Expand All @@ -120,7 +124,7 @@ jobs:
# kedro 0.18.1 is on purpose here, due to https://github.com/kedro-org/kedro-starters/issues/99
run: |
pip install $(find "./dist" -name "*.tar.gz")
kedro new --starter spaceflights --config tests/e2e/starter-config.yml --verbose
kedro new --starter spaceflights --config tests/e2e/${{ matrix.e2e_case }}/starter-config.yml --verbose

- name: Install project dependencies
working-directory: ./spaceflights
Expand All @@ -139,8 +143,13 @@ jobs:
sed -i 's/\(COPY src\/requirements.txt.*\)$/\1\nCOPY kedro-vertexai.tar.gz ./g' Dockerfile
echo "!data/01_raw" >> .dockerignore
kedro vertexai init gid-ml-ops-sandbox europe-west4
mv ../tests/e2e/catalog.yml conf/base/catalog.yml
mv ../tests/e2e/vertexai.yml conf/base/vertexai.yml
cp ../tests/e2e/${{ matrix.e2e_case }}/catalog.yml conf/base/catalog.yml
cp ../tests/e2e/${{ matrix.e2e_case }}/vertexai.yml conf/base/vertexai.yml
# Introducing tagging to pipelines
if [[ "${{ matrix.e2e_case }}" == "grouping" ]]; then
mv ../tests/e2e/${{ matrix.e2e_case }}/pipeline_data_processing.py src/spaceflights/pipelines/data_processing/pipeline.py
mv ../tests/e2e/${{ matrix.e2e_case }}/pipeline_data_science.py src/spaceflights/pipelines/data_science/pipeline.py
fi

- name: Prepare docker env
uses: docker/setup-buildx-action@v3
Expand All @@ -151,14 +160,15 @@ jobs:
- name: Build pipeline docker image
run: |
cd ./spaceflights
docker build --build-arg BASE_IMAGE=python:3.8-buster --tag kedro-vertexai-e2e:latest --load .
docker pull gcr.io/gid-ml-ops-sandbox/kedro-vertexai-e2e:${{ matrix.e2e_case }} || true
docker build --build-arg BASE_IMAGE=python:3.8-buster --tag kedro-vertexai-e2e:${{ matrix.e2e_case }} --load --cache-from=gcr.io/gid-ml-ops-sandbox/kedro-vertexai-e2e:${{ matrix.e2e_case }} .

- name: Publish docker image to GCR
uses: mattes/gce-docker-push-action@v1
with:
creds: ${{ secrets.GOOGLE_APPLICATION_CREDENTIALS }}
src: kedro-vertexai-e2e:latest
dst: gcr.io/gid-ml-ops-sandbox/kedro-vertexai-e2e:latest
src: kedro-vertexai-e2e:${{ matrix.e2e_case }}
dst: gcr.io/gid-ml-ops-sandbox/kedro-vertexai-e2e:${{ matrix.e2e_case }}

- name: Set up GCP Credentials
uses: google-github-actions/auth@v1.1.1
Expand All @@ -172,6 +182,7 @@ jobs:
cd ./spaceflights
export KEDRO_CONFIG_COMMIT_ID=$GITHUB_SHA
kedro vertexai run-once --wait-for-completion

publish:
if: github.event.pull_request == null && github.ref == 'refs/heads/master'
needs: [ e2e_tests, codeql ]
Expand Down
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -174,3 +174,8 @@ terraform/terraform.tfstate
.idea
conf/azure/credentials.yml

# pyspelling
dictionary.dic

# vs code
.vscode
6 changes: 6 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
# THIS IS LOCAL PRE-COMMIT CONFIG
# CICD uses separate config in .github until spellcheck hook gets improved
repos:
- repo: https://github.com/pycqa/isort
rev: 5.12.0
Expand All @@ -13,3 +15,7 @@ repos:
hooks:
- id: flake8
args: ['--ignore=E203,W503', '--max-line-length=120'] # see https://github.com/psf/black/issues/315 https://github.com/psf/black/issues/52
- repo: https://github.com/getindata/py-pre-commit-hooks
rev: v0.2.1
hooks:
- id: pyspelling-docker
42 changes: 21 additions & 21 deletions .spellcheck.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,26 +10,26 @@ matrix:
- "README.md"
default_encoding: utf-8
pipeline:
- pyspelling.filters.context:
context_visible_first: true
escapes: \\[\\`~]
delimiters:
# Ignore multiline content between fences (fences can have 3 or more back ticks)
# ```
# content
# ```
- open: '^(?s)(?P<open>`{1,3})[^`]'
close: '(?P=open)'
# Ignore text between inline back ticks
- open: '(?P<open>`)[^`]'
close: '(?P=open)'
# Ignore text in brackets [] and ()
- open: '\['
close: '\]'
- open: '\('
close: '\)'
- open: '\{'
close: '\}'
- pyspelling.filters.context:
context_visible_first: true
escapes: \\[\\`~]
delimiters:
# Ignore multiline content between fences (fences can have 3 or more back ticks)
# ```
# content
# ```
- open: '(?s)^(?P<open>`{1,3})[^`]'
close: '(?P=open)'
# Ignore text between inline back ticks
- open: '(?P<open>`)[^`]'
close: '(?P=open)'
# Ignore text in brackets [] and ()
- open: '\['
close: '\]'
- open: '\('
close: '\)'
- open: '\{'
close: '\}'
dictionary:
wordlists:
- docs/spellcheck_exceptions.txt
- docs/spellcheck_exceptions.txt
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

- Added explicite pyarrow dependency to avoid critical vulnerability
- Updated dependencies and tested for kedro `0.18.14`
- [Feature 🚀] Node grouping: added option to group multiple Kedro nodes together at execution in single Vertex AI process to allow better optimization - less steps, shorter delays while running Vertex AI nodes and less wasted time of data serialization thanks to possibility to use the MemoryDataset

## [0.9.1] - 2023-08-16

Expand Down
20 changes: 20 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = .
BUILDDIR = _build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
58 changes: 56 additions & 2 deletions docs/source/02_installation/02_configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,15 @@ run_config:
# Optional pipeline description
#description: "Very Important Pipeline"

# Optional config for node execution grouping. - 2 classes are provided:
# - default no-grouping option IdentityNodeGrouper
# - tag based grouping with TagNodeGrouper
grouping:
cls: kedro_vertexai.grouping.IdentityNodeGrouper
# cls: kedro_vertexai.grouping.TagNodeGrouper
# params:
# tag_prefix: "group:"

# How long to keep underlying Argo workflow (together with pods and data
# volume after pipeline finishes) [in seconds]. Default: 1 week
ttl: 604800
Expand All @@ -34,8 +43,10 @@ run_config:
# pipeline status. Used to send notifications or raise the alerts
# on_exit_pipeline: notify_via_slack

# Optional section allowing adjustment of the resources
# reservations and limits for the nodes
# Optional section allowing adjustment of the resources, reservations and limits
Lasica marked this conversation as resolved.
Show resolved Hide resolved
# for the nodes. You can specify node names or tags to select which nodes the requirements
# apply to (also in node selectors). When not provided they're set to 500m cpu and 1024Mi memory.
# If you don't want to specify pipeline resources set both to None in __default__.
resources:

# For nodes that require more RAM you can increase the "memory"
Expand Down Expand Up @@ -170,6 +181,49 @@ def generate_config(self) -> dict:
First one - `target_config_file` should return the name of the configuration file to be generated (e.g. `credentials.yml`) and the `generate_config` should return a dictionary, which will be then serialized into the target file as YAML. If the target file already exists during the invocation, it will be merged (see method `kedro_vertexai.dynamic_config.DynamicConfigProvider.merge_with_existing` ) with the existing one and then saved again.
Note that the `generate_config` has access to an initialized plugin config via `self.config` property, so any values from the `vertexai.yml` configuration is accessible.


## Grouping feature

Optional `grouping` section enables grouping feature that aggregates many Kedro nodes execution to single VertexAI node(s). Using it allows you to freely subdivide Kedro pipelines to as many steps as logically makes sense while keeping advantages of in memory data transmission possibilities. It also saves you a lot of time avoiding delays of docker container starting at Vertex nodes which can amount to about 2 minutes for each VertexAI node.

API allows implementation of your own aggregation method. You can provide aggregating class and its additional init params as `kwargs` dictionary. Default class is `IdentitiyNodeGrouper` which actually does not group the nodes (plugin behaves as in versions before `0.9.1`). Class that implements grouping using configured tag prefix is called `TagNodeGrouper`. The default prefix is `"group:"`. It uses what follows after the tag prefix as a name of group of nodes. Only one tag with this grouping prefix is allowed per node; more than that results in `GroupingException`. Example configuration:
```yaml
grouping:
cls: kedro_vertexai.grouping.TagNodeGrouper
params:
tag_prefix: "group:"
```

Lasica marked this conversation as resolved.
Show resolved Hide resolved
The above configuration will result in the following result in this sample pipeline:
```python
Pipeline([
node(some_operation, "A", "B", name="node1", tags=["foo", "group:nodegroup"]),
node(some_operation, "B", "C", name="node2", tags=["bar", "group:nodegroup"]),
node(some_operation, "C", "D", name="node3", tags=["baz"]),
])
```
The result will be 2 VertexAI nodes for this pipeline, first with name `nodegroup` that will run `node1` and `node2` Kedro nodes inside and provide output `C` and second VertexAI node: `node3`. Additional MLflow node can be present if `kedro-mlflow` is used. Right now it is not possible to group it. If you feel you need that functionality search for/create an issue on [github page of the plugin](https://github.com/getindata/kedro-vertexai/issues).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it would be awesome to show the following thing in the documentation below the description:

  1. Screenshot of Kedro-viz of regular spaceflights pipeline
  2. Screenshot of code from e2e tests with tags (groups)
  3. Screenshot from Vertex AI
    All side-by-side with arrows from one to another. Visualization always speaks better than a wall of text and people will know at glance what to expect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I'll add the pictures.


This grouping class is used during pipeline translation at plugin pipeline generator. It implements interface of `NodeGrouper` class with `group` function, that accepts `pipeline.node_dependencies` and returns `Grouping`. `Grouping` is a `dataclass` with two dictionaries:
- `node_mapping` - which defines names of groups and says which sets of nodes are part of a given group
- `dependencies` - which defines child-parent relation of all groups in `node_mapping`.
`Grouping` class also validates dependencies upon creation to check whether the grouping is valid. That means it does not introduce a cycle in dependencies graph.

````{warning}
Make sure that all nodes in pipeline have names and their names are unique within the pipeline when using this feature, as grouping class and VertexAI nodes naming depend on it.
````

### Example

Here you can see how standard spaceflights changes after enabling the grouping feature configured with `TagNodeGrouper`, when using the following tagging (view from kedro viz):

![Vertex AI Pipeline](grouped_kedro_viz.png)

We get the following result:

![Vertex AI Pipeline](grouping_visualisation.png)


## Resources configuration

Optional `resources` and `node_selectors` sections enable adjustment of the resources reservations and limits for the
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading