Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache polish work - including the introduction of cache policies to the new top level Cache object #3129

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

eapolinario
Copy link
Collaborator

@eapolinario eapolinario commented Feb 12, 2025

Tracking issue

flyteorg/flyte#6143

Why are the changes needed?

Over the years we received some feedback complaining about the discoverability of the cache-related functionality, this is finally being addressed in this PR. This PR also opens the door for a more complex caching mechanism to be applied to each @task.

What changes were proposed in this pull request?

We create a new object called Cache which is groups all of the caching functionality, including the ability to set the version, cache serialization, the ability to ignore certain inputs in the cache key calculation (via setting cache_ignore_input_vars), and the set of cache policies applied. Here's the full definition of Cache:

class Cache:
    version: Optional[str] = None
    serialize: bool = False
    ignored_inputs: Union[Tuple[str, ...], str] = ()
    salt: str = ""
    policies: Optional[Union[List[CachePolicy], CachePolicy]] = None

The first three fields are well known and map directly to existing cache-related concepts, now grouped in a single object. As for policies and salt, those are new concepts, the idea is to provide more complicated caching behavior based on the function definition. A CachePolicy is a protocol that specifies how versions are calculated. It exposes a single method called get_version that's called to produce a value, that we call version.

The Cache object applies the policies in the order in which they are defined to generate the final cache version. The process of generating the final value involves concatenating each version produced by a policy and then using SHA-256 to produce the final hexdigest (i.e. 64 hex characters) value.

Also worth mentioning that we require the definition of policies in case a Cache object is used.

All cache-related parameters are now officially deprecated (although they will keep being supported, potentially for a really long time). In order to make the migration seamless, the cache parameter in the @task decorator is now Union[bool, Cache] and mapping the old names to the new fields in Cache should be fairly simple.

As a final design decision, we also added a hook to help define a list of default policies. This might suit the more advanced users in case they need to maintain a central list of cache policies somewhere.

How was this patch tested?

Unit tests and local sandbox.

Setup process

Screenshots

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

Summary by Bito

This PR implements a comprehensive refactoring of Flyte tasks' caching functionality, transitioning from explicit cache parameters to a kwargs-based approach with a new Cache class. The changes include enhanced pod template support, improved cache policy validation, and authentication system refinements. The implementation features updated error messaging for cache policy requirements and modified task decorator for pod template parameter handling, while maintaining backward compatibility through policy-based version generation.

Unit tests added: True

Estimated effort to review (1-5, lower is better): 5

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
…ignore_input_vars`

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
@flyte-bot
Copy link
Contributor

flyte-bot commented Feb 12, 2025

Code Review Agent Run #e01a83

Actionable Suggestions - 6
  • flytekit/configuration/plugin.py - 1
    • Consider default implementation for cache policies · Line 115-115
  • tests/flytekit/unit/core/test_cache.py - 3
  • flytekit/core/task.py - 2
Review Details
  • Files reviewed - 5 · Commit Range: 067929b..4db4544
    • flytekit/__init__.py
    • flytekit/configuration/plugin.py
    • flytekit/core/cache.py
    • flytekit/core/task.py
    • tests/flytekit/unit/core/test_cache.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

@flyte-bot
Copy link
Contributor

flyte-bot commented Feb 12, 2025

Changelist by Bito

This pull request implements the following key changes.

Key Change Files Impacted
Feature Improvement - Enhanced Task Caching System

__init__.py - Added imports for new Cache, CachePolicy, and VersionParameters classes

plugin.py - Added support for default cache policies in FlytekitPlugin

cache.py - Introduced new Cache system with policy-based version generation

task.py - Updated task decorator to support new Cache object and deprecated old cache parameters

Testing - Cache System Unit Tests

test_cache.py - Added comprehensive test suite for new Cache functionality and policies

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Copy link

codecov bot commented Feb 12, 2025

Codecov Report

Attention: Patch coverage is 94.02985% with 4 lines in your changes missing coverage. Please review.

Project coverage is 76.10%. Comparing base (4cee7cd) to head (87432ff).
Report is 19 commits behind head on master.

Files with missing lines Patch % Lines
flytekit/core/cache.py 90.47% 1 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3129      +/-   ##
==========================================
- Coverage   77.90%   76.10%   -1.80%     
==========================================
  Files         217      205      -12     
  Lines       22178    21758     -420     
  Branches     2768     2820      +52     
==========================================
- Hits        17277    16559     -718     
- Misses       4115     4377     +262     
- Partials      786      822      +36     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
@flyte-bot
Copy link
Contributor

flyte-bot commented Feb 12, 2025

Code Review Agent Run #6b0cb9

Actionable Suggestions - 0
Additional Suggestions - 10
  • flytekit/image_spec/image_spec.py - 2
    • Consider adding validation for pip_secret_mounts · Line 89-89
    • Consider extracting pip_secret_mounts validation logic · Line 134-141
  • flytekit/models/domain.py - 1
    • Consider adding parameter validation in Domain · Line 14-16
  • flytekit/loggers.py - 1
    • Ensure proper boolean conversion from env var · Line 190-190
  • flytekit/image_spec/default_builder.py - 1
    • Consider extracting secret mount string logic · Line 224-229
  • flytekit/models/task.py - 1
  • flytekit/models/execution.py - 2
    • Consider helper method for bool wrapping · Line 332-332
    • Consider adding interruptible parameter validation · Line 183-183
  • tests/flytekit/unit/test_translator.py - 1
  • flytekit/tools/fast_registration.py - 1
    • Potential duplicate progress bar initialization · Line 166-169
Review Details
  • Files reviewed - 53 · Commit Range: 4db4544..d9395b4
    • dev-requirements.txt
    • flytekit/bin/entrypoint.py
    • flytekit/clients/auth_helper.py
    • flytekit/clients/friendly.py
    • flytekit/clients/grpc_utils/auth_interceptor.py
    • flytekit/clients/raw.py
    • flytekit/core/base_task.py
    • flytekit/core/cache.py
    • flytekit/core/context_manager.py
    • flytekit/core/node.py
    • flytekit/core/promise.py
    • flytekit/core/python_function_task.py
    • flytekit/core/type_engine.py
    • flytekit/core/type_match_checking.py
    • flytekit/core/worker_queue.py
    • flytekit/core/workflow.py
    • flytekit/deck/deck.py
    • flytekit/image_spec/default_builder.py
    • flytekit/image_spec/image_spec.py
    • flytekit/loggers.py
    • flytekit/models/core/workflow.py
    • flytekit/models/domain.py
    • flytekit/models/execution.py
    • flytekit/models/security.py
    • flytekit/models/task.py
    • flytekit/remote/remote.py
    • flytekit/tools/fast_registration.py
    • flytekit/tools/translator.py
    • flytekit/types/directory/types.py
    • plugins/flytekit-onnx-pytorch/dev-requirements.txt
    • pydoclint-errors-baseline.txt
    • pyproject.toml
    • tests/flytekit/integration/remote/test_remote.py
    • tests/flytekit/unit/bin/test_python_entrypoint.py
    • tests/flytekit/unit/clients/auth/test_keyring_store.py
    • tests/flytekit/unit/clients/test_auth_helper.py
    • tests/flytekit/unit/clients/test_friendly.py
    • tests/flytekit/unit/clients/test_raw.py
    • tests/flytekit/unit/core/image_spec/test_default_builder.py
    • tests/flytekit/unit/core/image_spec/test_image_spec.py
    • tests/flytekit/unit/core/test_annotated_bindings.py
    • tests/flytekit/unit/core/test_array_node_map_task.py
    • tests/flytekit/unit/core/test_generice_idl_type_engine.py
    • tests/flytekit/unit/core/test_map_task.py
    • tests/flytekit/unit/core/test_node_creation.py
    • tests/flytekit/unit/core/test_type_engine.py
    • tests/flytekit/unit/core/test_type_match_checking.py
    • tests/flytekit/unit/core/test_worker_queue.py
    • tests/flytekit/unit/deck/test_deck.py
    • tests/flytekit/unit/models/core/test_security.py
    • tests/flytekit/unit/models/core/test_workflow.py
    • tests/flytekit/unit/models/test_execution.py
    • tests/flytekit/unit/test_translator.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

wild-endeavor
wild-endeavor previously approved these changes Feb 12, 2025
flytekit/core/cache.py Outdated Show resolved Hide resolved
@@ -343,7 +346,35 @@ def launch_dynamically():
:param pickle_untyped: Boolean that indicates if the task allows unspecified data types.
"""

def wrapper(fn: Callable[P, Any]) -> PythonFunctionTask[T]:
def wrapper(fn: Callable[P, FuncOut]) -> PythonFunctionTask[T]:
nonlocal cache, cache_serialize, cache_version, cache_ignore_input_vars
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid nonlocal here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you expand a bit on why?

We need to modify these variables inside the wrapper function and in python variable assignment implies the creation of a new variable unless such variable is declared as nonlocal.

cache_ignore_input_vars = cache.get_ignored_inputs()
cache = True

if cache_serialize is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's weird to have cache_version being set above, but then immediately overwritten. One needs to look a few lines above to see that these parameters can not be set.

Can we at least have a check here to make it obvious that this is only used for is for the bool case?

is_bool_cache = isinstance(cache, bool) and cache is True and cache_version is None
if is_bool_cache:
    cache = Cache(...)

if isinstance(cache, Cache):
    ....

if is_bool_cache:
    if cache_serialize is None:
    ...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TaskMetadata dataclass does not take Optional for the cache-related fields, which means that we either (1) modify those types in the dataclass declaration or (2) set default values prior to invoking the constructor. I decided to go with (2) because having those fields as Optional means that I'd have to check for the nil case everywhere (which is pretty annoying).

Also worth noting that this initialization covers the common case where the cache is not used at all (so not only the positive bool case).

flytekit/core/cache.py Show resolved Hide resolved
Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
@flyte-bot
Copy link
Contributor

flyte-bot commented Feb 13, 2025

Code Review Agent Run #27f658

Actionable Suggestions - 0
Review Details
  • Files reviewed - 1 · Commit Range: d9395b4..346cabe
    • flytekit/core/task.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
Signed-off-by: Eduardo Apolinario <eapolinario@users.noreply.github.com>
@flyte-bot
Copy link
Contributor

flyte-bot commented Feb 13, 2025

Code Review Agent Run #1f41e5

Actionable Suggestions - 1
  • tests/flytekit/unit/core/test_cache.py - 1
    • Consider adding test for version-only caching · Line 145-169
Review Details
  • Files reviewed - 3 · Commit Range: 346cabe..056b8a5
    • flytekit/core/cache.py
    • flytekit/core/task.py
    • tests/flytekit/unit/core/test_cache.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

AI Code Review powered by Bito Logo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants