Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Tolerations to Build and BuildRun objects #1711

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

dorzel
Copy link
Contributor

@dorzel dorzel commented Oct 30, 2024

Changes

Fixes #1636

Submitter Checklist

  • Includes tests if functionality changed/was added
  • Includes docs if changes are user-facing
  • Set a kind label on this PR
  • Release notes block has been filled in, or marked NONE

See the contributor guide
for details on coding conventions, github and prow interactions, and the code review process.

Release Notes

Add Tolerations to Build and BuildRun objects

@openshift-ci openshift-ci bot added the release-note Label for when a PR has specified a release note label Oct 30, 2024
@pull-request-size pull-request-size bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 30, 2024
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 30, 2024
@dorzel dorzel force-pushed the MULTIARCH-5036 branch 8 times, most recently from 872db31 to 462d9bb Compare November 6, 2024 20:22
@dorzel dorzel force-pushed the MULTIARCH-5036 branch 3 times, most recently from 4ecfc21 to dfe25d5 Compare November 13, 2024 18:59
@dorzel dorzel force-pushed the MULTIARCH-5036 branch 3 times, most recently from e449fbd to 03b3b21 Compare November 19, 2024 18:13
@pull-request-size pull-request-size bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 19, 2024
@dorzel dorzel force-pushed the MULTIARCH-5036 branch 8 times, most recently from 3e66b55 to 43382a6 Compare November 20, 2024 22:06
@dorzel dorzel marked this pull request as ready for review November 21, 2024 17:28
@dorzel
Copy link
Contributor Author

dorzel commented Jan 8, 2025

An update - still looking into why TaintEffect isn't getting set in the integration tests. Integration tests are at least all failing in the same way.
e2e tests are all over the place, with unrelated tests failing and 2/4 of them nearly passing. I'm wondering if some of these failures are due to the new three-node configuration and disk space issues.

@dorzel dorzel force-pushed the MULTIARCH-5036 branch 5 times, most recently from f17465b to 5181b3f Compare January 20, 2025 18:52
@dorzel
Copy link
Contributor Author

dorzel commented Jan 20, 2025

@SaschaSchwarze0 @adambkaplan

Would I be able to get another look at this? I'm having an issue with the validation logic for the taint effect:
https://github.com/shipwright-io/build/pull/1711/files#diff-991ca77e26cb9e18ce83c2dc437ab9dd148e6c9263b933267733b1b3c0b45c7dR49-R58

In the case where the effect isn't specified in the yaml, as in this test:
https://github.com/shipwright-io/build/pull/1711/files#diff-1cc6608c14aa73c7ad391f5f3d05b858cef1e22b084e778ce651e5b5e663b46aR594

It should get explicitly set on the TaskRun object via the validation logic, but that doesn't happen. Confused, as everything else for the validation seems to work.
Wondering if I'm missing something obvious here or if I need to revert to setting it in taskrun.go

@SaschaSchwarze0
Copy link
Member

Would I be able to get another look at this? I'm having an issue with the validation logic for the taint effect:
https://github.com/shipwright-io/build/pull/1711/files#diff-991ca77e26cb9e18ce83c2dc437ab9dd148e6c9263b933267733b1b3c0b45c7dR49-R58

Sry for the late reply @dorzel.

The validation logic should imo NOT mutate anything on the object. It should only validate. If empty is a good value for the effect, then that's all the validation should do: not complain anything.

The logic that creates the toleration on the TaskRun should then again be aware that empty is a valid value for the effect and translate that into TaintEffectNoSchedule.

@dorzel
Copy link
Contributor Author

dorzel commented Jan 27, 2025

Ok, reverted to setting the TaintEffect outside of validation and that has fixed the integration tests (one unrelated flaky test). Looks like we're just waiting on #1786 to get an actual signal from e2e tests.

@dorzel dorzel force-pushed the MULTIARCH-5036 branch 12 times, most recently from 090570c to b8e487a Compare January 29, 2025 23:05
Copy link
Member

@SaschaSchwarze0 SaschaSchwarze0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice that tests now succeed. I still have one more homework for you. Sorry that I missed that earlier.

Comment on lines 43 to 46
// In this case, fields set only on the BuildRun object do not get validated as they are not copied to the transient Build resource.
// explicitly setting them here is required for validation to happen.
build.Spec.NodeSelector = buildRun.Spec.NodeSelector
build.Spec.Tolerations = buildRun.Spec.Tolerations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not correct and reminds me of something that I missed earlier (also for nodeSelector). Users can define these things both in the inline build spec as well as the overrides on the buildrun spec.

Two things:

  1. We should validate the buildrun spec fields somewhere near here (https://github.com/shipwright-io/build/blob/v0.14.0/pkg/reconciler/buildrun/buildrun.go#L276-L292) and not copy it over to a build spec to perform a validation.
  2. When an inline build spec is used, we validate that certain fields are not set on the buildrun spec because it does not make sense to have them specified in two places. That would be suitable for both nodeSelector and tolerations as well. The code that validates this is here: https://github.com/shipwright-io/build/blob/v0.14.0/pkg/validate/validate.go#L106-L136. Basically, when somebody creates a standalone BuildRun (= with an inline Build spec), we expect that BuildRun to not also set those fields (params, volumes, timeout etc) on the BuildRun spec itself.

Can you adopt this please.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so if I understand it seems that specifically in the inline build spec case we expect that certain fields are only specified on that build spec instead of having, for example, the "BuildRun Tolerations take preference over Build Tolerations" behavior (overriding). I'll implement these.

Signed-off-by: Dylan Orzel <dorzel@redhat.com>
@dorzel dorzel force-pushed the MULTIARCH-5036 branch 2 times, most recently from fa80d86 to b0fee7d Compare February 3, 2025 22:06
Signed-off-by: Dylan Orzel <dorzel@redhat.com>
Signed-off-by: Dylan Orzel <dorzel@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. release-note Label for when a PR has specified a release note size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

SHIP-0039: Allow tolerations on Build and BuildRun to be set
2 participants