-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support version in watcher cleanup rollback #4176
Support version in watcher cleanup rollback #4176
Conversation
Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane) |
Pinging @elastic/elastic-agent (Team:Elastic-Agent) |
This pull request is now in conflicts. Could you fix it? 🙏
|
0ec9173
to
b17c713
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new rollback and cleanup tests are excellent thanks, just one question on handling the "old" marker file format as well.
err := markUpgrade(log, paths.DataFrom(topDir), newAgentVersion, newAgentHash, newAgentVersionedHome, oldAgentVersion, oldAgentHash, oldAgentVersionedHome, nil, nil) | ||
require.NoError(t, err, "error writing fake update marker") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This always creates the marker with the new fields you added, in the case where we come from an old agent, shouldn't it omit them to make the test 100% accurate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a boolean parameter for that in createUpdateMarker
that blanks out the new fields to simulate an old upgrade marker just a few lines above this comment
https://github.com/elastic/elastic-agent/pull/4176/files/bf2e849b16f8852a59593e45b8ada9095e190ad1..b17c7132e2b6cd2ff157835379c833435f325fc4#diff-cc4058ffab0db853a72dc782e00a1a7c3c7b7314175bec4538b1f32e69332c66R497-R503
This pull request is now in conflicts. Could you fix it? 🙏
|
b17c713
to
d04001c
Compare
Tested this PR locally (on MacOS). Before upgrade
During upgrade
After upgrade
Directory cleanup (of old Agent files) is happening correctly after upgrade. The only thing that looks a bit odd is that the |
This issue exists when upgrading from |
@@ -81,7 +81,7 @@ func unzip(log *logger.Logger, archivePath, dataDir string) (UnpackResult, error | |||
return UnpackResult{}, fmt.Errorf("parsing package manifest: %w", err) | |||
} | |||
pm.mappings = manifest.Package.PathMappings | |||
versionedHome = path.Clean(pm.Map(manifest.Package.VersionedHome)) | |||
versionedHome = filepath.Clean(pm.Map(manifest.Package.VersionedHome)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left one general nitpick about table-based testing. PR changes LGTM — love that we are now passing in topDir/dataDir and the unit testing that enabled.
|
a831764
into
elastic:feature/add-agent-version-to-installation-directory-name
* extract directory paths to allow unit testing (instead of nested paths.* calls) * introduce tests for watcher cleanup and rollback * close data directory after cleanup
* extract directory paths to allow unit testing (instead of nested paths.* calls) * introduce tests for watcher cleanup and rollback * close data directory after cleanup
* extract directory paths to allow unit testing (instead of nested paths.* calls) * introduce tests for watcher cleanup and rollback * close data directory after cleanup
* extract directory paths to allow unit testing (instead of nested paths.* calls) * introduce tests for watcher cleanup and rollback * close data directory after cleanup
* extract directory paths to allow unit testing (instead of nested paths.* calls) * introduce tests for watcher cleanup and rollback * close data directory after cleanup
* extract directory paths to allow unit testing (instead of nested paths.* calls) * introduce tests for watcher cleanup and rollback * close data directory after cleanup
* extract directory paths to allow unit testing (instead of nested paths.* calls) * introduce tests for watcher cleanup and rollback * close data directory after cleanup
* extract directory paths to allow unit testing (instead of nested paths.* calls) * introduce tests for watcher cleanup and rollback * close data directory after cleanup
* extract directory paths to allow unit testing (instead of nested paths.* calls) * introduce tests for watcher cleanup and rollback * close data directory after cleanup
* extract directory paths to allow unit testing (instead of nested paths.* calls) * introduce tests for watcher cleanup and rollback * close data directory after cleanup
* extract directory paths to allow unit testing (instead of nested paths.* calls) * introduce tests for watcher cleanup and rollback * close data directory after cleanup
* Use package manifest for install (#4173) * use version in path for rpm and deb * install elastic-agent remapping paths * Use package manifests for upgrade (#4174) * map paths when upgrading using .tar.gz packages * use structured output from unpack to identify old and new agents directories * copy state directories and rotate symlink correctly * Support version in watcher cleanup rollback (#4176) * extract directory paths to allow unit testing (instead of nested paths.* calls) * introduce tests for watcher cleanup and rollback * close data directory after cleanup * Reintroduce upgrade version check (#4177) * Fix shutdownCallback() directories * reintroduce version check in upgrade * Use watcher that supports paths in update marker (#4178) * Invoke agent watcher capable of handling new paths * Adapt TestShutdownCallback to run multiple testcases * Fix shutdownCallback to work with version in path * Fix MacOS relink step * Add commit hash to manifest * document manifest-aware upgrade --------- Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co> * small fixes based on PR feedback * Add wait for watcher (#4229) * Fix rollbackInstall order of execution and handle errors * Add wait for watcher up and running --------- Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co>
What does this PR do?
This is a subset of a bigger PR #3950, so CI is not expected to be green, especially regarding the integration tests.
This PR adds new fields in the update marker so that the upgrade watcher Cleanup and Rollback processes can operate correctly with the new
elastic-agent-<version>-<hash>
.There is some refactoring to make Cleanup() and Rollback() testable that involves injecting paths rather than obtaining through
paths.*
functions: this is needed to be able to write unit tests without callingpaths.SetTop()
which already created flakiness and introduced race conditions in our tests in the pastWhy is it important?
Now the agent installation is correctly cleaned up leaving only the new or old version of agent installed, since the watcher is reading the correct paths from the new update marker
Checklist
[ ] I have made corresponding changes to the documentation[ ] I have made corresponding change to the default configuration files[ ] I have added an entry in./changelog/fragments
using the changelog tool[ ] I have added an integration test or an E2E testAuthor's Checklist
How to test this PR locally
Testing this PR is mostly the same as the test procedure for #4174, but now the watcher cleans the installation correctly
After the end of grace period the new agent directory should be the only left inside
/opt/Elastic/Agent/data
.If we want to trigger a rollback, it's enough to kill the new agent a few times and then only the old agent installation will be left inside
/opt/Elastic/Agent/data
.Known issues:
Related issues
Use cases
Screenshots
Logs
Questions to ask yourself