-
Notifications
You must be signed in to change notification settings - Fork 155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix filemarker creation process, update error message on uninstall #4172
Conversation
This pull request does not have a backport label. Could you fix it @fearful-symmetry? 🙏
NOTE: |
Pinging @elastic/elastic-agent (Team:Elastic-Agent) |
Coming up with an automated test for this is a bit tricky. To reproduce the original problem, we'd need to SIGINT the install command at the correct moment which would be hard to do deterministically. @pchila added a good test of the "Copy files" portion of the installer in https://github.com/elastic/elastic-agent/pull/4173/files, see the changes in internal/pkg/agent/install/install_test.go. Could we split the install up into individual function steps and write a unit test that ensures we can call uninstall if you only complete the first one? You could also add a hidden test option to the install command to get the install to exit right after the files were copied or something. I like breaking the install code up into explicit steps a bit better since a test that says "ensure we can uninstall after the first step" is better than "ensure we can uninstall after we hit the part of the code I arbitrarily exited from". |
@cmacknz when you say "ensure we can uninstall after the first step" are you thinking of a specific part of the process? After we create the |
Yes, should have been clearer, I am thinking of the install process as a sequence of steps like:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There a small doubt on the permissions we end up for topPath which should be checked
err = os.MkdirAll(filepath.Dir(topPath), 0755) | ||
err = os.MkdirAll(topPath, 0755) | ||
if err != nil { | ||
return utils.FileOwner{}, errors.New( | ||
err, | ||
fmt.Sprintf("failed to create installation parent directory (%s)", filepath.Dir(topPath)), | ||
errors.M("directory", filepath.Dir(topPath))) | ||
fmt.Sprintf("failed to create installation parent directory (%s)", topPath), | ||
errors.M("directory", topPath)) | ||
} | ||
|
||
// create the install marker | ||
if err := CreateInstallMarker(topPath, ownership); err != nil { | ||
return utils.FileOwner{}, fmt.Errorf("failed to create install marker: %w", err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a subtle change compared to the previous logic:
- before the change we were creating the parent directory of topPath with 0755, while topPath would be created by the copy of files
- after the change we are creating topPath as well with 0755 permissions instead of copying the permissions from the package: this could be a little too permissive and it cannot be fixed by setting different permissions in the package directory.
From a quick check without this change the topPath is created with 0750
root@elastic-agent-dev:/opt/Elastic# ls -la
total 12
drwxr-xr-x 3 root root 4096 Feb 5 12:55 .
drwxr-xr-x 3 root root 4096 Dec 18 14:59 ..
drwxr-x--- 4 root root 4096 Feb 5 12:55 Agent
we should check what kind of permissions we get now that we create also topDir with 0755
: does the Copy() call fixes those ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yah, in the previous version, the Copy
call would create the Agent
directory. I tested it like this, but wasn't considering permissions.Lemme check...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, we need to preserve the previous permissions. This is something else we should consider adding tests for here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I ended up moving the CreateInstallMarker
to after the initial copy operation, which avoids the permission issue here, and also seems more useful? If we create the install marker and then the copy operation fails than we might not be able to run or uninstall the agent at all, as the executable or base config, etc, might be missing.
// do this after the file copy but before installing service files, etc | ||
// That way if the service install process fails, a user can still run `elastic-agent uninstall` | ||
// to clean up. | ||
if err := CreateInstallMarker(topPath, ownership); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still happening after the copyFiles
call, it should be immediately after the err = os.MkdirAll(filepath.Dir(topPath), 0755)
earlier in this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I chose to not put it there, since putting the .installed
file before anything has been installed seemed counter-intuitive, and if the copy fails we might not be able to run elastic-agent
at all. Unless there's something I'm not factoring in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't run elastic-agent uninstall
unless the .installed
marker exists, so if you put it after the copy you can't remove the copied files from disk with the elastic-agent uninstall
command.
Needs a changelog entry. |
changelog/fragments/1707951532-change-install-marker-creation.yaml
Outdated
Show resolved
Hide resolved
…yaml Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co>
|
I am going to force merge this since Blake has requested changes but is on PTO. The change is now small enough that I don't expect he'd object. |
…4172) * move install file marker create, rename methods, change error message * change error message * change error message * remove calls to InstallMarkerExists * fix windows helper * fix windows, again * Revert "remove calls to InstallMarkerExists" This reverts commit 400bb0d. * change install file locaton, revert removal of filepath.Dir * fix wording * move IsUpgradeable, remove duplication * add tests * add mostly useless test * linter * tinkering with CI * skip supervisor test on windows * tinkering with the linter * remove experiment * refactor install to make adding tests easier * linter * add more tests... * still fixing tests * sonarQube * fix windows tests, fight sonarqube * tinker with file marker create * move install path * fix bad merge * still cleaning up merge * add tests * add tests, docs * fix tests * don't check mode bits * move command, fix permissions * SonarQube... * refactor dir setup to make Sonarqube happy * add changelog * Update changelog/fragments/1707951532-change-install-marker-creation.yaml Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co> --------- Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co> (cherry picked from commit 65e2c30)
…4172) * move install file marker create, rename methods, change error message * change error message * change error message * remove calls to InstallMarkerExists * fix windows helper * fix windows, again * Revert "remove calls to InstallMarkerExists" This reverts commit 400bb0d. * change install file locaton, revert removal of filepath.Dir * fix wording * move IsUpgradeable, remove duplication * add tests * add mostly useless test * linter * tinkering with CI * skip supervisor test on windows * tinkering with the linter * remove experiment * refactor install to make adding tests easier * linter * add more tests... * still fixing tests * sonarQube * fix windows tests, fight sonarqube * tinker with file marker create * move install path * fix bad merge * still cleaning up merge * add tests * add tests, docs * fix tests * don't check mode bits * move command, fix permissions * SonarQube... * refactor dir setup to make Sonarqube happy * add changelog * Update changelog/fragments/1707951532-change-install-marker-creation.yaml Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co> --------- Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co> (cherry picked from commit 65e2c30) # Conflicts: # internal/pkg/agent/install/install.go # internal/pkg/agent/install/install_test.go
…4172) (#4307) * move install file marker create, rename methods, change error message * change error message * change error message * remove calls to InstallMarkerExists * fix windows helper * fix windows, again * Revert "remove calls to InstallMarkerExists" This reverts commit 400bb0d. * change install file locaton, revert removal of filepath.Dir * fix wording * move IsUpgradeable, remove duplication * add tests * add mostly useless test * linter * tinkering with CI * skip supervisor test on windows * tinkering with the linter * remove experiment * refactor install to make adding tests easier * linter * add more tests... * still fixing tests * sonarQube * fix windows tests, fight sonarqube * tinker with file marker create * move install path * fix bad merge * still cleaning up merge * add tests * add tests, docs * fix tests * don't check mode bits * move command, fix permissions * SonarQube... * refactor dir setup to make Sonarqube happy * add changelog * Update changelog/fragments/1707951532-change-install-marker-creation.yaml Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co> --------- Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co> (cherry picked from commit 65e2c30) Co-authored-by: Alex K <8418476+fearful-symmetry@users.noreply.github.com>
What does this PR do?
Closes #4051
Why is it important?
This does a few things:
.installed
file to earlier on in the install process, so a partial install won't break theuninstall
command.RunningInstalled()
to make it more clear what the function does.Checklist
- [ ] I have made corresponding changes to the documentation- [ ] I have made corresponding change to the default configuration files./changelog/fragments
using the changelog tool