Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix filemarker creation process, update error message on uninstall #4172

Merged
merged 40 commits into from
Feb 21, 2024

Conversation

fearful-symmetry
Copy link
Contributor

@fearful-symmetry fearful-symmetry commented Jan 31, 2024

What does this PR do?

Closes #4051

Why is it important?

This does a few things:

  • moves the command to create the .installed file to earlier on in the install process, so a partial install won't break the uninstall command.
  • Rename RunningInstalled() to make it more clear what the function does.
  • Change the error message that's printed when the uninstall command detects an issue with the install path

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
    - [ ] I have made corresponding changes to the documentation
    - [ ] I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in ./changelog/fragments using the changelog tool
  • I have added an integration test or an E2E test

@fearful-symmetry fearful-symmetry self-assigned this Jan 31, 2024
@fearful-symmetry fearful-symmetry requested a review from a team as a code owner January 31, 2024 21:50
@fearful-symmetry fearful-symmetry changed the title Uninstall filemarker Fix filemarker creation process, update error message on uninstall Jan 31, 2024
Copy link
Contributor

mergify bot commented Jan 31, 2024

This pull request does not have a backport label. Could you fix it @fearful-symmetry? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v./d./d./d is the label to automatically backport to the 8./d branch. /d is the digit

NOTE: backport-skip has been added to this pull request.

@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

@cmacknz
Copy link
Member

cmacknz commented Feb 1, 2024

Coming up with an automated test for this is a bit tricky. To reproduce the original problem, we'd need to SIGINT the install command at the correct moment which would be hard to do deterministically.

@pchila added a good test of the "Copy files" portion of the installer in https://github.com/elastic/elastic-agent/pull/4173/files, see the changes in internal/pkg/agent/install/install_test.go.

Could we split the install up into individual function steps and write a unit test that ensures we can call uninstall if you only complete the first one?

You could also add a hidden test option to the install command to get the install to exit right after the files were copied or something. I like breaking the install code up into explicit steps a bit better since a test that says "ensure we can uninstall after the first step" is better than "ensure we can uninstall after we hit the part of the code I arbitrarily exited from".

@fearful-symmetry
Copy link
Contributor Author

@cmacknz when you say "ensure we can uninstall after the first step" are you thinking of a specific part of the process? After we create the .installed file? After the files are copied?

@cmacknz
Copy link
Member

cmacknz commented Feb 2, 2024

Yes, should have been clearer, I am thinking of the install process as a sequence of steps like:

  1. Create directory and install marker
  2. Copy the agent package.
  3. Setup the shell wrapper.
  4. Fix permissions.
  5. Setup the OS service.
  6. Etc

Copy link
Member

@pchila pchila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There a small doubt on the permissions we end up for topPath which should be checked

Comment on lines 104 to 115
err = os.MkdirAll(filepath.Dir(topPath), 0755)
err = os.MkdirAll(topPath, 0755)
if err != nil {
return utils.FileOwner{}, errors.New(
err,
fmt.Sprintf("failed to create installation parent directory (%s)", filepath.Dir(topPath)),
errors.M("directory", filepath.Dir(topPath)))
fmt.Sprintf("failed to create installation parent directory (%s)", topPath),
errors.M("directory", topPath))
}

// create the install marker
if err := CreateInstallMarker(topPath, ownership); err != nil {
return utils.FileOwner{}, fmt.Errorf("failed to create install marker: %w", err)
}
Copy link
Member

@pchila pchila Feb 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a subtle change compared to the previous logic:

  • before the change we were creating the parent directory of topPath with 0755, while topPath would be created by the copy of files
  • after the change we are creating topPath as well with 0755 permissions instead of copying the permissions from the package: this could be a little too permissive and it cannot be fixed by setting different permissions in the package directory.

From a quick check without this change the topPath is created with 0750

root@elastic-agent-dev:/opt/Elastic# ls -la
total 12
drwxr-xr-x 3 root root 4096 Feb  5 12:55 .
drwxr-xr-x 3 root root 4096 Dec 18 14:59 ..
drwxr-x--- 4 root root 4096 Feb  5 12:55 Agent

we should check what kind of permissions we get now that we create also topDir with 0755: does the Copy() call fixes those ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yah, in the previous version, the Copy call would create the Agent directory. I tested it like this, but wasn't considering permissions.Lemme check...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, we need to preserve the previous permissions. This is something else we should consider adding tests for here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I ended up moving the CreateInstallMarker to after the initial copy operation, which avoids the permission issue here, and also seems more useful? If we create the install marker and then the copy operation fails than we might not be able to run or uninstall the agent at all, as the executable or base config, etc, might be missing.

// do this after the file copy but before installing service files, etc
// That way if the service install process fails, a user can still run `elastic-agent uninstall`
// to clean up.
if err := CreateInstallMarker(topPath, ownership); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still happening after the copyFiles call, it should be immediately after the err = os.MkdirAll(filepath.Dir(topPath), 0755) earlier in this function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I chose to not put it there, since putting the .installed file before anything has been installed seemed counter-intuitive, and if the copy fails we might not be able to run elastic-agent at all. Unless there's something I'm not factoring in?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't run elastic-agent uninstall unless the .installed marker exists, so if you put it after the copy you can't remove the copied files from disk with the elastic-agent uninstall command.

@cmacknz cmacknz added backport-v8.13.0 Automated backport with mergify backport-v8.12.0 Automated backport with mergify and removed skip-changelog labels Feb 14, 2024
@mergify mergify bot removed the backport-skip label Feb 14, 2024
@cmacknz
Copy link
Member

cmacknz commented Feb 14, 2024

Needs a changelog entry.

…yaml

Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co>
Copy link

@cmacknz
Copy link
Member

cmacknz commented Feb 21, 2024

I am going to force merge this since Blake has requested changes but is on PTO. The change is now small enough that I don't expect he'd object.

@cmacknz cmacknz merged commit 65e2c30 into elastic:main Feb 21, 2024
9 checks passed
mergify bot pushed a commit that referenced this pull request Feb 21, 2024
…4172)

* move install file marker create, rename methods, change error message

* change error message

* change error message

* remove calls to InstallMarkerExists

* fix windows helper

* fix windows, again

* Revert "remove calls to InstallMarkerExists"

This reverts commit 400bb0d.

* change install file locaton, revert removal of filepath.Dir

* fix wording

* move IsUpgradeable, remove duplication

* add tests

* add mostly useless test

* linter

* tinkering with CI

* skip supervisor test on windows

* tinkering with the linter

* remove experiment

* refactor install to make adding tests easier

* linter

* add more tests...

* still fixing tests

* sonarQube

* fix windows tests, fight sonarqube

* tinker with file marker create

* move install path

* fix bad merge

* still cleaning up merge

* add tests

* add tests, docs

* fix tests

* don't check mode bits

* move command, fix permissions

* SonarQube...

* refactor dir setup to make Sonarqube happy

* add changelog

* Update changelog/fragments/1707951532-change-install-marker-creation.yaml

Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co>

---------

Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co>
(cherry picked from commit 65e2c30)
mergify bot pushed a commit that referenced this pull request Feb 21, 2024
…4172)

* move install file marker create, rename methods, change error message

* change error message

* change error message

* remove calls to InstallMarkerExists

* fix windows helper

* fix windows, again

* Revert "remove calls to InstallMarkerExists"

This reverts commit 400bb0d.

* change install file locaton, revert removal of filepath.Dir

* fix wording

* move IsUpgradeable, remove duplication

* add tests

* add mostly useless test

* linter

* tinkering with CI

* skip supervisor test on windows

* tinkering with the linter

* remove experiment

* refactor install to make adding tests easier

* linter

* add more tests...

* still fixing tests

* sonarQube

* fix windows tests, fight sonarqube

* tinker with file marker create

* move install path

* fix bad merge

* still cleaning up merge

* add tests

* add tests, docs

* fix tests

* don't check mode bits

* move command, fix permissions

* SonarQube...

* refactor dir setup to make Sonarqube happy

* add changelog

* Update changelog/fragments/1707951532-change-install-marker-creation.yaml

Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co>

---------

Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co>
(cherry picked from commit 65e2c30)

# Conflicts:
#	internal/pkg/agent/install/install.go
#	internal/pkg/agent/install/install_test.go
fearful-symmetry added a commit that referenced this pull request Feb 22, 2024
…4172) (#4307)

* move install file marker create, rename methods, change error message

* change error message

* change error message

* remove calls to InstallMarkerExists

* fix windows helper

* fix windows, again

* Revert "remove calls to InstallMarkerExists"

This reverts commit 400bb0d.

* change install file locaton, revert removal of filepath.Dir

* fix wording

* move IsUpgradeable, remove duplication

* add tests

* add mostly useless test

* linter

* tinkering with CI

* skip supervisor test on windows

* tinkering with the linter

* remove experiment

* refactor install to make adding tests easier

* linter

* add more tests...

* still fixing tests

* sonarQube

* fix windows tests, fight sonarqube

* tinker with file marker create

* move install path

* fix bad merge

* still cleaning up merge

* add tests

* add tests, docs

* fix tests

* don't check mode bits

* move command, fix permissions

* SonarQube...

* refactor dir setup to make Sonarqube happy

* add changelog

* Update changelog/fragments/1707951532-change-install-marker-creation.yaml

Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co>

---------

Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co>
(cherry picked from commit 65e2c30)

Co-authored-by: Alex K <8418476+fearful-symmetry@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-v8.12.0 Automated backport with mergify backport-v8.13.0 Automated backport with mergify Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing .installed file prevents uninstall or upgrade
7 participants