Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clear agent.upgrade_attempts on upgrade complete #4528

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

jillguyonnet
Copy link
Contributor

@jillguyonnet jillguyonnet commented Feb 28, 2025

What is the problem this PR solves?

elastic/kibana#212744 adds retry logic to the task that automatically ugprades agents. Agents that were upgraded through this task have their new upgrade_attempts property populated. It is missing a way to clear this property when the upgrade completes.

How does this PR solve the problem?

The change in this PR clears upgrade_attempts when the upgrade is complete.

How to test this PR locally

This should be tested alongside elastic/kibana#212744 (or after it is merged - this is fine, since automatic upgrades are currently behind the enableAutomaticAgentUpgrades feature flag). With this change, agents upgraded through the automatic upgrade task should have their upgrade_attempts property set to null.

Design Checklist

  • I have ensured my design is stateless and will work when multiple fleet-server instances are behind a load balancer.
  • I have or intend to scale test my changes, ensuring it will work reliably with 100K+ agents connected.
  • I have included fail safe mechanisms to limit the load on fleet-server: rate limiting, circuit breakers, caching, load shedding, etc.

Checklist

  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in ./changelog/fragments using the changelog tool

Related issues

Relates https://github.com/elastic/ingest-dev/issues/4720

@jillguyonnet jillguyonnet added the enhancement New feature or request label Feb 28, 2025
@jillguyonnet jillguyonnet self-assigned this Feb 28, 2025
Copy link
Contributor

mergify bot commented Feb 28, 2025

This pull request does not have a backport label. Could you fix it @jillguyonnet? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-./d./d is the label to automatically backport to the 8./d branch. /d is the digit
  • backport-active-all is the label that automatically backports to all active branches.
  • backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
  • backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.

@jillguyonnet jillguyonnet added the backport-skip Skip notification from the automated backport with mergify label Feb 28, 2025
@@ -219,7 +219,7 @@ func (ct *CheckinT) validateRequest(zlog zerolog.Logger, w http.ResponseWriter,
}

wTime := pollDuration + time.Minute
rc := http.NewResponseController(w) //nolint:bodyclose // we are working with a ResponseWriter not a Respons
rc := http.NewResponseController(w)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This caused the nolintlint linter to error:

Error: internal/pkg/api/handleCheckin.go:222:39: directive `//nolint:bodyclose // we are working with a ResponseWriter not a Respons` is unused for linter "bodyclose" (nolintlint)

I installed golangci-lint (version 1.64.5) locally and removing this nolint directive works, so that seemed like a better option than also silencing nolintlint.

By the way, golangci-lint run yields a lot of errors across the codebase, most of them errcheck. It seems the CI only lints modified files.

@cmacknz cmacknz added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label Feb 28, 2025
Copy link
Member

@pchila pchila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks sensible, holding off the approval until we have a green CI run (it seems we are getting some errors from ECH when creating a stack and 503s when trying to clean it up (not sure if it's related to the failed creation)

Copy link
Contributor

@michel-laterman michel-laterman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, merge when CI is green

@cmacknz
Copy link
Member

cmacknz commented Mar 3, 2025

There is a test you should update to check that this field is properly reset:

name: "agent has details checkin details are nil",
agent: &model.Agent{ESDocument: esd, Agent: &model.AgentMetadata{ID: "test-agent"}, UpgradeDetails: &model.UpgradeDetails{}},
details: nil,
bulk: func() *ftesting.MockBulk {
mBulk := ftesting.NewMockBulk()
mBulk.On("Update", mock.Anything, dl.FleetAgents, "doc-ID", mock.MatchedBy(func(p []byte) bool {
doc := struct {
Doc map[string]interface{} `json:"doc"`
}{}
if err := json.Unmarshal(p, &doc); err != nil {
t.Logf("bulk match unmarshal error: %v", err)
return false
}
return doc.Doc[dl.FieldUpgradeDetails] == nil && doc.Doc[dl.FieldUpgradeStartedAt] == nil && doc.Doc[dl.FieldUpgradeStatus] == nil && doc.Doc[dl.FieldUpgradedAt] != ""
}), mock.Anything, mock.Anything).Return(nil)
return mBulk
},
cache: func() *testcache.MockCache {
return testcache.NewMockCache()
},
err: nil,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-skip Skip notification from the automated backport with mergify enhancement New feature or request Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants