Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fleet config change logger #4050

Merged
merged 20 commits into from
Jan 25, 2024

Conversation

fearful-symmetry
Copy link
Contributor

@fearful-symmetry fearful-symmetry commented Jan 9, 2024

closes #3406

What does this PR do?

This is a basic implementation of #3406. When triggered, it prints a change detailing added and removed units:

{"log.level":"info","@timestamp":"2024-01-08T23:52:56.107Z","log.origin":{"file.name":"coordinator/coordinator.go","file.line":1297},"message":"The following components have been added: [udp-default journald-default]","log":{"source":"elastic-agent"},"ecs.version":"1.6.0"}

Right now this PR is fairly basic and does what we want it to do, so I figured I'd ask if there's anything else we think we should do here. We could drill down even farther, and diff individual streams in an input, etc. The downside here is that diffing an array is kind of complex, so we might be incurring a large cost for complex configs.

Example log lines:

{"log.level":"info","@timestamp":"2024-01-17T22:08:16.612Z","log.origin":{"file.name":"coordinator/coordinator.go","file.line":1382},"message":"component model updated","log":{"source":"elastic-agent"},"changes":{"components":{"added":["udp-default","journald-default"],"updated":["http/metrics-monitoring: [(http/metrics-monitoring-metrics-monitoring-agent: updated)]","log-default: [(log-default-logfile-system-9bf446fc-58d4-4767-b42d-3450815d5d3d: updated)]","system/metrics-default: [(system/metrics-default-system/metrics-system-9bf446fc-58d4-4767-b42d-3450815d5d3d: updated)]","beat/metrics-monitoring: [(beat/metrics-monitoring-metrics-monitoring-beats: updated)]"],"count":7},"outputs":{}},"ecs.version":"1.6.0"}

{"log.level":"info","@timestamp":"2024-01-17T22:08:18.558Z","log.origin":{"file.name":"coordinator/coordinator.go","file.line":1382},"message":"component model updated","log":{"source":"elastic-agent"},"changes":{"components":{"updated":["beat/metrics-monitoring: [(beat/metrics-monitoring-metrics-monitoring-beats: updated)]","http/metrics-monitoring: [(http/metrics-monitoring-metrics-monitoring-agent: updated)]"],"count":7},"outputs":{}},"ecs.version":"1.6.0"}

Why is it important?

We need debug data showing us what config has changed before a config is applied.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in ./changelog/fragments using the changelog tool
  • I have added an integration test or an E2E test

@fearful-symmetry fearful-symmetry added the Team:Elastic-Agent Label for the Agent team label Jan 9, 2024
@fearful-symmetry fearful-symmetry self-assigned this Jan 9, 2024
@fearful-symmetry fearful-symmetry requested a review from a team as a code owner January 9, 2024 18:45
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

Copy link
Contributor

mergify bot commented Jan 9, 2024

This pull request does not have a backport label. Could you fix it @fearful-symmetry? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v./d./d./d is the label to automatically backport to the 8./d branch. /d is the digit

NOTE: backport-skip has been added to this pull request.

@mergify mergify bot added the backport-skip label Jan 9, 2024
Comment on lines 1226 to 1228
if c.logger.IsDebug() {
c.checkAndLogUpdate()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this is valuable even when not in debug mode, policy changes aren't that frequent so the actual amortized cost of this should be low.

I'd vote we turn this on by default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yah, seems useful, but I was kinda worried about overhead, particularly if we start adding info on component updates. Should probably run some benchmarks and see what it looks like.

@fearful-symmetry
Copy link
Contributor Author

Alright, completely refactored this to properly print updated component/units, and also clean it up.

There's still a few more tests to add, but other than that, should be good.

Copy link
Member

@AndersonQ AndersonQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a small blocker

Comment on lines 1247 to 1252
type compCheck struct {
inNew bool
inLast bool

diffUnits map[string]diffCheck
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Nit]
Whys is compCheck defined only here, but diffCheck in the package level? I think it'd be more clear if they both were defined together at the package level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yah, that was done entirely because compCheck isn't used outside of one method, and diffCheck is. I didn't want to clutter up coordinator.go.

Comment on lines +1261 to +1266
if outDiff, ok := outDiffMap[outName]; ok {
outDiff.inLast = true
outDiffMap[outName] = outDiff
} else {
outDiffMap[outName] = diffCheck{inLast: true}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you checking if it exists to "append" the inLast: true without changing the other values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yah, the logic is to update if it exists, add otherwise.

@fearful-symmetry
Copy link
Contributor Author

test failed on windows, but not for the reasons expected?

 testing.go:1225: TempDir RemoveAll cleanup: remove C:\Users\buildkite\AppData\Local\Temp\TestComponentUpdateDiff2228549831\001\testlog-20240116.ndjson: The process cannot access the file because it is being used by another process.

Copy link
Member

@cmacknz cmacknz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you post a few sample logs that this outputs?

@AndersonQ AndersonQ dismissed their stale review January 24, 2024 17:29

I'll be on PTO for tne next 3 work days

@AndersonQ
Copy link
Member

@cmacknz @fearful-symmetry I'll be on PTO until next Tue, so feel free to proceed without my review. I've already removed my "request changes"

Copy link
Contributor

@blakerouse blakerouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, has nice test coverage!

Copy link

Quality Gate passed Quality Gate passed

The SonarQube Quality Gate passed, but some issues were introduced.

1 New issue
0 Security Hotspots
98.2% 98.2% Coverage on New Code
0.0% 0.0% Duplication on New Code

See analysis details on SonarQube

@fearful-symmetry fearful-symmetry merged commit 65943b4 into elastic:main Jan 25, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-skip Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Log a summary of each policy change received from Fleet
5 participants