Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node: Don't hang on post to Loki #4267

Conversation

bruce-riley
Copy link
Contributor

@bruce-riley bruce-riley commented Feb 18, 2025

There was apparently a Grafana outage that caused the guardian to hang while posting to Loki because the channel filled up, blocking the caller. This PR adds checking for channel full and metrics to be incremented if the channel post succeeds or fails.

To test this PR I configured an nginx proxy for grafana logging as documented here and pointed my local guardian at it.

I then added testing code that creates a go routine to generate ten log messages per second. After verifying those showed up in Grafana, I stopped the ngxinx proxy.

Before this fix, logging (even to the terminal) stopped. After this fix, the logging to the terminal continued. (Obviously there is a gap in Grafana.

I also verified that logging to Grafana resumed a few minutes after restarting the proxy service, without needing to restart the guardian.

@bruce-riley bruce-riley marked this pull request as ready for review February 18, 2025 13:09
@lrogana
Copy link

lrogana commented Feb 18, 2025

Looks good. We have tested this in scenarios where loki is down

@evan-gray evan-gray force-pushed the node_dont_hang_on_loki_post branch from 7b1ddcb to 0c0a4c3 Compare February 18, 2025 16:14
@bruce-riley bruce-riley merged commit 2ea519c into wormhole-foundation:main Feb 18, 2025
31 checks passed
@bruce-riley bruce-riley deleted the node_dont_hang_on_loki_post branch February 18, 2025 16:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants