You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current Alerting System on Grafana is generating an excessive number of failed-request-too-high alerts. To ensure the system remains reliable and actionable, it is crucial to investigate the root cause of the elevated failure rates and assess whether the alerting thresholds or mechanisms need refinement.
Upon analysis, the eth_getBlockByHash and eth_getBlockByNumber endpoints have been identified as the primary drivers of the issue, contributing significantly to the recurring errors observed in the system.
Solution
The proposed solution involves analyzing the logs to determine the root cause of the failed requests and identifying the specific response codes being returned. Based on these findings, address the underlying issues by locating and resolving the bug causing the failures. This approach will help mitigate the problem and effectively reduce the volume of white-noise alerts, ensuring the alerting system remains focused on critical issues.
Alternatives
No response
The content you are editing has changed. Please copy your edits and refresh the page.
quiet-node
changed the title
Investigate and Mitigate Excessive failed-request-too-high Alerts in the Alerting System
[Failed-Request-Alert-Tuning] Investigate and Mitigate Excessive failed-request-too-high Alerts in the Alerting System
Dec 20, 2024
Problem
The current Alerting System on Grafana is generating an excessive number of
failed-request-too-high
alerts. To ensure the system remains reliable and actionable, it is crucial to investigate the root cause of the elevated failure rates and assess whether the alerting thresholds or mechanisms need refinement.Upon analysis, the
eth_getBlockByHash
andeth_getBlockByNumber
endpoints have been identified as the primary drivers of the issue, contributing significantly to the recurring errors observed in the system.Solution
The proposed solution involves analyzing the logs to determine the root cause of the failed requests and identifying the specific response codes being returned. Based on these findings, address the underlying issues by locating and resolving the bug causing the failures. This approach will help mitigate the problem and effectively reduce the volume of white-noise alerts, ensuring the alerting system remains focused on critical issues.
Alternatives
No response
Tasks
eth_getBlockByHash
&ð_getBlockByNumber
method returns too many 500 errors #3345The text was updated successfully, but these errors were encountered: