[Failed-Request-Alert-Tuning] Investigate and Mitigate Excessive failed-request-too-high Alerts in the Alerting System #3344

quiet-node · 2024-12-20T02:06:59Z

Problem

The current Alerting System on Grafana is generating an excessive number of failed-request-too-high alerts. To ensure the system remains reliable and actionable, it is crucial to investigate the root cause of the elevated failure rates and assess whether the alerting thresholds or mechanisms need refinement.

Upon analysis, the eth_getBlockByHash and eth_getBlockByNumber endpoints have been identified as the primary drivers of the issue, contributing significantly to the recurring errors observed in the system.

Solution

The proposed solution involves analyzing the logs to determine the root cause of the failed requests and identifying the specific response codes being returned. Based on these findings, address the underlying issues by locating and resolving the bug causing the failures. This approach will help mitigate the problem and effectively reduce the volume of white-noise alerts, ensuring the alerting system remains focused on critical issues.

Alternatives

No response

Tasks

Give feedback

[Failed-Request-Alert-Tuning] Traverse through logs to identify and document all issues contributing to the "failed-request-too-high" alerts.
[Failed-Request-Alert-Tuning] eth_getBlockByHash && eth_getBlockByNumber method returns too many 500 errors #3345

bug
[Failed-Request-Alert-Tuning] eth_getTransactionReceipt method returns too many 500 errors #3351

bug
[Failed-Request-Alert-Tuning] Enhanced retry mechanism for MN contract results to poll until records are fully mature #3366

enhancement
[Failed-Request-Alert-Tuning] Assign Appropriate HTTP Status Code for Immature Records error #3392

internal
Options

The text was updated successfully, but these errors were encountered:

quiet-node added bug Something isn't working Epic labels Dec 20, 2024

quiet-node added this to the 0.63.0 milestone Dec 20, 2024

quiet-node self-assigned this Dec 20, 2024

quiet-node added this to Smart Contract Sprint Board Dec 20, 2024

github-project-automation bot moved this to Backlog in Smart Contract Sprint Board Dec 20, 2024

quiet-node moved this from Backlog to Epics In Progress in Smart Contract Sprint Board Dec 20, 2024

quiet-node changed the title ~~Investigate and Mitigate Excessive failed-request-too-high Alerts in the Alerting System~~ [Failed-Request-Alert-Tuning] Investigate and Mitigate Excessive failed-request-too-high Alerts in the Alerting System Dec 20, 2024

quiet-node modified the milestones: 0.63.0, 0.63.1 Dec 26, 2024

quiet-node moved this from Epics In Progress to In Review in Smart Contract Sprint Board Jan 6, 2025

quiet-node moved this from In Review to Epics In Progress in Smart Contract Sprint Board Jan 6, 2025

quiet-node closed this as completed Jan 9, 2025

github-project-automation bot moved this from Epics In Progress to Done in Smart Contract Sprint Board Jan 9, 2025

quiet-node reopened this Jan 9, 2025

github-project-automation bot moved this from Done to Sprint Backlog in Smart Contract Sprint Board Jan 9, 2025

quiet-node moved this from Sprint Backlog to Epics In Progress in Smart Contract Sprint Board Jan 10, 2025

quiet-node closed this as completed Jan 10, 2025

github-project-automation bot moved this from Epics In Progress to Done in Smart Contract Sprint Board Jan 10, 2025

quiet-node reopened this Jan 15, 2025

github-project-automation bot moved this from Done to Sprint Backlog in Smart Contract Sprint Board Jan 15, 2025

quiet-node moved this from Sprint Backlog to Epics In Progress in Smart Contract Sprint Board Jan 15, 2025

quiet-node closed this as completed Jan 24, 2025

github-project-automation bot moved this from Epics In Progress to Done in Smart Contract Sprint Board Jan 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Failed-Request-Alert-Tuning] Investigate and Mitigate Excessive failed-request-too-high Alerts in the Alerting System #3344

[Failed-Request-Alert-Tuning] Investigate and Mitigate Excessive failed-request-too-high Alerts in the Alerting System #3344

quiet-node commented Dec 20, 2024 •

edited

Loading

Tasks

[Failed-Request-Alert-Tuning] Investigate and Mitigate Excessive failed-request-too-high Alerts in the Alerting System #3344

[Failed-Request-Alert-Tuning] Investigate and Mitigate Excessive failed-request-too-high Alerts in the Alerting System #3344

Comments

quiet-node commented Dec 20, 2024 • edited Loading

Problem

Solution

Alternatives

Tasks

quiet-node commented Dec 20, 2024 •

edited

Loading