Skip to content
This repository has been archived by the owner on Nov 12, 2024. It is now read-only.

INCIDENT 025 | Error in median selection precision #28

Open
Christian-MK opened this issue Apr 10, 2024 · 0 comments
Open

INCIDENT 025 | Error in median selection precision #28

Christian-MK opened this issue Apr 10, 2024 · 0 comments
Labels
ada-usd Incident affected the ADA-USD feed under-review Incident is under-review since being logged and continue to be monitored

Comments

@Christian-MK
Copy link
Contributor

Trigger

  • ⬛ suspected malware infections
  • ⬛ access violations
  • ✔️ anomalous system behaviors
  • ⬛ human errors
  • ⬛ unauthorized access attempts

Date

2024-04-06

Summary

Orcfax heartbeat was missed at 0200 UTC on 6 April due to an error in median selection precision between collectors and the
validator.

Status

Under Review

Assessment

Orcfax collectors are written in Golang, and the validator in Python. With an even number of collector values, the team has discovered that median is occasionally being calculated to a different precision between the collector and the validator. This causes a median selection error.

Additional Notes

For consumers, the 0200 UTC heartbeat was missed and the problem corrected itself at the next regular heartbeat.

At the time the incident was first noticed, the Orcfax team did not have enough information to determine the cause and to satisfactorily complete the incident report.

However, additional observations since have allowed the team to better understand and reflect on the issues being seen.

Technical improvements

We are investigating:

  1. More efficient methods for determining the level of precision in the validator node.
  2. Temporarily reducing the number of sources in collectors from six to five.
  3. Identify the correct retry mechanism between the validator and coop that increases publishing reliability.

Documentation improvements

  1. Updating the documentation for devops teams looking at diagnosing validation issues.
@Christian-MK Christian-MK added under-review Incident is under-review since being logged and continue to be monitored ada-usd Incident affected the ADA-USD feed labels Apr 10, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
ada-usd Incident affected the ADA-USD feed under-review Incident is under-review since being logged and continue to be monitored
Projects
None yet
Development

No branches or pull requests

1 participant