[DCJ-284] Increase tools pool datarepo_v1 from 1500->2000 #293
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
https://broadworkbench.atlassian.net/browse/DCJ-284
Background
TDR uses the
datarepo_v1
pool in RBS' tools namespace in our integration tests. We periodically run into pool exhaustion when many test runs are running concurrently. The configuration change we made to cancel earlier active test runs on a PR helps manage this load, but isn’t enough given the high volume of developer activity on TDR these days.RBS logs showed a spike in related errors yesterday, aligning with many test runs that ultimately failed (we didn’t yet have pool availability metrics exposed in Grafana): https://cloudlogging.app.goo.gl/Lvwz3ytmAb1Vwx3T8
Changes
datarepo_v1
pool in RBS' tools namespace from 1500 -> 2000Here's a view of the metrics now being exposed, we can see that from a few concurrent PR test runs this afternoon the pool was depleted to a low point of 58%.
