Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add retry on S3 SlowDown exceptions #958

Merged
merged 2 commits into from
Feb 4, 2025
Merged

Add retry on S3 SlowDown exceptions #958

merged 2 commits into from
Feb 4, 2025

Conversation

BenGalewsky
Copy link
Contributor

Problem

When the Ceph object store is overwhelmed it throws SlowDown exceptions:

Traceback (most recent call last):
  File "/servicex/transformer_sidecar/object_store_manager.py", line 62, in upload_file
    result = self.minio_client.fput_object(bucket_name=bucket,
  File "/usr/local/lib/python3.10/site-packages/minio/api.py", line 1051, in fput_object
    return self.put_object(
  File "/usr/local/lib/python3.10/site-packages/minio/api.py", line 1996, in put_object
    raise exc
  File "/usr/local/lib/python3.10/site-packages/minio/api.py", line 1947, in put_object
    upload_id = self._create_multipart_upload(
  File "/usr/local/lib/python3.10/site-packages/minio/api.py", line 1762, in _create_multipart_upload
    response = self._execute(
  File "/usr/local/lib/python3.10/site-packages/minio/api.py", line 441, in _execute
    return self._url_open(
  File "/usr/local/lib/python3.10/site-packages/minio/api.py", line 424, in _url_open
    raise response_error
minio.error.S3Error: S3 operation failed; code: SlowDown, message: None, resource: None, request_id: tx0000032ac6ddfe2121a06-0067943977-aeca260-af-object-store, host_id: aeca260-af-object-store-af-object-store

We should retry when this happens. It can often impact several transformers and so it's essential that we use the exponential retry along with jitter.

@BenGalewsky BenGalewsky requested a review from ponyisi January 29, 2025 22:41
@BenGalewsky BenGalewsky merged commit 0df7123 into develop Feb 4, 2025
69 checks passed
@BenGalewsky BenGalewsky deleted the s3_retry branch February 4, 2025 14:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants