-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fickle API connection to S2 catalog that errors with RuntimeError: not recognized as a supported file format. #192
Comments
I'm trying to find a reproducable example, but I can't fully track it down yet. This is another kind traceback I got with a similar snippet as above: 2023-02-23 11:35:15,800 - distributed.worker - WARNING - Compute Failed
Key: ('asset_table_to_reader_and_window-fetch_raster_window-e1a79444dab8d9e2e7bc05b6293d1778', 0, 3, 0, 0)
Function: execute_task
args: ((subgraph_callable-35fd6af9-538f-4445-bc46-422dd5de4ead, (subgraph_callable-c44a256d-f930-43a0-86a9-5c78b0353ac1, array([[('https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/56/H/LH/2022/08/22/S2B_MSIL2A_20220822T000219_N0400_R030_T56HLH_20220822T162835.SAFE/GRANULE/L2A_T56HLH_A028513_20220822T000222/IMG_DATA/R10m/T56HLH_20220822T000219_B08_10m.tif?st=2023-02-22T10%3A13%3A55Z&se=2023-02-24T10%3A13%3A55Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2023-02-23T07%3A47%3A01Z&ske=2023-03-02T07%3A47%3A01Z&sks=b&skv=2021-06-08&sig=SH8dBq%2BwzHlm1LDwMrh4B0s7GRVysIyGVk%2B19y/hRRk%3D', [ 300000., 6190240., 409800., 6300040.])]],
dtype=[('url', 'O'), ('bounds', '<f8', (4,))]), RasterSpec(epsg=32756, bounds=(341928.23765310494, 6264827.991079295, 344662.6210193617, 6270757.102609981), resolutions_xy=(10.0, 10.0)), <Resampling.bilinear: 1>, dtype('float64'), nan, False, None, (<class 'tuple'>, [RasterioIOError('H
kwargs: {}
Exception: 'RuntimeError("Error opening \'https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/56/H/LH/2022/08/22/S2B_MSIL2A_20220822T000219_N0400_R030_T56HLH_20220822T162835.SAFE/GRANULE/L2A_T56HLH_A028513_20220822T000222/IMG_DATA/R10m/T56HLH_20220822T000219_B08_10m.tif?st=2023-02-22T10%3A13%3A55Z&se=2023-02-24T10%3A13%3A55Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2023-02-23T07%3A47%3A01Z&ske=2023-03-02T07%3A47%3A01Z&sks=b&skv=2021-06-08&sig=SH8dBq%2BwzHlm1LDwMrh4B0s7GRVysIyGVk%2B19y/hRRk%3D\': RasterioIOError(\'503: Recv failure: Connection reset by peer\')")'
2023-02-23 11:35:15,801 - distributed.worker - WARNING - Compute Failed
Key: ('asset_table_to_reader_and_window-fetch_raster_window-e1a79444dab8d9e2e7bc05b6293d1778', 0, 0, 0, 0)
Function: execute_task
args: ((subgraph_callable-35fd6af9-538f-4445-bc46-422dd5de4ead, (subgraph_callable-c44a256d-f930-43a0-86a9-5c78b0353ac1, array([[('https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/56/H/LH/2022/08/22/S2B_MSIL2A_20220822T000219_N0400_R030_T56HLH_20220822T162835.SAFE/GRANULE/L2A_T56HLH_A028513_20220822T000222/IMG_DATA/R10m/T56HLH_20220822T000219_B02_10m.tif?st=2023-02-22T10%3A13%3A55Z&se=2023-02-24T10%3A13%3A55Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2023-02-23T07%3A47%3A01Z&ske=2023-03-02T07%3A47%3A01Z&sks=b&skv=2021-06-08&sig=SH8dBq%2BwzHlm1LDwMrh4B0s7GRVysIyGVk%2B19y/hRRk%3D', [ 300000., 6190240., 409800., 6300040.])]],
dtype=[('url', 'O'), ('bounds', '<f8', (4,))]), RasterSpec(epsg=32756, bounds=(341928.23765310494, 6264827.991079295, 344662.6210193617, 6270757.102609981), resolutions_xy=(10.0, 10.0)), <Resampling.bilinear: 1>, dtype('float64'), nan, False, None, (<class 'tuple'>, [RasterioIOError('H
kwargs: {}
Exception: 'RuntimeError(\'Error opening \\\'https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/56/H/LH/2022/08/22/S2B_MSIL2A_20220822T000219_N0400_R030_T56HLH_20220822T162835.SAFE/GRANULE/L2A_T56HLH_A028513_20220822T000222/IMG_DATA/R10m/T56HLH_20220822T000219_B02_10m.tif?st=2023-02-22T10%3A13%3A55Z&se=2023-02-24T10%3A13%3A55Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2023-02-23T07%3A47%3A01Z&ske=2023-03-02T07%3A47%3A01Z&sks=b&skv=2021-06-08&sig=SH8dBq%2BwzHlm1LDwMrh4B0s7GRVysIyGVk%2B19y/hRRk%3D\\\': RasterioIOError("\\\'/vsicurl/https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/56/H/LH/2022/08/22/S2B_MSIL2A_20220822T000219_N0400_R030_T56HLH_20220822T162835.SAFE/GRANULE/L2A_T56HLH_A028513_20220822T000222/IMG_DATA/R10m/T56HLH_20220822T000219_B02_10m.tif?st=2023-02-22T10%3A13%3A55Z&se=2023-02-24T10%3A13%3A55Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2023-02-23T07%3A47%3A01Z&ske=2023-03-02T07%3A47%3A01Z&sks=b&skv=2021-06-08&sig=SH8dBq%2BwzHlm1LDwMrh4B0s7GRVysIyGVk%2B19y/hRRk%3D\\\' not recognized as a supported file format.")\')'
2023-02-23 11:35:15,811 - distributed.worker - WARNING - Compute Failed
Key: ('asset_table_to_reader_and_window-fetch_raster_window-e1a79444dab8d9e2e7bc05b6293d1778', 0, 1, 0, 0)
Function: execute_task
args: ((subgraph_callable-35fd6af9-538f-4445-bc46-422dd5de4ead, (subgraph_callable-c44a256d-f930-43a0-86a9-5c78b0353ac1, array([[('https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/56/H/LH/2022/08/22/S2B_MSIL2A_20220822T000219_N0400_R030_T56HLH_20220822T162835.SAFE/GRANULE/L2A_T56HLH_A028513_20220822T000222/IMG_DATA/R10m/T56HLH_20220822T000219_B03_10m.tif?st=2023-02-22T10%3A13%3A55Z&se=2023-02-24T10%3A13%3A55Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2023-02-23T07%3A47%3A01Z&ske=2023-03-02T07%3A47%3A01Z&sks=b&skv=2021-06-08&sig=SH8dBq%2BwzHlm1LDwMrh4B0s7GRVysIyGVk%2B19y/hRRk%3D', [ 300000., 6190240., 409800., 6300040.])]],
dtype=[('url', 'O'), ('bounds', '<f8', (4,))]), RasterSpec(epsg=32756, bounds=(341928.23765310494, 6264827.991079295, 344662.6210193617, 6270757.102609981), resolutions_xy=(10.0, 10.0)), <Resampling.bilinear: 1>, dtype('float64'), nan, False, None, (<class 'tuple'>, [RasterioIOError('H
kwargs: {}
Exception: 'RuntimeError(\'Error opening \\\'https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/56/H/LH/2022/08/22/S2B_MSIL2A_20220822T000219_N0400_R030_T56HLH_20220822T162835.SAFE/GRANULE/L2A_T56HLH_A028513_20220822T000222/IMG_DATA/R10m/T56HLH_20220822T000219_B03_10m.tif?st=2023-02-22T10%3A13%3A55Z&se=2023-02-24T10%3A13%3A55Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2023-02-23T07%3A47%3A01Z&ske=2023-03-02T07%3A47%3A01Z&sks=b&skv=2021-06-08&sig=SH8dBq%2BwzHlm1LDwMrh4B0s7GRVysIyGVk%2B19y/hRRk%3D\\\': RasterioIOError("\\\'/vsicurl/https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/56/H/LH/2022/08/22/S2B_MSIL2A_20220822T000219_N0400_R030_T56HLH_20220822T162835.SAFE/GRANULE/L2A_T56HLH_A028513_20220822T000222/IMG_DATA/R10m/T56HLH_20220822T000219_B03_10m.tif?st=2023-02-22T10%3A13%3A55Z&se=2023-02-24T10%3A13%3A55Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2023-02-23T07%3A47%3A01Z&ske=2023-03-02T07%3A47%3A01Z&sks=b&skv=2021-06-08&sig=SH8dBq%2BwzHlm1LDwMrh4B0s7GRVysIyGVk%2B19y/hRRk%3D\\\' not recognized as a supported file format.")\')'
2023-02-23 11:35:15,858 - distributed.worker - WARNING - Compute Failed
Key: ('asset_table_to_reader_and_window-fetch_raster_window-ce1b7bca382e9de462603fa6630a18a9', 0, 5, 0, 0)
Function: execute_task
args: ((subgraph_callable-724a8816-772a-4cd1-9dc5-2c737b6b9592, (subgraph_callable-5613dc44-6692-46dd-b3a3-711328c5dc8c, array([[('https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/56/H/LH/2022/11/20/S2B_MSIL2A_20221120T000219_N0400_R030_T56HLH_20221120T090830.SAFE/GRANULE/L2A_T56HLH_A029800_20221120T000221/IMG_DATA/R20m/T56HLH_20221120T000219_SCL_20m.tif?st=2023-02-22T10%3A13%3A55Z&se=2023-02-24T10%3A13%3A55Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2023-02-23T07%3A47%3A01Z&ske=2023-03-02T07%3A47%3A01Z&sks=b&skv=2021-06-08&sig=SH8dBq%2BwzHlm1LDwMrh4B0s7GRVysIyGVk%2B19y/hRRk%3D', [ 300000., 6190240., 409800., 6300040.])]],
dtype=[('url', 'O'), ('bounds', '<f8', (4,))]), RasterSpec(epsg=32756, bounds=(341928.23765310494, 6264827.991079295, 344662.6210193617, 6270757.102609981), resolutions_xy=(10.0, 10.0)), <Resampling.bilinear: 1>, dtype('float64'), nan, False, None, (<class 'tuple'>, [RasterioIOError('H
kwargs: {}
Exception: 'RuntimeError(\'Error opening \\\'https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/56/H/LH/2022/11/20/S2B_MSIL2A_20221120T000219_N0400_R030_T56HLH_20221120T090830.SAFE/GRANULE/L2A_T56HLH_A029800_20221120T000221/IMG_DATA/R20m/T56HLH_20221120T000219_SCL_20m.tif?st=2023-02-22T10%3A13%3A55Z&se=2023-02-24T10%3A13%3A55Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2023-02-23T07%3A47%3A01Z&ske=2023-03-02T07%3A47%3A01Z&sks=b&skv=2021-06-08&sig=SH8dBq%2BwzHlm1LDwMrh4B0s7GRVysIyGVk%2B19y/hRRk%3D\\\': RasterioIOError("\\\'/vsicurl/https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/56/H/LH/2022/11/20/S2B_MSIL2A_20221120T000219_N0400_R030_T56HLH_20221120T090830.SAFE/GRANULE/L2A_T56HLH_A029800_20221120T000221/IMG_DATA/R20m/T56HLH_20221120T000219_SCL_20m.tif?st=2023-02-22T10%3A13%3A55Z&se=2023-02-24T10%3A13%3A55Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2023-02-23T07%3A47%3A01Z&ske=2023-03-02T07%3A47%3A01Z&sks=b&skv=2021-06-08&sig=SH8dBq%2BwzHlm1LDwMrh4B0s7GRVysIyGVk%2B19y/hRRk%3D\\\' not recognized as a supported file format.")\')'
---------------------------------------------------------------------------
RasterioIOError Traceback (most recent call last)
File ~/mambaforge/envs/coastal/lib/python3.10/site-packages/stackstac/rio_reader.py:326, in _open()
325 try:
--> 326 ds = SelfCleaningDatasetReader(
327 self.url, sharing=False
328 )
329 except Exception as e:
File rasterio/_base.pyx:309, in rasterio._base.DatasetBase.__init__()
RasterioIOError: 503: Recv failure: Connection reset by peer
The above exception was the direct cause of the following exception:
RuntimeError Traceback (most recent call last)
Cell In[67], line 1
----> 1 da = dss.compute()
File ~/mambaforge/envs/coastal/lib/python3.10/site-packages/xarray/core/dataset.py:912, in Dataset.compute(self, **kwargs)
893 """Manually trigger loading and/or computation of this dataset's data
894 from disk or a remote source into memory and return a new dataset.
895 Unlike load, the original dataset is left unaltered.
(...)
909 dask.compute
910 """
911 new = self.copy(deep=False)
--> 912 return new.load(**kwargs)
File ~/mambaforge/envs/coastal/lib/python3.10/site-packages/xarray/core/dataset.py:746, in Dataset.load(self, **kwargs)
743 import dask.array as da
745 # evaluate all the dask arrays simultaneously
--> 746 evaluated_data = da.compute(*lazy_data.values(), **kwargs)
748 for k, data in zip(lazy_data, evaluated_data):
749 self.variables[k].data = data
File ~/mambaforge/envs/coastal/lib/python3.10/site-packages/dask/base.py:599, in compute(traverse, optimize_graph, scheduler, get, *args, **kwargs)
596 keys.append(x.__dask_keys__())
597 postcomputes.append(x.__dask_postcompute__())
--> 599 results = schedule(dsk, keys, **kwargs)
600 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])
File ~/mambaforge/envs/coastal/lib/python3.10/site-packages/distributed/client.py:3137, in Client.get(self, dsk, keys, workers, allow_other_workers, resources, sync, asynchronous, direct, retries, priority, fifo_timeout, actors, **kwargs)
3135 should_rejoin = False
3136 try:
-> 3137 results = self.gather(packed, asynchronous=asynchronous, direct=direct)
3138 finally:
3139 for f in futures.values():
File ~/mambaforge/envs/coastal/lib/python3.10/site-packages/distributed/client.py:2306, in Client.gather(self, futures, errors, direct, asynchronous)
2304 else:
2305 local_worker = None
-> 2306 return self.sync(
2307 self._gather,
2308 futures,
2309 errors=errors,
2310 direct=direct,
2311 local_worker=local_worker,
2312 asynchronous=asynchronous,
2313 )
File ~/mambaforge/envs/coastal/lib/python3.10/site-packages/distributed/utils.py:338, in SyncMethodMixin.sync(self, func, asynchronous, callback_timeout, *args, **kwargs)
336 return future
337 else:
--> 338 return sync(
339 self.loop, func, *args, callback_timeout=callback_timeout, **kwargs
340 )
File ~/mambaforge/envs/coastal/lib/python3.10/site-packages/distributed/utils.py:405, in sync(loop, func, callback_timeout, *args, **kwargs)
403 if error:
404 typ, exc, tb = error
--> 405 raise exc.with_traceback(tb)
406 else:
407 return result
File ~/mambaforge/envs/coastal/lib/python3.10/site-packages/distributed/utils.py:378, in sync.<locals>.f()
376 future = asyncio.wait_for(future, callback_timeout)
377 future = asyncio.ensure_future(future)
--> 378 result = yield future
379 except Exception:
380 error = sys.exc_info()
File ~/mambaforge/envs/coastal/lib/python3.10/site-packages/tornado/gen.py:769, in Runner.run(self)
766 exc_info = None
768 try:
--> 769 value = future.result()
770 except Exception:
771 exc_info = sys.exc_info()
File ~/mambaforge/envs/coastal/lib/python3.10/site-packages/distributed/client.py:2169, in Client._gather(self, futures, errors, direct, local_worker)
2167 exc = CancelledError(key)
2168 else:
-> 2169 raise exception.with_traceback(traceback)
2170 raise exc
2171 if errors == "skip":
File ~/mambaforge/envs/coastal/lib/python3.10/site-packages/dask/optimization.py:990, in __call__()
988 if not len(args) == len(self.inkeys):
989 raise ValueError("Expected %d args, got %d" % (len(self.inkeys), len(args)))
--> 990 return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
File ~/mambaforge/envs/coastal/lib/python3.10/site-packages/dask/core.py:149, in get()
147 for key in toposort(dsk):
148 task = dsk[key]
--> 149 result = _execute_task(task, cache)
150 cache[key] = result
151 result = _execute_task(out, cache)
File ~/mambaforge/envs/coastal/lib/python3.10/site-packages/dask/core.py:119, in _execute_task()
115 func, args = arg[0], arg[1:]
116 # Note: Don't assign the subtask results to a variable. numpy detects
117 # temporaries by their reference count and can execute certain
118 # operations in-place.
--> 119 return func(*(_execute_task(a, cache) for a in args))
120 elif not ishashable(arg):
121 return arg
File ~/mambaforge/envs/coastal/lib/python3.10/site-packages/stackstac/to_dask.py:185, in fetch_raster_window()
178 # Only read if the window we're fetching actually overlaps with the asset
179 if windows.intersect(current_window, asset_window):
180 # NOTE: when there are multiple assets, we _could_ parallelize these reads with our own threadpool.
181 # However, that would probably increase memory usage, since the internal, thread-local GDAL datasets
182 # would end up copied to even more threads.
183
184 # TODO when the Reader won't be rescaling, support passing `output` to avoid the copy?
--> 185 data = reader.read(current_window)
187 if all_empty:
188 # Turn `output` from a broadcast-trick array to a real array, so it's writeable
189 if (
190 np.isnan(data)
191 if np.isnan(fill_value)
192 else np.equal(data, fill_value)
193 ).all():
194 # Unless the data we just read is all empty anyway
File ~/mambaforge/envs/coastal/lib/python3.10/site-packages/stackstac/rio_reader.py:385, in read()
384 def read(self, window: Window, **kwargs) -> np.ndarray:
--> 385 reader = self.dataset
386 try:
387 result = reader.read(
388 window=window,
389 masked=True,
(...)
392 **kwargs,
393 )
File ~/mambaforge/envs/coastal/lib/python3.10/site-packages/stackstac/rio_reader.py:381, in dataset()
379 with self._dataset_lock:
380 if self._dataset is None:
--> 381 self._dataset = self._open()
382 return self._dataset
File ~/mambaforge/envs/coastal/lib/python3.10/site-packages/stackstac/rio_reader.py:337, in _open()
332 warnings.warn(msg)
333 return NodataReader(
334 dtype=self.dtype, fill_value=self.fill_value
335 )
--> 337 raise RuntimeError(msg) from e
338 if ds.count != 1:
339 ds.close()
RuntimeError: Error opening 'https://sentinel2l2a01.blob.core.windows.net/sentinel2-l2/56/H/LH/2022/08/22/S2B_MSIL2A_20220822T000219_N0400_R030_T56HLH_20220822T162835.SAFE/GRANULE/L2A_T56HLH_A028513_20220822T000222/IMG_DATA/R10m/T56HLH_20220822T000219_B08_10m.tif?st=2023-02-22T10%3A13%3A55Z&se=2023-02-24T10%3A13%3A55Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2023-02-23T07%3A47%3A01Z&ske=2023-03-02T07%3A47%3A01Z&sks=b&skv=2021-06-08&sig=SH8dBq%2BwzHlm1LDwMrh4B0s7GRVysIyGVk%2B19y/hRRk%3D': RasterioIOError('503: Recv failure: Connection reset by peer') |
Just following up.. running the same |
xref gjoseph92/stackstac#18 and #11 (comment) where this came up before. I think that the Storage Account is responding with a 503 error. GDAL / stackstac should retry on these, but you might need to do a bit of work to configure things properly. Ideally you wouldn't need to worry about it, but for now you might need to do some work to manually ensure that things are set up properly (maybe setting https://stackstac.readthedocs.io/en/latest/api/main/stackstac.stack.html#stackstac.stack.params.gdal_env, and maybe some environment variables like When I have a chance, I'll look into setting those environment variables on the Hub by default. But in the meantime, and you aren't using the Hub, you'll need to set them. |
@TomAugspurger , thank you for the suggestion. Just for completeness, the errors were both raised on Hub and in my local env. Do you know what may have caused this change in behaviour between my previous use of MPC and current use? Could it be a daily/weekly/monthly quota issue? Although the requests I make are not necessary large(r), maybe slightly more often, so the total volume increases. In hub the following gdal variables are set:
I now set |
The errors you're seeing are from the storage account itself, and is a global limit shared between all users. It just so happened that you requested data when the storage account was near it's limit (serving requests to you and other users) and you got some 500 errors. Most of the time the storage account isn't anywhere near it's limit, so don't need to worry about the retries. But it's safer to have them in place just in case. |
|
Since yesterday (2023-02-22) 21:00 CET I have a very unstable connection when loading data from the S2 SR catalog.
A query like below often errors with
RuntimeError: not recognized as a supported file format.
(see details). Usually I run these on a dask.client.The text was updated successfully, but these errors were encountered: