STAC API Data/Rate Limits #246
-
Are there any data or rate limits when using or accessing data from the STAC API? I want access S2/Landsat data for several countries going back a couple of years but given the amount of imagery required I wanted to make sure I understood what limits I might hit so I can plan how to approach that problem appropriately. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hi @andreweryan, thanks for checking in about this. There are a few considerations at each step of the process, and I'll try to enumerate them here: 1. STAC API / Metadata searchesWe don't have rate limiting per-se on the STAC search endpoints, but it is a shared resource and an individual's throughput/response times are sensitive to the overall load on the system at that time. The general advice here is to make requests from within a system that can manage overall concurrency, so you can tune your search volume accordingly. Additionally, it's best to use a retry-with-backoff strategy for 503/504 response codes which the API will return under periods of high load. Some searches are faster than others. The Planetary Computer uses the pgstac project, which is optimized heavily for spatial and temporal access patterns. Defining as narrow a datetime boundary as possible per query will likely improve response times, as well as using simpler geometries for a spatial filter. The API supports multi-collection searches, but single-collection searches will generally perform better. The returned STAC Items can be quite verbose. If only a few attributes are required on the results, consider using the Fields Extension to reduce the response size by only returning what you need. For large If your workload is going to make a high volume of API searches, you may consider using our alternate mechanism for high-volume STAC metadata access. All of our STAC Items are also available as GeoParquet files, and some considerations and examples for using that method are described in this notebook. 2. Asset signing / SAS tokensYou'll need to sign the asset URLs that are returned from your searches to access them. This API does have rate limiting enabled and those are described in the documentation. The short version is a) register for an API key and include it with requests to the signing URL and b) make requests from within the Azure West Europe region for the highest rates. You can also request tokens for a storage container and not just a single file. This means that you can reuse a single token (up to its expiry time) for many file requests and can cut out a significant number of requests to that API. This can be done using either the 3. Data accessWhen it's time to access the files directly, you'll be at the scalability limits of Azure Storage (which are quite high but have been known to be hit by users running very large jobs). The same general theme applies here as well: access data from within Azure West Europe and utilize retries with backoff to get the best results. The Sentinel-2 and Landsat data files are in Cloud Optimized GeoTiff format to enable efficient byte-range requests of specific areas of the file, so downloading the entire file in a single request is not necessary (or recommended). Hopefully that helps you plan an optimized workload using the Planetary Computer! If you have additional observations or questions, please post them here as I believe the community could benefit from sharing best practices. |
Beta Was this translation helpful? Give feedback.
Hi @andreweryan, thanks for checking in about this. There are a few considerations at each step of the process, and I'll try to enumerate them here:
1. STAC API / Metadata searches
We don't have rate limiting per-se on the STAC search endpoints, but it is a shared resource and an individual's throughput/response times are sensitive to the overall load on the system at that time. The general advice here is to make requests from within a system that can manage overall concurrency, so you can tune your search volume accordingly. Additionally, it's best to use a retry-with-backoff strategy for 503/504 response codes which the API will return under periods of high load.
Some searches are faste…