Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to search for collections without granules #940

Merged
merged 6 commits into from
Feb 10, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,13 @@ and this project uses [Semantic Versioning](https://semver.org/spec/v2.0.0.html)

## [Unreleased]

### Changed
- `search_datasets` now accepts a `has_granules` keyword argument. Use
`has_granules=False` to search for metadata about collections with no
associated granules. The default value set in `DataCollections` remains `True`.
([#939](https://github.com/nsidc/earthaccess/issues/939))
([**@juliacollins**](https://github.com/juliacollins))

## [v0.13.0] - 2025-01-28

### Changed
Expand Down
1 change: 1 addition & 0 deletions earthaccess/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ def search_datasets(count: int = -1, **kwargs: Any) -> List[DataCollection]:
* **doi**: DOI for a dataset
* **daac**: e.g. NSIDC or PODAAC
* **provider**: particular to each DAAC, e.g. POCLOUD, LPDAAC etc.
* **has_granules**: if true, only return collections with granules
* **temporal**: a tuple representing temporal bounds in the form
`(date_from, date_to)`
* **bounding_box**: a tuple representing spatial bounds in the form
Expand Down
22 changes: 22 additions & 0 deletions earthaccess/search.py
Original file line number Diff line number Diff line change
Expand Up @@ -291,6 +291,28 @@ def debug(self, debug: bool = True) -> Self:
self._debug = debug
return self

def has_granules(self, has_granules: bool | None = True) -> Self:
"""Match only collections with granules, without granules, or either.
Parameters:
has_granules:
If `True`, only return collections with granules. If
`False`, only return collections without granules.
If `None`, return both types of collections.
Returns:
self
"""
if has_granules is not None and not isinstance(has_granules, bool):
raise TypeError("has_granules must be of type bool or None")

if has_granules is None and "has_granules" in self.params:
del self.params["has_granules"]
else:
self.params["has_granules"] = has_granules

return self

def cloud_hosted(self, cloud_hosted: bool = True) -> Self:
"""Only match granules that are hosted in the cloud. This is valid for public
collections.
Expand Down
9 changes: 9 additions & 0 deletions tests/unit/test_collection_queries.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,15 @@ def test_querybuilder_can_handle_doi():
assert query.params["doi"] == doi


def test_querybuilder_can_handle_has_granules():
query = DataCollections().has_granules(False)
assert not query.params["has_granules"]
query = DataCollections().has_granules(True)
assert query.params["has_granules"]
query = DataCollections().has_granules(None)
assert "has_granules" not in query.params


@pytest.mark.parametrize("start,end,expected", valid_single_dates)
def test_query_can_parse_single_dates(start, end, expected):
query = DataCollections().temporal(start, end)
Expand Down