Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Icon support #11

Merged
merged 36 commits into from
Oct 21, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
2fadef5
init branch icon_suppoert
Nov 29, 2023
6840518
ICON support of update_phenology_realtime
Dec 12, 2023
b95e4d9
Bug correction
Dec 12, 2023
32e3804
ICON support of update_strength_realtime
Dec 29, 2023
9cdaa9c
Support of advancing calibrated fields by 1 hour
Feb 13, 2024
f255df7
Reorganisation of input files
Feb 14, 2024
9afd616
update to latest blueprint
twicki Feb 15, 2024
72dd884
pop some stashes
twicki Feb 15, 2024
d38fa5e
test
twicki Feb 15, 2024
1d0eaa7
pre-commit cleanup
twicki Feb 15, 2024
a46d1d4
minor cleanup
twicki Feb 15, 2024
d213dd8
Merge commit 'a46d1d4443793a8045c84f91b90a1e19b019f225' into icon_sup…
twicki Feb 15, 2024
4834117
add run file
twicki Feb 27, 2024
23dd1b3
Command line interface support
Feb 29, 2024
f8d4f0f
reorganisation of code, setup and config, cleanup
May 14, 2024
a5e37aa
Small bug fixes
May 16, 2024
bbb4869
Deleted files: obsolete files after ICON support
May 16, 2024
5d0c9c9
improved code (solve pre-commit errors)
May 16, 2024
5b38dfe
Solve pre-commit errors not present locally but on github
May 16, 2024
96df752
editorial update
May 22, 2024
0342316
integraton of reviewer comments
May 30, 2024
f7e4ffe
editorial change needed for codespell hook on github
May 30, 2024
8381959
update for pre-commit
May 30, 2024
848fd8f
update for pre-commit on github
May 30, 2024
51d97a5
updates for hooks on Github
May 30, 2024
efab2b0
Editorial update
Jun 24, 2024
e54cb60
add extraction scripts in tools/
Jun 24, 2024
2a1d4d6
editorial changes to tools/extract_pollen_measured.sh
Jun 24, 2024
7b694ef
Revise tests (#12)
DanielLeuenberger Jun 26, 2024
dfd57ba
fix variable in get_data.sh
Jun 26, 2024
81ebcb7
rename test script,scp testdata and update README (#13)
DanielLeuenberger Aug 15, 2024
1a63380
change T_2M test data to 12UTC and document this in README.md
Aug 15, 2024
31ef83f
remove whitespaces in README.md
Aug 15, 2024
1222b27
save log information and tell the user (#16)
andreaspauling Aug 19, 2024
4d52ffe
Satisfy pre-commit reqs
Aug 19, 2024
7fd1801
improved error handling for missing data and log information (#18)
andreaspauling Aug 28, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file is quite large to upload to git. Maybe consider using git lfs (large file storage)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test data is not part of the repo any more. It will be available from a location defined by osm.

Binary file not shown.
Binary file added data/grib2_files_ICON-CH1/POV_out
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, it is uncommon to upload large data files directly to git repos.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dito

Binary file not shown.
1 change: 1 addition & 0 deletions data/grib2_files_ICON-CH1/iaf2023042500
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here on the other hand you are directly linking your scratch, this might not work for other users (is this a problem?). Having all required data in one common place might be benefitial for future users to understand the data-IO better. Updated external fields would always be available in that place (e.g. git lfs).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. There are no links any more.

1 change: 1 addition & 0 deletions data/grib2_files_ICON-CH1/lfff00000000c
19 changes: 10 additions & 9 deletions src/realtime_pollen_calibration/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,19 +16,20 @@

cwd = Path(os.getcwd())
data_path = str(cwd)
# data_path = str(cwd.parent.parent.absolute())

file_data = data_path + "/data/atabs/alnu_pollen_measured_values_2022021314.atab"
file_grib = data_path + "/data/grib2_files_cosmo1e/laf2022021314_ALNUtthrs_tthre"
file_out = data_path + "/data/2022021315_ALNUtthrs_tthre"
file_obs = data_path + "/data/atabs/alnu_pollen_measured_values_2022021314.atab"
file_POV = data_path + "/data/grib2_files_ICON-CH1/ART_POV_iconR19B08-grid_0001_all_specs_values"
file_T_2M = data_path + "/data/grib2_files_ICON-CH1/iaf2023042500"
file_Const = data_path + "/data/grib2_files_ICON-CH1/lfff00000000c"
file_out = data_path + "/data/grib2_files_ICON-CH1/POV_out"
update_phenology_realtime.update_phenology_realtime(
file_data, file_grib, file_out, False
file_obs, file_POV, file_T_2M, file_Const, file_out, True
)

file_data = data_path + "/data/atabs/alnu_pollen_measured_values_2022022207.atab"
file_data_mod = data_path + "/data/atabs/alnu_pollen_modelled_values_2022022207.atab"
file_grib = data_path + "/data/grib2_files_cosmo1e/laf2022022207_ALNUtune"
file_obs = data_path + "/data/atabs/alnu_pollen_measured_values_2022022207.atab"
file_obs_mod = data_path + "/data/atabs/alnu_pollen_modelled_values_2022022207.atab"
file_POV = data_path + "/data/grib2_files_cosmo1e/laf2022022207_ALNUtune"
file_out = data_path + "/data/2022022208_ALNUtune"
update_strength_realtime.update_strength_realtime(
file_data, file_data_mod, file_grib, file_out, False
file_obs, file_obs_mod, file_POV, file_out, False
)
123 changes: 117 additions & 6 deletions src/realtime_pollen_calibration/update_phenology_realtime.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,28 +7,139 @@
"""A module for the update of the pollen start and end of season in real time."""

# Third-party
import cfgrib # type: ignore
import numpy as np
import xarray as xr
from datetime import datetime
from eccodes import (
codes_get,
codes_get_array,
codes_grib_new_from_file,
codes_release,
codes_set_array,
codes_write,
)

# First-party
from realtime_pollen_calibration import utils


def update_phenology_realtime(file_obs, file_in, file_out, verbose=False):
def update_phenology_realtime(file_obs, file_POV, file_T_2M, file_Const, file_out, verbose=False):
"""Advance the temperature threshold fields by one hour.

Args:
file_obs: Location of ATAB file containing the pollen
concentration information at the stations.
file_in: Location of GRIB2 file containing the following fields:
'T_2M', 'tthrs', 'tthre' (for POAC, 'saisl' instead),
file_POV: Location of ICON GRIB2 file containing the pollen fields:
'tthrs', 'tthre' (for POAC, 'saisl' instead),
'saisn' and 'ctsum'. Lat-lon information of the grid must be
present in the file.
file_T_2M: Location of GRIB2 file containing T_2M.
file_Const: Location of GRIB2 file containing Longitudes and Latitudes of the
unstructured ICON grid.
file_out: Location of the desired output file.
verbose: Optional additional debug prints.

"""
ds = cfgrib.open_dataset(file_in, encode_cf=("time", "geography", "vertical"))

#file_POV = "data/grib2_files_ICON-CH1/ART_POV_iconR19B08-grid_0001_all_specs_values"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove comments if not required

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

#file_obs= "data/atabs/alnu_pollen_measured_values_2022021314.atab"

fh_POV = open(file_POV, "rb")
fh_Const = open(file_Const, "rb")
fh_T_2M = open(file_T_2M, "rb")

# read CLON, CLAT
while True:
# Get the next message
recConst = codes_grib_new_from_file(fh_Const)
if recConst is None:
break

# Get the short name of the current field
short_name = codes_get(recConst, "shortName")

# Extract longitude and latitude of the ICON grid
if short_name == "CLON":
CLON = codes_get_array(recConst, "values")
if short_name == "CLAT":
CLAT = codes_get_array(recConst, "values")

# Delete the message
codes_release(recConst)

# Close the GRIB file
fh_Const.close()

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these can be considered user input parameters and you might want to write them in a seperate config.yaml file where the user can define the required species and fields themselves. Unless this is supposed to remain hard-coded and static for years.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Now there is a config.yaml.

# read POV and extract available fields
specs = ["ALNU", "BETU", "POAC", "CORY"]
fields= ["tthrs", "tthre", "saisn", "ctsum"]
pol_fields = [x + y for x in specs for y in fields]
pol_fields[9] = "POACsaisl"

calFields = {}
while True:
# Get the next message
recPOV = codes_grib_new_from_file(fh_POV)
if recPOV is None:
break

# Get the short name of the current field
short_name = codes_get(recPOV, "shortName")

# Extract and alter fields if they present
for pol_field in pol_fields:
if short_name == pol_field:
calFields[pol_field] = codes_get_array(recPOV, "values")

# Delete the message
codes_release(recPOV)

# Close the GRIB files
fh_POV.close()


# Read T_2M
while True:
# Get the next message
recX = codes_grib_new_from_file(fh_T_2M)
if recX is None:
break

# Get the short name of the current field
short_name = codes_get(recX, "shortName")

if short_name == "T_2M":
calFields["T_2M"] = codes_get_array(recX, "values")

# timestamp is needed. Take it from the T_2M field
dataDate = str(codes_get(recX, "dataDate"))
dataTime = str(str(codes_get(recX, "dataTime")).zfill(2))
dataDateTime = dataDate + dataTime

# Convert the string to a datetime object
date_obj = datetime.strptime(dataDateTime, '%Y%m%d%H')
date_obj_fmt = date_obj.strftime('%Y-%m-%dT%H:00:00.000000000')
time_values = np.datetime64(date_obj_fmt)

codes_release(recX)

# Close the GRIB files
fh_T_2M.close()

# Dictionary to hold DataArrays for each variable
calFields_arrays = {}

# Loop through variables to create DataArrays
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you might consider using earthkit's .to_xarray() method for conversion to a hypercube. @clairemerker what is the latest guideline on using eccodes/cfgrib/earthkit? also consider a short discussion with Petra on this matter. I am not sure, earthkit makes the code easier but also adds one dependency.

for var_name, data in calFields.items():
data_array = xr.DataArray(data, coords={'index': np.arange(len(data))}, dims=['index'])
data_array.coords['latitude'] = (('index'), CLAT)
data_array.coords['longitude'] = (('index'), CLON)
data_array.coords['time'] = time_values
calFields_arrays[var_name] = data_array

# Create an xarray Dataset with the DataArrays
ds = xr.Dataset(calFields_arrays)

ptype_present = utils.get_pollen_type(ds)
if verbose:
print(f"Detected pollen types in the DataSet provided: {ptype_present}")
Expand All @@ -55,4 +166,4 @@ def update_phenology_realtime(file_obs, file_in, file_out, verbose=False):
obs_mod_data.coord_stns,
method="sum",
)
utils.to_grib(file_in, file_out, dict_fields)
utils.to_grib(file_POV, file_out, dict_fields)
21 changes: 13 additions & 8 deletions src/realtime_pollen_calibration/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
thr_con_24 = {"ALNU": 240, "BETU": 240, "POAC": 72, "CORY": 240}
thr_con_120 = {"ALNU": 720, "BETU": 720, "POAC": 216, "CORY": 720}
failsafe = {"ALNU": 1000, "BETU": 2500, "POAC": 6000, "CORY": 2500}
jul_days_excl = {"ALNU": 14, "BETU": 40, "POAC": 3, "CORY": 46}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider moving these user defined value into a config.yaml file

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These values should not be changed by users. Hence I would leave them there

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine to leave it here! Could you provide a comment documenting the meaning of those values?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

jul_days_excl = {"ALNU": 14, "BETU": 40, "POAC": 46, "CORY": 3}


def count_to_log_level(count: int) -> int:
Expand Down Expand Up @@ -414,15 +414,15 @@ def get_change_phenol( # pylint: disable=R0912,R0914,R0915
f"and last 120H {sum_obs}",
)
print(
f"Cumulative temperature sum {ctsum_stns.values[0][0]} ",
f"and threshold (start): {tthrs_stns.values[0][0]}",
f" and saisn: {saisn_stns.values[0][0]}",
f"Cumulative temperature sum {ctsum_stns.values[0]} ",
f"and threshold (start): {tthrs_stns.values[0]}",
f" and saisn: {saisn_stns.values[0]}",
)
if pollen_type != "POAC":
print(f"Cumsum temp threshold end: {tthre_stns.values[0][0]}")
print(f"Cumsum temp threshold end: {tthre_stns.values[0]}")
else:
print(f"Saisl: {saisl_stns.values[0][0]}")
print(f"Temperature at station {t_2m_stns.values[0][0]}, " f"date: {date}")
print(f"Saisl: {saisl_stns.values[0]}")
print(f"Temperature at station {t_2m_stns.values[0]}, " f"date: {date}")
print("-----------------------------------------")
# ADJUSTMENT OF SEASON START AND END AT THE BEGINNING OF THE SEASON
if (
Expand Down Expand Up @@ -545,15 +545,20 @@ def to_grib(inp: str, outp: str, dict_fields: dict) -> None:
break
# clone record
clone_id = eccodes.codes_clone(gid)
# get short_name

# get short_name
short_name = eccodes.codes_get_string(clone_id, "shortName")

# read values
values = eccodes.codes_get_values(clone_id)
eccodes.codes_set(
clone_id, "dataTime", eccodes.codes_get(clone_id, "dataTime") + 100
)
if short_name in dict_fields:

#set values in dict_fields[short_name] to zero where values is zero (edge values)
#This is because COSMO-1E was slightly smaller than ICON-CH1
dict_fields[short_name][values == 0] = 0
eccodes.codes_set_values(clone_id, dict_fields[short_name].flatten())
else:
eccodes.codes_set_values(clone_id, values)
Expand Down
Loading