-
Notifications
You must be signed in to change notification settings - Fork 33
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
CDAT Migration Phase: Refactor
cosp_histogram
set (#748)
- Refactor `cosp_histogram_driver.py` and `cosp_histogram_plot.py` - `formulas_cosp.py` (new file) - Includes refactored, Xarray-based `cosp_histogram_standard()` and `cosp_bin_sum()` functions - I wrote a lot of new code in `formulas_cosp.py` to clean up `derivations.py` and the old equivalent functions in `utils.py` - `derivations.py` - Cleaned up portions of `DERIVED_VARIABLES` dictionary - Removed unnecessary `OrderedDict` usage for `cosp_histogram` related variables (we should do this for the rest of the variables in in #716) - Remove unnecessary `convert_units()` function calls - Move cloud levels passed to derived variable formulas to `formulas_cosp.CLOUD_BIN_SUM_MAP` - `utils.py` - Delete deprecated, CDAT-based `cosp_histogram` functions - `dataset_xr.py` - Add `dataset_xr.Dataset._open_climo_dataset()` method with a catch for dataset quality issues where "time" is a scalar variable that does not match the "time" dimension array length, drops this variable and replaces it with the correct coordinate - Update `_get_dataset_with_derivation_func()` to handle derivation functions that require the `xr.Dataset` and `target_var_key` args (e.g., `cosp_histogram_standardize()` and `cosp_bin_sum()`) - `io.py` - Update `_write_vars_to_netcdf()` to write test, ref, and diff variables to individual netCDF (required for easy comparison to CDAT-based code that does the same thing) - Add `cdat_migration_regression_test_netcdf.ipynb` validation notebook template for comparing `.nc` files
- Loading branch information
1 parent
985ff14
commit 517c44f
Showing
19 changed files
with
2,376 additions
and
748 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
[report] | ||
exclude_also = | ||
if TYPE_CHECKING: |
270 changes: 270 additions & 0 deletions
270
...dat_regression_testing/660-cosp-histogram/660_cosp_histogram_regression_test_netcdf.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,270 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# CDAT Migration Regression Testing Notebook (`.nc` files)\n", | ||
"\n", | ||
"This notebook is used to perform regression testing between the development and\n", | ||
"production versions of a diagnostic set.\n", | ||
"\n", | ||
"## How it works\n", | ||
"\n", | ||
"It compares the relative differences (%) between ref and test variables between\n", | ||
"the dev and `main` branches.\n", | ||
"\n", | ||
"## How to use\n", | ||
"\n", | ||
"PREREQUISITE: The diagnostic set's netCDF stored in `.json` files in two directories\n", | ||
"(dev and `main` branches).\n", | ||
"\n", | ||
"1. Make a copy of this notebook under `auxiliary_tools/cdat_regression_testing/<DIR_NAME>`.\n", | ||
"2. Run `mamba create -n cdat_regression_test -y -c conda-forge \"python<3.12\" xarray dask pandas matplotlib-base ipykernel`\n", | ||
"3. Run `mamba activate cdat_regression_test`\n", | ||
"4. Update `SET_DIR` and `SET_NAME` in the copy of your notebook.\n", | ||
"5. Run all cells IN ORDER.\n", | ||
"6. Review results for any outstanding differences (>=1e-5 relative tolerance).\n", | ||
" - Debug these differences (e.g., bug in metrics functions, incorrect variable references, etc.)\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Setup Code\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 7, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from collections import defaultdict\n", | ||
"import glob\n", | ||
"\n", | ||
"import numpy as np\n", | ||
"import xarray as xr\n", | ||
"\n", | ||
"# TODO: Update SET_NAME and SET_DIR\n", | ||
"SET_NAME = \"cosp_histogram\"\n", | ||
"SET_DIR = \"660-cosp-histogram\"\n", | ||
"\n", | ||
"DEV_PATH = f\"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/{SET_DIR}/{SET_NAME}/**\"\n", | ||
"MAIN_PATH = f\"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/{SET_NAME}/**\"\n", | ||
"\n", | ||
"DEV_GLOB = sorted(glob.glob(DEV_PATH + \"/*.nc\"))\n", | ||
"MAIN_GLOB = sorted(glob.glob(MAIN_PATH + \"/*.nc\"))\n", | ||
"\n", | ||
"if len(DEV_GLOB) == 0 or len(MAIN_GLOB) == 0:\n", | ||
" raise IOError(\"No files found at DEV_PATH and/or MAIN_PATH.\")\n", | ||
"\n", | ||
"if len(DEV_GLOB) != len(MAIN_GLOB):\n", | ||
" raise IOError(\"Number of files do not match at DEV_PATH and MAIN_PATH.\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 34, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"def _get_var_to_filepath_map():\n", | ||
" var_to_file = defaultdict(lambda: defaultdict(dict))\n", | ||
"\n", | ||
" for dev_file, main_file in zip(DEV_GLOB, MAIN_GLOB):\n", | ||
" # Example:\n", | ||
" # \"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/ISCCP-COSP/ISCCPCOSP-COSP_HISTOGRAM_ISCCP-ANN-global_test.nc\"\n", | ||
" file_arr = dev_file.split(\"/\")\n", | ||
"\n", | ||
" # Example: \"test\"\n", | ||
" data_type = dev_file.split(\"_\")[-1].split(\".nc\")[0]\n", | ||
"\n", | ||
" # Skip comparing `.nc` \"diff\" files because comparing relative diffs of\n", | ||
" # does not make sense.\n", | ||
" if data_type == \"test\" or data_type == \"ref\":\n", | ||
" # Example: \"ISCCP\"\n", | ||
" model = file_arr[-2].split(\"-\")[0]\n", | ||
" season = \"JJA\" if \"JJA\" in dev_file else \"ANN\"\n", | ||
"\n", | ||
" var_to_file[model][data_type][season] = (dev_file, main_file)\n", | ||
"\n", | ||
" return var_to_file\n", | ||
"\n", | ||
"\n", | ||
"def _get_relative_diffs(var_to_filepath):\n", | ||
" # Absolute tolerance of 0 and relative tolerance of 1e-5.\n", | ||
" # We are mainly focusing on relative tolerance here (in percentage terms).\n", | ||
" atol = 0\n", | ||
" rtol = 1e-5\n", | ||
"\n", | ||
" for model, data_types in var_to_filepath.items():\n", | ||
" for _, seasons in data_types.items():\n", | ||
" for _, filepaths in seasons.items():\n", | ||
" print(\"Comparing:\")\n", | ||
" print(filepaths[0], \"\\n\", filepaths[1])\n", | ||
" ds1 = xr.open_dataset(filepaths[0])\n", | ||
" ds2 = xr.open_dataset(filepaths[1])\n", | ||
"\n", | ||
" try:\n", | ||
" var_key = f\"COSP_HISTOGRAM_{model}\"\n", | ||
" np.testing.assert_allclose(\n", | ||
" ds1[var_key].values,\n", | ||
" ds2[var_key].values,\n", | ||
" atol=atol,\n", | ||
" rtol=rtol,\n", | ||
" )\n", | ||
" except AssertionError as e:\n", | ||
" print(e)\n", | ||
" else:\n", | ||
" print(f\" * All close and within relative tolerance ({rtol})\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## 1. Compare the netCDF files between branches\n", | ||
"\n", | ||
"- Compare \"ref\" and \"test\" files\n", | ||
"- \"diff\" files are ignored because getting relative diffs for these does not make sense (relative diff will be above tolerance)\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 36, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"var_to_filepaths = _get_var_to_filepath_map()" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 37, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"Comparing:\n", | ||
"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/ISCCP-COSP/ISCCPCOSP-COSP_HISTOGRAM_ISCCP-ANN-global_ref.nc \n", | ||
" /global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/cosp_histogram/ISCCP-COSP/ISCCPCOSP-COSP_HISTOGRAM_ISCCP-ANN-global_ref.nc\n", | ||
" * All close and within relative tolerance (1e-05)\n", | ||
"Comparing:\n", | ||
"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/ISCCP-COSP/ISCCPCOSP-COSP_HISTOGRAM_ISCCP-JJA-global_ref.nc \n", | ||
" /global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/cosp_histogram/ISCCP-COSP/ISCCPCOSP-COSP_HISTOGRAM_ISCCP-JJA-global_ref.nc\n", | ||
" * All close and within relative tolerance (1e-05)\n", | ||
"Comparing:\n", | ||
"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/ISCCP-COSP/ISCCPCOSP-COSP_HISTOGRAM_ISCCP-ANN-global_test.nc \n", | ||
" /global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/cosp_histogram/ISCCP-COSP/ISCCPCOSP-COSP_HISTOGRAM_ISCCP-ANN-global_test.nc\n", | ||
" * All close and within relative tolerance (1e-05)\n", | ||
"Comparing:\n", | ||
"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/ISCCP-COSP/ISCCPCOSP-COSP_HISTOGRAM_ISCCP-JJA-global_test.nc \n", | ||
" /global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/cosp_histogram/ISCCP-COSP/ISCCPCOSP-COSP_HISTOGRAM_ISCCP-JJA-global_test.nc\n", | ||
" * All close and within relative tolerance (1e-05)\n", | ||
"Comparing:\n", | ||
"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/MISR-COSP/MISRCOSP-COSP_HISTOGRAM_MISR-ANN-global_ref.nc \n", | ||
" /global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/cosp_histogram/MISR-COSP/MISRCOSP-COSP_HISTOGRAM_MISR-ANN-global_ref.nc\n", | ||
" * All close and within relative tolerance (1e-05)\n", | ||
"Comparing:\n", | ||
"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/MISR-COSP/MISRCOSP-COSP_HISTOGRAM_MISR-JJA-global_ref.nc \n", | ||
" /global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/cosp_histogram/MISR-COSP/MISRCOSP-COSP_HISTOGRAM_MISR-JJA-global_ref.nc\n", | ||
" * All close and within relative tolerance (1e-05)\n", | ||
"Comparing:\n", | ||
"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/MISR-COSP/MISRCOSP-COSP_HISTOGRAM_MISR-ANN-global_test.nc \n", | ||
" /global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/cosp_histogram/MISR-COSP/MISRCOSP-COSP_HISTOGRAM_MISR-ANN-global_test.nc\n", | ||
" * All close and within relative tolerance (1e-05)\n", | ||
"Comparing:\n", | ||
"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/MISR-COSP/MISRCOSP-COSP_HISTOGRAM_MISR-JJA-global_test.nc \n", | ||
" /global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/cosp_histogram/MISR-COSP/MISRCOSP-COSP_HISTOGRAM_MISR-JJA-global_test.nc\n", | ||
" * All close and within relative tolerance (1e-05)\n", | ||
"Comparing:\n", | ||
"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/MODIS-COSP/MODISCOSP-COSP_HISTOGRAM_MODIS-ANN-global_ref.nc \n", | ||
" /global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/cosp_histogram/MODIS-COSP/MODISCOSP-COSP_HISTOGRAM_MODIS-ANN-global_ref.nc\n", | ||
"\n", | ||
"Not equal to tolerance rtol=1e-05, atol=0\n", | ||
"\n", | ||
"Mismatched elements: 42 / 42 (100%)\n", | ||
"Max absolute difference: 4.23048367e-05\n", | ||
"Max relative difference: 1.16682146e-05\n", | ||
" x: array([[0.703907, 2.669376, 3.065526, 1.579834, 0.363847, 0.128541],\n", | ||
" [0.147366, 1.152637, 3.67049 , 3.791006, 1.398453, 0.392103],\n", | ||
" [0.07496 , 0.474791, 1.37002 , 1.705649, 0.786423, 0.346744],...\n", | ||
" y: array([[0.703899, 2.669347, 3.065492, 1.579816, 0.363843, 0.12854 ],\n", | ||
" [0.147364, 1.152624, 3.670448, 3.790965, 1.398438, 0.392099],\n", | ||
" [0.074959, 0.474786, 1.370004, 1.705629, 0.786415, 0.34674 ],...\n", | ||
"Comparing:\n", | ||
"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/MODIS-COSP/MODISCOSP-COSP_HISTOGRAM_MODIS-JJA-global_ref.nc \n", | ||
" /global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/cosp_histogram/MODIS-COSP/MODISCOSP-COSP_HISTOGRAM_MODIS-JJA-global_ref.nc\n", | ||
"\n", | ||
"Not equal to tolerance rtol=1e-05, atol=0\n", | ||
"\n", | ||
"Mismatched elements: 42 / 42 (100%)\n", | ||
"Max absolute difference: 4.91806181e-05\n", | ||
"Max relative difference: 1.3272405e-05\n", | ||
" x: array([[0.62896 , 2.657657, 3.206268, 1.704946, 0.398659, 0.169424],\n", | ||
" [0.147569, 1.228835, 3.697387, 3.727142, 1.223123, 0.436504],\n", | ||
" [0.072129, 0.508413, 1.167637, 1.412202, 0.638085, 0.362268],...\n", | ||
" y: array([[0.628952, 2.657625, 3.206227, 1.704924, 0.398654, 0.169422],\n", | ||
" [0.147567, 1.228819, 3.697338, 3.727093, 1.223107, 0.436498],\n", | ||
" [0.072128, 0.508407, 1.167622, 1.412183, 0.638076, 0.362263],...\n", | ||
"Comparing:\n", | ||
"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/MODIS-COSP/MODISCOSP-COSP_HISTOGRAM_MODIS-ANN-global_test.nc \n", | ||
" /global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/cosp_histogram/MODIS-COSP/MODISCOSP-COSP_HISTOGRAM_MODIS-ANN-global_test.nc\n", | ||
" * All close and within relative tolerance (1e-05)\n", | ||
"Comparing:\n", | ||
"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/MODIS-COSP/MODISCOSP-COSP_HISTOGRAM_MODIS-JJA-global_test.nc \n", | ||
" /global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/cosp_histogram/MODIS-COSP/MODISCOSP-COSP_HISTOGRAM_MODIS-JJA-global_test.nc\n", | ||
" * All close and within relative tolerance (1e-05)\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"_get_relative_diffs(var_to_filepaths)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### Results\n", | ||
"\n", | ||
"- The relative tolerance of all files are 1e-05, which means things should be good to go.\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "cdat_regression_test", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.10.12" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 2 | ||
} |
6 changes: 6 additions & 0 deletions
6
auxiliary_tools/cdat_regression_testing/660-cosp-histogram/660_cosp_histogram_run_script.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
from auxiliary_tools.cdat_regression_testing.base_run_script import run_set | ||
|
||
SET_NAME = "cosp_histogram" | ||
SET_DIR = "660-cosp-histogram" | ||
|
||
run_set(SET_NAME, SET_DIR) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.