Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDAT Migration Phase: Refactor cosp_histogram set #748

Merged
merged 34 commits into from
Jan 30, 2024
Merged
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
3f22786
Add initial `cosp_histogram` refactor updates
tomvothecoder Oct 10, 2023
d21010c
Update Dataset derivation methods to support dataset-based funcs
tomvothecoder Dec 6, 2023
79f053e
Update e3sm_diags/derivations/formulas_cosp.py
tomvothecoder Dec 7, 2023
3964170
Fix cosp histogram funcs to add target var key to dataset
tomvothecoder Dec 7, 2023
b8c117d
Initial code for refactoring `cosp_bin_sum()`
tomvothecoder Dec 7, 2023
54a6ce4
Refactor `cosp_bin_sum()` function and variable names
tomvothecoder Dec 8, 2023
9e1eefe
Refactor common logic into `_get_dataset_with_derivation_func()`
tomvothecoder Dec 8, 2023
049537e
Update `cosp_bin_sum()` function args and type annotations
tomvothecoder Dec 9, 2023
eccbace
Fix incorrect arg passed to sim and cloud level funcs
tomvothecoder Dec 14, 2023
f18ed06
Update order of functions
tomvothecoder Dec 14, 2023
fff4c2e
Start refactoring `cosp_histogram_plot.py`
tomvothecoder Dec 14, 2023
a0212a8
Fix type annotation for `fig` arg
tomvothecoder Dec 14, 2023
5d7611f
Add FIXME comment to plot
tomvothecoder Dec 18, 2023
ae64002
Get plot function working
tomvothecoder Dec 21, 2023
3e23116
Fix unit tests failures
tomvothecoder Jan 3, 2024
4dfec98
Fix mypy errors
tomvothecoder Jan 3, 2024
3697374
Fix incorrect param object reference by copying
tomvothecoder Jan 3, 2024
bc1bc94
Fix type annotation
tomvothecoder Jan 3, 2024
196ae63
Remove unused functions in `utils.py`
tomvothecoder Jan 3, 2024
1b8139b
Remove #type ignore
tomvothecoder Jan 3, 2024
73edf2a
Update test scripts
tomvothecoder Jan 4, 2024
2b5d387
Update notebook and script
tomvothecoder Jan 5, 2024
127c5a9
Add "time" scalar workaround
tomvothecoder Jan 23, 2024
83790bd
Add keys for matching prs axis
tomvothecoder Jan 24, 2024
a89b68f
Set `decode_times=False` when opening climo dataset
tomvothecoder Jan 24, 2024
befd28a
Update regressiong testing notebook and script
tomvothecoder Jan 24, 2024
0b7a18d
Update `_write_vars_to_netcdf` to write individual files
tomvothecoder Jan 25, 2024
0ae2a46
Fix unit tests
tomvothecoder Jan 25, 2024
68f62e0
Update e3sm_diags/driver/cosp_histogram_driver.py
tomvothecoder Jan 29, 2024
091292c
Update template and base run scripts to always generate netCDF files
tomvothecoder Jan 29, 2024
22f7737
Fix incorrect glob in json notebook
tomvothecoder Jan 29, 2024
a314f9a
Update `cosp_histogram_standardize()` args
tomvothecoder Jan 30, 2024
67b1d4e
Apply suggestions from code review
tomvothecoder Jan 30, 2024
883dda4
Apply suggestions from code review
tomvothecoder Jan 30, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .coveragerc
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[report]
exclude_also =
if TYPE_CHECKING:
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
[#]
tomvothecoder marked this conversation as resolved.
Show resolved Hide resolved
sets = ["cosp_histogram"]
case_id = "model_vs_model"
variables = ["COSP_HISTOGRAM_MISR"]
contour_levels = [0,0.5,1.0,1.5,2.0,2.5,3.0,3.5,4.0,4.5,5.0,5.5]
diff_levels = [-3.0,-2.5,-2.0,-1.5,-1.0,-0.5,0,0.5,1.0,1.5,2.0,2.5,3.0]
seasons = ["ANN", "DJF", "MAM", "JJA", "SON"]

[#]
sets = ["cosp_histogram"]
case_id = "model_vs_model"
variables = ["COSP_HISTOGRAM_MODIS"]
contour_levels = [0,0.5,1.0,1.5,2.0,2.5,3.0,3.5,4.0,4.5,5.0,5.5]
diff_levels = [-3.0,-2.5,-2.0,-1.5,-1.0,-0.5,0,0.5,1.0,1.5,2.0,2.5,3.0]
seasons = ["ANN", "DJF", "MAM", "JJA", "SON"]

[#]
sets = ["cosp_histogram"]
case_id = "model_vs_model"
variables = ["COSP_HISTOGRAM_ISCCP"]
contour_levels = [0,0.5,1.0,1.5,2.0,2.5,3.0,3.5,4.0,4.5,5.0,5.5]
diff_levels = [-3.0,-2.5,-2.0,-1.5,-1.0,-0.5,0,0.5,1.0,1.5,2.0,2.5,3.0]
seasons = ["ANN", "DJF", "MAM", "JJA", "SON"]
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# %%
tomvothecoder marked this conversation as resolved.
Show resolved Hide resolved
"""This script is based on the `cosp_histogram` section of
`tests/integration/all_sets.cfg`.
"""
# %%
import os
import sys

from e3sm_diags.parameter.core_parameter import CoreParameter
from e3sm_diags.run import runner

param = CoreParameter()

# %%
param.sets = ["cosp_histogram"]

param.test_name = "system tests"
param.short_test_name = "short_system tests"
param.ref_name = "MISRCOSP"
param.reference_name = "MISR COSP (2000-2009)"
param.reference_data_path = "tests/integration/integration_test_data"
param.ref_file = "CLDMISR_ERA-Interim_ANN_198001_201401_climo.nc"
param.test_data_path = "tests/integration/integration_test_data"
param.test_file = "CLD_MISR_20161118.beta0.FC5COSP.ne30_ne30.edison_ANN_climo.nc"

# param.backend = "mpl"
prefix = "/global/cfs/cdirs/e3sm/www/vo13/cdat-migration-test"
param.results_dir = os.path.join(prefix, "660-cosp-histogram", param.sets[0])
param.multiprocessing = False

# Make sure to save the NetCDF for metrics comparison.
param.save_netcdf = True

# %%
CFG_PATH = (
"/global/u2/v/vo13/E3SM-Project/e3sm_diags/auxiliary_tools/cdat_regression_testing/"
"660-cosp-histogram/660-cosp-histogram.cfg"
)
sys.argv.extend(["-d", CFG_PATH])
runner.run_diags([param])

# %%
Original file line number Diff line number Diff line change
@@ -0,0 +1,270 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# CDAT Migration Regression Testing Notebook (`.nc` files)\n",
"\n",
"This notebook is used to perform regression testing between the development and\n",
"production versions of a diagnostic set.\n",
"\n",
"## How it works\n",
"\n",
"It compares the relative differences (%) between ref and test variables between\n",
"the dev and `main` branches.\n",
"\n",
"## How to use\n",
"\n",
"PREREQUISITE: The diagnostic set's netCDF stored in `.json` files in two directories\n",
"(dev and `main` branches).\n",
"\n",
"1. Make a copy of this notebook under `auxiliary_tools/cdat_regression_testing/<DIR_NAME>`.\n",
"2. Run `mamba create -n cdat_regression_test -y -c conda-forge \"python<3.12\" xarray dask pandas matplotlib-base ipykernel`\n",
"3. Run `mamba activate cdat_regression_test`\n",
"4. Update `SET_DIR` and `SET_NAME` in the copy of your notebook.\n",
"5. Run all cells IN ORDER.\n",
"6. Review results for any outstanding differences (>=1e-5 relative tolerance).\n",
" - Debug these differences (e.g., bug in metrics functions, incorrect variable references, etc.)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup Code\n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"from collections import defaultdict\n",
"import glob\n",
"\n",
"import numpy as np\n",
"import xarray as xr\n",
"\n",
"# TODO: Update SET_NAME and SET_DIR\n",
"SET_NAME = \"cosp_histogram\"\n",
"SET_DIR = \"660-cosp-histogram\"\n",
"\n",
"DEV_PATH = f\"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/{SET_DIR}/{SET_NAME}/**\"\n",
"MAIN_PATH = f\"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/{SET_NAME}/**\"\n",
"\n",
"DEV_GLOB = sorted(glob.glob(DEV_PATH + \"/*.nc\"))\n",
"MAIN_GLOB = sorted(glob.glob(MAIN_PATH + \"/*.nc\"))\n",
"\n",
"if len(DEV_GLOB) == 0 or len(MAIN_GLOB) == 0:\n",
" raise IOError(\"No files found at DEV_PATH and/or MAIN_PATH.\")\n",
"\n",
"if len(DEV_GLOB) != len(MAIN_GLOB):\n",
" raise IOError(\"Number of files do not match at DEV_PATH and MAIN_PATH.\")"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [],
"source": [
"def _get_var_to_filepath_map():\n",
" var_to_file = defaultdict(lambda: defaultdict(dict))\n",
"\n",
" for dev_file, main_file in zip(DEV_GLOB, MAIN_GLOB):\n",
" # Example:\n",
" # \"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/ISCCP-COSP/ISCCPCOSP-COSP_HISTOGRAM_ISCCP-ANN-global_test.nc\"\n",
" file_arr = dev_file.split(\"/\")\n",
"\n",
" # Example: \"test\"\n",
" data_type = dev_file.split(\"_\")[-1].split(\".nc\")[0]\n",
"\n",
" # Skip comparing `.nc` \"diff\" files because comparing relative diffs of\n",
" # does not make sense.\n",
" if data_type == \"test\" or data_type == \"ref\":\n",
" # Example: \"ISCCP\"\n",
" model = file_arr[-2].split(\"-\")[0]\n",
" season = \"JJA\" if \"JJA\" in dev_file else \"ANN\"\n",
"\n",
" var_to_file[model][data_type][season] = (dev_file, main_file)\n",
"\n",
" return var_to_file\n",
"\n",
"\n",
"def _get_relative_diffs(var_to_filepath):\n",
" # Absolute tolerance of 0 and relative tolerance of 1e-5.\n",
" # We are mainly focusing on relative tolerance here (in percentage terms).\n",
" atol = 0\n",
" rtol = 1e-5\n",
"\n",
" for model, data_types in var_to_filepath.items():\n",
" for _, seasons in data_types.items():\n",
" for _, filepaths in seasons.items():\n",
" print(\"Comparing:\")\n",
" print(filepaths[0], \"\\n\", filepaths[1])\n",
" ds1 = xr.open_dataset(filepaths[0])\n",
" ds2 = xr.open_dataset(filepaths[1])\n",
"\n",
" try:\n",
" var_key = f\"COSP_HISTOGRAM_{model}\"\n",
" np.testing.assert_allclose(\n",
" ds1[var_key].values,\n",
" ds2[var_key].values,\n",
" atol=atol,\n",
" rtol=rtol,\n",
" )\n",
" except AssertionError as e:\n",
" print(e)\n",
" else:\n",
" print(f\" * All close and within relative tolerance ({rtol})\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Compare the netCDF files between branches\n",
"\n",
"- Compare \"ref\" and \"test\" files\n",
"- \"diff\" files are ignored because getting relative diffs for these does not make sense (relative diff will be above tolerance)\n"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [],
"source": [
"var_to_filepaths = _get_var_to_filepath_map()"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Comparing:\n",
"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/ISCCP-COSP/ISCCPCOSP-COSP_HISTOGRAM_ISCCP-ANN-global_ref.nc \n",
" /global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/cosp_histogram/ISCCP-COSP/ISCCPCOSP-COSP_HISTOGRAM_ISCCP-ANN-global_ref.nc\n",
" * All close and within relative tolerance (1e-05)\n",
"Comparing:\n",
"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/ISCCP-COSP/ISCCPCOSP-COSP_HISTOGRAM_ISCCP-JJA-global_ref.nc \n",
" /global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/cosp_histogram/ISCCP-COSP/ISCCPCOSP-COSP_HISTOGRAM_ISCCP-JJA-global_ref.nc\n",
" * All close and within relative tolerance (1e-05)\n",
"Comparing:\n",
"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/ISCCP-COSP/ISCCPCOSP-COSP_HISTOGRAM_ISCCP-ANN-global_test.nc \n",
" /global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/cosp_histogram/ISCCP-COSP/ISCCPCOSP-COSP_HISTOGRAM_ISCCP-ANN-global_test.nc\n",
" * All close and within relative tolerance (1e-05)\n",
"Comparing:\n",
"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/ISCCP-COSP/ISCCPCOSP-COSP_HISTOGRAM_ISCCP-JJA-global_test.nc \n",
" /global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/cosp_histogram/ISCCP-COSP/ISCCPCOSP-COSP_HISTOGRAM_ISCCP-JJA-global_test.nc\n",
" * All close and within relative tolerance (1e-05)\n",
"Comparing:\n",
"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/MISR-COSP/MISRCOSP-COSP_HISTOGRAM_MISR-ANN-global_ref.nc \n",
" /global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/cosp_histogram/MISR-COSP/MISRCOSP-COSP_HISTOGRAM_MISR-ANN-global_ref.nc\n",
" * All close and within relative tolerance (1e-05)\n",
"Comparing:\n",
"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/MISR-COSP/MISRCOSP-COSP_HISTOGRAM_MISR-JJA-global_ref.nc \n",
" /global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/cosp_histogram/MISR-COSP/MISRCOSP-COSP_HISTOGRAM_MISR-JJA-global_ref.nc\n",
" * All close and within relative tolerance (1e-05)\n",
"Comparing:\n",
"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/MISR-COSP/MISRCOSP-COSP_HISTOGRAM_MISR-ANN-global_test.nc \n",
" /global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/cosp_histogram/MISR-COSP/MISRCOSP-COSP_HISTOGRAM_MISR-ANN-global_test.nc\n",
" * All close and within relative tolerance (1e-05)\n",
"Comparing:\n",
"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/MISR-COSP/MISRCOSP-COSP_HISTOGRAM_MISR-JJA-global_test.nc \n",
" /global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/cosp_histogram/MISR-COSP/MISRCOSP-COSP_HISTOGRAM_MISR-JJA-global_test.nc\n",
" * All close and within relative tolerance (1e-05)\n",
"Comparing:\n",
"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/MODIS-COSP/MODISCOSP-COSP_HISTOGRAM_MODIS-ANN-global_ref.nc \n",
" /global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/cosp_histogram/MODIS-COSP/MODISCOSP-COSP_HISTOGRAM_MODIS-ANN-global_ref.nc\n",
"\n",
"Not equal to tolerance rtol=1e-05, atol=0\n",
"\n",
"Mismatched elements: 42 / 42 (100%)\n",
"Max absolute difference: 4.23048367e-05\n",
"Max relative difference: 1.16682146e-05\n",
" x: array([[0.703907, 2.669376, 3.065526, 1.579834, 0.363847, 0.128541],\n",
" [0.147366, 1.152637, 3.67049 , 3.791006, 1.398453, 0.392103],\n",
" [0.07496 , 0.474791, 1.37002 , 1.705649, 0.786423, 0.346744],...\n",
" y: array([[0.703899, 2.669347, 3.065492, 1.579816, 0.363843, 0.12854 ],\n",
" [0.147364, 1.152624, 3.670448, 3.790965, 1.398438, 0.392099],\n",
" [0.074959, 0.474786, 1.370004, 1.705629, 0.786415, 0.34674 ],...\n",
"Comparing:\n",
"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/MODIS-COSP/MODISCOSP-COSP_HISTOGRAM_MODIS-JJA-global_ref.nc \n",
" /global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/cosp_histogram/MODIS-COSP/MODISCOSP-COSP_HISTOGRAM_MODIS-JJA-global_ref.nc\n",
"\n",
"Not equal to tolerance rtol=1e-05, atol=0\n",
"\n",
"Mismatched elements: 42 / 42 (100%)\n",
"Max absolute difference: 4.91806181e-05\n",
"Max relative difference: 1.3272405e-05\n",
" x: array([[0.62896 , 2.657657, 3.206268, 1.704946, 0.398659, 0.169424],\n",
" [0.147569, 1.228835, 3.697387, 3.727142, 1.223123, 0.436504],\n",
" [0.072129, 0.508413, 1.167637, 1.412202, 0.638085, 0.362268],...\n",
" y: array([[0.628952, 2.657625, 3.206227, 1.704924, 0.398654, 0.169422],\n",
" [0.147567, 1.228819, 3.697338, 3.727093, 1.223107, 0.436498],\n",
" [0.072128, 0.508407, 1.167622, 1.412183, 0.638076, 0.362263],...\n",
"Comparing:\n",
"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/MODIS-COSP/MODISCOSP-COSP_HISTOGRAM_MODIS-ANN-global_test.nc \n",
" /global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/cosp_histogram/MODIS-COSP/MODISCOSP-COSP_HISTOGRAM_MODIS-ANN-global_test.nc\n",
" * All close and within relative tolerance (1e-05)\n",
"Comparing:\n",
"/global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/660-cosp-histogram/cosp_histogram/MODIS-COSP/MODISCOSP-COSP_HISTOGRAM_MODIS-JJA-global_test.nc \n",
" /global/cfs/projectdirs/e3sm/e3sm_diags_cdat_test/main/cosp_histogram/MODIS-COSP/MODISCOSP-COSP_HISTOGRAM_MODIS-JJA-global_test.nc\n",
" * All close and within relative tolerance (1e-05)\n"
]
}
],
"source": [
"_get_relative_diffs(var_to_filepaths)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Results\n",
"\n",
"- The relative tolerance of all files are 1e-05, which means things should be good to go.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "cdat_regression_test",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
from auxiliary_tools.cdat_regression_testing.base_run_script import run_set

SET_NAME = "cosp_histogram"
SET_DIR = "660-cosp-histogram"

run_set(SET_NAME, SET_DIR)
4 changes: 2 additions & 2 deletions auxiliary_tools/cdat_regression_testing/base_run_script.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ class MachinePaths(TypedDict):
tc_test: str


def run_set(set_name: str, set_dir: str, save_netcdf: bool):
def run_set(set_name: str, set_dir: str):
machine_paths: MachinePaths = _get_machine_paths()

param = CoreParameter()
Expand All @@ -62,7 +62,7 @@ def run_set(set_name: str, set_dir: str, save_netcdf: bool):
param.num_workers = 5

# Make sure to save the netCDF files to compare outputs.
param.save_netcdf = save_netcdf
param.save_netcdf = True

# Set specific parameters for new sets
enso_param = EnsoDiagsParameter()
Expand Down
Loading
Loading