-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add scripts to run containerized model outside of FRE #136
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,158 @@ | ||
# Author: Tom Robinson | ||
FROM intel/hpckit:2024.2.1-0-devel-ubuntu22.04 as builder | ||
############ Set up build environment ############ | ||
## Clone spack | ||
RUN mkdir -p /opt && cd /opt && git clone -b v0.22.2 https://github.com/spack/spack.git | ||
# What we want to install and how we want to install it | ||
# is specified in a manifest file (spack.yaml) | ||
RUN mkdir -p /opt/spack-environment && \ | ||
set -o noclobber \ | ||
&& (echo spack: \ | ||
&& echo ' mirrors:'\ | ||
&& echo ' E4S: https://cache.e4s.io/noaa'\ | ||
&& echo ' definitions:' \ | ||
&& echo ' - packages_builtin:' \ | ||
&& echo ' - bacio%oneapi@2024.2.1' \ | ||
&& echo ' - hdf5@1.14.3%oneapi@2024.2.1' \ | ||
&& echo ' - ip%oneapi@2024.2.1' \ | ||
&& echo ' - libyaml@0.2.5%oneapi@2024.2.1' \ | ||
&& echo ' - nccmp@1.9.1.0%oneapi@2024.2.1' \ | ||
&& echo ' - netcdf-c@4.9.2%oneapi@2024.2.1' \ | ||
&& echo ' - netcdf-fortran@4.6.1%oneapi@2024.2.1' \ | ||
&& echo ' - sp@2.3.3%oneapi@2024.2.1' \ | ||
&& echo ' - w3emc@2.11.0' \ | ||
&& echo ' - w3nco@2.4.1' \ | ||
&& echo ' - zlib%oneapi@2024.2.1' \ | ||
&& echo ' - zlib-ng@2.1.4%oneapi@2024.2.1' \ | ||
&& echo ' packages:' \ | ||
&& echo ' intel-oneapi-mpi:' \ | ||
&& echo ' buildable: false' \ | ||
&& echo ' externals:' \ | ||
&& echo ' - spec: intel-oneapi-mpi@2021.10.0' \ | ||
&& echo ' path: /opt/intel/oneapi/mpi/2021.10.0' \ | ||
&& echo ' mpi:' \ | ||
&& echo ' require: intel-oneapi-mpi' \ | ||
&& echo ' hdf5:' \ | ||
&& echo ' variants: +fortran+hl+szip' \ | ||
&& echo ' netcdf-c:' \ | ||
&& echo ' variants: +dap' \ | ||
&& echo ' pango:' \ | ||
&& echo ' variants: +X' \ | ||
&& echo ' all:' \ | ||
&& echo ' target: [x86_64]' \ | ||
&& echo ' providers:' \ | ||
&& echo ' zlib-api: [zlib-ng+compat, zlib]' \ | ||
&& echo ' compiler: [oneapi]' \ | ||
&& echo ' specs:' \ | ||
&& echo ' - matrix:' \ | ||
&& echo ' - [$packages_builtin]' \ | ||
&& echo ' concretizer:' \ | ||
&& echo ' unify: True' \ | ||
&& echo ' config:' \ | ||
&& echo ' install_tree: /opt/software' \ | ||
&& echo ' view: /opt/views/view') > /opt/spack-environment/spack.yaml | ||
# Install the software, remove unnecessary deps | ||
RUN . /opt/spack/share/spack/setup-env.sh && cd /opt/spack-environment && spack compiler add | ||
RUN . /opt/spack/share/spack/setup-env.sh && cd /opt/spack-environment && spack compiler add && spack --verbose env activate . && spack --verbose install --fail-fast && spack gc -y | ||
## Set environment variables | ||
ENV LD_LIBRARY_PATH /opt/views/view/lib:$LD_LIBRARY_PATH | ||
ENV LIBRARY_PATH /opt/views/view/lib:$LIBRARY_PATH | ||
ENV PATH $PATH:/opt/views/view/bin | ||
############ Set up build ############ | ||
## Set up code checkout | ||
RUN export GIT_TERMINAL_PROMPT=0 && \ | ||
mkdir -p /apps/mom6_sis2_generic_4p_compile_symm_yaml/src && \ | ||
cd /apps/mom6_sis2_generic_4p_compile_symm_yaml/src && \ | ||
git clone --recursive --jobs=4 https://github.com/NOAA-GFDL/FMS.git -b 2024.01.02 FMS && \ | ||
git clone --recursive --jobs=4 https://github.com/NOAA-GFDL/CEFI-regional-MOM6.git -b main mom6 && \ | ||
git clone --recursive --jobs=4 https://github.com/NOAA-GFDL/ice_param.git sis2 && \ | ||
git clone --recursive --jobs=4 https://github.com/NOAA-GFDL/land_null.git land_null && \ | ||
git clone --recursive --jobs=4 https://github.com/NOAA-GFDL/atmos_null.git atmos_null && \ | ||
git clone --recursive --jobs=4 https://github.com/NOAA-GFDL/FMScoupler.git coupler | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since the |
||
## Clone mkmf | ||
RUN cd /apps \ | ||
&& git clone --recursive https://github.com/NOAA-GFDL/mkmf \ | ||
&& cp mkmf/bin/* /usr/local/bin | ||
## Create the build directory | ||
RUN mkdir -p /apps/mom6_sis2_generic_4p_compile_symm_yaml/exec | ||
## Use mkmf to create the Makefiles for the componenets | ||
RUN bld_dir=/apps/mom6_sis2_generic_4p_compile_symm_yaml/exec \ | ||
&& src_dir=/apps/mom6_sis2_generic_4p_compile_symm_yaml/src \ | ||
&& mkmf_template=/apps/mkmf/templates/hpcme-intel24.mk \ | ||
&& mkdir -p $bld_dir/FMS \ | ||
&& list_paths -l -o $bld_dir/FMS/pathnames_FMS $src_dir/FMS \ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Switch to |
||
&& cd $bld_dir/FMS \ | ||
&& mkmf -m Makefile -a $src_dir -b $bld_dir -p libFMS.a -t $mkmf_template -c " -DINTERNAL_FILE_NML -g -Duse_libMPI -Duse_netCDF -Duse_yaml -DMAXFIELDMETHODS_=600" -IFMS/fms2_io/include -IFMS/include -IFMS/mpp/include $bld_dir/FMS/pathnames_FMS | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, please consider changing the FMS-related include folder to |
||
RUN bld_dir=/apps/mom6_sis2_generic_4p_compile_symm_yaml/exec \ | ||
&& src_dir=/apps/mom6_sis2_generic_4p_compile_symm_yaml/src \ | ||
&& mkmf_template=/apps/mkmf/templates/hpcme-intel24.mk \ | ||
&& mkdir -p $bld_dir/mom6 \ | ||
&& list_paths -l -o $bld_dir/mom6/pathnames_mom6 $src_dir/mom6/src/MOM6/config_src/memory/dynamic_symmetric $src_dir/mom6/src/MOM6/config_src/drivers/FMS_cap $src_dir/mom6/src/MOM6/src/*/ $src_dir/mom6/src/MOM6/src/*/*/ $src_dir/mom6/src/MOM6/config_src/external/ODA_hooks $src_dir/mom6/src/MOM6/config_src/external/stochastic_physics $src_dir/mom6/src/MOM6/config_src/external/drifters $src_dir/mom6/src/MOM6/config_src/external/database_comms $src_dir/mom6/src/ocean_BGC/generic_tracers $src_dir/mom6/src/ocean_BGC/mocsy/src $src_dir/mom6/src/MOM6/pkg/GSW-Fortran/modules $src_dir/mom6/src/MOM6/pkg/GSW-Fortran/toolbox $src_dir/mom6/src/MOM6/config_src/infra/FMS2 \ | ||
&& cd $bld_dir/mom6 \ | ||
&& mkmf -m Makefile -a $src_dir -b $bld_dir -p libmom6.a -t $mkmf_template -c "-DINTERNAL_FILE_NML -g -DINTERNAL_FILE_NML -DMAX_FIELDS_=100 -DUSE_FMS2_IO -DNOT_SET_AFFINITY -D_USE_MOM6_DIAG -D_USE_GENERIC_TRACER -DUSE_PRECISION=2 " -o "-I$bld_dir/FMS " -IFMS/fms2_io/include -IFMS/include -IFMS/mpp/include -Imom6/src/MOM6/pkg/CVMix-src/include $bld_dir/mom6/pathnames_mom6 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Consider increasing |
||
RUN bld_dir=/apps/mom6_sis2_generic_4p_compile_symm_yaml/exec \ | ||
&& src_dir=/apps/mom6_sis2_generic_4p_compile_symm_yaml/src \ | ||
&& mkmf_template=/apps/mkmf/templates/hpcme-intel24.mk \ | ||
&& mkdir -p $bld_dir/sis2 \ | ||
&& list_paths -l -o $bld_dir/sis2/pathnames_sis2 $src_dir/mom6/src/SIS2/config_src/dynamic_symmetric $src_dir/mom6/src/SIS2/config_src/external/Icepack_interfaces $src_dir/mom6/src/SIS2/src $src_dir/mom6/src/icebergs/src $src_dir/mom6/src/ice_param \ | ||
&& cd $bld_dir/sis2 \ | ||
&& mkmf -m Makefile -a $src_dir -b $bld_dir -p libsis2.a -t $mkmf_template -c "-DINTERNAL_FILE_NML -g -DUSE_FMS2_IO" -o "-I$bld_dir/FMS -I$bld_dir/mom6 " -IFMS/fms2_io/include -IFMS/include -IFMS/mpp/include -Imom6/src/MOM6/pkg/CVMix-src/include -Imom6/src/MOM6/src/framework $bld_dir/sis2/pathnames_sis2 | ||
RUN bld_dir=/apps/mom6_sis2_generic_4p_compile_symm_yaml/exec \ | ||
&& src_dir=/apps/mom6_sis2_generic_4p_compile_symm_yaml/src \ | ||
&& mkmf_template=/apps/mkmf/templates/hpcme-intel24.mk \ | ||
&& mkdir -p $bld_dir/land_null \ | ||
&& list_paths -l -o $bld_dir/land_null/pathnames_land_null $src_dir/mom6/src/land_null \ | ||
&& cd $bld_dir/land_null \ | ||
&& mkmf -m Makefile -a $src_dir -b $bld_dir -p libland_null.a -t $mkmf_template -c "" -o "-I$bld_dir/FMS " $bld_dir/land_null/pathnames_land_null | ||
RUN bld_dir=/apps/mom6_sis2_generic_4p_compile_symm_yaml/exec \ | ||
&& src_dir=/apps/mom6_sis2_generic_4p_compile_symm_yaml/src \ | ||
&& mkmf_template=/apps/mkmf/templates/hpcme-intel24.mk \ | ||
&& mkdir -p $bld_dir/atmos_null \ | ||
&& list_paths -l -o $bld_dir/atmos_null/pathnames_atmos_null $src_dir/mom6/src/atmos_null \ | ||
&& cd $bld_dir/atmos_null \ | ||
&& mkmf -m Makefile -a $src_dir -b $bld_dir -p libatmos_null.a -t $mkmf_template -c "-DINTERNAL_FILE_NML -g" -o "-I$bld_dir/FMS " -IFMS/fms2_io/include -IFMS/include -IFMS/mpp/include -Imom6/src/MOM6/pkg/CVMix-src/include $bld_dir/atmos_null/pathnames_atmos_null | ||
RUN bld_dir=/apps/mom6_sis2_generic_4p_compile_symm_yaml/exec \ | ||
&& src_dir=/apps/mom6_sis2_generic_4p_compile_symm_yaml/src \ | ||
&& mkmf_template=/apps/mkmf/templates/hpcme-intel24.mk \ | ||
&& mkdir -p $bld_dir/coupler \ | ||
&& list_paths -l -o $bld_dir/coupler/pathnames_coupler $src_dir/mom6/src/coupler/shared $src_dir/mom6/src/coupler/full \ | ||
&& cd $bld_dir/coupler \ | ||
&& mkmf -m Makefile -a $src_dir -b $bld_dir -p libcoupler.a -t $mkmf_template -c "-DINTERNAL_FILE_NML -g -DUSE_FMS2_IO -D_USE_LEGACY_LAND_ -Duse_AM3_physics" -o "-I$bld_dir/FMS -I$bld_dir/mom6 -I$bld_dir/sis2 -I$bld_dir/land_null -I$bld_dir/atmos_null " -IFMS/fms2_io/include -IFMS/include -IFMS/mpp/include -Imom6/src/MOM6/pkg/CVMix-src/include $bld_dir/coupler/pathnames_coupler | ||
## Create the main Makefile | ||
RUN mkdir -p /apps/mom6_sis2_generic_4p_compile_symm_yaml/exec \ | ||
&& cd /apps/mom6_sis2_generic_4p_compile_symm_yaml/exec \ | ||
&& set -o noclobber \ | ||
&& (echo SRCROOT = /apps/mom6_sis2_generic_4p_compile_symm_yaml/src/ \ | ||
&& echo 'BUILDROOT = /apps/mom6_sis2_generic_4p_compile_symm_yaml/exec/' \ | ||
&& echo 'MK_TEMPLATE = /apps/mkmf/templates/hpcme-intel24.mk' \ | ||
&& echo 'include $(MK_TEMPLATE)' \ | ||
&& echo 'mom6_sis2_generic_4p_compile_symm_yaml.x: coupler/libcoupler.a sis2/libsis2.a mom6/libmom6.a land_null/libland_null.a atmos_null/libatmos_null.a FMS/libFMS.a ' \ | ||
&& echo '\t$(LD) $^ $(LDFLAGS) -o $@ $(STATIC_LIBS)' \ | ||
&& echo 'coupler/libcoupler.a: FMS/libFMS.a mom6/libmom6.a sis2/libsis2.a land_null/libland_null.a atmos_null/libatmos_null.a FORCE' \ | ||
&& echo '\t$(MAKE) SRCROOT=$(SRCROOT) BUILDROOT=$(BUILDROOT) MK_TEMPLATE=$(MK_TEMPLATE) --directory=coupler $(@F)' \ | ||
&& echo 'sis2/libsis2.a: FMS/libFMS.a mom6/libmom6.a FORCE' \ | ||
&& echo '\t$(MAKE) SRCROOT=$(SRCROOT) BUILDROOT=$(BUILDROOT) MK_TEMPLATE=$(MK_TEMPLATE) --directory=sis2 $(@F)' \ | ||
&& echo 'mom6/libmom6.a: FMS/libFMS.a FORCE' \ | ||
&& echo '\t$(MAKE) SRCROOT=$(SRCROOT) BUILDROOT=$(BUILDROOT) MK_TEMPLATE=$(MK_TEMPLATE) --directory=mom6 $(@F)' \ | ||
&& echo 'land_null/libland_null.a: FMS/libFMS.a FORCE' \ | ||
&& echo '\t$(MAKE) SRCROOT=$(SRCROOT) BUILDROOT=$(BUILDROOT) MK_TEMPLATE=$(MK_TEMPLATE) --directory=land_null $(@F)' \ | ||
&& echo 'atmos_null/libatmos_null.a: FMS/libFMS.a FORCE' \ | ||
&& echo '\t$(MAKE) SRCROOT=$(SRCROOT) BUILDROOT=$(BUILDROOT) MK_TEMPLATE=$(MK_TEMPLATE) --directory=atmos_null $(@F)' \ | ||
&& echo 'FMS/libFMS.a: FORCE' \ | ||
&& echo '\t$(MAKE) SRCROOT=$(SRCROOT) BUILDROOT=$(BUILDROOT) MK_TEMPLATE=$(MK_TEMPLATE) --directory=FMS $(@F)' \ | ||
&& echo 'FORCE:' \ | ||
&& echo '') > /apps/mom6_sis2_generic_4p_compile_symm_yaml/exec/Makefile | ||
## Use make to build the model | ||
RUN cd /apps/mom6_sis2_generic_4p_compile_symm_yaml/exec && make -j 4 PROD=on | ||
############ Create the final stage of the container build ############ | ||
FROM intel/oneapi-runtime:2024.2.1-0-devel-ubuntu22.04 as final | ||
## copy libs and executable from builder | ||
COPY --from=builder /opt/software /opt/software | ||
COPY --from=builder /opt/views /opt/views | ||
# This will include the code and all of the build from the builder stage | ||
COPY --from=builder /apps/mom6_sis2_generic_4p_compile_symm_yaml /apps/mom6_sis2_generic_4p_compile_symm_yaml | ||
## Set up the run time environment | ||
ENV LD_LIBRARY_PATH /opt/views/view/lib:$LD_LIBRARY_PATH | ||
ENV LIBRARY_PATH /opt/views/view/lib:$LIBRARY_PATH | ||
ENV PATH /opt/views/view/bin:/apps/mom6_sis2_generic_4p_compile_symm_yaml/exec:$PATH | ||
ENTRYPOINT ["/bin/bash"] |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
# Running Containerized Model Outside of FRE | ||
|
||
## Introduction | ||
|
||
This directory contains scripts to run the `CEFI_NWA12_COBALT_V1` experiment in a container outside of the `FRE` workflow. Since these scripts do not benefit from the years of development that have gone into `FRE`, it lacks several features and makes several assumptions: | ||
|
||
1.) You will have to stage all the necessary input files to the `INPUT/` directory yourself, using the same naming scheme as `CEFI_NWA12_cobalt.xml`. All input files are available on gaea, and the `run_model.sh` script will stage annual `ERA5` and `GloFAS` runoff forcings for you if you provide a path to a directory where these files are located. You will have to manually move the other files your self. If on gaea, or a system with access to gaea, you can stage the necesarray in puts with the following commands: | ||
``` | ||
cp /gpfs/f5/cefi/scratch/Utheri.Wagura/DockerfileTest/INPUT ./INPUT | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I also make a copy of those inputs under global-shared folder: |
||
ln -s INPUT/ocean_topog.nc INPUT/topog.nc | ||
``` | ||
|
||
2.) Model output will not be staged to another system at the end of each model year. When a year of simulation is complete, `run_model.sh` will simply tar together all output files and move them to a folder in `OUTPUT` named after the start date of that particular run. If the `mppnccombine` tool is available in your `path`, `run_model.sh` will try to combine outputs from different ranks before tarring the files togeter. | ||
|
||
## Requirements | ||
This workflow uses a containerized environment to compile and run the model. As such, to run the workflow as-is, you will need access to a system that has both `podman` and `apptainer/singularity` available. Since `podman` is a drop in replacement for `docker`, the `compile_model.sh` script should still work if you replace `podman` with `docker`, though this has not been tested. Similarly, if you do not have access to `apptainer/singularity`, you may be able to skip the `apptainer build` step and replace `apptainer exec` with `docker exec` in `run_model.sh`, although this has not been tested either. | ||
|
||
Running the model requires either `Intel MPI` or `MPICH`. The environment file in `/envs/container-gaea.env` uses `lmod` to load `MPICH` for you and sets environment variables that `apptainer` needs. If running on a system other than gaea, you will need to create an environment file that sets the relevant variables and loads the relevant `MPI` implementation for you. | ||
|
||
**IMPORTANT**: If running on a system other than gaea, be sure to edit the variables `era5_dir`, `glofas_dir`, and `env_file` to point to direcotories containing annual `ERA5` and `GloFAS` runoff data, as well as your environment file, before running the workflow. | ||
**IMPORTANT**: We have encountered some issues building the container on some gaea nodes. So far, the model had compiled successfully on the following nodes: | ||
``` | ||
gaea56 | ||
``` | ||
Please use one of these nodes to create the container until all compilation issues are resolved | ||
|
||
## Running the workflow. | ||
|
||
After staging all files to the `INPUT` directory, compile the model by calling `compile_model.sh`: | ||
``` | ||
./compile_model.sh | ||
``` | ||
This will create a `CEFI_NWA12_COBALT_V1.sif` file containing the model executable. Note that this process can take about an hour to complete. Once you have the `.sif` file, and have set all of the relevant variables at the top of the `run_model.sh` script, run the following command to run the model for `n` years: | ||
``` | ||
sbatch ./run_model.sh n | ||
``` | ||
The model output from each siumulation year will be available in `OUTPUT/YYYY0101` for year `YYYY`, while the `stdout` and `stderr` from the run will be available in `OUTPUT/stdout`. |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
#!/bin/bash | ||
podman build -f Dockerfile -t mom6_sis2_generic_4p_compile_symm_yaml:prod | ||
rm -f mom6_sis2_generic_4p_compile_symm_yaml.tar mom6_sis2_generic_4p_compile_symm_yaml.sif | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we really need this line? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This was copied over from the |
||
podman save -o mom6_sis2_generic_4p_compile_symm_yaml-prod.tar localhost/mom6_sis2_generic_4p_compile_symm_yaml:prod | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I really like Podman—it's a great alternative to Docker, especially for Linux and HPC users. My concern is that regular users may simply want a Dockerfile or Singularity definition file to easily build an image for running the model. I don't have any issues with the run script, but perhaps we could consider adding a Singularity definition file as well, so users can just run There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good. I can work on creating a singularity definition file as well to provide this capability |
||
apptainer build --disable-cache CEFI_NWA12_COBALT_V1.sif docker-archive://mom6_sis2_generic_4p_compile_symm_yaml-prod.tar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicated line