Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heads up: future versions of spack-stack (1.7.0+) will have shared+static ESMF and MAPL libraries #2094

Open
climbfuji opened this issue Jan 16, 2024 · 9 comments
Labels
enhancement New feature or request

Comments

@climbfuji
Copy link
Collaborator

Description

This is a heads up that future versions of spack-stack (1.7.0+; 1.7.0 to be released in March 2024) will have both shared and static esmf and mapl libraries. Until now, and including spack-stack-1.6.0, we turned off the shared builds of esmf and mapl because of the following reason:

If esmf is static, then mapl needs to be static. If esmf is shared, then mapl needs to be shared. Violate these requirements and you'll be hit with double-free corruption segmentation faults when an executable unwinds its stack upon exit (see JCSDA/spack#372 and esmf-org/esmf#209 for more information). Historically, the UFS wants everything to be static, so this was the configuration in spack-stack until now.

spack-stack however needs to support more than just the UFS, and other modeling systems such as GEOS rely on dynamic libraries rather than static libraries. There is no reason why we should restrict the esmf build to static only because of the UFS. If the UFS requires a static esmf, but its build system uses the shared version when both are available, then the UFS needs to be fixed (not spack-stack trimmed down to static esmf). Similar story for mapl.

Solution

The UFS build system needs to be updated so that it uses/prefers the static versions of esmf and mapl if found.

Alternatives

Not fixing the UFS build system means being unable to update to spack-stack-1.7.0 and later versions.

Related to

@mathomp4
Copy link

Note: I'm going to look at fixing up MAPL to be "safe". That is, if ESMF is only static, or if a user chooses to link to the ESMF static library, then MAPL is forced to be built as static (and vice versa).

Of course, with spack this is fairly trivial with requirements and I'll look to make a PR to the MAPL package to effect this. But I sort of want to belt and suspender it so CMake has no choice.

@jkbk2004
Copy link
Collaborator

jkbk2004 commented Feb 8, 2024

@climbfuji I am wondering about the scope of cmake build system update needed. I mean something like a good starting point to add options like add_library(objlib OBJECT ${libsrc}) as STATIC or SHARED ? @DusanJovic-NOAA @junwang-noaa @BrianCurtis-NOAA There should be a good time period for testing enough, I think.

@DusanJovic-NOAA
Copy link
Collaborator

@climbfuji I am wondering about the scope of cmake build system update needed. I mean something like a good starting point to add options like add_library(objlib OBJECT ${libsrc}) as STATIC or SHARED ? @DusanJovic-NOAA @junwang-noaa @BrianCurtis-NOAA There should be a good time period for testing enough, I think.

@jkbk2004 I do not understand what you mean by "good starting point to add options like add_library(objlib OBJECT ${libsrc}) as STATIC or SHARED". Where should we add this? Can you please be a little bit more specific. Thanks.

@jkbk2004
Copy link
Collaborator

jkbk2004 commented Feb 8, 2024

@climbfuji I am wondering about the scope of cmake build system update needed. I mean something like a good starting point to add options like add_library(objlib OBJECT ${libsrc}) as STATIC or SHARED ? @DusanJovic-NOAA @junwang-noaa @BrianCurtis-NOAA There should be a good time period for testing enough, I think.

@jkbk2004 I do not understand what you mean by "good starting point to add options like add_library(objlib OBJECT ${libsrc}) as STATIC or SHARED". Where should we add this? Can you please be a little bit more specific. Thanks.

@DusanJovic-NOAA If I search STATIC build options, it shows

./CMEPS-interface/CMakeLists.txt:add_library(cmeps STATIC ${_ufs_util_files} ${_mediator_files} ${SCHEMES} ${CAPS} ${API})
./WW3/cmake/FindSCOTCH.cmake:add_library(PTSCOTCHparmetis::PTSCOTCHparmetis STATIC IMPORTED)
./WW3/cmake/FindSCOTCH.cmake:add_library(SCOTCH::SCOTCH STATIC IMPORTED)
./WW3/cmake/FindSCOTCH.cmake:add_library(PTSCOTCH::PTSCOTCH STATIC IMPORTED)
./WW3/cmake/FindSCOTCH.cmake:add_library(SCOTCHerr::SCOTCHerr STATIC IMPORTED)
./WW3/cmake/FindSCOTCH.cmake:add_library(PTSCOTCHerr::PTSCOTCHerr STATIC IMPORTED)
./WW3/cmake/FindParMETIS.cmake:add_library(ParMETIS::ParMETIS STATIC IMPORTED)
./WW3/cmake/FindOASIS.cmake:add_library(PSMILE::PSMILE STATIC IMPORTED)
./WW3/cmake/FindOASIS.cmake:add_library(MCT::MCT STATIC IMPORTED)
./WW3/cmake/FindOASIS.cmake:add_library(MPEU::MPEU STATIC IMPORTED)
./WW3/cmake/FindOASIS.cmake:add_library(SCRIP::SCRIP STATIC IMPORTED)
./WW3/cmake/FindMETIS.cmake:add_library(METIS::METIS STATIC IMPORTED)
./WW3/model/src/CMakeLists.txt:add_library(ww3_lib STATIC ${ftn_src} ${switch_files})
./AQM/CMakeLists.txt:add_library(aqm STATIC ${aqm_files} $<TARGET_OBJECTS:shr>
./NOAHMP-interface/noahmp/CMakeLists.txt:add_library(noahmp STATIC ${_noahmp_cap_files} ${_noahmp_ccpp_files} ${_noahmp_files})
./NOAHMP-interface/noahmp/CMakeLists.txt:file(APPEND ${CMAKE_INSTALL_PREFIX}/lib/cmake/noahmp-esmx.cmake "add_library(noahmp STATIC IMPORTED)\n")
./NOAHMP-interface/noahmp/cmake/FindESMF.cmake:    add_library(ESMF STATIC IMPORTED)
./NOAHMP-interface/CMakeLists.txt:add_library(noahmp STATIC ${_noahmp_cap_files} ${_noahmp_ccpp_files} ${_noahmp_files})
./CMakeModules/Modules/FindPIO.cmake:    add_library(PIO_C_STATIC INTERFACE IMPORTED)
./CMakeModules/Modules/FindPIO.cmake:    add_library(PIO_Fortran_STATIC INTERFACE IMPORTED)
./CMakeModules/Modules/FindPIO.cmake:      add_library(PIO::PIO_C ALIAS PIO_C_STATIC)
./CMakeModules/Modules/FindPIO.cmake:      add_library(PIO::PIO_Fortran ALIAS PIO_Fortran_STATIC)
./CMakeModules/Modules/FindPIO.cmake:        add_library(PIO::PIO_C ALIAS PIO_C_STATIC)
./CMakeModules/Modules/FindPIO.cmake:      add_library(PIO::PIO_Fortran ALIAS PIO_Fortran_STATIC)
./CDEPS-interface/CMakeLists.txt:add_library(cdeps STATIC $<TARGET_OBJECTS:share>
./HYCOM-interface/CMakeLists.txt:add_library(hycom STATIC $<TARGET_OBJECTS:hycom_obj>
./MOM6-interface/CMakeLists.txt:add_library(mom6 STATIC $<TARGET_OBJECTS:mom6_obj>
./MOM6-interface/MOM6/pkg/CVMix-src/CMakeLists.txt:        add_library( cvmix_static STATIC ${CMAKE_SOURCE_DIR}/src/dummy.F90 $<TARGET_OBJECTS:cvmix_objects> )
./MOM6-interface/MOM6/pkg/CVMix-src/CMakeLists.txt:      add_library( cvmix_static STATIC $<TARGET_OBJECTS:cvmix_objects> )
./MOM6-interface/MOM6/pkg/CVMix-src/src/CMakeLists.txt:   add_library(cvmix_io STATIC .)
./MOM6-interface/MOM6/pkg/CVMix-src/src/CMakeLists.txt:   add_library(cvmix_drivers STATIC .)
./CICE-interface/CMakeLists.txt:add_library(cice STATIC ${lib_src_files})
./FV3/ccpp/framework/stub/CMakeLists.txt:add_library(ccppstub STATIC ${SCHEMES} ${CAPS} ${API})
./FV3/ccpp/framework/src/CMakeLists.txt:add_library(ccpp_framework STATIC ${SOURCES_F90})
./FV3/ccpp/physics/CMakeLists.txt:add_library(ccpp_physics STATIC ${SCHEMES} ${SCHEMES_OPENMP_OFF} ${SCHEMES_DYNAMICS} ${CAPS})
./FV3/upp/sorc/ncep_post.fd/CMakeLists.txt:add_library(${LIBNAME} STATIC ${LIB_SRC})

@DusanJovic-NOAA
Copy link
Collaborator

@climbfuji I am wondering about the scope of cmake build system update needed. I mean something like a good starting point to add options like add_library(objlib OBJECT ${libsrc}) as STATIC or SHARED ? @DusanJovic-NOAA @junwang-noaa @BrianCurtis-NOAA There should be a good time period for testing enough, I think.

@jkbk2004 I do not understand what you mean by "good starting point to add options like add_library(objlib OBJECT ${libsrc}) as STATIC or SHARED". Where should we add this? Can you please be a little bit more specific. Thanks.

@DusanJovic-NOAA If I search STATIC build options, it shows

./CMEPS-interface/CMakeLists.txt:add_library(cmeps STATIC ${_ufs_util_files} ${_mediator_files} ${SCHEMES} ${CAPS} ${API})
./WW3/cmake/FindSCOTCH.cmake:add_library(PTSCOTCHparmetis::PTSCOTCHparmetis STATIC IMPORTED)
./WW3/cmake/FindSCOTCH.cmake:add_library(SCOTCH::SCOTCH STATIC IMPORTED)
./WW3/cmake/FindSCOTCH.cmake:add_library(PTSCOTCH::PTSCOTCH STATIC IMPORTED)
./WW3/cmake/FindSCOTCH.cmake:add_library(SCOTCHerr::SCOTCHerr STATIC IMPORTED)
./WW3/cmake/FindSCOTCH.cmake:add_library(PTSCOTCHerr::PTSCOTCHerr STATIC IMPORTED)
./WW3/cmake/FindParMETIS.cmake:add_library(ParMETIS::ParMETIS STATIC IMPORTED)
./WW3/cmake/FindOASIS.cmake:add_library(PSMILE::PSMILE STATIC IMPORTED)
./WW3/cmake/FindOASIS.cmake:add_library(MCT::MCT STATIC IMPORTED)
./WW3/cmake/FindOASIS.cmake:add_library(MPEU::MPEU STATIC IMPORTED)
./WW3/cmake/FindOASIS.cmake:add_library(SCRIP::SCRIP STATIC IMPORTED)
./WW3/cmake/FindMETIS.cmake:add_library(METIS::METIS STATIC IMPORTED)
./WW3/model/src/CMakeLists.txt:add_library(ww3_lib STATIC ${ftn_src} ${switch_files})
./AQM/CMakeLists.txt:add_library(aqm STATIC ${aqm_files} $<TARGET_OBJECTS:shr>
./NOAHMP-interface/noahmp/CMakeLists.txt:add_library(noahmp STATIC ${_noahmp_cap_files} ${_noahmp_ccpp_files} ${_noahmp_files})
./NOAHMP-interface/noahmp/CMakeLists.txt:file(APPEND ${CMAKE_INSTALL_PREFIX}/lib/cmake/noahmp-esmx.cmake "add_library(noahmp STATIC IMPORTED)\n")
./NOAHMP-interface/noahmp/cmake/FindESMF.cmake:    add_library(ESMF STATIC IMPORTED)
./NOAHMP-interface/CMakeLists.txt:add_library(noahmp STATIC ${_noahmp_cap_files} ${_noahmp_ccpp_files} ${_noahmp_files})
./CMakeModules/Modules/FindPIO.cmake:    add_library(PIO_C_STATIC INTERFACE IMPORTED)
./CMakeModules/Modules/FindPIO.cmake:    add_library(PIO_Fortran_STATIC INTERFACE IMPORTED)
./CMakeModules/Modules/FindPIO.cmake:      add_library(PIO::PIO_C ALIAS PIO_C_STATIC)
./CMakeModules/Modules/FindPIO.cmake:      add_library(PIO::PIO_Fortran ALIAS PIO_Fortran_STATIC)
./CMakeModules/Modules/FindPIO.cmake:        add_library(PIO::PIO_C ALIAS PIO_C_STATIC)
./CMakeModules/Modules/FindPIO.cmake:      add_library(PIO::PIO_Fortran ALIAS PIO_Fortran_STATIC)
./CDEPS-interface/CMakeLists.txt:add_library(cdeps STATIC $<TARGET_OBJECTS:share>
./HYCOM-interface/CMakeLists.txt:add_library(hycom STATIC $<TARGET_OBJECTS:hycom_obj>
./MOM6-interface/CMakeLists.txt:add_library(mom6 STATIC $<TARGET_OBJECTS:mom6_obj>
./MOM6-interface/MOM6/pkg/CVMix-src/CMakeLists.txt:        add_library( cvmix_static STATIC ${CMAKE_SOURCE_DIR}/src/dummy.F90 $<TARGET_OBJECTS:cvmix_objects> )
./MOM6-interface/MOM6/pkg/CVMix-src/CMakeLists.txt:      add_library( cvmix_static STATIC $<TARGET_OBJECTS:cvmix_objects> )
./MOM6-interface/MOM6/pkg/CVMix-src/src/CMakeLists.txt:   add_library(cvmix_io STATIC .)
./MOM6-interface/MOM6/pkg/CVMix-src/src/CMakeLists.txt:   add_library(cvmix_drivers STATIC .)
./CICE-interface/CMakeLists.txt:add_library(cice STATIC ${lib_src_files})
./FV3/ccpp/framework/stub/CMakeLists.txt:add_library(ccppstub STATIC ${SCHEMES} ${CAPS} ${API})
./FV3/ccpp/framework/src/CMakeLists.txt:add_library(ccpp_framework STATIC ${SOURCES_F90})
./FV3/ccpp/physics/CMakeLists.txt:add_library(ccpp_physics STATIC ${SCHEMES} ${SCHEMES_OPENMP_OFF} ${SCHEMES_DYNAMICS} ${CAPS})
./FV3/upp/sorc/ncep_post.fd/CMakeLists.txt:add_library(${LIBNAME} STATIC ${LIB_SRC})

Those are all either definitions of imported targets in various find modules or local libraries. Are you suggesting that we change all local libraries (like ccpp_framework or cice oy hycom) to shared libraries? Why would we do that, what problem will that solve? If I understand correctly this issue is about adding esmf and mapl shared libraries in addition to static to spack-stack installation. The ufs-weather-model already links with many shared libraries from spack-stack, for example:

$ ldd ufs_model | grep spack-stack
        libpng16.so.16 => /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.1/envs/unified-env/install/intel/2021.5.0/libpng-1.6.37-ggghinw/lib64/libpng16.so.16 (0x00007fd9bae33000)
        libz.so.1 => /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.1/envs/unified-env/install/intel/2021.5.0/zlib-1.2.13-t5sfdu6/lib/libz.so.1 (0x00007fd9bac13000)
        libjasper.so.4 => /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.1/envs/unified-env/install/intel/2021.5.0/jasper-2.0.32-cscsjum/lib64/libjasper.so.4 (0x00007fd9ba99b000)
        libjpeg.so.62 => /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.1/envs/unified-env/install/intel/2021.5.0/libjpeg-turbo-2.1.0-ig7rqhk/lib64/libjpeg.so.62 (0x00007fd9ba6da000)
        libnetcdf.so.19 => /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.1/envs/unified-env/install/intel/2021.5.0/netcdf-c-4.9.2-pc2lepn/lib/libnetcdf.so.19 (0x00007fd9ba047000)
        libnetcdff.so.7 => /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.1/envs/unified-env/install/intel/2021.5.0/netcdf-fortran-4.6.0-wplqtff/lib/libnetcdff.so.7 (0x00007fd9b9b9f000)
        libpioc.so => /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.1/envs/unified-env/install/intel/2021.5.0/parallelio-2.5.10-zy4qvzy/lib/libpioc.so (0x00007fd9b963b000)
        libpiof.so => /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.1/envs/unified-env/install/intel/2021.5.0/parallelio-2.5.10-zy4qvzy/lib/libpiof.so (0x00007fd9b8faf000)
        libpnetcdf.so.4 => /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.1/envs/unified-env/install/intel/2021.5.0/parallel-netcdf-1.12.2-dcxai5a/lib/libpnetcdf.so.4 (0x00007fd9b846b000)
        libhdf5_fortran.so.310 => /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.1/envs/unified-env/install/intel/2021.5.0/hdf5-1.14.0-wd7qdru/lib/libhdf5_fortran.so.310 (0x00007fd9b8227000)
        libhdf5_hl.so.310 => /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.1/envs/unified-env/install/intel/2021.5.0/hdf5-1.14.0-wd7qdru/lib/libhdf5_hl.so.310 (0x00007fd9b299f000)
        libhdf5.so.310 => /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.1/envs/unified-env/install/intel/2021.5.0/hdf5-1.14.0-wd7qdru/lib/libhdf5.so.310 (0x00007fd9b2207000)
        libsz.so.2 => /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.1/envs/unified-env/install/intel/2021.5.0/libaec-1.0.6-2arommd/lib64/libsz.so.2 (0x00007fd9b1ff6000)
        libzstd.so.1 => /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.1/envs/unified-env/install/intel/2021.5.0/zstd-1.5.2-ljcxre6/lib/libzstd.so.1 (0x00007fd9b19f0000)
        libblosc.so.1 => /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.1/envs/unified-env/install/intel/2021.5.0/c-blosc-1.21.4-kxgrd3c/lib64/libblosc.so.1 (0x00007fd9b17d7000)
        libxml2.so.2 => /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.1/envs/unified-env/install/intel/2021.5.0/libxml2-2.10.3-w7rqgkh/lib/libxml2.so.2 (0x00007fd9b13a0000)
        libhdf5_f90cstub.so.310 => /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.1/envs/unified-env/install/intel/2021.5.0/hdf5-1.14.0-wd7qdru/lib/../lib/libhdf5_f90cstub.so.310 (0x00007fd9b06b6000)
        liblz4.so.1 => /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.1/envs/unified-env/install/intel/2021.5.0/lz4-1.9.4-murstsd/lib/liblz4.so.1 (0x00007fd9b047c000)
        libiconv.so.2 => /scratch1/NCEPDEV/nems/role.epic/spack-stack/spack-stack-1.5.1/envs/unified-env/install/intel/2021.5.0/libiconv-1.17-gegnywp/lib/libiconv.so.2 (0x00007fd9afef4000)

Once spack-stack installation with both static and shared version of esmf and mapl becomes available we can try to use it and if there are issues in 'finding' consistent versions of esmf and mapl, we'll just need to fix corresponding find modules or config-file package if provided by either of these two libraries. I do not see how this situation is different than for example hdf5 and netcdf, both of which provide static and shared libraries.

@climbfuji
Copy link
Collaborator Author

I agree with Dusan, this is the best way to proceed. We can make a test stack available on Hercules/Orion pretty soon with both shared and static ESMF and MAPL and a few other features that have been requested for the next release.

@jkbk2004
Copy link
Collaborator

jkbk2004 commented Feb 9, 2024

@DusanJovic-NOAA @climbfuji thanks for all the comment!

@BrianCurtis-NOAA
Copy link
Collaborator

@Hang-Lei-NOAA was this a potential cause to the problems you're seeing with these two libraries on WCOSS2?

@Hang-Lei-NOAA
Copy link

Hang-Lei-NOAA commented Sep 23, 2024

No , I don’t think so. The issue on wcoss2 are still old versions of libraries, but just changed results. Not failure. It could be caused by using modified netcdf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants