Skip to content

Commit

Permalink
Merge pull request #5339 from ye-luo/revise-cmake-default-for-amd
Browse files Browse the repository at this point in the history
Revise cmake option default values for AMD GPUs
  • Loading branch information
prckent authored Feb 24, 2025
2 parents 6c0b0f2 + 78b35c7 commit a56e5c8
Show file tree
Hide file tree
Showing 2 changed files with 29 additions and 13 deletions.
31 changes: 18 additions & 13 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -800,7 +800,10 @@ if(ENABLE_ROCM)
target_link_libraries(ROCM::libraries INTERFACE roc::rocsolver roc::rocrand)
endif()
message("Project HIP_FLAGS: ${CMAKE_HIP_FLAGS}")
option(QMC_DISABLE_HIP_HOST_REGISTER "Disable hipHostRegister for pinning host memory" ON)
option(QMC_DISABLE_HIP_HOST_REGISTER "Disable hipHostRegister for pinning host memory" OFF)
if(QMC_DISABLE_HIP_HOST_REGISTER)
message(STATUS "Use of hipHostRegister disabled")
endif()
endif(ENABLE_ROCM)

if(USE_NVTX_API AND QMC_CUDA2HIP)
Expand Down Expand Up @@ -869,22 +872,24 @@ if(ENABLE_SYCL)
endif()
endif(ENABLE_SYCL)

#--------------------------------------------------------------------
# Resolve Vendor(CUDA/HIP/SYCL) and OpenMP runtime incompatibilities
#--------------------------------------------------------------------
#-----------------------------------------------------
# Resolve Vendor and OpenMP runtime incompatibilities
#-----------------------------------------------------
# Some OpenMP offload runtime libraries have composibility issue with vendor native ones.
# A workaround is making the vendor native runtime responsible for memory allocations and OpenMP associate/disassocate them.
set(QMC_OFFLOAD_MEM_ASSOCIATED_DEFAULT OFF)
if(ENABLE_OFFLOAD)
# CUDA/HIP supported, not SYCL
if(${COMPILER} MATCHES "Clang" AND QMC_CUDA2HIP AND ENABLE_OFFLOAD)
# Known issue HIP<5.5 https://github.com/ROCm/aomp/issues/253
message("check ${COMPILER} ${QMC_CUDA2HIP} ${hip_VERSION}")
if(${COMPILER} MATCHES "Clang" AND QMC_CUDA2HIP AND hip_VERSION VERSION_LESS "5.5")
set(QMC_OFFLOAD_MEM_ASSOCIATED_DEFAULT ON)
endif()
# Known performance issue remains in 6.3.3
set(QMC_OFFLOAD_MEM_ASSOCIATED_DEFAULT ON)
else()
set(QMC_OFFLOAD_MEM_ASSOCIATED_DEFAULT OFF)
endif()
cmake_dependent_option(QMC_OFFLOAD_MEM_ASSOCIATED "Use omp_target_associate_ptr instead of direct OpenMP offload maps in dual-space allocators"
${QMC_OFFLOAD_MEM_ASSOCIATED_DEFAULT} "ENABLE_OFFLOAD;ENABLE_CUDA;NOT ENABLE_SYCL" OFF)
if(QMC_OFFLOAD_MEM_ASSOCIATED)
message(STATUS "Use omp_target_associate_ptr instead of direct OpenMP offload maps in dual-space allocators")
endif()
cmake_dependent_option(QMC_OFFLOAD_MEM_ASSOCIATED "Manage OpenMP memory allocations via the vendor runtime"
${QMC_OFFLOAD_MEM_ASSOCIATED_DEFAULT} "ENABLE_OFFLOAD;ENABLE_CUDA" OFF)


#-------------------------------------------------------------------
# set up VTune ittnotify library
Expand Down
11 changes: 11 additions & 0 deletions docs/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -350,6 +350,17 @@ the path to the source directory.
USE_OBJECT_TARGET ON/OFF(default). Use CMake object library targets to workaround linker not being able to handle hybrid
binary archives which contain both host and device codes.

- Expert performance fine tuning options

::

QMC_OFFLOAD_MEM_ASSOCIATED ON/OFF. ON by default only when using both OpenMP offload and HIP
programming models and the host compiler is Clang based.
Use omp_target_associate_ptr instead of direct OpenMP offload maps in dual-space allocators.
Allocate device memory using vendor runtimes instead of the OpenMP runtime.
QMC_DISABLE_HIP_HOST_REGISTER ON/OFF(default). If ON, make all the use of hipHostRegister/Unregister
as no-op, namely disabling all the use of pinned memory.

- BLAS/LAPACK related

::
Expand Down

0 comments on commit a56e5c8

Please sign in to comment.