Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add UCC support #175

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
acc3c89
Added pmix (internal) and libevent/hwloc/ucc (external) dependencies;…
j34ni Sep 19, 2024
83a67f4
Fixed POST_LINK tests
j34ni Sep 19, 2024
18799e0
Update run_test.sh
j34ni Sep 20, 2024
82a76b1
Removed configure option --enable-mca-dso to ensure compatibility wit…
j34ni Sep 21, 2024
bde2bf2
Update meta.yaml
j34ni Sep 21, 2024
968cde7
After conda smithy recipe-lint --conda-forge
j34ni Sep 21, 2024
9806f8a
Build with
j34ni Sep 21, 2024
4bd2922
Update meta.yaml
j34ni Sep 22, 2024
296d267
Update meta.yaml
j34ni Sep 22, 2024
f747a34
Update meta.yaml
j34ni Sep 22, 2024
46878f6
Update meta.yaml
j34ni Sep 22, 2024
0512def
Update meta.yaml
j34ni Sep 22, 2024
840530b
Update meta.yaml
j34ni Sep 22, 2024
acb11de
Update meta.yaml
j34ni Sep 22, 2024
554f11b
Update meta.yaml
j34ni Sep 22, 2024
398f332
Update meta.yaml
j34ni Sep 22, 2024
4917ead
Update run_test.sh
j34ni Sep 22, 2024
5ef282b
Fixed typos woth linux-64 and osx-64
j34ni Sep 22, 2024
01691cc
Update meta.yaml
j34ni Sep 22, 2024
c0168c5
Update meta.yaml
j34ni Sep 22, 2024
fbc426c
Still issues with ppc64le
j34ni Sep 22, 2024
9a4e097
Update meta.yaml
j34ni Sep 22, 2024
0e17698
Update build-mpi.sh
j34ni Sep 22, 2024
9d7ace8
Update build-mpi.sh
j34ni Sep 22, 2024
e888233
Update build-mpi.sh
j34ni Sep 22, 2024
bdf7b05
Update build-mpi.sh
j34ni Sep 22, 2024
cff4502
Update run_test.sh
j34ni Sep 22, 2024
3f27611
Update run_test.sh
j34ni Sep 22, 2024
533399b
Update run_test.sh
j34ni Sep 22, 2024
130649f
Update run_test.sh
j34ni Sep 22, 2024
3475244
--enable-debug
j34ni Sep 23, 2024
c7c9572
fixed typos
j34ni Sep 23, 2024
4c6a12d
back to 9a4e097
j34ni Sep 23, 2024
9539371
More consistent requirements for build/host/run/test
j34ni Sep 23, 2024
808b63f
Introduced
j34ni Sep 23, 2024
2e57c79
With DSOs
j34ni Sep 23, 2024
d5393e3
Force rebuild after Connection broken: IncompleteRead
j34ni Sep 23, 2024
fdcec31
Adjusted test on ompi_info to platform
j34ni Sep 23, 2024
e7d473e
Plateform name is linux-ppc64le (not ppc64le)
j34ni Sep 23, 2024
afcbd68
Reinstated the original tests
j34ni Sep 23, 2024
6eab69d
Deactivated tests
j34ni Sep 24, 2024
6f0983e
Testing on ompi_info rather than conda list
j34ni Sep 24, 2024
0525930
Build for ppc64le as it was originally
j34ni Sep 24, 2024
19148dd
Replaced simple quote by double quote in configure
j34ni Sep 24, 2024
d10a926
Dependencies [linux and not linux-ppc64]
j34ni Sep 24, 2024
0cd8996
Dependencies [linux and not linux-ppc64]
j34ni Sep 24, 2024
fcc69e0
Fixed tab
j34ni Sep 24, 2024
35a3de3
Restored [linux and not ppc64]
j34ni Sep 24, 2024
8210e3d
Restored simple quote in configure
j34ni Sep 24, 2024
b4f4ef6
Removed debugging stuff
j34ni Sep 24, 2024
9969ff6
Update recipe/build-mpi.sh
j34ni Sep 25, 2024
cc5ae51
Update recipe/build-mpi.sh
j34ni Sep 25, 2024
ebc95d7
Update recipe/meta.yaml
j34ni Sep 25, 2024
959ba00
Update recipe/meta.yaml
j34ni Sep 25, 2024
5f86ed3
Removed POST_LINK test for UCX and simplified deps
j34ni Sep 26, 2024
672ed2d
MNT: Re-rendered with conda-build 24.7.1, conda-smithy 3.40.1, and co…
j34ni Sep 26, 2024
cc7111d
Update build-mpi.sh
j34ni Sep 26, 2024
5764139
Create post-link-ucc.sh
j34ni Sep 26, 2024
92254dc
Update build-mpi.sh
j34ni Sep 26, 2024
27df17a
avoid extra space
leofang Sep 26, 2024
4e2d265
Merge branch 'main' into update-meta-build-files-for-infiniband-hpc-s…
leofang Sep 26, 2024
75e645a
MNT: Re-rendered with conda-build 24.9.0, conda-smithy 3.40.1, and co…
Sep 26, 2024
ee21f06
restore old tests
leofang Sep 26, 2024
6569809
remove redundant pmix build option
leofang Sep 26, 2024
d43fb96
fix
leofang Sep 27, 2024
72299b8
remove unused FCFLAGS
minrk Sep 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/CODEOWNERS

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions README.md

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

15 changes: 13 additions & 2 deletions recipe/build-mpi.sh
100755 → 100644
Original file line number Diff line number Diff line change
Expand Up @@ -33,15 +33,20 @@ if [[ "$target_platform" == osx-* ]]; then
wrapper_ldflags='-Wl,-rpath,${libdir}'
fi

# UCX support
# UCX and UCC support
build_with_ucx=""
if [[ "$target_platform" == linux-* ]]; then
build_with_ucc=""
if [[ "$target_platform" == linux-* && "$target_platform" != linux-ppc64le ]]; then
echo "Build with UCX/UCC support"
build_with_ucx="--with-ucx=$PREFIX"
build_with_ucc="--with-ucc=$PREFIX"
fi


# CUDA support
build_with_cuda=""
if [[ -n "$CUDA_HOME" ]]; then
echo "Build with CUDA support"
build_with_cuda="--with-cuda=$CUDA_HOME --with-cuda-libdir=$CUDA_HOME/lib64/stubs"
fi

Expand Down Expand Up @@ -81,6 +86,7 @@ fi
--enable-mca-dso \
--enable-ipv6 \
$build_with_ucx \
$build_with_ucc \
$build_with_cuda \
|| (cat config.log; false)

Expand All @@ -95,6 +101,11 @@ if [ -n "$build_with_ucx" ]; then
echo "osc = ^ucx" >> $PREFIX/etc/openmpi-mca-params.conf
cat $RECIPE_DIR/post-link-ucx.sh >> $POST_LINK
fi
if [ -n "$build_with_ucc" ]; then
echo "setting MCA coll_ucc_enable to 0..."
echo "coll_ucc_enable = 0" >> $PREFIX/etc/openmpi-mca-params.conf
cat $RECIPE_DIR/post-link-ucc.sh >> $POST_LINK
fi
if [ -n "$build_with_cuda" ]; then
echo "setting MCA mca_base_component_show_load_errors to 0..."
echo "mca_base_component_show_load_errors = 0" >> $PREFIX/etc/openmpi-mca-params.conf
Expand Down
14 changes: 9 additions & 5 deletions recipe/meta.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{% set version = "5.0.5" %}
{% set major = version.rpartition('.')[0] %}
{% set cuda_major = (cuda_compiler_version|default("11.8")).rpartition('.')[0] %}
{% set build = 0 %}
{% set build = 1 %}

# give conda package a higher build number
{% if mpi_type == 'conda' %}
Expand Down Expand Up @@ -41,6 +41,7 @@ outputs:
{% endif %}
ignore_run_exports:
- ucx # [linux]
- ucc # [linux]
ignore_run_exports_from:
- {{ compiler('cuda') }} # [cuda_compiler != "None"]
requirements:
Expand All @@ -54,23 +55,25 @@ outputs:
- perl # [osx]
- autoconf # [osx]
- automake # [osx]
- libtool # [unix]
- libtool # [unix]
- make # [unix]
host:
#- openpmix
#- prrte
- libhwloc
- libevent
- libhwloc
- libnl # [linux]
- zlib
- ucx # [linux]
- ucc # [linux and not ppc64le]
- ucx # [linux and not ppc64le]
- cuda-version {{ cuda_compiler_version }} # [cuda_compiler != "None"]
run:
- mpi 1.0 openmpi
#- openpmix
#- prrte
run_constrained:
- {{ pin_compatible("ucx", max_pin="x.x") }} # [linux]
- {{ pin_compatible("ucx", max_pin="x.x") }} # [linux and not ppc64le]
- {{ pin_compatible("ucc", max_pin="x.x") }} # [linux and not ppc64le]
# Open MPI only uses CUDA Driver APIs, set the minimal driver version
- __cuda >= {{ cuda_major ~ ".0" }} # [cuda_compiler != "None"]
# Ensure a consistent CUDA environment
Expand Down Expand Up @@ -158,3 +161,4 @@ extra:
- msarahan
- ocefpaf
- beckermr
- j34ni
13 changes: 13 additions & 0 deletions recipe/post-link-ucc.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
#!/bin/bash

cat << EOF >> $PREFIX/.messages.txt

On Linux, Open MPI is built with UCC support but it is disabled by default.
To enable it, first install UCC (conda install -c conda-forge ucc).
Afterwards, set the environment variables
OMPI_MCA_coll_ucc_enable=1
before launching your MPI processes.
Equivalently, you can set the MCA parameters in the command line:
mpiexec --mca coll_ucc_enable 1 ...

EOF
14 changes: 14 additions & 0 deletions recipe/run_test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,25 @@ if [[ $PKG_NAME == "openmpi" ]]; then
exit 1
fi

if [[ "$target_platform" == linux-64 || "$target_platform" == linux-aarch64 ]]; then
if [[ -z "$(ompi_info | grep ucx)" ]]; then
echo "OpenMPI configured without UCX support!"
exit 1
fi
fi

if [[ -n "$(conda list | grep cuda-version)" ]]; then
echo "Improper CUDA dependency!"
exit 1
fi

if [[ "$target_platform" == linux-64 || "$target_platform" == linux-aarch64 ]]; then
if [[ -z "$(ompi_info | grep cuda)" ]]; then
echo "OpenMPI configured without CUDA support!"
exit 1
fi
fi

command -v ompi_info
ompi_info

Expand Down