[RELEASE] rmm v23.12 #1386

raydouglass · 2023-11-21T21:51:30Z

❄️ Code freeze for `branch-23.12` and v23.12 release

What does this mean?

Only critical/hotfix level issues should be merged into branch-23.12 until release (merging of this PR).

What is the purpose of this PR?

Update documentation
Allow testing for the new release
Enable a means to merge branch-23.12 into main for the release

doxygen catches more doc issues (of the types fixed in #1317) when more build outputs are turned on, which is indicative of some bugs/limitations in doxygen. XML builds will be necessary to leverage Breathe (see #1324) so this PR enables XML builds and fixes the associated issues. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Rong Ou (https://github.com/rongou) URL: #1348

Forward-merge branch-23.10 to branch-23.12

This PR builds `librmm` and `rmm` conda packages using CUDA 12 on ARM. Authors: - Bradley Dice (https://github.com/bdice) - Mark Harris (https://github.com/harrism) Approvers: - Ray Douglass (https://github.com/raydouglass) - Mark Harris (https://github.com/harrism) URL: #1330

Forward-merge branch-23.10 to branch-23.12

Since #1328 was merged after `branch-23.12` was created, the files need to be updated manually in `branch-23.12`. Authors: - Ray Douglass (https://github.com/raydouglass) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) URL: #1355

Update to use non deprecated signatures for `rapids_export` functions Authors: - Robert Maynard (https://github.com/robertmaynard) Approvers: - Mark Harris (https://github.com/harrism) URL: #1357

Forward-merge branch-23.10 to branch-23.12

This PR switches back to using `branch-23.12` for CI workflows because the CUDA 12 ARM conda migration is complete. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Jake Awe (https://github.com/AyodeAwe) URL: #1360

This PR adds doxygen groups for the various parts of the C++ API to help provide more context. This will also help with improving the docs experience in #1324. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Mark Harris (https://github.com/harrism) URL: #1358

The dropping of system CTK libraries from our CUDA 12 CI images revealed that we were missing the cuda-nvcc package required to provide nvvm for numba in the Python tests. They also revealed that the list of libraries we searched to dlopen is incomplete; for CUDA 11, the SONAME of the library incorrectly includes an extra `.0` version segment, and rmm was designed to search for that, but CUDA 12 correctly has just `libcudart.so.12` and that needs to be added to the search path. We were previously getting by on finding `libcudart.so`, but the linker name is only present in conda environments if `cuda-cudart-dev` is installed, and that package should not be a runtime requirement for rmm. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Rong Ou (https://github.com/rongou) - Ray Douglass (https://github.com/raydouglass) URL: #1366

This PR: - Adds the errors group to the doxygen so that errors are also contained in a group - Removes invalid `@throws` sections that throw nothing - Remove unnecessary backticks around exception types in contexts where they are already assumed to be types and therefore will link/use the appropriate font automatically Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Mark Harris (https://github.com/harrism) URL: #1367

…es (#1347) This PR changes conda C++/Python packages and wheels to all generate a consistent version for nightlies. The nightly version is of the form YY.MM.DDaN, where N is the number of commits from the last tag. The version is embedded in both the package metadata and in the `rmm.__version__` attribute. In addition the commit hash itself is embedded into the package as `rmm.__git_commit__`. These changes ensure that 1) the conda Python package for a given nightly will reliably choose the correct C++ package (previously we relied on build strings and build times, which is more fragile w.r.t. the conda solver); 2) wheels are properly considered nightlies and are treated accordingly by pip (e.g. requiring `--pre` for installation, not conflicting with normal releases, etc); and 3) wheels and conda packages are aligned on versions so that they can be easily compared if necessary. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) - Lawrence Mitchell (https://github.com/wence-) - AJ Schmidt (https://github.com/ajschmidt8) URL: #1347

This PR leverages [Breathe](https://breathe.readthedocs.io/en/latest/) to pull the rmm C++ API documentation into the python Sphinx docs build, generating a single unified build of the documentation that supports cross-linking between language libraries and also simplifies cross-linking from other libraries that wish to link here (such as higher-level RAPIDS libraries that use both rmm's Python and C++ APIs). Using Breathe requires changing the doxygen build to generate XML in addition to the usual HTML. It turns out that doxygen catches more doc issues (of the types fixed in #1317) when more build outputs are turned on, which is indicative of some bugs/limitations in doxygen, but nonetheless I've fixed the additional issues in this PR as well. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Mark Harris (https://github.com/harrism) - Bradley Dice (https://github.com/bdice) - AJ Schmidt (https://github.com/ajschmidt8) URL: #1324

There are a couple of default parameters that are being set, one to a local constexpr and another by a method, both of which were previously private. That made the defaults unintentionally opaque. This change makes both of them publicly visible values. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Mark Harris (https://github.com/harrism) URL: #1373

An object with `thread_local` modifier has thread storage duration, its destructor (if it exists) will after the thread exits, which, on the main thread, is below `main` (https://eel.is/c++draft/basic.start.term). The CUDA runtime sets up (when the first call into the runtime is made) a teardown of the driver that runs `atexit`. Although [basic.start.term#5](https://eel.is/c++draft/basic.start.term#5) provides guarantees on the order in which these destructors are called (thread storage duration objects are destructed _before_ any `atexit` handlers run), it appears that gnu libstdc++ does not always implement this correctly (if not compiled with `_GLIBCXX_HAVE___CXA_THREAD_ATEXIT`). Moreover (possibly consequently) it is considered undefined behaviour to call into the CUDA runtime below `main`. Hence, we cannot call `cudaEventDestroy` to deallocate our `thread_local` events. Since there are a finite number of these event (`ndevices * nparticipating_threads`), rather than attempting to destroy them we choose to leak them, thus avoiding any sequencing problems. - Closes #1371 Authors: - Lawrence Mitchell (https://github.com/wence-) Approvers: - Mark Harris (https://github.com/harrism) - Jake Hemstad (https://github.com/jrhemstad) URL: #1375

This changes `device_buffer` to store the active CUDA device ID on creation, and (possibly temporarily) set the active device to that ID before allocating or freeing memory. It also adds tests for containers built on `device_buffer` (`device_buffer`, `device_uvector` and `device_scalar`) that ensure correct operation when the device is changed before doing things that alloc/dealloc memory for those containers. This fixes #1342 . HOWEVER, there is an important question yet to answer: `rmm::device_vector` is just an alias for `thrust::device_vector`, which does not use `rmm::device_buffer` for storage. However users may be surprised after this PR because the multidevice semantics of RMM containers will be different from `thrust::device_vector` (and therefore `rmm::device_vector`). Update: opinion is that it's probably OK to diverge from `device_vector`, and some think we should remove `rmm::device_vector`. ~While we discuss this I have set the DO NOT MERGE label.~ Authors: - Mark Harris (https://github.com/harrism) Approvers: - Lawrence Mitchell (https://github.com/wence-) - Jake Hemstad (https://github.com/jrhemstad) URL: #1370

…e` (#1095) This introduces `cuda::mr::{async_}resource_ref` as a type erased safe resource wrapper that is meant to replace uses of `{host, device}_memory_resource` We provide both async and classic allocate functions that delegate back to the original resource used to construct the `cuda::mr::{async_}resource_ref` In comparison to `{host, device}_memory_resource` the new feature provides additional compile time checks that will help users avoid common pitfalls with heterogeneous memory allocations. As a first step we provide the properties `cuda::mr::host_accessible` and `cuda::mr::device_accessible`. These properties can be added to an internal or even external type through a free function `get_property` ```cpp // For a user defined resource struct my_resource { friend void get_property(my_resource const&, cuda::mr::device_accessible) noexcept {} }; // For an external resource void get_property(some_external_resource const&, cuda::mr::device_accessible) noexcept {} ``` The advantage is that we can constrain interfaces based on these properties ```cpp void do_some_computation_on_device(cuda::mr::async_resource_ref<cuda::mr::device_accessible> mr, ...) { ... } ``` This function will fail to compile if it is passed any resource that does not support async allocations or is not tagged as providing device accessible memory. In the same way the following function will only compile if the provided resource provides the classic allocate / deallocate interface and is tagged to provide host accessible memory ```cpp void do_some_computation_on_host(cuda::mr::resource_ref<cuda::mr::host_accessible> mr, ...) { ... } ``` The property system is highly flexible and can easily be user provided to add their own properties as needed. That gives it both the flexibility of an inheritance based implementation and the security of a strictly type checked interface Authors: - Michael Schellenberger Costa (https://github.com/miscco) - Bradley Dice (https://github.com/bdice) - Mark Harris (https://github.com/harrism) Approvers: - Jake Hemstad (https://github.com/jrhemstad) - Mark Harris (https://github.com/harrism) - Bradley Dice (https://github.com/bdice) URL: #1095

gcc has a warning about potential compatibility issues with pre ISO C++ code. There is no danger in us compiling in that mode so silence this warning Authors: - Michael Schellenberger Costa (https://github.com/miscco) Approvers: - Bradley Dice (https://github.com/bdice) - Vyas Ramasubramani (https://github.com/vyasr) URL: #1381

\With the addition of libcudacxx 2.1.0, minimum CUDA version required to build RMM is now 11.4. This PR updates the readme to reflect this. Authors: - Mark Harris (https://github.com/harrism) Approvers: - Bradley Dice (https://github.com/bdice)

RAPIDS repos are using the `main` branch of https://github.com/actions/labeler which recently introduced [breaking changes](https://github.com/actions/labeler/releases/tag/v5.0.0). This PR pins to the latest v4 release of the labeler action until we can evaluate the changes required for v5. Authors: - Ray Douglass (https://github.com/raydouglass) Approvers: - AJ Schmidt (https://github.com/ajschmidt8)

@sameerz

…#1395) (#1396) This PR backports #1395 from 24.02 to 23.12. It contains an arena MR fix for simultaneous access by PTDS and other streams. Backport requested by @sameerz @GregoryKimball. Authors: - Thomas Graves (https://github.com/tgravescs) Approvers: - Lawrence Mitchell (https://github.com/wence-) - Mark Harris (https://github.com/harrism)

raydouglass and others added 22 commits September 22, 2023 09:46

v23.12 Updates [skip ci]

e13a8a8

Merge pull request #1351 from rapidsai/branch-23.10

19a8f8a

Forward-merge branch-23.10 to branch-23.12

Merge pull request #1352 from rapidsai/branch-23.10

ec75b12

Forward-merge branch-23.10 to branch-23.12

Merge pull request #1353 from rapidsai/branch-23.10

5f07014

Forward-merge branch-23.10 to branch-23.12

Update devcontainers to 23.12 (#1355)

da3ed7b

Since #1328 was merged after `branch-23.12` was created, the files need to be updated manually in `branch-23.12`. Authors: - Ray Douglass (https://github.com/raydouglass) Approvers: - AJ Schmidt (https://github.com/ajschmidt8) URL: #1355

Update rapids-cmake functions to non-deprecated signatures (#1357)

897313d

Update to use non deprecated signatures for `rapids_export` functions Authors: - Robert Maynard (https://github.com/robertmaynard) Approvers: - Mark Harris (https://github.com/harrism) URL: #1357

Merge pull request #1359 from rapidsai/branch-23.10

6362cd4

Forward-merge branch-23.10 to branch-23.12

Use branch-23.12 workflows. (#1360)

f6ab7b7

This PR switches back to using `branch-23.12` for CI workflows because the CUDA 12 ARM conda migration is complete. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Jake Awe (https://github.com/AyodeAwe) URL: #1360

update workflow links (#1363)

596ccf9

Enable build concurrency for nightly and merge triggers. (#1380)

c17730e

raydouglass requested review from a team as code owners November 21, 2023 21:51

raydouglass requested review from wence- and cwharris November 21, 2023 21:51

github-actions bot added CMake Python Related to RMM Python API labels Nov 21, 2023

github-actions bot added conda cpp Pertains to C++ code ci labels Nov 21, 2023

harrism approved these changes Nov 21, 2023

View reviewed changes

raydouglass and others added 3 commits December 4, 2023 14:11

Update Changelog [skip ci]

3d8faff

raydouglass merged commit e901e5d into main Dec 6, 2023
1 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RELEASE] rmm v23.12 #1386

[RELEASE] rmm v23.12 #1386

raydouglass commented Nov 21, 2023

[RELEASE] rmm v23.12 #1386

[RELEASE] rmm v23.12 #1386

Conversation

raydouglass commented Nov 21, 2023

❄️ Code freeze for branch-23.12 and v23.12 release

What does this mean?

What is the purpose of this PR?

❄️ Code freeze for `branch-23.12` and v23.12 release