-
Notifications
You must be signed in to change notification settings - Fork 19
EuroHack 2019
- After the recent changes/improvements to physics, the runtime on Octo-Tiger is dominated by the reconstruct+hydro part. While these methods were also important before, they are critical now and should thus be our focus!
- Create kernels for Reconstruct + Hydro (Kokkos?) (Gregor, Dominic)
- Refactor reconstruct
- Create a reconstruct template kernel (to be instanced with either Vc or double for the GPU)
- Refactor flux
- Create flux template kernel (to be instanced with either Vc or double for the GPU)
- Port new hydro kernels to the GPU (Gregor, Theresa)
- Create infrastructure for the data movement between CPU/GPU (integrate with the current CUDA scheduler
- Port a simple version of those kernels to the GPU
- Take a look at possible optimizations
- Take a look at whether we should use unified memory instead of our current asynchronous data copies
- Improve and expand the tests (Sagiv)
- Create tests that compare the results of the HPC kernels with the current kernels during the execution
- Create dummy HPC kernels for the parts where we do not have an HPC implementation yet
- Tests for gravity solver
- Tests for hydro kernels
- Tests for radiation kernels
- Automated performance testing framework on Rostam or the Stuttgart machines?
- Create tests that compare the results of the HPC kernels with the current kernels during the execution
- Checkout possibility of the integration of Kokkos kernels (John, Theresa, Gregor)
- Discuss possibilities here
- Minimal example?
- Other, general improvements to Octo-Tiger (Dominic)
- Improve I/O (Silo or Adios?)
- Improve compile times (Check whether we can move stuff from the unit tiger header files to cpp files)
- Other points to discuss
- Multi GPU Support (We already have it in Octo-Tiger, but the interface isn't the best)?
- In-Situ visualization?
- cuda::futures?
- Optimize Pre-Recon / Post-Recon
- Create a test that compares the Multipole Multipole kernel results of the old kernels, HPC CPU kernels, and the GPU kernels (ask Gregor for details)
- Evaluate the state of the distributed I/O
- Evaluate whether we should remove NDIM INX template from unittiger and replace them with macros to improve compiletimes
Single node rotating star:
./build/octotiger/build/octotiger -t4 --problem=rotating_star --theta=0.5 --xscale=1.5 --odt=0.1 --stop_time=0.5 --stop_step=10 --max_level=4 --input_file=rotating_star.bin --n_species=5 --multipole_kernel_type=SOA_CPU --p2p_kernel_type=SOA_CPU --p2m_kernel_type=SOA_CPU --cuda_streams_per_locality=8
Requires an init file which can be obtained with:
./build/octotiger/build/tools/gen_rotating_star_init
Building the simple variant of the toolchain:
git clone https://github.com/diehlpk/PowerTiger buildscripts && cd buildscripts && ./build-all.sh RelWithDebInfo without-cuda without-mpi
Afterward, Octo-Tiger can be built separately with:
./build-all.sh RelWithDebInfo without-cuda without-mpi octotiger
Adapt the parameters to fit the required build (MPI, CUDA).
This page on @biddisco's Github wiki Octotiger on Daint contains full build settings for the latest installation of HPX/Octotiger on daint and is maintained regularly (the contents have been migrated from a previous daint settings page).
All the dependencies are installed in the $INSTALL_ROOT location, and the CMake settings should be correct to allow you to copy the environment details and then run the Octotiger CMake build command and make.