Skip to content

Node Level Scaling on KNL (on Cori)

DavidPfander-UniStuttgart edited this page Jan 26, 2017 · 21 revisions

On this page, we track the state of node-level scaling on KNL (on Cori).

Current state

  • Drop in parallel efficiency even for only 2 threads
  • Scaling significantly better than with GRAV_PAR enabled
  • Graphs show some improvement compared to mid-January
  • Setup only uses single timestep, currently rerunning experiments with more timesteps

The following graph shows the runtime for different numbers of hpx threads ("-hpx:threads").

total time

  • Level 9 doesn't fit into the MCDRAM

This corresponds to the parallel efficiency displayed in the next graph.

parallel efficieny

Console output to individual experiments can be found here.

Reproduce

srun numactl -m 1 ./knl-build/octotiger-Release/octotiger -Disableoutput -Problem=dwd \
-Max_level=${level} -Xscale=4.0 -Eos=wd -Angcon=1 -Stopstep=0.01 \
--hpx:threads=${threads} -Restart=restart${level}.chk \
--hpx:ini=hpx.stacks.small_size=0xC0000 -Ihpx.stacks.use_guard_pages=0 \
--hpx:print-bind --hpx:print-counter /threads{locality#*/total}/idle-rate \
  > results/${name}_N${SLURM_NNODES}_t${threads}_l${level}_m1 2>&1

Parameters for older graphs (moving star problem):

srun numactl -m 1 ./knl-build/octotiger-Release/octotiger \
-Disableoutput -Problem=moving_star -Max_level=${level} \
 -Stopstep=0 --hpx:threads=${threads} \
--hpx:ini=hpx.stacks.small_size=0xC00000 -Ihpx.stacks.use_guard_pages=0 \
--hpx:print-bind --hpx:print-counter /threads{locality#*/total}/idle-rate \ 
> results/${name}_N${SLURM_NNODES}_t${threads}_l${level}_m1 2>&1
  • These parameters will result in an out-of-memory error for level 5 with MCDRAM and level 6 with DRAM