To run a benchmark on TPUs here are the steps to follow
Make sure your cloud VM environment is correctly setup, see the README TPU section
From a first VM shell, start the benchmarking script:
$ cd flowpm/scripts
$ python3 --model_dir=gs://flowpm_eu/tpu_test
- From a second VM shell, start Tensorboard as such:
$ export PATH="$PATH:`python3 -m site --user-base`/bin"
$ export STORAGE_BUCKET=gs://flowpm_eu/tpu_profiling/run0
$ export MODEL_DIR=gs://flowpm_eu/tpu_test
$ tensorboard --logdir=${MODEL_DIR} &
Finally, click the web preview button on the top right corner to launch TensorBoard
To capture the TPU trace, go to the profile tool, use these settings: TPU name: flowpm Address Type: TPU Name Profiling duration: 10000
When you have acquired the profile, you can shutdown the running benchmarking script with
Alternatively, one can also save the trace and then visualize it in tensorboard. To do so,
Follow step 1) above and start the benchmarking script
From a second VM shell that has again been setup to start the TPU as pointed in main README, do the following:
$ export PATH="$PATH:`python3 -m site --user-base`/bin"
$ export TPU_NAME=flowpm
$ export MODEL_DIR=gs://flowpm_eu/tpu_test
$ capture_tpu_profile --tpu=$TPU_NAME --logdir=${MODEL_DIR} --duration_ms=10000 --num_tracing_attempts=10
Ideally this will end with some output like - Profile session succeed for host(s):,,,
- From a third VM shell, follow the step 2) above to launch tensorboard and visualize
More info on using TensorBoard and Profiling for TPU here:
To access these traces from your local computer, here are the 2 simple steps
- Use the gcloud cli to authenticate yourself:
$ gcloud auth application-default login
- Start TensorBoard with the path to your Bucket:
$ tensorboard --logdir=gs://flowpm_eu/tpu_test
- Step 3: Profit!