-
Notifications
You must be signed in to change notification settings - Fork 86
Profile rocFFT kernels
in bash: "export HIP_TRACE_API=1" (reset by =0)
Launch your application, then it profiles every HIP APIs, including rocFFT kernels, memory copy and allocation/deallocation.
For more profiling tools, see Profiling and Debugging HIP Code
The IR and ISA can be dumped by setting the following environment variable before building and running the app.
export KMDUMPISA=1
export KMDUMPLLVM=1
export KMDUMPDIR=/path/to/dump
-
a tool very similar to nvprof, roprof is a command line tool to profile HIP kernels, roprof is located in /opt/rocm/profiler/bin
-
example usage
/opt/rocm/profiler/bin/rcprof -T -a profile.atp ./your_executable
it will dump several a bunch of profile.HSA*.html files, you can view it by any internet browser.
/opt/rocm/profiler/bin/rcprof --help for more options
"nvprof ./your_executable" to profile every CUDA runtime invocations including kernels, memory copy.