-
Notifications
You must be signed in to change notification settings - Fork 86
Profile rocFFT kernels
in bash: "export HIP_TRACE_API=1" (reset by =0)
Launch your application, then it profiles every HIP APIs, including rocFFT kernels, memory copy and allocation/deallocation.
For more profiling tools, see Profiling and Debugging HIP Code
The IR and ISA can be dumped by setting the following environment variable before building and running the app.
export KMDUMPISA=1
export KMDUMPLLVM=1
export KMDUMPDIR=/path/to/dump
a tool very similar to nvprof, roprof is a command line tool to profile HIP kernels, roprof is located in /opt/rocm/profiler/bin
example usage
/opt/rocm/profiler/bin/rcprof -A ./your_executable
Then the dumped output apitrace.atp will be in your home directory.
Download CodeXL and open the *.atp with the CodeXL. Notice: switch to profile mode before open the *.atp
it will dump several a bunch of profile.HSA*.html files, you can view it by any internet browser.
/opt/rocm/profiler/bin/rcprof
--help for more options
"nvprof ./your_executable" to profile every CUDA runtime invocations including kernels, memory copy.