You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am still getting familiarized with the HSA runtime programming environment so this may sound like a simple question. I am developing a small networking application (IPv4 router) on a Kaveri machine that
uses a GPU module for IPv4 route lookups. My GPU module is written in OpenCL and I use cloc.sh to compile the kernel code to HSA code object (hsaco) format. I am using DPDK as my networking I/O driver for receiving and sending traffic.
I first tried to pass array of pointers to (rte_mbuf *) structures within the GPU kernel so that only the GPU directly retrieves the Ethernet frame (and the IPv4 header) so that the CPU does not waste any cycles in parsing the packet header fields (and avoid any necessary cache misses). Unfortunately, my program immediately crashes once the GPU tries to access packets' payload and I get the following messages in my dmesg log:
On more careful analysis I discovered that I am correctly passing the pointers but the kernel crashes once it tries to dereference the pointers.
I then tried to pass array of pointers to Ethernet frames to the GPU (CPU retrieves the packet pointer from the rte_mbuf structures) but this setup also triggered exactly the same crash as mentioned above.
I tried using hsa_memory_assign_agent() and hsa_memory_register() functions on the array of packet structures (both rte_mbuf * and uint8_t *) but I could not fix this problem. Any idea what I am doing wrong here?
H/W Specs:
model name : AMD A10-7850K Radeon R7, 12 Compute Cores 4C+8G
cpu MHz : 4000.000
cache size : 2048 KB
With Adrian's help I was able to find a bug in my GPU kernel code. I was trying to access an out-of-bounds memory region. After fixing that bug, my program no longer crashes. However, whenever my kernel code tries to dereference any packet pointer, it only gets fields with zero values (whether it is an Ethernet MAC address (00:00:00:00:00:00), or an IP src addr (0x00) etc.). I am sure that my CPU part of the code is not bzero-ing the packet pointers.... when the kernel execution finishes and I tried to retrieve packet contents from the CPU side I see the right values. Any ideas?
Hi,
I am still getting familiarized with the HSA runtime programming environment so this may sound like a simple question. I am developing a small networking application (IPv4 router) on a Kaveri machine that
uses a GPU module for IPv4 route lookups. My GPU module is written in OpenCL and I use
cloc.sh
to compile the kernel code to HSA code object (hsaco) format. I am using DPDK as my networking I/O driver for receiving and sending traffic.I first tried to pass array of pointers to (
rte_mbuf *
) structures within the GPU kernel so that only the GPU directly retrieves the Ethernet frame (and the IPv4 header) so that the CPU does not waste any cycles in parsing the packet header fields (and avoid any necessary cache misses). Unfortunately, my program immediately crashes once the GPU tries to access packets' payload and I get the following messages in mydmesg
log:On more careful analysis I discovered that I am correctly passing the pointers but the kernel crashes once it tries to dereference the pointers.
I then tried to pass array of pointers to Ethernet frames to the GPU (CPU retrieves the packet pointer from the
rte_mbuf
structures) but this setup also triggered exactly the same crash as mentioned above.I tried using
hsa_memory_assign_agent()
andhsa_memory_register()
functions on the array of packet structures (bothrte_mbuf *
anduint8_t *
) but I could not fix this problem. Any idea what I am doing wrong here?H/W Specs:
model name : AMD A10-7850K Radeon R7, 12 Compute Cores 4C+8G
cpu MHz : 4000.000
cache size : 2048 KB
S/W Specs:
Linux kernel version: 4.4.0-kfd-compute-rocm-rel-1.1.1-10
Intel dpdk-16.04
CLOC 1.0.11 (April 2016 update)
HSA Runtime v1.6
amdkfd v1.6.1
Thanks!
The text was updated successfully, but these errors were encountered: