ffwd will block when there are multiple client threads #2

a9QrX3Lu · 2022-09-11T09:11:01Z

I'm trying to run ffwd on several machines of mine, but find out that more than two clients will cause blocking on FFWD_EXEC. After some debugging, I find out that when there is concurrent FFWD_EXEC, all client threads will block on waiting the server's response, while server cannot receive any client's requests.

$ ./ffwd_sample -t 1 -s 2 -d 100 # this will run to completion
1 0.100 0.013
$ ./ffwd_sample -t 2 -s 2 -d 100 # this will block

The text was updated successfully, but these errors were encountered:

jeriksson · 2022-09-12T14:59:09Z

Hi Zulai, Can you tell us what machine you are running this on. Because of the busy-polling nature of FFWD, you can encounter severe slowdown if you oversubscribe the cores. Could that be what’s going on here? - Jakob On Sep 11, 2022, at 4:11 AM, Zulai Wang ***@***.******@***.***>> wrote: I'm trying to run ffwd on several machines of mine, but find out that more than two clients will cause blocking on FFWD_EXEC. After some debugging, I find out that when there is concurrent FFWD_EXEC, all client threads will block on waiting the server's response, while server cannot receive any client's requests. $ ./ffwd_sample -t 1 -s 2 -d 100 # this will run to completion 1 0.100 0.013 $ ./ffwd_sample -t 2 -s 2 -d 100 # this will block — Reply to this email directly, view it on GitHub<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fbitslab%2Fffwd%2Fissues%2F2&data=05%7C01%7Cjakob%40uic.edu%7C8069d7fabd0e49e34d3208da93d592fc%7Ce202cd477a564baa99e3e3b71a7c77dd%7C0%7C0%7C637984842753847635%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=tb0Yh93QuKYumlORLLfmVuvq%2FWO3TDhs%2FJqVjoXDOhE%3D&reserved=0>, or unsubscribe<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAAS3BQHF233ZKZURREZDR3V5WO3BANCNFSM6AAAAAAQJWFRZY&data=05%7C01%7Cjakob%40uic.edu%7C8069d7fabd0e49e34d3208da93d592fc%7Ce202cd477a564baa99e3e3b71a7c77dd%7C0%7C0%7C637984842754003834%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=IjD3jkQcCEiG78NTn82wiSZFJqnIuDythM8XV%2BO5fgA%3D&reserved=0>. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

a9QrX3Lu · 2022-09-14T08:56:19Z

Hi Jakob,

This is the environment of my machine

Intel(R) Xeon(R) Gold 6238R CPU
56 cores
377G DRAM

I think I didn't "oversubscribing" the cores, because I only assign 2 servers and 2 clients in the following test, which is blocking forever. Blocking keeps when I increase the server number or client number.

./ffwd_sample -t 2 -s 2 -d 100 # Here, `-t` means the thread number of clients, and `-s` means the number of polling servers.

What I've tried so far:

I've tried on several different Xeon machines, which all results in blocking.
ffwd-memcached and ffwd-hashtable also has the same results like the above ffwd_sample test.
Tests will only work when I limit the client thread number (-t) to be one.

jeriksson · 2022-09-14T13:04:55Z

Have a look at htop when the program is running. The program should be using 4 cores 100%, all green (user space). Is that what you see?

…

Sent from my iPhone On Sep 14, 2022, at 3:56 AM, Zulai Wang ***@***.***> wrote: Hi Jakob, This is the environment of my machine * Intel(R) Xeon(R) Gold 6238R CPU * 56 cores * 377G DRAM I think I didn't "oversubscribing" the cores, because I only assign 2 servers and 2 clients in the following test, which is blocking forever. Blocking keeps when I increase the server number or client number. ./ffwd_sample -t 2 -s 2 -d 100 # Here, `-t` means the thread number of clients, and `-s` means the number of polling servers. BTW, I've tried on several Xeon machines, which all results in blocking. It will only work when I limit the client thread number (-t) to be one. — Reply to this email directly, view it on GitHub<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fbitslab%2Fffwd%2Fissues%2F2%23issuecomment-1246454091&data=05%7C01%7Cjakob%40uic.edu%7Cd0e23e9aa32d4707b09608da962f045e%7Ce202cd477a564baa99e3e3b71a7c77dd%7C0%7C0%7C637987425974397752%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=r6XXS8vMkdq70p8Dl21yxpcaKHfdGfz8LVsaoLtHcyA%3D&reserved=0>, or unsubscribe<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAAS3BU7IWMZYFW3DH2U7JDV6GHL3ANCNFSM6AAAAAAQJWFRZY&data=05%7C01%7Cjakob%40uic.edu%7Cd0e23e9aa32d4707b09608da962f045e%7Ce202cd477a564baa99e3e3b71a7c77dd%7C0%7C0%7C637987425974397752%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ux3olD0mqFPGSEIPjfpnXZBS5CRtbr9o7RejqRp3ZD0%3D&reserved=0>. You are receiving this because you commented.Message ID: ***@***.***>

a9QrX3Lu · 2022-09-14T13:19:42Z

Have a look at htop when the program is running. The program should be using 4 cores 100%, all green (user space). Is that what you see?

Yes. Following is the output of htop. Every CPU below is 100%, all green. But, CPU 1 contains two ffwd_sample threads, each of which owns 50% CPU.

CPU     PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
105 3995483 wangzl     20   0  435M  2408  1956 R 100.  0.0  0:53.43 ./ffwd_sample -t 2 -s 2 -d 100
 78 3995484 wangzl     20   0  435M  2408  1956 R 100.  0.0  0:53.44 ./ffwd_sample -t 2 -s 2 -d 100
 57 3995482 wangzl     20   0  435M  2408  1956 R 100.  0.0  0:53.44 ./ffwd_sample -t 2 -s 2 -d 100
  1 3995481 wangzl     20   0  435M  2408  1956 R 50.0  0.0  0:26.75 ./ffwd_sample -t 2 -s 2 -d 100
  1 3995486 wangzl     20   0  435M  2408  1956 R 50.0  0.0  0:26.68 ./ffwd_sample -t 2 -s 2 -d 100

a9QrX3Lu · 2022-09-14T13:39:01Z

I've added two printf to gather logs on concurrent FFWD_EXEC. Hope this can provide some hints.

Two printf:

103 #define FFWD_EXEC(server_no, function, ret, ...) \
 +    printf("context=%p server_no=%d\n", context, server_no);\
105   context->request[server_no]->fptr = function; \
106   prepare_request(context->request[server_no], __VA_ARGS__); \
107   context->local_client_flag[server_no] ^= context->mask; \
108   context->request[server_no]->flag = context->local_client_flag[server_no]; \
109   while(((context->server_response[server_no]->flags ^ context->local_client_flag[server_no]) & context->mask)){ \
110     __asm__ __volatile__("rep;nop": : :"memory"); \
111   } \
 +    printf("get_value\n");\
113   ret = context->server_response[server_no]->return_values[((context->id_in_chip)) % NCLIENTS]; \
114
115 #define GET_CONTEXT() \
116   struct ffwd_context *context = ffwd_get_context();

Runtime log:

context=0x565401c0f860 server_no=1
get_value
...
context=0x565401c0f860 server_no=1
get_value
context=0x565401c0f860 server_no=0
get_value
context=0x565401c0f860 server_no=1
get_value
context=0x565401c0f7d0 server_no=0
context=0x565401c0f860 server_no=0
get_value
# Blocking start

It seems that when the second context (i.e., second client thread) starts to send message, both of the client threads will block.

jeriksson · 2022-09-14T13:48:13Z

This is new behavior to me. I don’t see how the number of clients plays a role. As a debugging step, try adding an “mfence” instruction after rep;nop; on line 110. - Jakob On Sep 14, 2022, at 8:39 AM, Zulai Wang ***@***.******@***.***>> wrote: I've added two printf to gather logs on concurrent FFWD_EXEC. Hope this can provide some hints. Two printf: 103 #define FFWD_EXEC(server_no, function, ret, ...) \ + printf("context=%p server_no=%d\n", context, server_no);\ 105 context->request[server_no]->fptr = function; \ 106 prepare_request(context->request[server_no], __VA_ARGS__); \ 107 context->local_client_flag[server_no] ^= context->mask; \ 108 context->request[server_no]->flag = context->local_client_flag[server_no]; \ 109 while(((context->server_response[server_no]->flags ^ context->local_client_flag[server_no]) & context->mask)){ \ 110 __asm__ __volatile__("rep;nop": : :"memory"); \ 111 } \ + printf("get_value\n");\ 113 ret = context->server_response[server_no]->return_values[((context->id_in_chip)) % NCLIENTS]; \ 114 115 #define GET_CONTEXT() \ 116 struct ffwd_context *context = ffwd_get_context(); Runtime log: context=0x565401c0f860 server_no=1 get_value ... context=0x565401c0f860 server_no=1 get_value context=0x565401c0f860 server_no=0 get_value context=0x565401c0f860 server_no=1 get_value context=0x565401c0f7d0 server_no=0 context=0x565401c0f860 server_no=0 get_value # Blocking start It seems that when the second context (i.e., second client thread) starts to send message, both of the client threads will block. — Reply to this email directly, view it on GitHub<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fbitslab%2Fffwd%2Fissues%2F2%23issuecomment-1246780391&data=05%7C01%7Cjakob%40uic.edu%7C37c0453eb5514e82c6a908da96568261%7Ce202cd477a564baa99e3e3b71a7c77dd%7C0%7C0%7C637987595546283908%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=wHNP3ggvNze%2BcFS8uENKNJuCFL1aX4853CurqKzcQ%2Bs%3D&reserved=0>, or unsubscribe<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAAS3BVG2XDKOJJTFKP2NVTV6HIQBANCNFSM6AAAAAAQJWFRZY&data=05%7C01%7Cjakob%40uic.edu%7C37c0453eb5514e82c6a908da96568261%7Ce202cd477a564baa99e3e3b71a7c77dd%7C0%7C0%7C637987595546283908%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=tBwaS%2B%2F5i9KalQUNxjieNbrRbX2jrAkj6sxM%2BZk3bTY%3D&reserved=0>. You are receiving this because you commented.Message ID: ***@***.***>

a9QrX3Lu · 2022-09-14T13:52:05Z

Added "mfence"

103 #define FFWD_EXEC(server_no, function, ret, ...) \
 +    printf("context=%p server_no=%d\n", context, server_no);\
105   context->request[server_no]->fptr = function; \
106   prepare_request(context->request[server_no], __VA_ARGS__); \
107   context->local_client_flag[server_no] ^= context->mask; \
108   context->request[server_no]->flag = context->local_client_flag[server_no];
     \
109   while(((context->server_response[server_no]->flags ^ context->local_client
    _flag[server_no]) & context->mask)){ \
 ~      __asm__ __volatile__("rep;nop;mfence": : :"memory"); \
111   } \
 +    printf("get_value\n");\
113   ret = context->server_response[server_no]->return_values[((context->id_in_
    chip)) % NCLIENTS]; \

After recompiling and re-run, behavior seems to be the same

context=0x5588004c7860 server_no=1
get_value
context=0x5588004c7860 server_no=0
get_value
context=0x5588004c7860 server_no=1
get_value
context=0x5588004c7860 server_no=0
get_value
context=0x5588004c7860 server_no=1
get_value
context=0x5588004c7860 server_no=0
context=0x5588004c77d0 server_no=0
get_value

jeriksson · 2022-09-14T14:01:58Z

Probably good news that this didn’t have any effect. There’s probably a silly problem at play that’s triggered by your particular configuration (56 cores, one socket?). My best guess is that you have more cores in one socket than the code is configured for. Please print the values context->id, context->id_in_chip as well. - Jakob On Sep 14, 2022, at 8:52 AM, Zulai Wang ***@***.******@***.***>> wrote: Added "mfence" 103 #define FFWD_EXEC(server_no, function, ret, ...) \ + printf("context=%p server_no=%d\n", context, server_no);\ 105 context->request[server_no]->fptr = function; \ 106 prepare_request(context->request[server_no], __VA_ARGS__); \ 107 context->local_client_flag[server_no] ^= context->mask; \ 108 context->request[server_no]->flag = context->local_client_flag[server_no]; \ 109 while(((context->server_response[server_no]->flags ^ context->local_client _flag[server_no]) & context->mask)){ \ ~ __asm__ __volatile__("rep;nop;mfence": : :"memory"); \ 111 } \ + printf("get_value\n");\ 113 ret = context->server_response[server_no]->return_values[((context->id_in_ chip)) % NCLIENTS]; \ After recompiling and re-run, behavior seems to be the same context=0x5588004c7860 server_no=1 get_value context=0x5588004c7860 server_no=0 get_value context=0x5588004c7860 server_no=1 get_value context=0x5588004c7860 server_no=0 get_value context=0x5588004c7860 server_no=1 get_value context=0x5588004c7860 server_no=0 context=0x5588004c77d0 server_no=0 get_value — Reply to this email directly, view it on GitHub<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fbitslab%2Fffwd%2Fissues%2F2%23issuecomment-1246797989&data=05%7C01%7Cjakob%40uic.edu%7C51182ddb4cba4ba926d908da965855d7%7Ce202cd477a564baa99e3e3b71a7c77dd%7C0%7C0%7C637987603385539691%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=FDANkWxQg20nXuc7YV5XfE0mIDgLD7xl1KIgxr6tn4w%3D&reserved=0>, or unsubscribe<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAAS3BSRTU55QCWGQHTX4PLV6HKBBANCNFSM6AAAAAAQJWFRZY&data=05%7C01%7Cjakob%40uic.edu%7C51182ddb4cba4ba926d908da965855d7%7Ce202cd477a564baa99e3e3b71a7c77dd%7C0%7C0%7C637987603385539691%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=9XeECVNtxYXyuQNGRfr3TP8Z4WWLUPFOkXsGf9RV8O4%3D&reserved=0>. You are receiving this because you commented.Message ID: ***@***.***>

a9QrX3Lu · 2022-09-14T14:06:20Z

context=0x56229556b860 context->id=2 context->id_in_chip=2 server_no=0
get_value
context=0x56229556b860 context->id=2 context->id_in_chip=2 server_no=1
get_value
context=0x56229556b860 context->id=2 context->id_in_chip=2 server_no=0
get_value
context=0x56229556b860 context->id=2 context->id_in_chip=2 server_no=1
get_value
context=0x56229556b860 context->id=2 context->id_in_chip=2 server_no=0
context=0x56229556b7d0 context->id=0 context->id_in_chip=-2 server_no=0
get_value

The second id_in_chip is a negative number. Maybe this is the cause? I'll look into the CPU core configuration part in the ffwd code.

There are two NUMA node and hyperthreading enabled on my machine, so there is 28 physical cores (56 logical cores) on one socket.

jeriksson · 2022-09-14T14:16:11Z

Aha, there it is. To just get things to run, try temporarily changing the Makefile to force the number of cores to 32 or less. - Jakob On Sep 14, 2022, at 9:06 AM, Zulai Wang ***@***.******@***.***>> wrote: context=0x56229556b860 context->id=2 context->id_in_chip=2 server_no=0 get_value context=0x56229556b860 context->id=2 context->id_in_chip=2 server_no=1 get_value context=0x56229556b860 context->id=2 context->id_in_chip=2 server_no=0 get_value context=0x56229556b860 context->id=2 context->id_in_chip=2 server_no=1 get_value context=0x56229556b860 context->id=2 context->id_in_chip=2 server_no=0 context=0x56229556b7d0 context->id=0 context->id_in_chip=-2 server_no=0 get_value The second id_in_chip is a negative number. Maybe this is the cause? I'll look into the CPU core configuration part in the ffwd code. — Reply to this email directly, view it on GitHub<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fbitslab%2Fffwd%2Fissues%2F2%23issuecomment-1246817906&data=05%7C01%7Cjakob%40uic.edu%7C0d755f9b679247b0ba7d08da965a536a%7Ce202cd477a564baa99e3e3b71a7c77dd%7C0%7C0%7C637987611938088174%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=CS0HGIazUtHfo6y4tliCgO%2FtdJMaoWIAOU8feN47vUQ%3D&reserved=0>, or unsubscribe<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAAS3BR5IKQLFGGLV7CAIHDV6HLWNANCNFSM6AAAAAAQJWFRZY&data=05%7C01%7Cjakob%40uic.edu%7C0d755f9b679247b0ba7d08da965a536a%7Ce202cd477a564baa99e3e3b71a7c77dd%7C0%7C0%7C637987611938088174%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=CnxCPTYE%2FxEVFT375PmbTMTwQ%2Bgo%2F9aseLQrj2VIlwY%3D&reserved=0>. You are receiving this because you commented.Message ID: ***@***.***>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ffwd will block when there are multiple client threads #2

ffwd will block when there are multiple client threads #2

a9QrX3Lu commented Sep 11, 2022 •

edited

Loading

jeriksson commented Sep 12, 2022 via email

a9QrX3Lu commented Sep 14, 2022 •

edited

Loading

jeriksson commented Sep 14, 2022 via email

a9QrX3Lu commented Sep 14, 2022 •

edited

Loading

a9QrX3Lu commented Sep 14, 2022

jeriksson commented Sep 14, 2022 via email

a9QrX3Lu commented Sep 14, 2022

jeriksson commented Sep 14, 2022 via email

a9QrX3Lu commented Sep 14, 2022 •

edited

Loading

jeriksson commented Sep 14, 2022 via email

ffwd will block when there are multiple client threads #2

ffwd will block when there are multiple client threads #2

Comments

a9QrX3Lu commented Sep 11, 2022 • edited Loading

jeriksson commented Sep 12, 2022 via email

a9QrX3Lu commented Sep 14, 2022 • edited Loading

jeriksson commented Sep 14, 2022 via email

a9QrX3Lu commented Sep 14, 2022 • edited Loading

a9QrX3Lu commented Sep 14, 2022

jeriksson commented Sep 14, 2022 via email

a9QrX3Lu commented Sep 14, 2022

jeriksson commented Sep 14, 2022 via email

a9QrX3Lu commented Sep 14, 2022 • edited Loading

jeriksson commented Sep 14, 2022 via email

a9QrX3Lu commented Sep 11, 2022 •

edited

Loading

a9QrX3Lu commented Sep 14, 2022 •

edited

Loading

a9QrX3Lu commented Sep 14, 2022 •

edited

Loading

a9QrX3Lu commented Sep 14, 2022 •

edited

Loading