-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(shim): change the current dir back to bundle path #577
Conversation
This commit changes the current dir for the shim process back to the bundle path (e.g. `/run/containerd/io.containerd.runtime.v2.task/<ns>/<id>`) after `shim::wait()` finishes. The motivation for this change is that the `create` request handled by youki's libcontainer will change the current dir to container path (e.g. `/run/containerd/wasmtime/<ns>/<id>`), this path does not contain the `address` file for getting the shim socket address that needs to be deleted by the ttrpc server closes. So in order to properly delete the socket path, the current directory used by the shim crate has to be the bundle path. Signed-off-by: jiaxiao zhou <jiazho@microsoft.com>
This does not seem like the correct fix. |
Looks like that code is happening in
|
The problem is that the shim process is calling youki's APIs directly (e.g. this) and youki calls the |
@utam0k Its not clear to me from the code why youki is changing directories here. Can you chime in? |
There is another place where youki changes the directory to container_path: this, |
I believe this |
At least, it is unnecessary if you don't use pseudo terminals. |
Closing as youki-dev/youki#2780 will fix the issue |
The upstream fix is in: youki-dev/youki#2780 |
Merged in upstream fix by this PR #601 |
This commit changes the current dir for the shim process back to the bundle path (e.g.
/run/containerd/io.containerd.runtime.v2.task/<ns>/<id>
) aftershim::wait()
finishes.The motivation for this change is that the
create
request handled by youki's libcontainer will change the current dir to container path (e.g./run/containerd/wasmtime/<ns>/<id>
), this path does not contain theaddress
file for getting the shim socket address that needs to be deleted by the ttrpc server closes.So in order to properly delete the socket path, the current directory used by the shim crate has to be the bundle path.
Perf test regarding the startup of the process BEFORE the change
As reported by @rperezvaz at https://cloud-native.slack.com/archives/C04LTPB6Z0V/p1705059273058979
Perf test regarding the startup of the process AFTER the change
Explanation
The primary reason for the huge start-up time reduction is due to that the shim socket has not been properly cleaned up after the shim get closes and so every time the shim starts, it hangs for 1 whole second to try to reconnect with an existing socket address. See more containerd/rust-extensions#263 and spinframework/containerd-shim-spin#41 (comment))
bpftrace (unchanged)
Notice how
libcontainer
handler changes the process's current directory to/run/containerd/wasmtime/default/testwasm
, and at the end ofdelete
request,libcontainer
cleans up this path, so after theshim::wait()
, the address path is not longer able to be found by using the current directory because it no longer exists!