-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Runtime support for CPU hoist ops #2152
base: main
Are you sure you want to change the base?
Conversation
### Problem description This PR adds support for CPU hoisting changes to our TTNNToFlatbuffer translation. ### What's changed This PR introduces new flatbuffer fields for CPU hoisted ops. It hooks up the TTNNToFlatbuffer pass to properly utilize passes introduced in other PRs s.t. a CPU dylib can be embedded in our flatbuffer. Note: this PR is pretty much impossible to test standalone, unfortunately. 2 following PRs will be strictly dependent on this PR merging: 1. #2148 this PR adds earlier support for hoisting in the TTNNPipeline for TTIRToTTNN. However, this generates IR that TTNNToFlatbuffer pass cannot parse until this PR lands. (So this TTNNToFlatbuffer PR is useless without TTIRToTTNN PR, but TTIRToTTNN breaks ttmlir-translate without this PR 😄 ). 2. #2152 Runtime PR, which will add support for actually executing new flatbuffers. This is obviously dependent on this PR + its flatbuffer changes as well.
### Problem description This PR adds support for CPU hoisting changes to our TTNNToFlatbuffer translation. ### What's changed This PR introduces new flatbuffer fields for CPU hoisted ops. It hooks up the TTNNToFlatbuffer pass to properly utilize passes introduced in other PRs s.t. a CPU dylib can be embedded in our flatbuffer. Note: this PR is pretty much impossible to test standalone, unfortunately. 2 following PRs will be strictly dependent on this PR merging: 1. #2148 this PR adds earlier support for hoisting in the TTNNPipeline for TTIRToTTNN. However, this generates IR that TTNNToFlatbuffer pass cannot parse until this PR lands. (So this TTNNToFlatbuffer PR is useless without TTIRToTTNN PR, but TTIRToTTNN breaks ttmlir-translate without this PR 😄 ). 2. #2152 Runtime PR, which will add support for actually executing new flatbuffers. This is obviously dependent on this PR + its flatbuffer changes as well.
1ea72a8
to
2fbe384
Compare
2fbe384
to
d947859
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang-Tidy
found issue(s) with the introduced code (1/1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang-Tidy
found issue(s) with the introduced code (1/1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clang-Tidy
found issue(s) with the introduced code (1/1)
@@ -171,6 +175,11 @@ class ProgramContext { | |||
|
|||
DeviceVariant getTargetDevice(uint32_t meshId); | |||
|
|||
void *tryGetDylibHandle(const uint32_t dylibId) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think we can drop the const here and use uint32_t
directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we definitely can but I'm not sure it's preferable. Personally, I prefer primitives to be const wherever possible, including params. But I don't super strongly I guess, our code is very const-unfriendly in general thanks to MLIR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dialect changes look good to me!
Problem description
Add runtime support for executing hoisted CPU ops.
What's changed
Add mechanism for reading + opening dylibs from flatbuffer, executing CPU ops (by executing proper function within a given dylib).
Checklist
Verify emptyOp logic works correctlyseparate follow-on PR to support empty tensors on host: Create specific TTNN op for empty tensor on host #2006