-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changing determining parallel config logic #18601
base: main
Are you sure you want to change the base?
Conversation
2f11a10
to
48d1b02
Compare
48d1b02
to
4a58eca
Compare
@@ -123,6 +137,8 @@ ParallelConfig determine_parallel_config( | |||
? find_closest_largest_divisor_with_num_padding( | |||
out_channels_ntiles, input_channles_ntiles, start_divisor_c) | |||
: find_closest_largest_divisor(out_channels_ntiles, input_channles_ntiles, start_divisor_c); | |||
set_shard_width_to_half_tile_if_possible( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why half tile??
I feel that this will eliminate non-tile multiple shard width scenario in conv in some cases. You can refer this. Pls make sure that we do not by-pass that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that feature is removed in #17937, so we expect tile multiples for conv
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ohh.. I did not know that :|
@@ -306,7 +308,8 @@ void py_bind_conv2d(py::module& module) { | |||
py::arg("compute_grid_size"), | |||
py::arg("block_shard_orientation"), | |||
py::arg("enable_channels_padding"), | |||
py::arg("is_out_tiled") = true); | |||
py::arg("is_shard_height_tile_multiple") = true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a way to eliminate these args? My only concern is you are relying on user to do right thing, not that it is too bad, but if it can be avoided, pls do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That determine_parallel_config is the same function as one we use in our ops to choose on which cores op is going to be executed so guess these can not be avoided, but that args are optional so user can skip it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not the function, the args, and yes user can avoid them in case of tile multiple but in case of non tile multiple shard height and width, user will have to explicitly calculate, right(or I am missing something)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is no intuitive way to avoid them - if we just call this function we don't know is it used for conv or for pool, and it is going to have different behaviour based on type of op. Possibly more intuitive parameter could be type of op, but we don't know if sometimes, for example, some pool operations will work on whole tiles and some won't
@@ -703,7 +722,6 @@ Conv2dConfig determine_conv_config_for_auto_shard( | |||
|
|||
ShardOrientation shard_orientation = | |||
conv_config.transpose_shards ? ShardOrientation::COL_MAJOR : ShardOrientation::ROW_MAJOR; | |||
const bool is_out_tiled = conv_config.output_layout == Layout::TILE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition seems more intuitive to user(we usually have output layouts defined) also. In latest case, user will have to find shard shape and then do the calculations on whether it is tile multiple or not. well, I know that I am not solving anything but just putting my thoughts here so you can consider all aspects before mainlining.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For conv shard shape is always tile multiple for both dimensions
@@ -63,7 +63,8 @@ def get_conv_input_memory_config( | |||
compute_grid_size=compute_grid, | |||
block_shard_orientation=ttnn.ShardOrientation.ROW_MAJOR, | |||
enable_channels_padding=True, | |||
is_out_tiled=True, | |||
is_shard_height_tile_multiple=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stick to defaults on convs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved assuming pipelines are green
5c8131e
to
bc866bf
Compare
Ticket
#18198
Problem description
Some maxpool width sharded tests fail on blackhole because they had non-tile multiple width per shard.
What's changed
Change logic for determining parallel config - introduce 2 parameters (
is_shard_height_tile_multiple
andis_shard_height_tile_multiple
) instead ofis_out_tiled
, and based on them decide which cores are going to be used. Logic is implemented with consideration that conv ops need to have tile multiple shards, and pool ops can have anything for shard height, and tile multiple or tile half for shard width. This approach needs to be modified when other shard dimensions are supported.Checklist