Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#0: temp workaround on TG resnet trace+2cq hang #18750

Merged
merged 1 commit into from
Mar 8, 2025
Merged

#0: temp workaround on TG resnet trace+2cq hang #18750

merged 1 commit into from
Mar 8, 2025

Conversation

yugaoTT
Copy link
Contributor

@yugaoTT yugaoTT commented Mar 6, 2025

P0 temp fix for resnet50, which is hanging ND on TG for trace+2cq, due to the extra rt arg being send.
IT removes the extra args for sharded case.
#18724 (comment)

Checklist

@tt-rkim
Copy link
Collaborator

tt-rkim commented Mar 7, 2025

Running single card full pipes to be sure: https://github.com/tenstorrent/tt-metal/actions/runs/13719964134

This will probably go in though

@tt-rkim
Copy link
Collaborator

tt-rkim commented Mar 7, 2025

Looks like slight regression: https://github.com/tenstorrent/tt-metal/actions/runs/13719964134/job/38373196371#step:10:65

We can bump this for now?

@yugaoTT
Copy link
Contributor Author

yugaoTT commented Mar 7, 2025

@tt-rkim is that pipeline stable ? my change shouldn't make it worse (should be slight better if there's any)

@yugaoTT
Copy link
Contributor Author

yugaoTT commented Mar 7, 2025

@tt-rkim tt-rkim requested review from esmalTT, uaydonat and a team as code owners March 7, 2025 19:35
@tt-rkim
Copy link
Collaborator

tt-rkim commented Mar 7, 2025

Yes that test is definitely stable...

@tt-rkim
Copy link
Collaborator

tt-rkim commented Mar 7, 2025

Oh wait - that wasn't a mistake
I will bump it down. We haven't seen that before

@tt-rkim
Copy link
Collaborator

tt-rkim commented Mar 7, 2025

cc: @esmalTT - bumping down experimental unet perf test threshold in e2e perf

@tt-rkim
Copy link
Collaborator

tt-rkim commented Mar 7, 2025

#0: fixes

#0: Bump down unet fps to accommodate

Revert "#0: Bump down unet fps to accommodate"

This reverts commit 744a4ad.

Revert "Revert "#0: Bump down unet fps to accommodate""

This reverts commit e045504.
@tt-rkim tt-rkim merged commit 15db9cc into main Mar 8, 2025
14 checks passed
@tt-rkim tt-rkim deleted the yugao/resnet branch March 8, 2025 00:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants