-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the execution phase and stage recording issue and replace compare_with_golden with verify function #1358
Merged
chandrasekaranpradeep
merged 1 commit into
main
from
pchandrasekaran/fix_execution_depth_recording
Mar 7, 2025
Merged
Fix the execution phase and stage recording issue and replace compare_with_golden with verify function #1358
chandrasekaranpradeep
merged 1 commit into
main
from
pchandrasekaran/fix_execution_depth_recording
Mar 7, 2025
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
|
1 similar comment
|
|
vkovinicTT
reviewed
Mar 4, 2025
9a885b6
to
86c3e5b
Compare
|
1 similar comment
|
|
1 similar comment
|
86c3e5b
to
fda39e1
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1358 +/- ##
=======================================
Coverage 43.40% 43.40%
=======================================
Files 48 48
Lines 7860 7860
=======================================
Hits 3412 3412
Misses 4448 4448 ☔ View full report in Codecov by Sentry. |
|
|
|
|
vkovinicTT
requested changes
Mar 6, 2025
nvukobratTT
approved these changes
Mar 6, 2025
fda39e1
to
6288878
Compare
6288878
to
e771e4a
Compare
ashokkumarkannan1
requested changes
Mar 6, 2025
|
|
|
|
e771e4a
to
1a23cc9
Compare
ashokkumarkannan1
approved these changes
Mar 7, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!!!
1a23cc9
to
898dd24
Compare
vkovinicTT
requested changes
Mar 7, 2025
898dd24
to
6ae6015
Compare
vkovinicTT
approved these changes
Mar 7, 2025
…_with_golden with verify function
6ae6015
to
d2b1f0d
Compare
|
|
|
|
|
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem Description
The execution phase and stage are recorded incorrectly whenever test cases contain post-processing code that reruns the compiled model.
Example:
In the table above, both test cases passed verification, but the execution phase and stage for the forge/test/models/pytorch/vision/resnet/test_resnet.py::test_resnet_hf[microsoft/resnet-50] test case are incorrect. The execution phase should be PASSED, and the execution stage should be VERIFICATION.
This occurs because the test case contains a post-processing function (
run_and_print_results
) that reruns the compiled model, which updates the execution phase from PASSED to EXECUTED_TTNN and the execution stage from VERIFICATION to EXECUTED_TTNN.On the other hand, another ResNet test case, forge/test/models/pytorch/vision/resnet/test_resnet.py::test_resnet_timm, does not contain post-processing, so the execution phase and stage remain unchanged.
Issue Origin
Execution depth (EXECUTED_TTNN) is recorded inside the CompiledModel class in
forge/forge/compiled_graph_state.py
, specifically in the call method, at this line.Whenever the compiled model is rerun, it updates the execution depth again.
Reason for Current Implementation
Some test cases still use the
compare_with_golden
function for verification instead ofverify
. Test cases usingcompare_with_golden
may run the compiled model anywhere inside the test function. To ensure execution depth tracking, EXECUTED_TTNN was recorded inside the call method of the CompiledModel class.Proposed Fix
To prevent incorrect updates of execution phase and stage due to post-processing:
EXECUTED_TTNN
) inside theverify
function rather than inside the call method in the CompiledModel class.compare_with_golden
withverify
in test cases.Benefits of This Change
Prevents Depth Changes During Post-Processing
verify
, execution depth remains correct.Ensures Proper Depth Handling Across Loops (e.g., Epochs)
compare_with_golden
is replaced withverify
inforge/test/mlir/mnist/training/test_training.py
, consider this scenario:verify
, execution depth is tracked correctly per epoch, allowing proper rollback if mismatches occur later.Note:
@vkovinicTT replaced the compare_with_golden with verify function for the
forge/test/mlir/mnist/training/test_training.py
in this PR. So I haven't replaced it forforge/test/mlir/mnist/training/test_training.py
fileLlama Decode with cache:
The to_pt_tensors function was modified to accept the List/tuple of tensor in this commit but in the llama decode with cache test will pass the
List[input_ids, attention_mask, postion_ids, past_key_values]
as input, here thepast_key_values
islist of tuple of tensor
, so modified the llama decode with cache test to accept as list of tensor and update the llama decode implementation to process past key values as list of tensor, not as list of tuple of tensors.