Add verify_backward to enable testing bacward ops #1383

ndrakulicTT · 2025-03-06T14:05:57Z

Ticket

Fixes #1356

Problem description

There was no way to in one place verify gradients from backward pass

What's changed

Added verify_backward:

check input parameters
runs backward on compile model
saves gradients from compiled model
runs bacward on framework model
saves gradients from framework model
Checks the gradients

Checklist

New/Existing tests provide coverage for changes

codecov-commenter · 2025-03-06T15:21:47Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Please upload report for BASE (main@184b44a). Learn more about missing BASE report.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #1383   +/-   ##
=======================================
  Coverage        ?   43.40%           
=======================================
  Files           ?       48           
  Lines           ?     7860           
  Branches        ?        0           
=======================================
  Hits            ?     3412           
  Misses          ?     4448           
  Partials        ?        0

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

github-actions · 2025-03-06T16:12:28Z

	Tests	Passed ✅	Skipped ⚠️	Failed
TT-Forge-FE Tests	642 ran	499 passed	143 skipped	0 failed

Test	Result
No test annotations available

github-actions · 2025-03-06T16:39:12Z

	Tests	Passed ✅	Skipped ⚠️	Failed
TT-Forge-FE Tests	701 ran	563 passed	138 skipped	0 failed

Test	Result
No test annotations available

github-actions · 2025-03-06T16:41:52Z

	Tests	Passed ✅	Skipped ⚠️	Failed
TT-Forge-FE Tests	642 ran	499 passed	143 skipped	0 failed

Test	Result
No test annotations available

github-actions · 2025-03-06T16:49:47Z

	Tests	Passed ✅	Skipped ⚠️	Failed
TT-Forge-FE Tests	701 ran	563 passed	138 skipped	0 failed

Test	Result
No test annotations available

vkovinicTT · 2025-03-07T10:23:00Z

forge/forge/verify/verify.py

+    if not isinstance(framework_model, torch.nn.Module):
+        raise TypeError(f"Framework model must be of type {torch.nn.Module}, but got {type(framework_model)}")


So for now we just support backward verification for torch?

vkovinicTT · 2025-03-07T11:45:06Z

forge/forge/verify/verify.py

+    compiled_output: torch.Tensor,
+    framework_model: torch.nn.Module,
+    compiled_model: CompiledModel,
+    original_model: torch.nn.Module = None,


Do we really need original_model?

vkovinicTT · 2025-03-07T13:13:16Z

forge/forge/verify/verify.py

+        fw = _squeeze_tensor(fw)
+        co = _squeeze_tensor(co)


Can we just use regular squeeze instead?

vkovinicTT · 2025-03-07T13:17:19Z

forge/forge/verify/verify.py

+    VerifyConfig,
+    VerifyTensorMetadata,
+    should_waive_gradient,
+    AutomaticValueChecker,


Please just remove import AutomaticValueChecker if we don't use it here?

vkovinicTT · 2025-03-07T13:24:27Z

forge/test/mlir/llama/tests/test_specific_ops_llama32.py

+    # NOTE: We probably need two framework models with the same state_dict to compare the outputs
+    #       But for now it works without that for some reason?
+    # model_for_compile = Matmul()
+    # model_for_compile.eval() if not training else model_for_compile.train()
+    # model_for_compile.load_state_dict(framework_model.state_dict())


Discussed offline, we probably don't need 2 framework models.

ndrakulicTT added 4 commits March 5, 2025 09:26

WIP

334461e

Add verify_backward

d43e6bb

Added testing backward for matmul

1b93fbe

Mark tests properly

09a14bb

ndrakulicTT requested review from vladimirjovanovicTT and vkovinicTT March 6, 2025 14:05

ndrakulicTT requested review from nvukobratTT, pilkicTT and dgolubovicTT as code owners March 6, 2025 14:05

vkovinicTT reviewed Mar 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add verify_backward to enable testing bacward ops #1383

Add verify_backward to enable testing bacward ops #1383

ndrakulicTT commented Mar 6, 2025

codecov-commenter commented Mar 6, 2025

github-actions bot commented Mar 6, 2025

github-actions bot commented Mar 6, 2025

github-actions bot commented Mar 6, 2025

github-actions bot commented Mar 6, 2025

vkovinicTT Mar 7, 2025

vkovinicTT Mar 7, 2025

vkovinicTT Mar 7, 2025

vkovinicTT Mar 7, 2025

vkovinicTT Mar 7, 2025

		if not isinstance(framework_model, torch.nn.Module):
		raise TypeError(f"Framework model must be of type {torch.nn.Module}, but got {type(framework_model)}")

Add verify_backward to enable testing bacward ops #1383

Are you sure you want to change the base?

Add verify_backward to enable testing bacward ops #1383

Conversation

ndrakulicTT commented Mar 6, 2025

Ticket

Problem description

What's changed

Checklist

codecov-commenter commented Mar 6, 2025

Codecov Report

github-actions bot commented Mar 6, 2025

github-actions bot commented Mar 6, 2025

github-actions bot commented Mar 6, 2025

github-actions bot commented Mar 6, 2025

vkovinicTT Mar 7, 2025

Choose a reason for hiding this comment

vkovinicTT Mar 7, 2025

Choose a reason for hiding this comment

vkovinicTT Mar 7, 2025

Choose a reason for hiding this comment

vkovinicTT Mar 7, 2025

Choose a reason for hiding this comment

vkovinicTT Mar 7, 2025

Choose a reason for hiding this comment