Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[PyTorch] Stop storing fused weight tensor in linear modules (NVIDIA#719
) * Support noop concat without providing full tensor Stop storing fused buffers in linear modules. Signed-off-by: Tim Moon <tmoon@nvidia.com> * Debug noop cat func Signed-off-by: Tim Moon <tmoon@nvidia.com> * Construct TE modules in tests with correct dtypes Signed-off-by: Tim Moon <tmoon@nvidia.com> * Add tolerances to numerical tests Signed-off-by: Tim Moon <tmoon@nvidia.com> * Use plain PyTorch concat when exporting to ONNX Signed-off-by: Tim Moon <tmoon@nvidia.com> --------- Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by: Kirthi Shankar Sivamani <ksivamani@nvidia.com> Signed-off-by: Pawel Gadzinski <pgadzinski@nvidia.com>
- Loading branch information