Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
### Issue #370 ### Description Adds IOMMU support for Blackhole in a way that should be transparent to the application. ### List of the changes * Allow Blackhole to have multiple hugepages / host memory channels * Add an API on TTDevice for iATU programming * Rehome Blackhole iATU programming code to blackhole_tt_device.cpp * Remove unnecessary logic to determine hugepage quantity (just use what the application passes to Cluster constructor) * Add sysmem tests for Blackhole. ### Testing Manual testing was performed for both IOMMU on and IOMMU off cases using the newly-added sysmem tests for Blackhole. With IOMMU on: ``` [==========] Running 2 tests from 1 test suite. [----------] Global test environment set-up. [----------] 2 tests from SiliconDriverBH [ RUN ] SiliconDriverBH.SysmemTestWithPcie Detecting chips (found 1) 2024-12-10 20:40:07.019 | WARNING | SiliconDriver - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN 2024-12-10 20:40:07.020 | WARNING | SiliconDriver - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN 2024-12-10 20:40:07.083 | INFO | SiliconDriver - Detected PCI devices: [0] 2024-12-10 20:40:07.083 | INFO | SiliconDriver - Using local chip ids: {0} and remote chip ids {} 2024-12-10 20:40:07.083 | INFO | SiliconDriver - Opened PCI device 0; KMD version: 1.30.0, IOMMU: enabled 2024-12-10 20:40:07.170 | INFO | SiliconDriver - Allocating sysmem without hugepages (size: 0x40000000). 2024-12-10 20:40:07.417 | INFO | SiliconDriver - Mapped sysmem without hugepages to IOVA 0x3ffffff80000000. 2024-12-10 20:40:07.418 | INFO | SiliconDriver - Device: 0 Mapping iATU region 0 from 0x0 to 0x3fffffff to 0x3ffffff80000000 [ OK ] SiliconDriverBH.SysmemTestWithPcie (658 ms) [ RUN ] SiliconDriverBH.RandomSysmemTestWithPcie 2024-12-10 20:40:07.672 | WARNING | SiliconDriver - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN 2024-12-10 20:40:07.672 | WARNING | SiliconDriver - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN 2024-12-10 20:40:07.731 | INFO | SiliconDriver - Detected PCI devices: [0] 2024-12-10 20:40:07.731 | INFO | SiliconDriver - Using local chip ids: {0} and remote chip ids {} 2024-12-10 20:40:07.731 | INFO | SiliconDriver - Opened PCI device 0; KMD version: 1.30.0, IOMMU: enabled 2024-12-10 20:40:07.818 | INFO | SiliconDriver - Allocating sysmem without hugepages (size: 0x40000000). 2024-12-10 20:40:08.081 | INFO | SiliconDriver - Mapped sysmem without hugepages to IOVA 0x3ffffff80000000. 2024-12-10 20:40:08.327 | WARNING | SiliconDriver - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN 2024-12-10 20:40:08.327 | WARNING | SiliconDriver - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN 2024-12-10 20:40:08.387 | INFO | SiliconDriver - Detected PCI devices: [0] 2024-12-10 20:40:08.387 | INFO | SiliconDriver - Using local chip ids: {0} and remote chip ids {} 2024-12-10 20:40:08.387 | INFO | SiliconDriver - Opened PCI device 0; KMD version: 1.30.0, IOMMU: enabled 2024-12-10 20:40:08.474 | INFO | SiliconDriver - Allocating sysmem without hugepages (size: 0x100000000). 2024-12-10 20:40:09.453 | INFO | SiliconDriver - Mapped sysmem without hugepages to IOVA 0x3fffffe00000000. 2024-12-10 20:40:09.453 | INFO | SiliconDriver - Device: 0 Mapping iATU region 0 from 0x0 to 0x3fffffff to 0x3fffffe00000000 2024-12-10 20:40:09.454 | INFO | SiliconDriver - Device: 0 Mapping iATU region 1 from 0x40000000 to 0x7fffffff to 0x3fffffe40000000 2024-12-10 20:40:09.454 | INFO | SiliconDriver - Device: 0 Mapping iATU region 2 from 0x80000000 to 0xbfffffff to 0x3fffffe80000000 2024-12-10 20:40:09.454 | INFO | SiliconDriver - Device: 0 Mapping iATU region 3 from 0xc0000000 to 0xffffffff to 0x3fffffec0000000 [ OK ] SiliconDriverBH.RandomSysmemTestWithPcie (7754 ms) [----------] 2 tests from SiliconDriverBH (8413 ms total) [----------] Global test environment tear-down [==========] 2 tests from 1 test suite ran. (8413 ms total) [ PASSED ] 2 tests. ``` With IOMMU in passthrough: ``` [==========] Running 2 tests from 1 test suite. [----------] Global test environment set-up. [----------] 2 tests from SiliconDriverBH [ RUN ] SiliconDriverBH.SysmemTestWithPcie Detecting chips (found 1) 2024-12-10 20:59:03.744 | WARNING | SiliconDriver - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN 2024-12-10 20:59:03.745 | WARNING | SiliconDriver - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN 2024-12-10 20:59:03.812 | INFO | SiliconDriver - Detected PCI devices: [0] 2024-12-10 20:59:03.812 | INFO | SiliconDriver - Using local chip ids: {0} and remote chip ids {} 2024-12-10 20:59:03.813 | INFO | SiliconDriver - Opened PCI device 0; KMD version: 1.30.0, IOMMU: disabled 2024-12-10 20:59:03.928 | INFO | SiliconDriver - Device: 0 Mapping iATU region 0 from 0x0 to 0x3fffffff to 0xe00000000 [ OK ] SiliconDriverBH.SysmemTestWithPcie (383 ms) [ RUN ] SiliconDriverBH.RandomSysmemTestWithPcie 2024-12-10 20:59:04.121 | WARNING | SiliconDriver - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN 2024-12-10 20:59:04.121 | WARNING | SiliconDriver - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN 2024-12-10 20:59:04.177 | INFO | SiliconDriver - Detected PCI devices: [0] 2024-12-10 20:59:04.177 | INFO | SiliconDriver - Using local chip ids: {0} and remote chip ids {} 2024-12-10 20:59:04.177 | INFO | SiliconDriver - Opened PCI device 0; KMD version: 1.30.0, IOMMU: disabled 2024-12-10 20:59:04.380 | WARNING | SiliconDriver - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN 2024-12-10 20:59:04.380 | WARNING | SiliconDriver - Unknown board type for chip 0. This might happen because chip is running old firmware. Defaulting to UNKNOWN 2024-12-10 20:59:04.435 | INFO | SiliconDriver - Detected PCI devices: [0] 2024-12-10 20:59:04.435 | INFO | SiliconDriver - Using local chip ids: {0} and remote chip ids {} 2024-12-10 20:59:04.436 | INFO | SiliconDriver - Opened PCI device 0; KMD version: 1.30.0, IOMMU: disabled 2024-12-10 20:59:04.513 | INFO | SiliconDriver - Device: 0 Mapping iATU region 0 from 0x0 to 0x3fffffff to 0xe00000000 2024-12-10 20:59:04.513 | INFO | SiliconDriver - Device: 0 Mapping iATU region 1 from 0x40000000 to 0x7fffffff to 0xe40000000 2024-12-10 20:59:04.513 | INFO | SiliconDriver - Device: 0 Mapping iATU region 2 from 0x80000000 to 0xbfffffff to 0xe80000000 2024-12-10 20:59:04.513 | INFO | SiliconDriver - Device: 0 Mapping iATU region 3 from 0xc0000000 to 0xffffffff to 0xec0000000 [ OK ] SiliconDriverBH.RandomSysmemTestWithPcie (11055 ms) [----------] 2 tests from SiliconDriverBH (11438 ms total) [----------] Global test environment tear-down [==========] 2 tests from 1 test suite ran. (11438 ms total) [ PASSED ] 2 tests. ``` ### API Changes There are no API changes in this PR.
- Loading branch information