Release v0.10.0
The new stable version offers significant performance improvements of the generated kernel programs and contains critical resource deallocation fixes (get the Nuget package).
It is strongly recommended to upgrade to this version as soon as possible to avoid resource and GC related deallocation issues.
Breaking changes
- The inheritance hierarchy of the
ExchangeBuffer
class has been changed to avoid exposing internal memory buffers. If you previously relied on the immediate inheritance fromExchangeBufferBase
onMemoryBuffer
, you have to adapt your program to use the intermediate base classMemoryBuffer<T, TIndex>
instead (see diff). - Properties exposing internal memory buffers of the high-level
MemoryBufferXD
classes have been removed to avoid ownership related GC-free issues (see diff).
Why are there breaking changes?
We have decided to remove dangerous properties from several memory buffer classes. The use of these properties can lead to program crashes, since buffers could be disposed asynchronously in the background by the GC without further notice.
Changes
- Improved performance of kernel launchers by passing packed argument structures (#358, #372).
- Graduated different optimizations from
O2
toO1
(release mode) to improve performance in release builds using an additional of stable optimization passes (#344). - Graduated O2 optimizations in the
Cuda
backend toO1
pipeline to generate vectorized IO operations in release builds (#350). - Added support for managed
sizeof
IL instruction (#380). - Added
PrintInformation
method toAccelerator
instances to print detailed accelerator information (#389). - Added enhanced assertions and out-of-bounds checks to all
ArrayView
accesses on GPU devices (Use flagContextFlags.EnableAsserations
or attach a debugger to your application to enable assertion checks. Make sure to use theportable
debug information format for detailed source location information) (#375). - Added support for printf-like output in Kernels for
CPU
,Cuda
andOpenCL
accelerators (#342). - Added new utility Launch/LaunchAutoGrouped methods to immediately launch kernels using a separate strong-reference cache (#336).
- Added new
AlignTo
alignment methods to explicitly alignArrayView
instances to a particular alignment in bytes (#316). - Added enhanced support for local memory via a new
LocalMemory
class (#316). - Added support for several
PopCount
,CLZ
andCTZ
operations (#324). - Added new
MemSet
functions to all memory buffers (#338). - Added new IfConditionalConversion to fold nested and-also and or-else block chains to
O2
pipeline (#328). - Added new local memory optimizations to simplify array accesses (#317).
- Added simple 64-bit-based
LongGlobalIndex
helper to simplify correct computations using 64-bit integers (#337). - Added new
CLPlatformVersion
and fixed OpenCL 1.2 compatibility issues (#335). - Removed support for .NET Core 2.0 (#353).
- Prevent using
SharedMemory
in implicitly grouped kernels (#354). - Prevent using
CudaAccelerator
andCLAccelerator
instances to run on non-native OS .NET versions (#396). - Fixed critical GC-related resource deallocation issues (#376, #393).
- Fixed returning correct length of dynamic shared memory buffers (#357).
- Fixed invalid alignment information in the presence of reinterpret casts (#386).
- Fixed invalid address computations of fixed array buffers (#361).
- Fixed invalid PTX calling convention (#362).
- Fixed edge cases in
LoopUnrolling
(#373). - Fixed invalid
printf
formats forint64
anduintX
types (#391). - Fixed invalid
DebugArrayView
implementations (#345). - Fixed invalid initializations of local memory arrays (#287).
Major internal changes:
- Removed singleton instance of
RuntimeSystem
to avoid concurrency/reflection-API issues (#393). - Updated default optimizations for ILGPU debug builds (#384).
- Added support for unity tests running on. NET Framework 4.7 (#355).
- Migrated from FxCop analyzers to .NET analyzers. (#352).
- Redesigned internal address-space inference passes (#364).
Special thanks
Special thanks to @MoFtZ, @Ruberik and @jgiannuzzi for their contributions to this release and to the entire ILGPU community for providing feedback, submitting issues and feature requests.