Release v1.2.0
This new release includes bug fixes and a significantly improved O2
optimization pipeline (get the ILGPU Nuget package and ILGPU Algorithms Nuget package).
Changes
- Reviewed ILGPU documentation (#750, #776).
- Added Cuda ISA 7.5, ISA 7.6 and SM 8.7 (#778).
- Added support to fold Shuffle and Broadcast operations (#764).
- Added new sample demonstrating the use of ILGPU in Blazor Apps (#779).
- Improved performance by using uniform branches for NVIDIA GPUs (#765).
- Improved
LoopUnrolling
to cover more cases (#766). - Improved inline PTX to support multiple output and by-ref parameters (#760).
- Fixed multi-dimensional RNG number generation (#808).
- Fixed issues with
LibDevice
integration (#784). - Fixed issue with unsigned nested conversions (#772, #774).
- Fixed sample project target frameworks (#771).
Internal changes
- Bump FluentAssertions from 6.5.1 to 6.7.0 in /Src (#785, #807).
- Bump Microsoft.NET.Test.Sdk from 17.1.0 to 17.2.0 in /Src (#805).
- Bump xunit.runner.visualstudio from 2.4.3 to 2.4.5 in /Src (#804).
- Bump System.Memory from 4.5.4 to 4.5.5 in /Src (#785).
- Reset baseline for 1.2.0 (#777).
- Fixed several CI issues (#796, #809, #812).
Special thanks
Special thanks to @hokb, @jgiannuzzi, @kilngod, @MoFtZ, @pavlovic-ivan and @Ruberik for their contributions to this release in form of code, feedback, ideas and proposals. Furthermore, we would like to thank the entire ILGPU community (especially @Joey9801, @MPSQUARK, @NullandKale and @Yey007) for providing feedback, submitting issues and feature requests.