Skip to content

Releases: intel/llvm

DPC++ daily 2021-11-25

25 Nov 18:54
d8761fd
Compare
Choose a tag to compare
Pre-release
[BuildBot] Add option to set compile target (#5024)

Sometimes you don't need to build all the SYCL toolchain - added option to set
the needed build target.

DPC++ daily 2021-11-24

24 Nov 18:55
Compare
Choose a tag to compare
Pre-release
LLVM and SPIRV-LLVM-Translator pulldown (WW46-47)

LLVM: llvm/llvm-project@0f652d8f527f
SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@05e183d

DPC++ daily 2021-11-23

23 Nov 18:52
826c569
Compare
Choose a tag to compare
Pre-release
[SYCL] Fix vec class alignment on windows platform (#4953)

Currently the sycl::vec type can be copied in the way which doesn't
preserve the default alignment on windows. This can causes crashes
since the sycl:;vec code expects the vector to be aligned and uses
vector instructions. We used default alignment because we cannot
set correct alignment in all cases. The patch adds alignment of
vector types, if alignment required is larger than 64, it is limited to 64.

DPC++ daily 2021-11-22

22 Nov 18:48
5f562be
Compare
Choose a tag to compare
Pre-release
[SYCL][ESIMD] Add ESIMD-specific IR verification pass (#4965)

Signed-off-by: Sergey Dmitriev <serguei.n.dmitriev@intel.com>

DPC++ daily 2021-11-20

20 Nov 18:49
2af6ccd
Compare
Choose a tag to compare
Pre-release
[SYCL] Fix memory leak in online compiler (#4963)

The experimental online compiler may leak memory in compileToSPIRV.
These changes address this leak by storing the SPIR-V binary information
directly in the vector that will later be returned.

Signed-off-by: Steffen Larsen <steffen.larsen@intel.com>

oneAPI DPC++ Compiler 2021-09

19 Nov 05:34
bd68232
Compare
Choose a tag to compare

New features

SYCL Compiler

SYCL Library

Documentation

Improvements

SYCL Compiler

  • Added default device triple spir64 when the compiler encounters any
    incoming object/libraries that have been built with the spir64 target.
    -fno-sycl-link-spirv can be used for disabling this behaviour [1342360]
  • Added support for non-uniform IMul and FMul operation for ptx-nvidiacl
    [98a339d]
  • Added splitting modules capability when compiling for NVPTX and AMDGCN
    [c1324e6]
  • Added -fsycl-footer-path=<path> command-line option to set path where to
    store integration footer [155acd1]
  • Improved read only accessor handling - added readonly attribute to the
    associated pointer to memory [3661685]
  • Improved the output project name generation. If the output value has one of
    .a .o .out .lib .obj .exe extension, the output project directory name will
    omit extension. For any other extension the output project directory will
    keep this extension [d8237a6]
  • Improved handling of default device with AOCX archive [e3a579f]
  • Added support for NVPTX device printf [4af2eb5]
  • Added support for non-const private statics in ESIMD kernels [bc51fe0]
  • Improved diagnostic generation when incorrect accessor format is used
    [a292214]
  • Allowed passing -Xsycl-target-backend and -Xsycl-target-link when
    default target is used [d37b832]
  • Disabled kernel function propagation up the call tree to callee when in
    SYCL 2020 mode [2667e3e]

SYCL Library

  • Improved information passed to XPTI subscribers [2af0599] [66770f0]
  • Added event interoperability to Level Zero plugin [ef33c57]
  • Enabled blitter engine for in-order queues in Level Zero plugin [904967e]
  • Removed deprecation warning for SYCL 1.2.1 barriers [18c80fa]
  • Moved free function queries to experimental namespace
    sycl::ext::oneapi::experimental [63ba1ce]
  • Added info query for device::info::atomic_memory_order_capabilities and
    context::info::atomic_memory_order_capabilities [9b04f41]
  • Improved performance of generic shuffles [fb08adf]
  • Renamed ONEAPI/INTEL namespace to ext::oneapi/intel [d703f57] [ea4b8a9]
    [e9d308e]
  • Added Level Zero interoperability which allows to specify ownership of queue
    [4614ee4] [6cf48fa]
  • Added support for reqd_work_group_size attribute in CUDA plugin [a8fe4a5]
  • Introduced SYCL_CACHE_DIR environment variable which allows to specify a
    directory for persistent cache [4011775]
  • Added version of parallel_for accepting range and a reduction variable
    [d1556e4]
  • Added verbosity to some errors handling [84ee39a]
  • Added SYCL 2020 sycl::errc_for API [02756e3]
  • Added SYCL 2020 byte_size method for sycl::buffer and sycl::vec
    classes. get_size was deprecated [282d1de]
  • Added support for USM pointers for sycl::joint_exclusive_scan and
    sycl::joint_inclusive_scan [2de0f92]
  • Added copy and move constructors for
    sycl::ext::intel::experimental::esimd::simd_view [daae147]
  • Optimized memory allocation when sub-devices are used in Level Zero plugin
    [6504ba0]
  • Added constexpr constructors for vec and marray classes
    [e7cd86b][449721b]
  • Optimized kernel cache [c16705a]
  • Added caching of device properties in Level Zero plugin [a50f45b]
  • Optimized Cuda plugin work with small kernels [07189af]
  • Optimized submission of kernels [441dc3b][33432df]
  • Aligned implementation of SYCL_EXT_ONEAPI_LOCAL_MEMORY extension
    document with updated
    document [b3db5e5]
  • Improved sycl::accessor initialization performance on device [a10199d]
  • Added support sycl::get_kernel_ids and cache for sycl::kernel_id objects
    [491ec6d]
  • Deprecated ::half since it should not be available in global
    namespace, sycl::half can be used instead [6ff9cf7]
  • Deprecated sycl::interop_handler, sycl::handler::interop_task,
    sycl::handler::run_on_host_intel, sycl::kernel::get_work_group_info and
    sycl::spec_constant APIs [5120763]
  • Marked sycl::marray device copyable [6e02880]
  • Made Level Zero interoperability API SYCL 2020 compliant for
    sycl::platform, sycl::device and sycl::context [c696415]
  • Deprecated unstable keys of SYCL_DEVICE_ALLOWLIST [b27c57c]
  • Added predefined vendor macro SYCL_IMPLEMENTATION_ONEAPI and
    SYCL_IMPLEMENTATION_INTEL [6d34ebf]
  • Deprecated sycl::ext::intel::online_compiler,
    sycl::ext::intel::experimental::online_compiler can be used instead
    [7fb56cf]
  • Deprecated global_device_space and global_host_space values of
    sycl::access::address_space enumeration, ext_intel_global_device_space
    ext_intel_host_device_space can be used instead [7fb56cf]
  • Deprecated sycl::handler::barrier and sycl::queue::submit_barrier,
    sycl::handler::ext_oneapi_barrier and
    sycl::queue::ext_oneapi_submit_barrier can be used instead [7fb56cf]
  • Removed sycl::handler::codeplay_host_task API [9a0ea9a]

Tools

  • Added support for ROCm devices in get_device_count_by_type [03155e7]

Documentation

Bug fixes

SYCL Compiler

  • Fixed emission of integration header with type aliases [e3cfa19]
  • Fixed compilation for AMD GPU with -fsycl-dead-args-optimization [5ed48b4]
  • Removed faulty implementations for atomic loads and stores for acquire,
    release and seq_cst memory orders libclc for NVPTX [4876443]
  • Fixed having two specialization for the specialization_id, one of which was
    invalid [f71a1d5]
  • Fixed context destruction in HIP plugin [6042d3a]
  • Changed queue::mem_advise and handler::mem_advise to take int instead
    of pi_mem_advice [af2bf96]
  • Prevented passing of -fopenmp-simd to device compilation when used along
    with -fsycl [226ed8b]
  • Fixed generation of the integration header when non-base-ascii chars are
    used in the kernel name [91f5047]
  • Fixed a problem which could lead to picking up incorrect kernel at runtime in
    some situations when unnamed lambda feature is used [27c632e]
  • Fixed suggested target triple in the warning message [7cc89fa]
  • Fixed identity for multiplication on CUDA backend [a6447ca]
  • Fixed a problem with dependency file generation [fd6d948] [1d5b2cb]
  • Fixed builtins address space type for CUDA backend [1e3136e]
  • Fixed a problem which could lead to incorrect user header to be picked up
    [c23fe4b]

SYCL Library

  • Added assign operator to specializations of sycl::ext::oneapi::atomic_ref
    [c6bc5a6]
  • Fixed the way managed memory is freed in CUDA plugin [e825916]
  • Changed names of some SYCL internal enumerations to avoid possible
    conflicts with user macro [1419415]
  • Fixed status which was returned for host events by
    event::get_info<info::event::command_execution_status>() call [09715f6]
  • Fixed memory ordering used for barriers [73455a1]
  • Fixed several places in CUDA and HIP plugins where bool was used instead
    of uint32_t [764b6ff]
  • Fixed event pool memory leak in Level Zero plugin [0e95e5a]
  • Removed redundant memcpy call for copying struct using fpga_reg
    [a5d290d]
  • Fixed an issue where the native memory object passed to interoperability
    memory object constructor was ignored on devices without host unified memory
    [da19678]
  • Fixed a bug in simd::replicate_w API [d36480d]
  • Fixed group operations for (u)int8/16 types [6a055ec]
  • Fixed a problem with non-native specialization constants being undefined if
    they are not explicitly updated to non-default values [3d96e1d]
  • Fixed a crash which could happen when a default constructed event is passed
    ...
Read more

DPC++ daily 2021-11-19

19 Nov 18:55
f074774
Compare
Choose a tag to compare
Pre-release
sycl-nightly/20211119

[libclc] Delete the wrong file name in the SOURCE file, and add a new…

DPC++ daily 2021-11-18

18 Nov 18:57
c855fd1
Compare
Choose a tag to compare
Pre-release
[SYCL] Fix sub-group mask for smaller SG sizes (#4916)

Fix accessing sub-group mask when sub-group size is less than 32. Make sure that false is returned for positions that are more than sub-group size.

Update the test to check this case.

DPC++ daily 2021-11-17

17 Nov 18:53
6e5dd48
Compare
Choose a tag to compare
Pre-release
[SYCL] Generate and install stripped PDBs for SYCL libraries (#4915)

Adds stripped PDB files for SYCL library and the PI plugins when
building with MSVC. Full PDB files will also be generated, but only the
stripped variants will be installed.

The stripped PDB files will only be generated and installed if the used
linker supports the /PDBSTRIPPED options. LLD does not currently support
this option. If the stripped PDB is not generated, no PDB files are
installed for the SYCL libraries and PI plugins.

Signed-off-by: Steffen Larsen <steffen.larsen@intel.com>

DPC++ daily 2021-11-16

16 Nov 18:48
3205368
Compare
Choose a tag to compare
Pre-release
[SYCL] group algorithm routines with broadened supported types (#4910)

Signed-off-by: Chris Perkins <chris.perkins@intel.com>