Skip to content

Releases: uxlfoundation/oneCCL

Intel(R) oneAPI Collective Communications Library (oneCCL) 2021.15.4

Choose a tag to compare

@Maria1Petrova Maria1Petrova released this 31 Jul 17:44
57a9306

This ccl_2021.15.4-arc branch introduces several enhancements for Intel ARC A and B Series GPU:

  • Support for Reduce-Scatter and Point-To-Point in addition to previously enabled Allreduce and Allgather
  • Support for 8 bit datatypes (int8, uint8)
  • Bug fixes, including removal of previously required setting of IGC_VISAOptions=-activeThreadsOnlyBarrier, which is no longer needed.

The cmake command is the same as before:

make .. -DCMAKE_INSTALL_PREFIX=_install -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DCOMPUTE_BACKEND=dpcpp -DCCL_ENABLE_ARCA=1
An example of the cmake command for Intel ARC B Series GPU :
cmake .. -DCMAKE_INSTALL_PREFIX=_install -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DCOMPUTE_BACKEND=dpcpp -DCCL_ENABLE_ARCB=1

Intel(R) oneAPI Collective Communications Library (oneCCL) 2021.16

Choose a tag to compare

@nikitaxgusev nikitaxgusev released this 02 Jul 13:05
303b41a

What's New 2021.16:

  • Added SYCL graph support for Record and Replay for Allgather, Allreduce, Alltoall, ReduceScatter and Broadcast
  • Added SYCL-based implementation of ring algorithm for Allgather
  • Added SYCL-based implementation for Broadcast
  • Added multithread support for Allgather and ReduceScatter scale up impementation
  • Added attribute in the communicator to specify blocking operations for CPU
  • Bug fixes

Intel(R) oneAPI Collective Communications Library (oneCCL) 2021.15.3

Choose a tag to compare

@dycz0fx dycz0fx released this 12 Jun 20:40
def8705

This ccl_2021.15.3-arc branch adds support for Intel ARC A and B Series GPU and some bug fixes.

An example of the cmake command for Intel ARC A Series GPU :
cmake .. -DCMAKE_INSTALL_PREFIX=_install -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DCOMPUTE_BACKEND=dpcpp -DCCL_ENABLE_ARCA=1

An example of the cmake command for Intel ARC B Series GPU :
cmake .. -DCMAKE_INSTALL_PREFIX=_install -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DCOMPUTE_BACKEND=dpcpp -DCCL_ENABLE_ARCB=1

If the system does not have GPU Peer-to-Peer (P2P) support, you will need to add this compiler environment flag (export IGC_VISAOptions=-activeThreadsOnlyBarrier) before compiling. Similarly, on a system without P2P support, add export IGC_VISAOptions=-activeThreadsOnlyBarrier to your command line before running the application.

Intel(R) oneAPI Collective Communications Library (oneCCL) 2021.15.2

Choose a tag to compare

@nikitaxgusev nikitaxgusev released this 23 May 14:44
59bf593

What's new:

  • Bug fix - Improvement of User Experience based on setting of Environment Variables.

Intel(R) oneAPI Collective Communications Library (oneCCL) 2021.15.1

Choose a tag to compare

@nikitaxgusev nikitaxgusev released this 06 May 18:41
10e0e57

What's new:

  • Bug fixes

Intel(R) oneAPI Collective Communications Library (oneCCL) 2021.15

Choose a tag to compare

@nikitaxgusev nikitaxgusev released this 26 Mar 18:30
aea2b36

What's new:

  • Support for Average for Allreduce and Reduce-Scatter
  • Extend Group API to also support collective operations.
  • New split_communicator API with updated parameters.
  • Performance optimizations for scaleup for Alltoall

Removals:

  • split_communicators is deprecated in 2021.15.0 and will be removed later

Intel(R) oneAPI Collective Communications Library (oneCCL) 2021.14

Choose a tag to compare

@nikitaxgusev nikitaxgusev released this 06 Nov 10:44
3afa1bb

What's New:

  • Optimizations on key-value store support to scale up to 3000 nodes
  • New APIs for Allgather, Broadcast and group API calls
  • Performance Optimizations for scaleup for Allgather, Allreduce, and Reduce-scatter for scaleup and scaleout
  • Performance Optimizations for CPU single node
  • Optimizations to reuse Level Zero events.
  • Change of the default mechanism for IPC exchange to pidfd

Intel(R) oneAPI Collective Communications Library (oneCCL) 2021.13Intel(R) oneAPI Collective Communications Library (oneCCL) 2021.13.1

Choose a tag to compare

@nikitaxgusev nikitaxgusev released this 08 Aug 08:58
c80317f

What's new:

  • Bug fixes

Intel(R) oneAPI Collective Communications Library (oneCCL) 2021.13

Choose a tag to compare

@nikitaxgusev nikitaxgusev released this 24 Jun 10:42
0eb5987

What's New:

  • Optimizations to limit the memory consumed by oneCCL
  • Optimizations to limit the number of file descriptors maintained opened by oneCCL.
  • Align the support for in-place for the Allgatherv and Reduce-scatter collectives to follow the same behavior as NCCL.
  • In particular, the Allgatherv collective is in place when:
  • send_buff == recv_buff + rank_offset, where rank_offset = sum (recv_counts[i]), for all I<rank.
  • Reduce-scatter is in-place when recv_buff == send_buff + rank *recv_count.
  • When using the environment variable CCL_WORKER_AFFINITY, oneCCL enforces the requirement that the length of the list should be equal to the number of workers.
  • Bug fixes.

Intel(R) oneAPI Collective Communications Library (oneCCL) 2021.12

Choose a tag to compare

@ksenyako ksenyako released this 27 Mar 12:52

What's New

  • Performance improvements for scaleup for all message sizes for AllReduce, Allgather, and Reduce_Scatter.
  • Optimizations also include small message sizes that appear in inference apps.
  • Performance improvements for scaleout for Allreduce, Reduce, Allgather, and Reduce_Scatter.
  • Optimized memory usage of oneCCL.
  • Support for PMIx 4.2.6.
  • Bug fixes.

Removals

  • oneCCL 2021.12 removes support for PMIx 4.2.2