news

Just Released: CUTLASS 3.8

CUTLASS 3.8 extends support to NVIDIA Blackwell SM100 architecture with 99% peak performance for Tensor Core operations, bringing essential features