Just Released: NVIDIA TensorRT-LLM 0.13.0

Decorative image of an atomic model icon connected to a computer monitor.

Updates include tensor parallel support for Mamba2, sparse mixer normalization for MoE models, and more.

Updates include tensor parallel support for Mamba2, sparse mixer normalization for MoE models, and more.

Source

Source:: NVIDIA