Broadcom’s 102.4 Tbps Tomahawk 6 targets million-XPU AI clusters

Broadcom’s Tomahawk silicon has been a familiar component of data center networking for a decade, sitting inside the networking switch gear from multiple vendors.

With each new generation of Tomahawk, the goal has been to provide more bandwidth and scale. The new Tomahawk 6, which is now shipping, marks the first major update for Broadcom’s data center networking silicon since the Tomahawk 5 debuted in 2022. The stated goal with the new silicon is much the same as was three years ago: to accelerate AI workloads. What has changed, though, is the dramatic imperative and overwhelming scale of AI workloads. It’s the scale that Broadcom (Nasdaq:AVGO) is now chasing.

The Tomahawk 6 doubles the bandwidth of its predecessor from 51.2 Terabits per second (Tbps) to a staggering 102.4 Tbps. The announcement comes as hyperscale operators are planning deployments exceeding 100,000 XPUs (processing units including GPUs, TPUs, and other AI accelerators) and preparing for clusters that could scale to one million XPUs.

“In terms of raw bandwidth, this is the biggest leap yet. Tomahawk 5 delivered 51.2 Tbps, Tomahawk 6 doubles that to 102.4 Tbps,” Pete Del Vecchio, product manager for the Tomahawk switch family at Broadcom, told Network World.

Architectural complexity beyond linear scaling

While the bandwidth doubling from Tomahawk 5’s 51.2 Tbps represents a significant performance increase, the engineering complexity extends far beyond simple linear scaling.

Del Vecchio noted that many on-chip structures, like the memory management unit (MMU), grow in complexity by approximately 4x when bandwidth doubles. “So this is also the biggest increase in chip complexity for the Tomahawk family,” he said.

The MMU complexity increase reflects the challenges of managing packet buffering, queue scheduling and congestion control at these extreme bandwidths. Traditional approaches to packet switching become increasingly difficult as the numbers of ports, queues and simultaneous flows grow exponentially.

The Tomahawk 6 addresses these challenges through several key architectural innovations. The chip supports configurations with up to 1,024 100G SerDes lanes or higher-speed 200G SerDes options, providing flexibility for different deployment scenarios. For AI clusters requiring extended reach, the 100G SerDes configuration enables longer passive copper interconnects, reducing both power consumption and total cost of ownership compared to optical solutions. (Read more: Copper-to-optics technology eyed for next-gen AI networking gear)

Unified scale-up and scale-out architecture

One of Tomahawk 6’s most significant technical achievements is its ability to handle both scale-up and scale-out networking requirements within a unified Ethernet framework.

Scale-up networking refers to high-bandwidth, low-latency connections within individual AI training pods, typically supporting up to 512 XPUs in the Tomahawk 6’s case. Scale-out networking connects these pods together into larger clusters, with Tomahawk 6 supporting deployments exceeding 100,000 XPUs.

This unified approach eliminates the need for separate networking technologies and protocols between scale-up and scale-out tiers, simplifying network operations.

AI-optimized routing and congestion control

The Tomahawk 6 incorporates Cognitive Routing 2.0, an enhanced version of Broadcom’s adaptive routing technology specifically designed for AI workloads. This system provides advanced telemetry, dynamic congestion control, rapid failure detection and packet trimming capabilities that enable global load balancing across the network fabric.

These features address specific challenges in AI training workloads, where collective communication patterns like all-reduce operations can create temporary but severe congestion hotspots. The system’s ability to dynamically reroute traffic and provide fine-grained flow control helps maintain consistent performance across large-scale distributed training jobs.

The chip also supports modern AI transport protocols and congestion signaling mechanisms defined by the Ultra Ethernet Consortium, ensuring compatibility with emerging industry standards for AI networking.

Competitive positioning and development timeline

The development effort behind Tomahawk 6 represents a significant engineering investment.

“The development took over three years, with significant engineering investment across architecture, silicon design, software and system validation,” Del Vecchio said.

From a competitive standpoint, Broadcom is also claiming a substantial lead in switching bandwidth. “There is no other Ethernet switch chip close to TH6’s bandwidth,” Del Vecchio said. “The nearest competitors from NVIDIA, Marvell and Cisco top out at half the bandwidth.”

The company also emphasizes its time-to-market advantage: “Note that with Tomahawk 5, Broadcom was shipping in volume before any competitor even sampled,” Del Vecchio said. “We expect a similar time-to-market advantage with Tomahawk 6.”

Source:: Network World