- InfiniBand’s long dominance faces real pressure from Ethernet’s open-standard movement
- Meta and Nvidia are betting on openness to scale AI networks
- The ESUN project links industry rivals through shared networking ambitions
The Open Compute Project (OCP) has announced a new initiative known as Ethernet for Scale-Up Networking (ESUN), aimed at developing open standards for high-performance connections within artificial intelligence clusters.
This collaboration brings together companies such as Meta, Nvidia, AMD, Cisco, and OpenAI to explore how Ethernet can rival existing interconnects like InfiniBand in large-scale data centers.
Other companies joining the collaboration include Arista, ARM, Broadcom, HPE Networking, Marvell, Microsoft, and Oracle.
Open networking for AI clusters
InfiniBand has long dominated the market for high-speed AI networking, accounting for roughly 80% of the infrastructure connecting GPUs and accelerators.
However, the ESUN group believes that Ethernet’s maturity, cost-effectiveness, and interoperability make it a strong candidate for scaling up AI clusters.
Unlike proprietary systems, Ethernet’s widespread familiarity among engineers could help reduce complexity in managing massive AI workloads.
Supporters argue that using Ethernet as an open standard will allow operators to scale infrastructure while lowering costs.
OCP’s new AI tools initiative builds on earlier work under its SUE-Transport (SUE-T) program, which explored Ethernet transport for multi-processor systems.
ESUN’s participants will meet regularly to define standards for switch behavior, including protocol headers, error handling, and lossless data transfer.
The group will also study how network design affects load balancing and memory ordering within GPU-based systems.
It plans to coordinate with the Ultra Ethernet Consortium and the IEEE 802.3 standards body to ensure alignment across the wider Ethernet ecosystem.
Several firms have already developed Ethernet-based products targeting AI scale-up – Broadcom’s Tomahawk Ultra switch, for example, supports up to 77 billion packets per second, and Nvidia’s Spectrum-X platform also combines Ethernet with acceleration hardware for AI clusters.
However, Meta, which co-founded OCP in 2011, views ESUN as a natural extension of its push for open hardware within data centers.
Even so, observers note that replacing established InfiniBand networks would require Ethernet to prove itself under the most demanding AI workloads, where latency and reliability are critical.
ESUN’s success will depend on balancing openness with performance. Advocates see a future where AI systems run on interoperable hardware using standardized Ethernet technologies.
Yet, given the scale and sensitivity of AI infrastructure, it remains uncertain whether industry momentum will shift decisively away from proprietary interconnects.
For now, ESUN represents an ambitious effort, and whether it can match InfiniBand’s performance remains to be seen.
Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds. Make sure to click the Follow button!
And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form, and get regular updates from us on WhatsApp too.