Breaking

AI & Machine Learning

Feature

OpenAI’s MRC Protocol Matters More Than the Industry Realises

The AI infrastructure conversation has a hardware bias. GPUs, custom silicon, transformers, and power capacity dominate the analytical framework that

Akash Sharma
12 May 2026
4 min read
AI & Machine Learning
World

The AI infrastructure conversation has a hardware bias. GPUs, custom silicon, transformers, and power capacity dominate the analytical framework that most of the industry uses to understand where the scaling ceiling is and what will determine who hits it first. That bias is producing a blind spot. As GPU clusters scale from tens of thousands of chips toward hundreds of thousands, the constraint that is increasingly governing actual training performance is not compute capacity. It is network reliability. A single delayed data transfer in a tightly synchronised 100,000-GPU training cluster causes the entire cluster to wait, leaving millions of dollars of compute sitting idle while one straggler node catches up.

OpenAI calls this the straggler effect, and in a technical post published May 5 alongside the release of the Multipath Reliable Connection protocol, the company described network congestion, link failures, and device failures as the most common sources of delay and jitter in large-scale AI training. MRC is OpenAI’s answer to that problem, and the fact that it released it as an open standard through the Open Compute Project, developed collaboratively with AMD, Broadcom, Intel, Microsoft, and Nvidia, is the detail that makes it consequential beyond OpenAI’s own infrastructure.

Why MRC Changes Ethernet Networking Economics

MRC is a transport protocol that replaces the single-path model of conventional RDMA networking with multipath packet spraying across hundreds of independent network paths simultaneously. Where traditional RoCE networking assigns each data transfer to a single path, creating congestion when multiple transfers collide on the same link, MRC distributes a single transfer across as many paths as the network offers. When one path fails, only the fraction of packets on that path needs retransmitting rather than the entire transfer. Microsecond-level failure detection and rerouting eliminates the cascade of cluster stalls that conventional Ethernet fabrics experience during hardware failures.

OpenAI already deploys MRC across its largest Nvidia GB200 supercomputers, including the Stargate site with Oracle Cloud Infrastructure in Abilene, Texas, and Microsoft’s Fairwater supercomputers. MRC is not a research concept. It is production infrastructure already shaping the training performance of some of the world’s most consequential AI systems.

Why the Open Compute Project Donation Changes Everything

The strategic significance of MRC is not its technical design, which is sophisticated and well-documented. It is the decision to contribute the MRC specification to the Open Compute Project rather than keeping it proprietary. OpenAI’s Stargate infrastructure requires Nvidia, AMD, Broadcom, and Intel all to implement MRC in their networking hardware for the protocol to deliver its full performance benefits across heterogeneous cluster environments. No proprietary standard could achieve that cross-vendor implementation breadth. By contributing MRC to the OCP, OpenAI created the governance structure needed for universal hardware support without requiring each vendor to trust a competitor’s proprietary specification.

The OCP contribution also signals something important about OpenAI’s infrastructure strategy more broadly. A company that contributes a protocol it developed and deployed in production to an open standards body is making a calculated bet that the value it captures from universal MRC adoption exceeds the value it would capture from MRC as a proprietary advantage. That calculation is consistent with how foundational infrastructure standards have historically emerged: when a technology is bottlenecking the entire market rather than just one player, the market leader who removes the bottleneck through open standardisation captures more value from the resulting acceleration than it would from maintaining a proprietary solution. Sameh Boujelbene, vice president at Dell’Oro Group, described the OCP donation as a strong signal that hyperscalers are leaning harder into Ethernet for AI fabrics, particularly as clusters push toward 100,000 to 500,000-plus GPUs.

The InfiniBand Displacement Implication

MRC’s emergence as an open standard has a specific and commercially significant implication for Nvidia’s InfiniBand networking business that the hardware-focused AI infrastructure analysis has not fully processed. InfiniBand has dominated large-scale AI training networking for a decade because its low latency and tightly integrated performance made it the reliable choice for the GPU cluster sizes that mattered until recently. The conventional wisdom was that Ethernet, despite its cost and vendor diversity advantages, could not match InfiniBand’s performance for tightly synchronised AI training workloads. MRC directly challenges that conventional wisdom by addressing the specific failure modes, congestion characteristics, and latency variability that made Ethernet less suitable than InfiniBand for large-scale training.

Boujelbene stated directly that MRC reinforces Ethernet’s growing position in hyperscale AI infrastructure, and that historically, ultra-large training clusters were dominated by Nvidia’s InfiniBand, but Ethernet is now rapidly evolving into a serious foundation for the largest AI supercomputers. That shift has direct revenue implications for Nvidia. InfiniBand is a significant and high-margin component of Nvidia’s networking business. A market that increasingly solves its AI networking challenges through MRC-enabled Ethernet rather than InfiniBand is a market that reduces its dependence on Nvidia networking hardware while potentially increasing its dependence on Nvidia compute hardware. As covered in our analysis of the custom silicon AI accelerator race entering its most consequential phase, the competitive dynamics around AI hardware are shifting in ways that compound over time. MRC is the networking dimension of that shift.

What Infrastructure Operators Need to Understand

For data center operators, colocation providers, and enterprise AI infrastructure teams, MRC has practical implications that extend beyond the protocol itself into how AI infrastructure gets designed, procured, and operated. The shift from single-path RDMA networking to multipath packet spraying changes the requirements for switches, network interface cards, and the optical connectivity infrastructure that carries AI training traffic. Facilities designed around conventional three-layer network topologies face the same architectural pressure that Google’s Virgo fabric addresses on the compute side: the general-purpose network design that served previous generations of workloads is not the optimal design for the specific traffic patterns that AI training generates.

MRC’s open standard status means that data center operators can plan around it with confidence that hardware support will be broadly available rather than dependent on a single vendor’s product roadmap. MRC extends RDMA over Converged Ethernet and draws on techniques from the Ultra Ethernet Consortium while adding SRv6-based source routing for large-scale AI networking fabrics. The technical specification is public. The OCP governance structure ensures the collective needs of the industry, rather than any single vendor’s competitive interests, will drive MRC’s evolution. Operators designing AI data center networking infrastructure today should incorporate MRC support into their hardware specifications rather than wait for commodity equipment vendors to standardise it, because the clusters that need MRC most are already under construction, not arriving two years from now.

As covered in our analysis of the AI infrastructure workforce crisis nobody is planning for, the skills required to design, implement, and operate next-generation AI infrastructure are in critically short supply. MRC adds a new networking design competency to that list that most infrastructure teams are not yet building.

Topics

Akash Sharma

Kiara Mandavia is the Content Manager at Compute Forecast, a publication covering the data centre industry. She brings a background in technology and editorial strategy, with a focus on making complex infrastructure trends accessible and meaningful for industry audiences. Her work explores the business, innovation, and sustainability stories shaping how the world builds and scales its digital foundations. At Compute Forecast, Kiara leads feature stories, industry analysis, and thought leadership content that keeps readers ahead of the curve in a rapidly evolving sector.

[simple-author-box]

COMPUTE WEEKLY

The briefing that 40,000+ tech leaders read every Monday. Sharp, fast, essential.

Download Now

Building an AI Startup Without Owning GPUs

Not owning GPUs has become the default, deliberate strategy for building an AI company — not a compromise founders accept reluctantly. H100 rental rates fell 64-75% in fifteen months, a dense ecosystem of neoclouds and inference-as-a-service providers now lets startups skip infrastructure entirely, and credit programs can fund a company’s first year before a founder writes a check

Cerebras Systems

AI & Machine Learning

The chip that makes Nvidia nervous. Cerebras’ Wafer Scale Engine is rewriting the rules of AI inference at scale.

Faster

0 x

YoY Revenue

0 x

Transistors

0 T

Market Pulse

NVDA

$924.60

-2.11%

MSFT

$421.30

-2.94%

AMZN

$192.80

-4.87%

AMD

$924.60

-2.40%

TSMC

$924.60

-2.32%

Indicative only · Not financial advice

Upcoming Events

SEP

The AI Infrastructure Race (India)

WEBINAR · ONLINE

The AI Infrastructure Race: Won on Power, Land and Trust — Not Capital

MAY

AI Infrastructure Summit

DUBAI · IN PERSON

MEA’s premier AI infrastructure event.

JUN

0 0

Compute Forecast Summit

SINGAPORE · IN PERSON

Our flagship APAC event. Early bird open.

Latest Moves

Live

Ecolab Deepens Cooling Strategy With $4.75B CoolIT Acquisition

Ecolab is making one of its biggest moves yet into AI infrastructure after completing its $4.75 billion acquisition of liquid cooling specialist CoolIT Systems

Pure DC and AVK Deploy Europe’s First 110 MW Data Center Microgrid in Dublin

The Pure DC Dublin microgrid has made history as Europe’s first large-scale on-site data center microgrid, launched in partnership with power solutions provider AVK at Pure DC’s campus in Ireland.

Pace Digitek Partners With MEGMEET to Expand AI Data Center Power Business

India’s AI infrastructure ecosystem continues to mature as domestic technology manufacturers move beyond traditional telecommunications and industrial markets toward high-growth digital infrastructure opportunities

Follow Compute Forecast

11K followers

1200 followers

Companies to Watch

CoreWeave

Neo Cloud · $19B · IPO Watch

Cerebras Systems

AI Hardware · $4.25B · Pre-IPO

G42

G42

Sovereign AI · Abu Dhabi

Humain

Saudi AI · $40B Fund

Latest Podcast

EP . 041

AI Capex, Cloud Margins & the Nuclear Bet

48 MIN · 25 APR 2026

Breaking

AI & Machine Learning

Feature

OpenAI’s MRC Protocol Matters More Than the Industry Realises

The AI infrastructure conversation has a hardware bias. GPUs, custom silicon, transformers, and power capacity dominate the analytical framework that

Akash Sharma
12 May 2026
4 min read

847 SHARES

0
SHARES

Topics

[simple-author-box]

COMPUTE WEEKLY

The briefing that 40,000+ tech leaders read every Monday. Sharp, fast, essential.

Free Report

Global AI Infrastructure Outlook 2026

The briefing that 40,000+ tech leaders read every Monday. Sharp, fast, essential.

Download Free

Cerebras Systems

AI & Machine Learning

The chip that makes Nvidia nervous. Cerebras’ Wafer Scale Engine is rewriting the rules of AI inference at scale.

Faster

0 x

YoY Revenue

0 x

Transistors

0 T

Market Pulse

NVDA

$924.60

+2.4%

MSFT

$421.30

+1.1%

AMZN

$192.80

-0.6%

NVDA

$924.60

+2.4%

NVDA

$924.60

+2.4%

Indicative only · Not financial advice

Upcoming Events

MAY

0 0

DCD Global — London

LONDON · IN PERSON

World’s largest DC event. CF is media partner.

MAY

AI Infrastructure Summit

DUBAI · IN PERSON

MEA’s premier AI infrastructure event.

JUN

0 0

Compute Forecast Summit

SINGAPORE · IN PERSON

Our flagship APAC event. Early bird open.

Latest Moves

Live

Sam Altman

OpenAI appoints new Chief Infrastructure Officer to lead $100B DC programme

27 APR · OPENAI

Sam Altman

OpenAI appoints new Chief Infrastructure Officer to lead $100B DC programme

27 APR · OPENAI

Sam Altman

OpenAI appoints new Chief Infrastructure Officer to lead $100B DC programme

27 APR · OPENAI

Follow Compute Forecast

18.4K followers

12.1K followers

9.3K subscribers

41 episodes

Companies to Watch

CoreWeave

Neo Cloud · $19B · IPO Watch

Cerebras Systems

AI Hardware · $4.25B · Pre-IPO

G42

G42

Sovereign AI · Abu Dhabi

Humain

Saudi AI · $40B Fund

Latest Podcast

EP . 041

AI Capex, Cloud Margins & the Nuclear Bet

48 MIN · 25 APR 2026

OpenAI’s MRC Protocol Matters More Than the Industry Realises

Why MRC Changes Ethernet Networking Economics

Why the Open Compute Project Donation Changes Everything

The InfiniBand Displacement Implication

What Infrastructure Operators Need to Understand

More from AI Infrastructure

COMPUTE WEEKLY

Building an AI Startup Without Owning GPUs

Cerebras Systems

$924.60

$421.30

$192.80

$924.60

$924.60

OpenAI’s MRC Protocol Matters More Than the Industry Realises

More from AI Infrastructure

COMPUTE WEEKLY

Global AI Infrastructure Outlook 2026

Cerebras Systems

$924.60

$421.30

$192.80

$924.60

$924.60