News
Nvidia dominated the AI arena in 2024, with shipments of its Hopper GPUs more than tripling to over two million among its 12 largest prospects, according to estimates from Omdia.
But whereas Nvidia remains an AI infrastructure titan, or no longer it is facing stiffer opponents than ever from rival AMD. Among earlier adopters of its Instinct MI300 sequence GPUs, AMD is instantly gaining share.
Omdia estimates that Microsoft purchased approximately 581,000 GPUs in 2024, the largest of any cloud or hyperscale customer on this planet. Of those, one in six was constructed by AMD.
At Meta – by far essentially the most enthusiastic adopter of the barely year-frail accelerators, according to Omdia’s findings – AMD accounted for 43 percent of GPU shipments at 173,000 versus Nvidia’s 224,000. Meanwhile, at Oracle, AMD accounted for 23 percent of the database giant’s 163,000 GPU shipments.
Nvidia remained the dominant dealer of AI hardware in 2024. Credit ranking: Omdia – click on to enlarge
Despite rising share among key prospects savor Microsoft and Meta, AMD’s share of the broader GPU market remains comparatively small subsequent to Nvidia.
Omdia’s estimates tracked MI300X shipments across four vendors – Microsoft, Meta, Oracle, and GPU bit barn TensorWave – which totaled 327,000.
AMD’s MI300X shipments remained a fraction of Nvidia’s in 2024. Credit ranking: Omdia – click on to enlarge
AMD’s ramp will not be any less notable as its MI300-sequence accelerators have handiest been on the market for a year now. Sooner than that, AMD’s GPUs had been predominantly utilized in additional traditional high-performance computing applications savor Oak Ridge National Laboratory’s (ORNL) 1.35 exaFLOPS Frontier supercomputer.
“They managed to prove the effectiveness of the GPUs through the HPC scene last year, and I think that helped,” Vladimir Galabov, research director for cloud and datacenter at Omdia, told The Register. “I do think there was a thirst for an Nvidia alternative.”
Why AMD?
How mighty of this thirst is pushed by shrimp provide of Nvidia hardware is hard to say, nonetheless at least on paper, AMD’s MI300X accelerators offered a quantity of advantages. Presented a year ago, the MI300X claimed 1.3x larger floating point performance for AI workloads, as properly as 60 percent larger memory bandwidth and 2.4x larger capacity than the venerable H100.
The latter two aspects make the part particularly attractive for inference workloads, the performance of which are more often dictated by how mighty and how fast your memory is rather than how many FLOPS the GPU can throw around.
Generally speaking, most AI objects today are trained at 16-bit precision, which means that in repeat to escape them, you want approximately 2 GB of vRAM for every person billion parameters. With 192 GB of HBM3 per GPU, a single server boasts 1.5 TB of vRAM. This means that large objects, savor Meta’s Llama 3.1 405B frontier mannequin, can escape on a single node. A similarly equipped H100 node, on the varied hand, lacks the memory necessary to escape the mannequin at elephantine determination. The 141 GB H200 would not suffer from this same limitation, nonetheless capacity is no longer the MI300X’s handiest party trick.
The MI300X boasts 5.3 TBps of memory bandwidth, versus 3.3 TBps on the H100 and 4.8 TBps for the 141 GB H200. Collectively, this means that the MI300X should in idea be able to succor larger objects faster than Nvidia’s Hopper GPUs.
Even with Nvidia’s Blackwell, which is handiest upright starting to reach prospects, pulling ahead on performance and memory bandwidth, AMD’s unique MI325X level-headed holds a capacity advantage at 256 GB per GPU. Its more considerable MI355X slated for release late subsequent year will push this to 288 GB.
As such, or no longer it is no surprise that Microsoft and Meta, both of which are deploying large frontier objects measuring hundreds of billions and even trillions of parameters in size, have gravitated to AMD’s accelerators.
This, Galabov notes, has been reflected in AMD’s guidance, which has consistently inched upward quarter after quarter. As of Q3, AMD now expects Instinct to power $5 billion in revenues in fiscal 2024.
Going into the Original Year, Galabov believes that AMD has an opportunity to gain rather more share. “AMD executes well. It communicates well with clients, and it’s good at talking about its strengths and its weaknesses transparently,” he said.
One potential driver is the emergence of GPU bit barns, savor CoreWeave, which are deploying tens of thousands of accelerators a year. “Some of these are going to purposely try to build a business model around an Nvidia alternative,” Galabov said, pointing to TensorWave as one such example.
Custom silicon hits its walk
Or no longer it is no longer upright AMD chipping away at Nvidia’s empire. At the same time cloud and hyperscalers are buying up massive quantities of GPUs, many are deploying customized AI silicon of their very maintain.
Cloud companies deployed massive quantities of customized AI silicon in 2024, nonetheless or no longer it is important to bear in thoughts no longer all of these parts are designed for GenAI. Credit ranking Omdia – click on to enlarge
Omdia estimates that shipments of Meta’s customized MTIA accelerators, which we appeared at in additional detail earlier this year, crested 1.5 million in 2024, whereas Amazon placed orders for 900,000 Inferentia chips.
Whether or no longer this poses a challenge to Nvidia relies upon fairly heavily on the workload. That’s because these parts are designed to escape more traditional machine learning tasks savor the recommender systems used to match ads to users and products to traders.
Whereas Inferentia and MTIA may no longer have been designed with LLMs in thoughts, Google’s TPUs certainly had been and have been used to train many of the search giant’s language objects, including both its proprietary Gemini and initiate Gemma objects.
Finest Omdia can figure, Google placed orders for about a million TPU v5e and 480,000 TPU v5p accelerators this year.
In addition to Inferentia, AWS also has its Trainium chips, which regardless of their name have been retuned for both training and inference workloads. In 2024, Omdia figures Amazon ordered about 366,000 of these parts. This aligns with its plans for Project Rainier, which is able to present mannequin builder Anthropic with “hundreds of thousands” of its Trainium2 accelerators in 2025.
Finally there may be Microsoft’s MAIA parts, that have been first teased quickly before AMD debuted the MI300X. Similar to Trainium, these parts are tuned for both inference and training, one thing Microsoft obviously does a fair bit of as OpenAI’s main hardware partner and a mannequin builder in its maintain suitable. Omdia believes Microsoft ordered roughly 198,000 of these parts in 2024.
- Million GPU clusters, gigawatts of power – the scale of AI defies common sense
- Humanoid robots coming quickly, initially below faraway control
- Boffins trick AI mannequin into giving up its secrets and ways
- Apt how deep is Nvidia’s CUDA moat really?
The AI market is greater than hardware
Nvidia’s monumental earnings gains over the past two years have understandably shone a highlight on the infrastructure at the back of AI, nonetheless or no longer it is handiest one piece of a mighty larger puzzle.
Omdia expects Nvidia to fight over the next year to grow its share of the AI server market as AMD, Intel, and the cloud carrier companies push alternative hardware and services.
“If we’ve learned anything from Intel, once you’ve reached 90-plus percent share, it’s impossible to continue to grow. People will immediately look for an alternative,” Galabov said.
On the opposite hand, instead of struggling with for share in an increasingly competitive market, Galabov suspects that Nvidia will instead point of interest on expanding the total addressable market by making the technology more accessible.
The introduction of Nvidia Inference Microservices (NIMs), containerized objects designed to characteristic savor puzzle pieces for building advanced AI systems, are upright one example of this pivot.
“It’s the Steve Jobs strategy. What made the smartphone successful is the App Store. Because it makes the technology easy to consume,” Galabov said of NIMs. “It’s the same with AI; make an app store and people will download the app and they’ll use it.”
Having said that, Nvidia remains grounded in hardware. Cloud companies, hyperscalers, and GPU bit barns are already announcing massive clusters based on Nvidia’s considerable unique Blackwell accelerators, which pull properly ahead of anything AMD or Intel have to offer today, at least in terms of performance.
Meanwhile, Nvidia has accelerated its product roadmap to wait on a yearly cadence of unique chips to maintain its lead. It appears that whereas Nvidia will proceed to face stiff opponents from its rivals or no longer it is at no danger of shedding its crown any time quickly. ®