News
SC23 As the SC23 conference in Denver, USA, kicks off in earnest, Intel is spilling the tea on the two-segment Morning time supercomputer or no longer it is constructing for the UK with Dell and the University of Cambridge.
The chipmaker touted the gadget earlier this month in the course of the UK’s AI Summit, claiming this is also “the UK’s fastest AI supercomputer.”
Emphasis on AI, we reflect, on fable of at 19 petaFLOPS of benchmarked FP64 performance, Morning time in its first segment only factual about suits at the present time’s publicly known fastest UK supercomputer, Scotland’s Archer2, which is currently ranked thirty ninth in the world’s publicly known Top500. Archer2 manages to high out at 26 petaFLOPS as a theoretical most, or 20 petaFLOPS in benchmarks.
So, Morning time factual now is no longer always in actuality the fastest in Britain at FP64 in a brilliant sense. Must you lower its precision to something cherish FP16 for AI work, then yes, its performance will in theory be greater, and it will therefore be the fastest AI machine in the nation (assuming Archer2 couldn’t pull off the identical feat if its operators so desired.) Peep below for more on that.
And or no longer it is aloof no longer clear if Intel thinks the first or second segment of Morning time will be the “fastest” in the UK at AI. The second segment is decided to be ten-situations as snappily as the first segment of Morning time. The first segment is speculated to be going online soon if no longer already; the second segment is due subsequent 12 months.
In a press briefing before SC23, Intel execs mentioned as a minimal the first segment gadget would feature 512 4th-gen Xeon Scalable processors and 1,024 Datacenter GPU Max accelerators spread in the course of 256 liquid-cooled Dell PowerEdge XE9640 systems.
Each and each node is supplied with 1TB of DDR5 memory and 512GB of excessive bandwidth memory. We have also learned every node will invent the most of four of Nvidia’s Infiniband HDR200 interconnects.
Whereas neither Intel or Dell have shared the dinky print of what the second segment of the challenge will belief cherish, or no longer it is speculated to, as we mentioned, enhance the gadget’s ability tenfold.
As it stands the first segment of the gadget is rated for a peak output of fifty three petaFLOPS of double precision performance. Nonetheless, in its first Linpack run, Morning time managed no longer as much as half of that. At 19 petaFLOPS of genuine-world FP64 performance, the gadget is accessible in at forty first situation in the global Top500.
Intel’s peak performance claims would appear to hide the chipmaker has managed to determine the kinks in its Ponte Vecchio GPUs, which on paper are factual for about 52 teraFLOPS at FP64.
As our sibling publication The Next Platform pointed out earlier, the Ponte Vecchio facets dropped at Argonne National Lab for integration into the US-essentially essentially based gadget had been only in a position to turning in 31.5 teraFLOPS of FP64 performance — about 61 p.c of what the datasheet claims.
We have requested Intel for clarification on the GPU Max 1550’s performance; we’ll wait on you admire if we hear anything else motivate.
This means that if and when Morning time’s second segment is full, its peak theoretical performance would possibly perhaps aloof be nearer to 532 petaFLOPS at FP64. That will perhaps perhaps be a huge step up from the UK’s Archer2.
- Downfall fallout: Intel knew AVX chips had been afraid and did nothing, lawsuit claims
- UK bets on Intel CPUs and GPUs, Dell boxen, OpenStack for Morning time supercomputer
- UK govt finds £225M for Isambard-AI supercomputer powered by Nvidia
- Ventana bumps performance on Veyron RISC-V silicon to completely dart up servers
If Intel and Dell can present a enhance to the effectivity of the gadget, Morning time’s second segment would possibly perhaps aloof inappropriate among the high 10 fastest supercomputers officially recorded, with performance in spitting distance from the Fugaku gadget, which is rated for 537 petaFLOPS of peak FP64 performance.
With that mentioned, loyal performance in the Linpack bench on the total is accessible in a provocative bit lower. Whereas Fugaku is rated for 537 petaFLOPS of peak performance, in the genuine world or no longer it is nearer to 442 petaFLOPS.
Further evaluation
Intel’s state is that Morning time is the UK’s “fastest AI supercomputer,” and here is the keep things find a exiguous bit inspiring. These GPU Max 1550s are factual for 832 teraFLOPS of Mind Drift 16 (BF16) math, in response to Intel’s datasheet. In its first segment, that attach its AI performance at 852 petaFLOPS. Except the state is essentially essentially based on the chip’s integer performance, wherein case we’re having a belief at 1.7 exaOPS of INT8. Fully constructed, the gadget will be nearer to between 8.5 exaFLOPS of BF16 and 17 exaOPS of Int8.
Nvidia has made identical claims about the AI performance of the Isambard-AI supercomputer being deployed in collaboration with the University of Bristol, which would perchance perhaps additionally be created from 5,448 Nvidia GH200 Grace-Hopper Superchips. These facets give a enhance to almost 4 petaFLOPS of sparse FP8 performance, putting its peak AI performance at about 21 exaFLOPs.
Evaluate pure BF16 performance and the accomplished Morning time gadget would possibly perhaps aloof come out ahead. However if your workload can leverage FP8, then the Isambard-AI machine is the one to beat.
Consider the fact that, all of these estimates reflect that Morning time will at last swell to 10,000-plus GPUs and that Intel is in actuality getting 52 teraFLOPS of FP64 performance from the accelerators. ®