The HN crowd would probably prefer reading the many technical details at the ORNL press release: https://www.ornl.gov/news/frontier-supercomputer-debuts-worl... which I just submitted here: https://news.ycombinator.com/item?id=31573066
Also, yesterday Tom's hardware had a detailed article: https://www.tomshardware.com/news/amd-powered-frontier-super... 29 MW total, 400 kW per rack(!)
And, anyone else is like me and wants to see actual pictures or videos of the supercomputer, instead of a rendering like in venturebeat article? Well, head here, ORNL has a very short video: https://www.youtube.com/watch?v=etVzy1z_Ptg We can see among other things: that it's water-cooled (the blue and red tubing), at 0m3s we see a PCB labelled "Cray Inc Proprietary ... Sawtooth NIC Mezzanine Card"
Since they are using AMD's accelerators as well [1], I do wonder whether any usage of these will trickle down and give us improvements in ROCm.
Surely the people at these labs will want to run ordinary DL frameworks at some point - or do they have the money and time to always build entirely custom stacks?
[1] AMD Instinct MI250x in this case.
What an incredible achievement. Good for AMD. The Epyc is a fantastic processor.
And there are another 2 (3?) faster systems coming online in the next year or so.
I wonder if having one supercomputer with x number of chips or having eight supercomputers each with x/8 number of chips would be the more practical working setup. Weather forecasting for example is basically a complex probabilistic algorithm, and there's a notion that running eight models in parallel and then comparing and contrasting the results will give better estimates of actual outcomes than running one model on a much more powerful machine.
Is it feasible to run eight models on one supercomputer, or is that inefficient?
Congratulations to AMD, HPE and ORNL! This is an amazing achievement. Can't wait to see the spectacular science results coming from this installation.
Intel was supposed to build the first Exascale system for ANL [1] [2]. to be installed by 2018. They completely and utterly messed up the execution, partly drive by 10nm failure, went back to the drawing board multiple times, and now Raja switched the whole thing to GPUs, a technology that Intel has no previous success with and rebased it to 2 ExaFlops peak, meaning they probably expect 1 EF sustained performance, a 50% efficiency. No other facility would ever consider Intel as a prime contractor again. ANL hitched their wagon to the wrong horse.
1. https://www.alcf.anl.gov/aurora 2. https://insidehpc.com/2020/08/exascale-exasperation-why-doe-...
I am still kicking myself every time I look at AMD’s share price. I sold a not-insignificant-to-me amount of shares when the price was basically below 10 a share. Now it’s above 100. All this is to say that the turn around at AMD is good to see and the missteps at Intel are hilarious.
This is like the time the Athlon64 and it’s on die memory controller was kicking the Pentiums around.
While AMD gets top billing for the compute cores, HPE used the acquired Cray Slingshot network to create this heterogeneous supercomputer. It has a 64-port, 12.8 Tb/s bandwidth switch, it scales to >250,000 host ports with maximum of 3 hops, and it uses Ethernet "plus optimized HPC functionality".
This reads more or less like a corporate press release - (edit: actually, it reads exactly like a corporate press release) - is there a more substantive article on the topic?
China has two exaflop supercomputers. It's doubtful whether this is the world's most powerful supercomputer.
https://www.nextplatform.com/2021/10/26/china-has-already-re...
Seems I've heard nothing but good things about AMD for the last 10 years or so.
I once had an terrible experience with AMD ~10 years ago that made me swear off them for good. Had something to do with software but I remember it taking several days of work/solutions.
Willing to give them another try soon though. I never seem to even use the full power of whatever CPU I get, lol.
One petaflop DP linpack achieved in 2008. Supercomputing "Moores Law" is doubling speed every 1.5 years, order of magnitude every five years, a thousand-fold 15 years. Pretty close to schedule.
Onward to a zettaflop around 2037?
I read somewhere that this means the US now has the world's fastest supercomputer.
Does this No. 1 position have something to do with the ban on exporting advanced technology to China?
Can someone please explain, how software is made at this scale?
How much of that performance will get undone by the software though? Either through AMD's lack of effort or Intel's compiler "sabotage".
The more powerful processors become, the less I feel there's a need to build supercomputers.
Thinking about it, the most powerful supercomputer in the world is pretty much a million consumer processors, working in parallel. That's going to stay pretty constant, since cost scales roughly linearly.
If X is the processing power of $1k of consumer hardware, the bigger X gets, the less there is a difference in the class of problems that you can solve with X or X * 1e6 processing power.
Since Cray stopped making their own CPUs, they have been back and forth between AMD and Intel several times.
The most powerful and unfortunately unusable supercomputer of the world. AMD's approach to GPUs is on a failing track since its inception. The only software stack available is super fragile, buggy and barely supported. Rather than building a HPL machine I would have preferred see public money spent in a different way.
Now the real question: Can it run Crysis... without hardware acceleration?
hmm
Thank you to the authors for not calling it the fastest computer in the world :-) and instead, as they should, the most powerful. Clock speed is not the only factor of course, as instruction per cycle and cache sizes have an impact, but for a pure measure of speed, the fastest still is:
- For practical use, and non overclocked, the EC12 at 5.5 Ghz: https://www.redbooks.ibm.com/redbooks/pdfs/sg248049.pdf
or
- An AMD FX-8370 floating in Liquid Nitrogen at 8.7 Ghz: https://hwbot.org/benchmark/cpu_frequency/rankings#start=0#i...
What blows my mind is the newest NOAA super computer (that triples the speed of the last one) is a whopping 12 petaflops. It comes online this summer.
It kind of shows the difference in priority spending, when nuclear labs get >1000 petaflop super computers, and the weather service (that helps with disasters that affect many Americans each year) gets a new one that is 1.2% of the speed.
https://www.noaa.gov/media-release/us-to-triple-operational-....