Great article. I'm in the process of building this business myself, so I'm intimately aware of everything in the article. Just keep in mind that a lot of the math is back of the napkin guesses. All of this is managed on personal relationships and deals with the vendors and much of the pricing that actually gets set, isn't made public.
Our first offering is the AMD MI300x instead of Nvidia or Intel products. But, unlike all of my competitors, we are not picking sides. Our unique value prop is that we eventually plan to offer the best of the best compute of anything our customers want. Effectively building a varied super computer for rent, with bare metal access to everything you need and white glove service and support. In essence, we are the cap/opex for businesses that don't want to take on this risk themselves.
What people don't understand is how difficult and capital intensive to deploy and manage large scale compute, especially on the high end cutting edge. Gone are the days of just racking/stacking a few servers. This stuff is way more complicated and involved. It is rife with firmware bugs, limitations, and hardware failures.
The end of the article says some nonsense about a glut of GPU capacity. I do have to call that out. It isn't going to happen for a long while at least. The need for compute is growing exponentially. Given the complexities of just deploying this stuff, it isn't physically possible to get enough of it out there, fast enough, to satisfy the demand.
I love every challenge of this business. Happy to answer questions where I can.
Just adding some information, the article claims Tesla has 15k H100s, but they actually have 40k H100s and 85k by the end of the year. https://www.reddit.com/r/NVDA_Stock/comments/1cbwvnr/tesla_4...
As usual need to have a bit of doubt on the third party estimates.
AI boom combined with bottlenecks in foundry capacity and advanced packaging has created contango in GPU sales. Value of old H100 has gone up after sales.
GPU supply-demand curve is nowhere close where Nvidia would like it to be. Demand is so high that Nvidia would make at least 2X profits if it could sell 3X more GPU's. TSMC just cant build new fabs fast enough.
Great article. If this sustainable arbitrage exists from renting gpu time instead of selling and shipping gpus, why doesn't nvidia become a cloud provider itself?
So should we buy CoreWeave and Lambda shares after IPO or not?
For enthusiasts, even renting from Runpod, Salad, Vast.ai which are an order of magnitude cheaper than established cloud providers is much more expensive than buying a Rtx 3090 or 4090. Which got me thinking why don't companies who need training don't put their money together, buy some H100 and share them?
Nvidia is a manufacturer. They’ve read all the books of all of the giants of logistics. They know that throughput means nothing if it doesn’t include sales. They know how to reduce in-process work and get from raw materials to boxes on store shelves as quickly as possible.
Which is to say: they know inventory is a liability and they know how to get rid of it.
Renting out equipment is an inventory management problem, which manufacturers don’t understand on purpose. That’s somebody else’s domain.
> there are only 35,064 hours in a year with 365.25 days
Wut. 96 hours per day? I don't trust the maths in this article.
This is actually not an accurate report. https://twitter.com/glennklockwood/status/178624248776766284...
The math seems off to me here. In particular:
>> So here is the deal: If you have 16,000 GPUs, and you have a blended average of on-demand (50 percent), one year (30 percent), and three year (20 percent) instances, then that works out to $5.27 billion in GPU rental income over four years with an average cost of $9.40 per hour for the H100 GPUs.
This makes a very strong assumption that the rental cost of an H100 will not change over the next four years. This is wildly optimistic. Instead, we can infer expected prices by looking at the differential rates for one and three-year commitments:
>> We estimated the one year reserved instance cost $57.63 per hour, or $7.20 per GPU-hour, and we know that the published price for the three year reserved instance is $43.16, or $5.40 per GPU-hour.
On the margin, the cloud provider should be indifferent between an on-demand rental, a one-year reservation, and a three-year reservation. That implies that three consecutive one-year reservations should provide about the same income as the three-year reservation.^[1]
Someone who places a three-year commitment will spend $16.20 per hour in one year, over three years. The one-year commitment is $7.20 per hour in one year, over one year. Subtract the two, and we get the residual of $9.00, and divide by the two years remaining in the contract to get $4.50.
With this rough calculation, the two-year, one-year forward price of the H100 is about $4.50/hr. If we further assume that the price changes per year with a constant ratio (0.72), we can break that up the per-hour, one-year reservation prices as $7.20 (today), $5.22 (one year from now), and $3.78 (two years from now).
Going further into speculation and applying this ratio to rental revenue on the whole, that "$5.27b over four years" instead becomes $3.47b. Still a reasonable multiple of the purchased infrastructure cost, but it's less outlandish and emphasizes the profit potential of moving first in this sector (getting those long-term commitments while it's still a seller's market).
[1 — I'm ignoring the option-value in the one-year commitment, which allows the customer to seek a better deal after twelve months. This option-value is minimal of the GPU cloud is expected to be at-capacity forever, such that the provider can replace customers easily.]