During the current AI boom, no amount of dollars seems unjustified to pay for new GPU computing capabilities. When Silicon Valley heads complain about shortages, investors pour billions into cloud providers.
In computing, floating point operations per second (FLOPS) is a measure of computer performance.
“Today, FLOPS is one of the things you can spend dollars on. Tomorrow, dollars are one of the things you can spend FLOPS on. A reading from the church of FLOPS,” tweets Andrej Karpathy, Former AI director at Tesla, now working at OpenAI.
He’s one of the voices claiming that the “top gossip of the valley” is “who’s getting how many H100s and when.” The H100 is the latest and fastest chip from Nvidia for AI training.
Elon Musk has said that “GPUs are harder to buy than drugs,” and he also claimed that “it seems like everyone and their dog is buying GPUs at this point.”
Costing $30,603, a single GPU with 80GB of HBM2e memory is already very expensive. However, the listings on eBay go as high as $51,000, hinting at a profitable opportunity for scalpers.
And tech giants are hunting for tens of thousands of such GPUs.
“To train a model of probably GPT-5 size, I wouldn’t be surprised if they use at least 30,000, maybe 50,000 H100,” Musk evaluated.
A similar amount of previous generation A100 GPUs were required to train ChatGPT, research firm TrendForce estimated, adding that the need will increase significantly.
The increased demand for GPUs used to train generative AI and large language models is reflected by Nvidia’s first-quarter record data center sales, growing 14% from a year ago.
“One reason the AI boom is being underestimated is the GPU/TPU shortage. This shortage is causing all kinds of limits on product rollouts and model training but these are not visible. Instead, all we see is Nvidia spiking in price. Things will accelerate once supply meets demand,” Adam D'Angelo, CEO of Quora, writes.
The AI revolution, coupled with AI chatbots, is expected to introduce new tools for drug discovery, forecasting financial or weather trends, and significantly elevated level of automation.
Generative AI alone could add the equivalent of $2.6 trillion to $4.4 trillion dollars in value to the global economy annually, according to consultancy McKinsey. That is more than the entire GDP of the United Kingdom at $3.1 trillion. About 75 percent of the value that AI delivers falls across four areas: Customer operations, marketing and sales, software engineering, and R&D.
Investors pouring in billions
Specialized GPU cloud provider CoreWeave, which describes itself as a “small, growing team of intelligent, genuine people,” just recently closed a $2.3 Billion financing round led by Magnetar Capital and Blackstone. Money will be committed towards hardware for contacts already executed with clients, and Nvidia H100 GPUs are used as collateral.
The company bought and plugged in its first GPU in 2017. CoreWeave used to be a cryptocurrency mining provider in its early days. Now it is delivering infrastructure to companies for AI and model training purposes.
“We’re mobilizing quickly to expand capacity – we recently announced a new $1.6 billion data center in Texas, our first facility in the state. Our products are powering some of the world's most sophisticated and computationally intensive workloads that benefit from infrastructure purpose-built to support it, and we have some of the highest caliber customers on board,” CoreWeave writes in a press release.
The startup has also recently signed a deal with Microsoft for computing power.
Nvidia has invested $100 million in CoreWeave. However, Nvidia is also building its own Nvidia DGX Cloud. One instance of DGX Cloud features eight 80GB GPUs for 640GB of GPU memory per node. Thousands of GPUs are available online on Oracle Cloud infrastructure and Nvidia infrastructure in the US and UK.
Nvidia-dominated market
There is no way around GPUs if you want to train large AI models. And Nvidia dominates this market.
Microsoft, which invested $10 Billion into ChatGPT, even warned about potential service disruptions, as data centers depend on the availability of components, including GPUs, for which “there are very few qualified suppliers.”
“Extended disruptions at these suppliers could impact our ability to manufacture devices on time to meet consumer demand,” Microsoft writes in its annual report.
Nvidia’s flagship H100, compared to the previous A100 chip, offers an all-around upgrade: it delivers a four times speedup on language models, is two times faster in vision-based models, and specific large language models that need parallelization can achieve even more, according to Lambda Labs.
The only alternative is AMD, which offers AI chips with large amounts of HBM memory. AMD recently unveiled an MI300X GPY with 192 GB.
“Good software has been the Achilles’ heel for most ML training chip companies,” generative AI platform MosaicML writes. Data scientists and ML engineers are used to working with Nvidia-related frameworks like TensorFlow. Therefore, a more extensive collection of software and open user code exists on the Nvidia side.
Charlie Boyle, General Manager of DGX at Nvidia, told Venture Beat that Nvidia is building plenty of GPU chips, however, the shortages arise from other components in the supply chain and are related to increased demand among cloud providers.
Competition for GPUs is global. Last year, the US Government ordered Nvidia and AMD to stop selling their best AI technology to China, which includes Nvidia’s A100, H100, and future generations of integrated circuits achieving equal or higher performance. Aimed at China’s militarization efforts, new rules were to slow China’s advances in key technology sectors. A reprieve was granted later.
And Nvidia announced in March that it had modified its state-of-the-art H100 chip into a cut-down version named H800 that is legal to export to China. The export version has a halved chip-to-chip data transfer rate.
Your email address will not be published. Required fields are markedmarked