How Nvidia became the superpower of AI training


Nvidia has a near-monopoly market share in AI computing, but the competition is hotting up.

On Wednesday, Nvidia announced its financial results for Q1 that were, yet again, above analyst expectations. During the quarter that ended April 28th, the company's profit soared by 262% year over year, briefly sending the stock to record highs above $1000.

Now, with a market cap of over $2.3 trillion, Nvidia is the third most valuable company in the world, surpassing even big tech players like Alphabet and Amazon.

The main factor in such an impressive financial leap, which happened over a period of just a few years, was Nvidia's chips for data centers, which in Q1 helped increase revenue more than fourfold.

This surge also illustrates the latest trend in the tech world: tech companies are heavily investing in training AI.

Even though Nvidia's main competitors, Intel and AMD, as well as Google and Meta, offer alternative chips, none of them can deliver the same performance. Nvidia has a near monopoly in the data center chip market.

However, with increasing competition and chip shortages, maintaining the same market dominance will be difficult, if not impossible.

How it all started

The idea of Nvidia was born in Denny's restaurant in California, where Jensen Huang, an American immigrant from Taiwan, worked when he was 15 years old.

Before founding Nvidia, Huang had various humble jobs, including dishwashing and cleaning toilets.

"I was definitely the best toilet cleaner the world has ever seen," he said, half-jokingly, in an interview with Stripe's Co-founder at a company event.

Huang, who has been steering Nvidia for over 31 years, and is the longest-serving CEO in the tech world, doesn't shy away from his work experience and credits them to his strong work ethic.

In 1993, Huang met with Chris Malachowsky and Curtis Priem at Denny's to discuss creating a chip that would enable realistic 3D graphics on personal computers. There, the three co-founders started their journey.

The company's first chip, the NV1, was released in 1995. It was a multimedia accelerator that combined 2D and 3D graphics capabilities with audio support. However, the chip tried to do too many things, and it failed to win many paying customers. At that time, the company nearly went bankrupt and had to fire nearly half of its employees.

"Building Nvidia turned out to have been a million times harder than I expected it to be – than any of us expected to be," said Huang in an interview with the Acquired podcast. He confessed that he wouldn't start a company again if he knew what was ahead.

"At that time, if we realized the pain and suffering, and how vulnerable you gonna feel, and the challenges you gonna endure, the embarrassment and the shame, and the list of all the things that go wrong, I don't think that anybody would start a company," he said.

After the unsuccessful launch of NV1, followed by near bankruptcy, in 1997, the company launched its RIVA series GPUS, the 128, which was the company's first major successful product. The Riva 128 was four times faster than any other graphics processor.

In 1999, the company introduced GeForce 256 – the first line of famous graphic cards along with the term Graphic processing unit that the industry has been using to this day.

Pivot to data centers

While Nvidia's GeForce is still widely used in the latest gaming PCs, GPUs for gamers represent just around 1/10 of the company's total revenue, with the lion's share coming from data centers.

Nvidia's expertise in creating GPUs for gamers certainly contributed to its becoming a powerhouse of AI training.

In contrast to central processing units, which can perform billions of calculations one at a time, GPUs can perform many smaller tasks simultaneously. This process is called parallel processing.

This was one of the key factors that made GPUs suitable for programming AI software.

However, this "lucky coincidence" was only a small part of the success. The real reason Nvidia was able to dominate the AI training market and has such an edge over its competitors is that the company started preparing for it 17 years ago.

Nvidia started improving its chips and making them more suitable for AI training when nobody talked about AI. In 2014, it launched the Tesla K80, the first GPU for AI training for data centers.

Making hardware was just one part of success. In 2006, the company launched CUDA – a parallel computing platform and programming model that harnessed the power of GPU accelerators. CUDA allows developers to utilize the parallel processing capabilities of Nvidia GPUs to accelerate applications beyond graphics, such as scientific simulations and AI.

During his interview on stage at the Stripe event, Huang said that Nvidia wouldn't be as successful without the software.

"It is potentially one of the most important inventions in modern computing," he said. "We invented the idea called accelerating computing, and idea is so simple but deeply profound. A small percentage of the code of programs occupies 99 percent of the runtime and this is true for very important applications That small little cornel can be accelerated."

Near-monopoly market share

With years of experience in making hardware for AI training, Nvidia gained the know-how, so when the AI boom started, it had precisely what companies needed.

According to Germany-based IoT Analytics, Nvidia currently has a 92% market share in data center GPUs.

The huge demand for Nvidia's chips has also caused a chip shortage.

Last year, Daniel Newman, an analyst at Futurum Group, said to the new York Times that companies are waiting 18 months to get Nvidia’s latest Hopper architecture chips, called H100, instead of buying from competitors.

Even though the chip shortage has eased, Nvidia's chips are still in high demand. All the major players, including Open AI, Google, Meta, and Amazon, are training their AI’s with Nvidia's offerings.

This year, the company introduced chips based on Blackwell architecture. Following Q1 financial results, Huang said that the company is poised for the next wave of growth.

The Blackwell architecture chips would be twice as powerful as the latest Hopper and cost around $30.000- $40.000.

While paying tens of thousands may seem high compared to GPUs found in PCs, for customers, it pays off to pay the price.

"If chips can reduce the time of training large language model on a five billion data center, then the savings is more than the cost of all the chips," he explained to The New York Times.

The heating competition

While Nvidia is poised for the next phase of growth, competition in the AI training market is heating up. Nobody wants to be dependent on one supplier and wait months to get the hardware for AI training in such a fast-moving field.

In April, Google and Meta also announced new in-house chips for training AI. While they lag behind Nvidia, their chips have one advantage – their hardware can be specifically tailored to their AI models. Over time, the rich tech giants will definitely improve.

Some of the biggest tech companies, including Meta, Google, and Microsoft, are contributing to the development of software that Open AI released called Triton. Triton is designed to make code run software on a wide range of AI chips and it would be a competitor to CUDA.

Intel and AMD, Nvidia's main competitors, are trying to catch up. Last month, Intel announced its latest Gaudi chip that the company says can deliver 50% on average better inference1 and 40% on average better power efficiency than Nvidia H100 – at a fraction of the cost.

Startups are also offering some promising chips. Cerebras Systems recently introduced its CS-3, which the company says can deliver twice the speed and power as Nvidia's H100 at the same cost.

In the future, competition from Chinese manufacturers will also increase. Huawei is offering its Ascend AI chips for AI training, which are currently inferior to Western companies. But with billions of dollars invested in the semiconductor industry, it will likely get better.

For now, Nvidia holds a crucial advantage over its competitors: years of expertise and know-how. Whether it will be enough to maintain the lead is yet to be seen.


More from Cybernews:

Merrill employee exposes Walmart pension plan members

Affiliated Dermatologists struck by ransomware attack, 370K impacted

Tomb Raider games CEO: humans still key to creativity in AI era

TikTok limits state-affiliated media to counter influence campaigns

European Parliament breached, IDs leaked – media

Subscribe to our newsletter



Leave a Reply

Your email address will not be published. Required fields are markedmarked