NVIDIA Corporation (NVDA.US) performance meeting: GB200 promotion progresses smoothly and continues to expand scale to meet customer demand for Blackwell.
27/02/2025
GMT Eight
After the market closed on Wednesday Eastern Time, NVIDIA Corporation (NVDA.US) stated during its fiscal fourth quarter 2025 earnings conference call that the promotion of GB200 is progressing smoothly. Despite facing complex challenges, demand is strong. The company has successfully increased the production of Grace Blackwell, achieving about $11 billion in revenue in the last quarter. The company will continue to scale up to meet customer demand for the Blackwell system.
Regarding gross margin, NVIDIA Corporation's Executive Vice President and Chief Financial Officer, Colette Kres, said, "During the period of increased Blackwell production capacity, gross margin will be over 70%. Currently, the company is focusing on speeding up production to ensure prompt delivery to customers. With the full capacity increase of Blackwell, once that happens, the company can reduce costs and improve gross margin. Therefore, it is expected that gross margin later this year may reach around 75%."
Regarding the second-generation product Blackwell Ultra, Huang Renxun stated that the minor issues of the first generation Blackwell have been fully resolved, and the next wave will be launched on an annual basis. Blackwell Ultra is equipped with new networks, new memory, and new processors, all of which will be available. At NVIDIA Corporation's GTC event in March, Blackwell Ultra, Verarubin, and more exciting new products will be unveiled.
Regarding future market demand, he expressed confidence in both short-term and medium-term signals. Short-term signals include orders and forecasts. Medium-term signals involve an expansion in infrastructure and capital expenditure compared to previous years. Long-term signals are related to the fact that software has fundamentally shifted from manual coding on CPUs to running machine learning and AI-based software on GPUs and acceleration computing systems.
Q&A
Q: I would like to ask Judson, with testing time calculation and reinforcement learning showing such great potential, we are obviously seeing a blur in the boundaries between training and inference. What does this mean for potential future dedicated inference clusters? How do you see NVIDIA's overall impact on your customers?
A: There are several laws of extension now. There is the pre-training extension law, which will continue to expand as we have multimodal, we have data from inference, now used for pre-training, and then there is the second post-training, using reinforcement learning human feedback, reinforcement learning AI feedback, verifiable reinforcement learning rewards, the computation used for post-training is actually higher than pre-training.
In a sense, this is the realm of Youdao Inc ADR Class A, because you can generate a large amount of synthetic data or synthetically generated tokens when using reinforcement learning, the AI model is basically trained on generating tokens.
The third part, which you mentioned, testing time calculation or long-range inference, has chains of thought, you have search, the number of generated tokens, the required inference computation has already multiplied 100 times over the original large language models' one-time examples and capabilities, and this is just the beginning. This is just the beginning. The next generation may have thousands of times, and hopefully very thoughtful and simulation-based and search-based models, whose computation could be tens of thousands or millions of times today. So that's what's in our future. So the question is how to design such architectures? Some models are autoregressive. Some models are diffusion-based, sometimes you want your data centers to have separate inference, sometimes it's compact.
So, it is difficult to determine the best configuration for a data center, which is why NVIDIA's architecture is so popular, we run every model, we excel in training, most of our calculations today are actually in inference, and Blackwell will take all of this to a new level. We consider inference models in our design. When you look at training, its performance is much higher, but what's truly amazing is that for long-range thinking, testing time extension inference AI models, our speed is dozens of times faster, throughput is 25 times higher, so Blackwell will be incredibly impressive in all aspects, when you have a data center, it allows you to configure and use your data center for more pre-training, post-training, or expanding your inference, our architecture is interchangeable and easy to use in all these different ways. So, we actually see a more concentrated use of unified architecture than ever before.
Q: I would like to know if you can talk about the GB200 at CES. You mentioned to some extent the complexity of rack-level systems and the challenges you face. And as you said in the prepared remarks, we have seen a lot of widespread availability. Where are you on that slope? Apart from the chip level, are there still bottlenecks to consider at the system level? Just wondering if you still feel optimistic about the NVL-72 platform?
A: I am more optimistic today than I was at CES. The reason is that we have delivered more. We have about 350 factory manufacturing entering 1.5 million components in each factory, yes, it is very complex, we have successfully and incredibly ramped up production of Blackwell, delivering about $11 billion in revenue last quarter.
We will have to continue to scale up because demand is very high, and customers are eager to get their Blackwall systems. You may have seen some celebrations online about the launch of the Grace Blackwell system, Coreweave has now publicly announced their successful launch, Microsoft has, and OpenAI as well, you are starting to see many launches. We are doing well, what we are doing is very complex and we are delivering on time.
I am more optimistic today than I was at CES. The reason is that we are delivering more. We have about 350 factories manufacturing 1.5 million components in each factory, and yes, it is very complex. We have successfully and incredibly ramped up the production of Blackwell, delivering about $11 billion in revenue last quarter.
We will have to continue to scale up because demand is very high, and customers are eager to get their Blackwall systems. You may have seen some celebrations online about the launch of the Grace Blackwell system, Coreweave has now publicly announced their successful launch, Microsoft has, and OpenAI as well, you are starting to see many launches. We are doing well, and we will have to continue to expand our capabilities.Some partners are also doing very well.Q: Is the first quarter the bottom of the gross profit margin? What gives you confidence that strong demand can continue into next year? Has any innovation from DeepSeek and their suggestions in any way changed this perspective?
A: During our Blackwell mass production phase, our gross profit margin will be at a low of 70%. At this point, we are focused on speeding up our manufacturing speed. Accelerating our manufacturing speed is to ensure that we can quickly serve our customers. After Blackwell is fully mass-produced, we can increase our costs and gross profit margins. Therefore, we expect to potentially reach the mid-70s later this year. Looking back on what you heard Johnson talk about in terms of systems and their complexity, they can be customizable in certain cases. They have multiple network options. There is liquid cooling and water cooling. Therefore, we know we have opportunities to improve these gross profit margins in the future. But for now, we will focus on completing manufacturing and delivering to our customers as quickly as possible.
We have a fairly good understanding of the capital investment being made in data center construction. We know that, looking ahead, the vast majority of software will be based on machine learning. As a new opportunity in the field of AI development, whether it's Agent AI, Inference AI, or Physical AI. The number of startups is still very active, and each company requires a significant amount of computing infrastructure. So, whether it's short-term signals or medium-term signals, long-term signals, we fundamentally know that software has already changed this fact. From coding running on CPUs to machine learning and AI-based software running on GPUs and accelerated computing. Accelerated computing systems are the future of software. And in terms of applications, first AI and search, and some consumer-generated AI and advertising, a bit like the early stages of software. The next wave is coming. Agent AI for enterprises, Physical AI for Siasun Robot & Automation technology, and Sovereignty AI as different regions build AI for their own ecosystems. So, these are all just beginning, and we're just getting started.
Q: Blackwall Ultra is scheduled to be launched in the second half of this year. With the team's annual product rhythm consistent given you are still mass-producing the current generation Blackwell solution, can you help us understand the demand dynamics for Ultra? How are your customers and supply chain managing the synchronous mass production of these two products? Is the team still on track to execute Blackwell Ultra in the second half of this year?
A: Blackwall Ultra is in the second half of the year, as you know, the first Blackwell we encountered some minor issues that may have cost us a few months. We have certainly fully recovered. The team has done a great job, and all our supply chain partners and many others have helped us recover at lightning speed. So now we have successfully increased the production of Blackwell. But that does not stop the next product launch, running on an annual rhythm, Blackwell Ultra with new networks, new storage, and of course new processors, all coming soon.
The system architecture is identical between Blackwell and Blackwell Ultra this time. It was much more difficult transitioning from Hopper to Blackwell, as we switched from an NVLink 8 system to the fundamental system of NVL72. So, the chassis, system architecture, hardware, power, everything had to change. This is a quite challenging transition.
The next transition will be faster. Blackwall Ultra will be inserted directly. We have also revealed and have been working closely with all our partners afterwards for the Rubin product. All our partners are accelerating the transition. So come GTC, I will discuss Blackwell Ultra with you and some really exciting new products.
Q: Can you talk about the balance between custom ASICs and commercial GPUs? We have heard some news about the use of heterogeneous superclusters with GPUs and ASICs. Do customers plan to build these infrastructures, or will these infrastructures remain fairly level?
A: What we build is very different from ASICs. In some ways, completely different, in some areas, we have some overlap. We have several differences. First, NVIDIA's architecture is general. Whether you are optimizing for non-regressive models, diffusion-based models, vision-based models, multimodal models, or text-based models, it performs well in all models. We are good at all models because our software stack is very rich, our architecture is attentive to machine learning, very flexible, our software stack ecosystem is very rich, so we are the initial target for most exciting innovations and algorithms. So by definition, we are much more general than narrow. We are also very good at end-to-end. From data processing, training data management, training data, of course, reinforcement learning for post-training, all the way to testing time expansion for inference. So, we are general, end-to-end. We're everywhere. Because we're not just in one cloud, we're in every cloud, we can be local, we can be in Siasun Robot & Automation. Our architecture is more accessible. And for anyone starting a new company, it's a great initial target. So we're everywhere. And the third thing I want to say is, our performance and rhythm are very fast. Remember, the size of these data centers is always fixed.
They are fixed in size, or their power is fixed. If our per watt performance is between 2 times, 4 times, 8 times, that's not uncommon. It can directly revenue. So, if you have a 100 megawatt data center, if the performance or throughput of that 100 megawatt or gigawatt data center is 4 times or 8 times higher, that can be directly converted into revenue. So if you have a 100-megawatt data center, if the performance or throughput is 4 times or 8 times higher than that 100-megawatt data center.Revenue from the data center will increase by 8 times. The reason this is so different from previous data centers is that AI factories can directly monetize the tokens they generate. Therefore, the token throughput of our architecture is so fast, making it very valuable for all companies building these things for revenue generation purposes.And to achieve a fast return on investment, so I think the third reason is performance.
Then the last thing I want to say is that the software stack is very difficult. Building an ASIC is no different from what we do. We have to build a new architecture. And the ecosystem on top of our architecture today is 10 times more complex than two years ago. This is very obvious because the amount of software being built on top of our architecture is growing exponentially, and the development of AI is very rapid. Therefore, putting the entire ecosystem on top of multiple chips is very difficult. So, what I want to say are these four reasons, and finally, I want to say that just because a chip is designed doesn't mean it will be deployed. You have seen us do it over and over again. Many chips are built. But when the time comes, commercial decisions must be made. The commercial decision is about deploying a new engine, a new processor in limited size AI factories, and [unclear] and refined. Our technology is not only more advanced, more high-performance, and has better software features, but most importantly, we can deploy at lightning speed.
Q: You did a great job explaining some potential demand factors that have led to the U.S. increase by about $5 billion. I think some people are concerned that if there are regulations in other regions, can the U.S. make up for this gap. I just want to know, as we move forward throughout the year, will the U.S. continue to see this surge, and can it? If this is the basis of your growth rate, then in the case of this shift towards the U.S., how can you maintain such rapid growth? Your guidance suggests that China may continue to grow. So, just wanted to get your analysis on this dynamic.
A: China accounts for about the same percentage as in the fourth quarter. The same as the previous quarter. About half of what it was before export controls. But the percentage remains roughly the same. Regarding geographical location, the key point is that AI is software. It is modern software. It's incredibly modern software. But it's modern software. AI has become mainstream. AI is used everywhere for delivering services, used everywhere for shopping services, if you want to buy a quarter gallon of milk, AI will deliver it to you, AI has evolved.
So, consumers [unclear] almost everything at its core. Every student will use AI as a tutor. Healthcare Services Group, Inc. uses AI, financial services use AI. There is not a fintech company out there that's not using AI. Every fintech company is making use of it and every higher education institution, every university is using AI. So, I think it can be fairly certain and it's being integrated into every application [unclear][unclear] our hope is, of course, that technology continues to evolve safely and usefully [unclear] with that, I truly believe we are at the beginning of this new transition. The beginning I talk about is in and you remember behind us [unclear] there have been decades of data centers and computers being built. They were built for a world of manual coding and general computing. And CPUs etc. Looking ahead, I think it can be fairly certain that almost all software in that world will be injected with AI. All software and all services will ultimately be based on machine learning. The data flywheel will be a part of improving software and services.
And future computers will be accelerated. Future computers will be based on AI. We have truly been on this journey for the past three years. And we are modernizing computers that have been built for decades. So, I am quite sure that we are at the beginning of this new era. Finally, no technology has the chance to solve a larger portion of the world's GDP problem like AI. No software tool has ever done this before. So, this is now a software tool that can solve a larger portion of the world's GDP problem than ever before. So, we think about growth and how we think about whether something is big or small.