GTC 2026: Nvidia Is the AI Infrastructure General Contractor and the Future Token Factory

2026.03.174 min✦原创

公司拆解MINTOVIEW2026.03.17

Trillion-Dollar Order Pipeline

The biggest bombshell Huang dropped in his keynote was pushing data center revenue visibility to a staggering $1 trillion. This isn't just scale expansion; it's a complete reshaping of prior market expectations. Previously, Nvidia guided to "$500 billion in revenue by 2026," but the new forecast not only extends the outlook by a year but doubles the cumulative figure. This leapfrog revenue guidance significantly alleviates investor anxiety that AI capex might hit a "peak" in 2026, proving that the long-term demand engine for compute remains strong.

The confidence behind this trillion-dollar revenue comes from a diversified customer base. Beyond traditional hyperscalers, Nvidia is accelerating into two new battlefields: Sovereign AI and Industrial AI. Goldman Sachs notes that Nvidia's data center business now exhibits extremely high long-term revenue visibility, covering the full-stack compute and networking of Blackwell and Rubin architectures. This means Nvidia is shifting from selling "tools" to selling "productivity"—every order represents a mandatory investment by global enterprises and nations for a ticket to the AI era.

From Selling Chips to Delivering an Entire "AI Factory"

The most critical narrative shift at this GTC is: Nvidia no longer sells cards; it sells entire "AI factories." Huang emphasized that Vera Rubin is not a single chip but a complete AI supercomputer platform composed of 7 chips and 5 rack systems, designed for "extreme co-design" at infrastructure scale. This shift means the competitive logic has evolved from "single-card performance" to system integration capability. Previously, the market compared who had the stronger engine; now it's about who can build a smart assembly line that runs fast, dissipates heat well, consumes low power, and produces high output.

In this system, the Vera CPU has been elevated to an unprecedented level. This newly designed ARM architecture processor integrates 256 cores per rack, doubling compute efficiency and improving speed by 50% compared to traditional CPUs. It is no longer a mere assistant but deeply coupled with the , managing complex agent logic, data coordination, and efficient network flow. This full-stack layout allows Nvidia to provide comprehensive solutions from chips and systems to software scheduling.

Heterogeneous Inference and the Strategic Role of Groq 3

To address the explosion of the Agentic AI era, Huang repeatedly emphasized the concept of the "Token Factory." In the future, the core of competition will no longer be parameters or peak performance, but how many tokens can be produced per kilowatt-hour and how much revenue each data center can generate. To this end, Nvidia introduced a breakthrough heterogeneous inference solution, officially launching the Groq 3 LPX rack system based on Groq infrastructure. Manufactured by Samsung, the system integrates 256 LPU processors, with a killer feature: 128 GB of on-chip SRAM and 640 TB/s of ultra-fast expansion bandwidth.

The core logic of this design is "specialization": tasks requiring large memory and heavy computation go to Rubin, while decode tasks requiring ultra-low latency and fast token generation go to Groq. The combination of the Rubin platform and LPX racks delivers a staggering 35x improvement in inference throughput over the Blackwell platform. This marks a shift in the AI infrastructure bottleneck from "is there enough compute" to "can the entire system produce tokens cheaper, faster, and more stably at scale."

Agent Infrastructure and Graphics' "GPT Moment"

Huang is lowering the deployment barrier for AI agents to a minimum. The newly launched NemoClaw positions itself as the infrastructure layer for agent platforms. It supports "one-click deployment" of always-on AI assistants and, by integrating the Nemotron model and sandbox capabilities, fills the security and privacy barriers that enterprises care about most. This "safe agent farming" ecosystem allows AI to evolve from a mere conversational tool to an autonomous entity capable of planning and executing tasks 24/7.

At the same time, Nvidia showcased its extreme expansion into the physical world and outer space. The launch of the Space-1 Vera Rubin module marks the first deployment of data-center-grade AI compute on satellites and orbital data centers, supporting on-orbit inference and real-time geospatial intelligence, providing localized compute for autonomous space missions. In graphics, the release of DLSS 5 was hailed by Huang as the "GPT moment" of the field. It uses generative AI models to predict and complete scenes, allowing GPUs to render highly realistic characters and details without generating every pixel from scratch, completely redefining the boundaries of computer graphics.

Deep Revaluation of the Industry Investment Landscape: The Certainty of Physics

For investors, the post-GTC 2026 mindset must shift from simply staring at GPUs to deconstructing the entire "AI factory" line. The report provides a clear ranking of opportunities, with the core logic being "who is closest to order fulfillment":

Liquid cooling and power management are ranked first, as the Rubin architecture has shifted to 100% liquid cooling and full-rack deployment. This is no longer a concept story but a hard requirement driven by the laws of physics.
Interconnect and storage sectors also face revaluation. Despite market concerns about "optical replacing copper," Nvidia clarified that in the short to medium term, copper cables, AEC, and high-speed backplanes remain key to reducing cost and power consumption in rack-scale expansion, presenting a "coexistence of optical and copper" landscape. In storage, due to KV Cache management becoming an inference bottleneck in long-context and agent scenarios, storage technologies close to the inference chain, such as and CXL memory tiering, are gaining significantly in importance.

Below are screenshots of some important messages: