跳到正文 · Skip to content
内容
关于与合作
订阅与会员
★ 查看会员权益
设置
主题色
版本 · 深色 / 减动搜索⌘K
草稿预览·这篇还没发布,不会出现在列表和 RSS 里。review 完后把 frontmatter 的 draft: true 改为 false 即可。

GTC 2026: Nvidia Is the AI Infrastructure General Contractor and the Future Token Factory

2026.03.174 min原创
GTC 2026: Nvidia Is the AI Infrastructure General Contractor and the Future Token Factory
  1. Trillion-Dollar Order Pipeline

The biggest bombshell Huang dropped in his keynote was pushing data center revenue visibility to a staggering $1 trillion. This isn't just scale expansion; it's a complete reshaping of prior market expectations. Previously, Nvidia guided to "$500 billion in revenue by 2026," but the new forecast not only extends the outlook by a year but doubles the cumulative figure. This leapfrog revenue guidance significantly alleviates investor anxiety that AI capex might hit a "peak" in 2026, proving that the long-term demand engine for compute remains strong.

The confidence behind this trillion-dollar revenue comes from a diversified customer base. Beyond traditional hyperscalers, Nvidia is accelerating into two new battlefields: Sovereign AI and Industrial AI. Goldman Sachs notes that Nvidia's data center business now exhibits extremely high long-term revenue visibility, covering the full-stack compute and networking of Blackwell and Rubin architectures. This means Nvidia is shifting from selling "tools" to selling "productivity"—every order represents a mandatory investment by global enterprises and nations for a ticket to the AI era.

  1. From Selling Chips to Delivering an Entire "AI Factory"

The most critical narrative shift at this GTC is: Nvidia no longer sells cards; it sells entire "AI factories." Huang emphasized that Vera Rubin is not a single chip but a complete AI supercomputer platform composed of 7 chips and 5 rack systems, designed for "extreme co-design" at infrastructure scale. This shift means the competitive logic has evolved from "single-card performance" to system integration capability. Previously, the market compared who had the stronger engine; now it's about who can build a smart assembly line that runs fast, dissipates heat well, consumes low power, and produces high output.

In this system, the Vera CPU has been elevated to an unprecedented level. This newly designed ARM architecture processor integrates 256 cores per rack, doubling compute efficiency and improving speed by 50% compared to traditional CPUs. It is no longer a mere assistant but deeply coupled with the GPU, managing complex agent logic, data coordination, and efficient network flow. This full-stack layout allows Nvidia to provide comprehensive solutions from chips and systems to software scheduling.

  1. Heterogeneous Inference and the Strategic Role of Groq 3

To address the explosion of the Agentic AI era, Huang repeatedly emphasized the concept of the "Token Factory." In the future, the core of competition will no longer be parameters or peak performance, but how many tokens can be produced per kilowatt-hour and how much revenue each data center can generate. To this end, Nvidia introduced a breakthrough heterogeneous inference solution, officially launching the Groq 3 LPX rack system based on Groq infrastructure. Manufactured by Samsung, the system integrates 256 LPU processors, with a killer feature: 128 GB of on-chip SRAM and 640 TB/s of ultra-fast expansion bandwidth.

The core logic of this design is "specialization": tasks requiring large memory and heavy computation go to Rubin, while decode tasks requiring ultra-low latency and fast token generation go to Groq. The combination of the Rubin platform and LPX racks delivers a staggering 35x improvement in inference throughput over the Blackwell platform. This marks a shift in the AI infrastructure bottleneck from "is there enough compute" to "can the entire system produce tokens cheaper, faster, and more stably at scale."

  1. Agent Infrastructure and Graphics' "GPT Moment"

Huang is lowering the deployment barrier for AI agents to a minimum. The newly launched NemoClaw positions itself as the infrastructure layer for agent platforms. It supports "one-click deployment" of always-on AI assistants and, by integrating the Nemotron model and sandbox capabilities, fills the security and privacy barriers that enterprises care about most. This "safe agent farming" ecosystem allows AI to evolve from a mere conversational tool to an autonomous entity capable of planning and executing tasks 24/7.

At the same time, Nvidia showcased its extreme expansion into the physical world and outer space. The launch of the Space-1 Vera Rubin module marks the first deployment of data-center-grade AI compute on satellites and orbital data centers, supporting on-orbit inference and real-time geospatial intelligence, providing localized compute for autonomous space missions. In graphics, the release of DLSS 5 was hailed by Huang as the "GPT moment" of the field. It uses generative AI models to predict and complete scenes, allowing GPUs to render highly realistic characters and details without generating every pixel from scratch, completely redefining the boundaries of computer graphics.

  1. Deep Revaluation of the Industry Investment Landscape: The Certainty of Physics

For investors, the post-GTC 2026 mindset must shift from simply staring at GPUs to deconstructing the entire "AI factory" line. The report provides a clear ranking of opportunities, with the core logic being "who is closest to order fulfillment":

  • Liquid cooling and power management are ranked first, as the Rubin architecture has shifted to 100% liquid cooling and full-rack deployment. This is no longer a concept story but a hard requirement driven by the laws of physics.
  • Interconnect and storage sectors also face revaluation. Despite market concerns about "optical replacing copper," Nvidia clarified that in the short to medium term, copper cables, AEC, and high-speed backplanes remain key to reducing cost and power consumption in rack-scale expansion, presenting a "coexistence of optical and copper" landscape. In storage, due to KV Cache management becoming an inference bottleneck in long-context and agent scenarios, storage technologies close to the inference chain, such as HBM and CXL memory tiering, are gaining significantly in importance.

Below are screenshots of some important messages:

Same as last time, I recommend watching the original video:

【NVIDIA CEO Jensen Huang Keynote | GTC 2026】 https://www.bilibili.com/video/BV1AbwSzeEKD/?share_source=copy_web&vd_source=87e6382a87d2f6199e818090fd0cdfd7

Minto
明投 Minto
投资分析 · 长期主义者
你读完了 · Colophon

GTC 2026: Nvidia Is the AI Infrastructure General Contractor and the Future Token Factory

4
分钟
2026/03
期号
2026
年份
真正稀缺的,是一个不慌不忙的人。
明投 · MintoInvest Wisely
— From This Series
喜欢这篇?这类 公司拆解 的深度拆解会持续发到你邮箱。
无广告 · 随时退订
— Enjoyed the read?
如果这篇文章对你有用,把它分享给一个朋友,就是对我最好的支持。

口碑是独立创作者最稀缺的燃料。

— Discussion

说说你的想法

评论基于 GitHub Discussions(Giscus)。登录后即可留言、点赞、互相讨论。

评论还在准备中。

想说什么可以直接发我邮件,比在评论区更容易认真回复。

mingtaohuang617@gmail.com →
支持沉浸式阅读