NVIDIA has just thrown all of the pointless words to the ground with the release of its new Rubin platform at CES 2026. This is a giant leap forward in AI computing technology. The CEO of NVA, Jensen Huang, was on the stage in Las Vegas to announce this new architecture. This new architecture is going to bring down the costs of AI significantly and increase its performance as well. NVIDIA is going to aggressively pursue becoming the winner of the new generation of AI infrastructure.
Rubin platform delivers unprecedented AI performance gains
The Rubin platform is the first extreme code designed 6-chip AI platform by the Nvidia company and represents the beginning of mass production. The platform is a revolutionary design that will offer a 10x reduction in inference token costs over the previous Blackwell platform and detection of 75 percent GPU usage for Mixture of Experts models while training. The Rubin platform consists of six chips that collaborate to remove bottlenecks to AI models.
The codesign of hardware and software in the end solution allows drastic improvements in performance to be achieved, as follows. The end solution includes the Rubin GPUs with 50 petaflops of NVFP4 inference performance capabilities, Vera CPUs optimized for data motion optimization, NVLink 6 scale-up interconnects, Spectrum-X Ethernet Photonics, ConnectX-9 based SuperNICs, and BlueField-4 DPUs. The codesign allows for the highest efficiency to be made in the deployment of the AI infrastructure.
Six-chip architecture revolutionizes AI supercomputing design
The Rubin architecture is a revolutionary innovation in five different technological segments that are redefining AI computing capabilities. The Sixth Generation NVLink has a data bandwidth of 3.6 TB/s per GPU, while its Vera Rubin NVL72 rack has a staggering data bandwidth of 260 TB/s. This architecture has in-built network compute acceleration, serviceability improvements, and resiliency capabilities that make computations for AI training and inference much more efficient.
The naming of this technology platform is in memory of Vera Florence Cooper Rubin. Vera Florence Cooper Rubin was a renowned American astronomer whose discoveries changed the way humanity understands the universe and dark matter. The naming of this technology platform is a way of recognizing Nvidia’s dedication to improving knowledge of the world through technological innovation. The Rubin architecture incorporates third-generation Transformer Engines, advanced confidential computing, and second-generation RAS Engines.
AI-native storage speeds up processing of inference contexts
Inference Context Memory Storage Platform, with BlueField-4 chips, brings forth a new generation of AI-native memory storage infrastructure, catering to the demands of gigascale computing. This infrastructure also allows the sharing of the key-value cache memory data across the AI infrastructure, thus ensuring that there is a significant increase in the response time and the throughput, along with scalable processing of the agentic AI, demanding intensive context processing.
The leaders in this industry are adopting next-generation AI infrastructure
The tech companies and AI research labs are also lining up to use the Rubin platform for their future computing needs. Some of the leading companies that have planned to use the Rubin computing system in their AI system include OpenAI, Anthropic, Meta, Microsoft, Google, and Amazon Web Services. These indicate the trust and confidence placed in the tech capability of Nvidia and the capability of the Rubin platform to enhance AI development.
The offering will be provided by the partners in the second half of 2026, while the top cloud players are on track to deliver Rubin-based instances in response to customers demanding new capabilities in AI. Microsoft will utilize Vera Rubin NVL72 rack-scale systems in its next-generation AI data centers, including its future Fairwater AI superfactory sites. CoreWeave will incorporate Rubin systems in their offering for their AI cloud service, enabling customers to harness new reasoning capabilities in a mixture-of-experts.
