Cosmos marks another master stroke for Nvidia in AI Robotics

Nvidia CEO Jensen Huang on stage at the company’s flagship CES 2025 in Las Vegas, NV

Nvidia

In what was likely the most watched keynote in CES history, Nvidia CEO Jensen Huang took the stage to the packed Michelob Ultra arena with a dizzying array of new technology announcements from consumer devices like the new GeForce RTX 50 Series of gaming graphics cards, a new secure autonomous vehicle platform called Thor, which is based on the company’s latest Blackwell GPU technology, and more – much more. However, a new generative Nvidia AI technology called Cosmos, which some people might have overlooked due to its complexity, was, in my opinion, another star of the show. I’d even venture to say, if Cosmos works out the way the company is aiming, it could be a launching pad for Nvidia’s robotics and autonomous vehicle business.

Understanding Nvidia Cosmos for physical AI

Nvidia’s description of what the machine sees in a training simulation created by the cosmos model

Nvidia

Nvidia calls Cosmos a “platform for accelerating the physical development of AI.” And simply put, you can think of physical AI as the brain behind anything robotic, whether it’s humanoid robots that are designed to optimally navigate the world we live in, factory automation robots, or autonomous vehicles, which are robots that optimized for navigating our roads. carrying people or various loads. However, training robotic AI is very labor- and resource-intensive, often requiring the capture, labeling and categorization of millions of hours of human interaction in real-world environments, or millions of miles traveled on real roads around the world.

Nvidia Cosmos aims to partially solve this resource problem with a family of what the company is calling “World Core Models,” or AI neural networks that can generate accurate physics-aware videos of the future state of a virtual environment — or a multiverse, if you will. You can go ahead and queue Dr. Strange now, and Jensen even referenced the Marvel character in his keynote presentation. It all sounds very deep, but it’s actually quite straightforward. WFMs are similar to large language models, but where LLMs are trained AI models for natural language recognition, generation, translation, etc., WFMs use text, images, video content, and motion data to generate simulated virtual worlds and virtual world interactions that have accurate spatial awareness, physics and physical interaction, and even object persistence. For example, if a bolt rolls off a table in a factory and can’t be seen in the current camera view, the AI model knows it’s still there, but maybe just on the floor.

still with me? Good, because this is where it gets even more interesting. This new form of generating synthetic data to train physical AI, or robots, must be based on ground truth to be accurate. In other words, bad data means a corrupt model that is misleading or otherwise unreliable for generating training data for robotic AI. This is where Nvidia Omniverse, which the company announced a few years ago, comes into play.

Cosmos is designed to interface with Nvidia Omniverse Digital Twins

Nvidia’s Huang Details Application of Cosmos AI World Core Training Models for Robotics

Nvidia

Nvidia’s Omniverse digital twin operating system allows companies and developers from almost any industry to simulate products, factories, robots, vehicles, and more. in an environment that’s designed to connect with industry-standard tools from computer-aided design to animation and more. In fact, Nvidia unveiled new Omniverse Blueprints at CES 2025 as well, to help developers simulate fleets of robots for factories and warehouses (called Mega) and AV simulation, spatial streaming in the Apple Vision Pro headset for industrial digital on a large scale. twins and real-time computer-aided visualization of computer engineering and physics. The company pairs these with free tutorials for OpenUSD, or Universal Scene Description, which is the language that supports Omniverse and allows the integration of industry-standard tools and content. Nvidia announced that several major players are adopting its Omniverse platform, from Cadence for semiconductor EDA design tools, to Altair and Ansys for computational fluid dynamics, among many others.

Returning to Cosmos, we can now see Nvidia’s complete unified solution for physical AI in robotics. Cosmos’ models take data from a digitized version of the real world and then generate AI training content from it. Although Cosmos’ models were developed from training over 20 million hours of video data, according to Huang in his keynote, developers who want to train physical or robotic AI on their digital twins and their data can simulate in Omniverse, and then let Cosmos play out a bunch of synthetic realities that these robot AIs can then train with.

Is Cosmos Another CUDA Moment for Nvidia?

At this point, I know what you’re thinking. Training robots on simulated data and in simulated worlds, what can go wrong? No doubt, this technology is still in its infancy, but as the old saying goes, you have to start somewhere. The beauty of machine learning, while it’s prone to hallucinations and needs guardrails (for which Nvidia has well-documented tools and policies), is that you can train and keep training until you’re sure that you did it right. And the machine doesn’t sleep or take coffee breaks, not to mention it’s far more efficient than manually training an AI on human-generated and categorized content.

Nvidia CEO Jensen Huang on stage at CES 2025, describing the company’s 3 computing solutions for … [+] Robotics

Nvidia

That said, years ago, when Nvidia first announced its CUDA programming language that sparked the era of machine learning on GPU accelerators, the company went Johnny Appleseed, so to speak, making its tools available to developers from all walks of life, ultimately allowing it to become the de-facto standard for accelerating AI workloads in the data center. With Cosmos, Nvidia is once again making these AI-generating World Core Models available to developers for free under its open model license, and they are accessible on Hugging Face or in the company’s NGC catalog repositories . The models will also soon be available as Nvidia Inference-optimized Microservices (or NIMs), all of which will be accelerated on its DGX AI data center platforms and in AI edge devices, robots and autonomous vehicles , with its AGX Drive Orin car and Thor computing platform for autonomous vehicles. Or, as Huang and company call it, Nvidia’s “Computer Solution of Robotics 3.”

Nvidia CEO Jensen Huang Hold new autonomous vehicle platform AGX Drive Thor based on … [+] Blackwell GPU Tech

Nvidia

Nvidia notes that several big-name players in physical AI have already adopted Cosmos, from humanoid robot companies like 1X and XPENG, to Hillbot and SkildAI for general-purpose robots, to rideshare giant Uber, which uses Cosmos in combination with its massive sets of driving data. to help build AI models for the AV industry.

It might be a stretch to call this a “CUDA moment” for Nvidia, but the world leader in AI just dropped some seriously powerful new tools for physics and AI developers for free. I personally think it’s another master stroke for Jensen Huang and his band of AI wizards. We’ll have to see how far down this artificial intelligence, multiverse robot rabbit hole with Cosmos goes, and it should be fascinating to watch.

Click any of the icons to share this post:

Cosmos marks another master stroke for Nvidia in AI Robotics

Understanding Nvidia Cosmos for physical AI

Cosmos is designed to interface with Nvidia Omniverse Digital Twins

Is Cosmos Another CUDA Moment for Nvidia?

Like this:

Related

Recent Posts

Recent Comments

Categories

Cosmos marks another master stroke for Nvidia in AI Robotics

Understanding Nvidia Cosmos for physical AI

Cosmos is designed to interface with Nvidia Omniverse Digital Twins

Is Cosmos Another CUDA Moment for Nvidia?

Share this:

Like this:

Related

Recent Posts

Recent Comments

Categories