Join top executives in San Francisco on July 11-12, to hear how leaders are integrating and optimizing AI investments for success. Learn More

Vision AI is set to take a leap forward with new Metropolis framework capabilities from Nvidia. This application development framework focuses on microservices, as well as a suite of cloud-native workflows that empower users to build more efficient vision AI models.

Announced at this week’s Nvidia GTC 2023, Metropolis is joined by new versions of the company’s TAO Toolkit 5.0, which enables the creation of highly customized AI models, as well as the expansion of the Nvidia DeepStream data pipeline builder for vision AI applications and services.

Nvidia said that more than 1,000 companies seeking to create vision AI applications are presently utilizing the Metropolis developer tools to address operational challenges, sensor processing and IoT with vision AI. 

Enhanced workflow for industrial AI ecosystem

Nvidia presented the GTC community with significant expansions of Metropolis workflows aimed at placing greater AI capabilities and research within the reach of more developers. These expansions comprise the Nvidia TAO Toolkit, Metropolis microservices, and DeepStream SDK, as well as the Nvidia Isaac Sim synthetic data generation tool and robotics simulation applications.


Transform 2023

Join us in San Francisco on July 11-12, where top executives will share how they have integrated and optimized AI investments for success and avoided common pitfalls.


Register Now

Nvidia representatives said TAO 5.0 seeks to democratize cutting-edge vision AI capabilities, empowering individuals and businesses with advanced image processing and analysis tools. 

“All major infrastructure will become a robot, as Nvidia Metropolis helps automate the world’s most valuable physical processes and infrastructure,” Adam Scraba, director of product marketing at Nvidia, said during a GTC pre-briefing.  

With no-code accessibility to state-of-the-art vision AI models, Metropolis users can integrate AI-based workflows into their training methods, leveraging the capabilities of TAO 5.0 and Issac Sim. 

PepsiCo employs digital twins

These workflows can be trained and subsequently deployed on any device utilizing CPU, GPU, MCU or DLA using the TAO 5.0 ONNX export service. In addition, Nvidia’s reference applications help users create fine-tuned workflows, generating AI-enhanced API calls for computer vision integrations.

Convenience food and beverage industry titan PepsiCo is already leveraging the capabilities of Nvidia Metropolis to streamline its operations. The company has successfully developed AI-driven digital twins for its distribution centers, employing the Nvidia Omniverse platform to visualize different setups within its facilities and determine their impact on operational efficiency before implementation in real-world scenarios. 

Likewise, Siemens, a prominent digitalization and industrial automation enterprise, has inculcated Nvidia Metropolis to achieve next-level perception within its edge-based applications. By utilizing millions of sensors distributed across its factories, Siemens connected fleets of IoT devices and robots via the Metropolis ecosystem, ultimately integrating AI into its industrial computer vision workflow.

Empowering computer vision with low-code

The Nvidia TAO Toolkit is a low-code AI framework that supercharges vision AI model development for practically any developer, in any service, on any device. TAO 5.0 has new features, including vision transformer pretrained AI models. This provides the ability to deploy models on any platform with standard ONNX export, automatic hyperparameter tuning with automated machine learning (AutoML), and AI-assisted data annotation. 

“TAO doesn’t generate any code but supports “bring your own model,” where developers can import their custom model architectures and perform training, fine-tuning and optimization,” said Scraba.

TAO 5.0 now supports vision transformers for computer vision models and has been made open-source for developers. Through REST APIs, developers can integrate TAO into any AI service. In addition, the service’s AutoML feature also automates hyperparameter tuning for AI models and has been integrated into services such as Google Vertex AI, AzureML, Azure Kubernetes and Amazon EKS. 

Empowering IoT and edge use cases

Embedded microcontrollers firm STMicroelectronics, has integrated TAO into its STM32Cube AI developer workflow, which enabled the company to run sophisticated AI in widespread IoT and edge use cases that STM32 microcontrollers power within their compute and memory budget. 

“We are excited to integrate TAO into our STM32 development workflow,” said Matthieu Durnerin, head of STM32 Cube.AI tools at STMicroelectronics. “Bringing the latest AI training tools to developers who have already developed over 11 billion STM32 MCUs will have a major impact on IoT and edge computing.”

The Nvidia DeepStream SDK is a key tool for developers looking to create vision AI applications across a spectrum of industries. The latest update in Nvidia DeepStream SDK is a new graph execution runtime (GXF) that allows developers to expand beyond the open-source GStreamer multimedia framework. DeepStream’s GXF allows users to build applications that require tight execution control, enabling advanced scheduling and critical thread management.

Perception Is a vision thing

Adding perception to physical spaces often requires applying vision AI to numerous cameras covering multiple regions. For example, challenges in computer vision include monitoring the flow of packaged goods across a warehouse or analyzing individual customer flow across a large retail space. Metropolis Microservices aims to make these sophisticated vision AI tasks easier to integrate into users’ applications.

Metropolis microservices comprises a suite of cloud-native tools to build multi-camera tracking apps through computer vision, utilizing a matrix of sensors to generate a centralized common perception. 

Leading IT services company Infosys is using Nvidia Metropolis to build vision AI applications. The Nvidia TAO low-code training framework and pretrained models helped Infosys reduce its AI training efforts. 

“Metropolis enables us to deploy solutions faster and rapidly scale across stores and product lines while also getting much higher levels of accuracy than before,” said Balakrishnan DR, executive VP and head of AI and automation at Infosys. 

Metropolis Microservices, along with the DeepStream SDK, optimized the company’s vision processing pipeline throughput and cut overall solution costs, he said. Infosys can also generate troves of synthetic data with the Nvidia Omniverse replicator SDK to easily train AI models with new stock-keeping units and packaging. 

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.