Join top executives in San Francisco on July 11-12, to hear how leaders are integrating and optimizing AI investments for success. Learn More
Landing AI, a company that provides enterprise-wide transformation programs and solutions for industrial AI applications, today announced the launch of Visual Prompting, which helps make it easier for users to build computer vision applications.
VentureBeat sat down with Andrew Ng, the founder and CEO of Landing AI and known worldwide for his knowledge and expertise in the field of AI. In addition to learning more about Landing AI’s new Visual Prompting tool, we discussed how generative AI is inspiring innovations in other fields of AI, especially machine learning.
Ng said the company’s Visual Prompting capability simplifies and accelerates the creation of computer vision models. With this significant shift in the AI-development workflow, developers can leverage visual cues to quickly and efficiently label data, reducing the time needed for this crucial step.
How Visual Prompting works
“In the prompting workflow, you come up with a simple ‘visual prompt’ and, in seconds, can start getting predictions. This enables a much faster speed of development of applications,” Ng told VentureBeat. “The GPT-3 moment — where prompting makes it easy to develop new applications — isn’t here yet for computer vision, but I believe visual prompts will get us closer.”
Join us in San Francisco on July 11-12, where top executives will share how they have integrated and optimized AI investments for success and avoided common pitfalls.
Visual Prompting technology, he explained, enables users to specify a visual prompt by painting over object classes they wish the system to detect, using just one or a few images. The algorithm then immediately begins making inferences based on the user-provided prompt. If initial results are subpar, users can immediately re-enter the prompt to further refine their models, guiding the AI system toward better recognition by highlighting the specific pixels they want to improve.
Visual Prompting is a feature of LandingLens, the company’s flagship product that makes computer vision easy for everyone to implement. LandingLens is an intuitive software platform that allows users to create, deploy and scale AI-powered industrial computer-vision applications — such as defect detection — faster and with higher accuracy.
How generative AI inspired Landing AI’s new tool
The novel technology is inspired by recent advances in generative AI text interfaces like ChatGPT, which streamline the process of iterating text to gain valuable insights.
Traditionally, building artificial intelligence (AI) models has been a complex and lengthy endeavor, often involving multiple steps such as data labeling, model training and deployment before getting any predictions. But Visual Prompting was created to transform the way computer vision systems are created.
“This powerful functionality empowers users to fine-tune their models with ease, resulting in faster iterations and greater accuracy,” Ng said. “If the results look good, you can also deploy to a cloud API endpoint in the cloud, perhaps in tens of seconds. This means that you can get a first model up and running perhaps in minutes or, at most, a small number of hours, and use that to keep iterating and improving its performance.”
Beyond ChatGPT: Applying NLP to computer vision
Ng believes that the success of text prompting in transforming natural language processing (NLP) has paved the way for explosive innovation in the field of computer vision.
“Many groups in computer vision are exploring how to take the ideas of text prompting and adapt them to vision. For example, Meta’s recent work on SAM (Segment Anything Model) was a great piece of work on using prompting for the task of image segmentation,” he said. “There’s still much work ahead, and I expect this technology to continue to rapidly improve through our work and the work of many others.”
As part of this expansive work, Landing AI is testing its Visual Prompting tool across multiple industries for various use cases. In one example, a leading multinational pharmaceutical company utilized the technology to estimate crystal shape and growth from laboratory images. This enabled them to develop models on a small dataset while removing the annotation burden.
Recently, therapeutic antibody discovery firm OmniAb used the LandingLens platform with Visual Prompting to analyze individual cells in honeycombs; a process that in the past has generally required hours to complete by hand-labeling hundreds of hexagonal shapes.
“The greatest impact of Visual Prompting has been in use cases where it would be laborious to exhaustively label all features, such as cells in our high-throughput screening platform,” Bob Chen, senior director at OmniAb, said in a press release. “Thanks to Visual Prompting’s intuitive prompt interface, we can achieve high-quality results in a fraction of the time and with significantly reduced effort.”
Current challenges and Landing AI’s future plans
Ng says that the software doesn’t work on everything because it is a beta release. However, out of 40 use cases that Landing AI analyzed, Visual Prompting and its post-processing capabilities were sufficient for over two-thirds of them.
“One limitation of our current system is that it is better at distinguishing between classes with different textures/colors than shape features; this reflects a limitation of the pretrained vision transformers we’re using,” he said. “We’re continuing to work to improve the system.”
“We’ll keep improving Visual Prompting and are eager to engage with the community to keep developing this technology together,” Ng said. “As vision transformers improve, we’ll also keep looking into how to incorporate the latest ideas to further help our customers.”
Landing AI’s Visual Prompting tool is available as a public beta now. The company invites people to demo the tool on its website and see how it works to fine-tune AI models.
This story was revised 4/25/23 at 2:25 pm ET.
VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.