HumanSignal launches Adala open source framework for autonomous data labeling agents

HumanSignal, the firm behind the widely used open source Label Studio for data labeling, is growing its efforts today with the launch of the Adala open source framework for autonomous data labeling agents.

HumanSignal was previously known as Heartex, and rebranded itself in June 2023, in an effort to draw attention to its core value proposition of adding humans into the loop for machine learning (ML) training. Data labeling is a foundational activity for training models and in the past has been a very labor intensive process. With Label Studio, data scientists get the tools to label different types of data, including text and video. With machine learning rapidly evolving, HumanSignal is aiming to shape the future of reliable, efficient data processing through its new open source Adala framework.

Adala is an acronym for Autonomous Data Labeling Agent and it's an approach that uses AI agents, in a novel way to help accelerate and improve the data labeling process.

"We started to ask ourselves what it would mean to build what we call a reliable AI agent that you can trust," Michael Malyuk, cofounder and CEO of HumanSignal told VentureBeat. "Adala is our response and is meant to help build autonomous reliable agents that are focused specifically on data processing tasks."

_{Image credit: HumanSignal}

How Adala works to help accelerate the data labeling process

Adala agents are designed to learn and improve at data tasks like classification and labeling when provided with ground truth datasets. A ground truth dataset is the foundation for defining the data labels and can be developed using the Label Studio technology.

Malyuk explained that within the Adala framework there is the concept of an environment which basically defines how the agent learns with the ground truth being a part of the environment. An Adala agent will interact with the environment, learn from it and after it has gone through multiple learning iterations, the agent becomes a prediction engine. In the initial target use case for Adala, the predictions are used to apply data labeling to the rest of a data set that isn't already labeled.

The Adala agents are powered by what Malyuk referred to as a runtime, which is basically a large language model (LLM). The runtime executes the task that has been designated for the agent and provides responses back.

Nikolai Liubimov, CTO of HumanSignal explained that part of the Adala framework architecture is the requirement for some form of storage, which is typically going to be a vector database. He noted that the process for retrieving a data label that can be applied to new data is similar in many respects to how Retrieval Augmented Generation (RAG) works for LLMs.

Adala isn't just about data labeling

Malyuk noted that the Label Studio community of users have been asking for all sorts of automations.

The initial capability enabled by Adala is data labeling, but he emphasized that it can be a generalized agent for a variety of data processing tasks. With the Adala project as open source, his hope is that users will contribute ideas and code for how they want Adala to expand.

"One year from now they're going to be different types of agents with different types of skills that can interact and get feedback from different types of environments," Malyuk said. "And that is an extremely powerful approach that we want to share with the broader community."

How Adala works to help accelerate the data labeling process

Adala isn't just about data labeling

More