Join Transform 2021 for the most important themes in enterprise AI & Data. Learn more.


Labelbox, a startup developing a data annotation and labeling platform, today announced it has raised $40 million, bringing its total raised to $79 million. The company says the funds will be used to acquire new customers, expand its solutions, and grow its workforce around the globe.

Training AI and machine learning algorithms requires plenty of annotated data. But data rarely comes with annotations. The bulk of the work often falls to human labelers, whose efforts tend to be expensive, imperfect, and slow. It’s estimated most enterprises that adopt machine learning spend over 80% of their time on data labeling and management.

Labelbox was founded in 2018 by Manu Sharma and Brian Rieger, who both worked in the aeronautics industry, designing and testing flight control systems and experimenting with machine learning models. The San Francisco-based company offers a web service and API that allows data science teams to work with annotation teams from a single dashboard. Users can customize the tools to support specific use cases, including instances, custom attributes, and more, and label directly on photos, text strings, conversations, paragraphs, documents, and videos.

Labelbox

Using Labelbox, admins can manage access to data and projects for team members, ensuring access controls when working with a labeling service. They also get labeler performance metrics and a catalog of available labeling services, in addition to feature counts and object analytics to improve model capabilities.

Labelbox is in a category adjacent to companies like Scale AI, which has raised over $100 million for its suite of data labeling services, and CloudFactory, which says it offers labelers growth opportunities and “metric-driven” bonuses. That’s not to mention Hive, Alegion, Appen, SuperAnnotate, Dataloop, and Cognizant.

But Labelbox, which has 150 customers and just over 100 employees, says it reduces the time and cost associated with annotation through pre-labeling, where unlabeled data is initially seeded with machine learning model predictions. The company also claims to employ active learning, which dynamically prioritizes data labeling queues. From Labelbox, customers can search, browse, and curate training data to investigate poor or inconsistent labels.

Labelbox

When these tools are leveraged in conjunction with each other, Labelbox asserts they enable customers to automate labeling where confidence is high and spotlight assets where performance remains low. This ostensibly lets labelers pre-label assets to confirm, reject, or edit annotations, rather than labeling from scratch.

“While software is built with code, AI is built with data. Algorithms and compute power have now been commoditized, which means the way to differentiate your AI in the market is via your training data,” Rieger told VentureBeat via email. “But converting your proprietary data into revenue-generating AI has been a difficult process, full of delays and false starts. Our training data platform allows organizations to build their own AI ‘data engine’ extremely quickly at significant cost savings.”

B Capital Group led the series C investment Labelbox announced today. Previous investors Andreessen Horowitz, First Round Capital, Gradient Ventures (Google’s AI venture fund), Kleiner Perkins, and ARK Invest CEO Catherine Wood also participated.

VentureBeat

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact. Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more
Become a member