Cord raises $4.5M to automate computer vision annotation processes

Cord, a startup automating annotation processes for computer vision, today announced that it raised $4.5 million in a seed round led by CRV. CEO Eric Landau says that the capital will be put toward expanding Cord's customer base and platform as the company looks to hire additional employees.

Training AI and machine learning algorithms requires plenty of annotated data. But data rarely comes with annotations. The bulk of the work often falls to human labelers, whose efforts tend to be expensive, imperfect, and slow. It’s estimated most enterprises that adopt machine learning spend over 80% of their time on data labeling and management. In fact, in a recent survey conducted by startup CloudFlower, data scientists said that they spend 60% of the time just organizing and cleaning data compared with 4% on refining algorithms.

"The company started when my cofounder, Ulrik Stig Hansen, left his job at JP Morgan to go do a degree program in computer science at Imperial College London," Landau told VentureBeat via email. "My background was in physics as a Ph.D. dropout from Harvard, but I'd been working in quantitative trading where I was putting thousands of models into production. I met Hansen at an entrepreneurship network in London, and over a few pints at the pub, we realized that our perspective from finance could be applied to the process of labeling data."

Cord offers a computer vision annotation platform that automates a number of manual labeling tasks. Its suite of tools is designed for collaboration across roles and teams, from domain-expert annotators to project managers and machine learning engineers.

Cord was cofounded in 2020 by Landau, an ex-Harvard physics dropout, alongside Leeho Lim and Ulrik Stig Hansen. Landau left a job in the fintech industry to start the company, with the goal of applying quantitative finance principles to the data labeling process.

Features

With Cord's web app, users can annotate, classify, and segment images and video as well as perform quality assurance reviews and train "state-of-the-art" models. The platform's automation API lets developers automate data sampling, augmentation, transformation, labeling, and evaluation with custom training data algorithms while the Python SDK trains models, composes data programs, collates training data algorithms, and ingests and processes data.

Cord offers keypoint tracking features that help speed up the annotation process to get to production AI for human pose estimation. Complementary tools let developers create training data for modeling human movement and interaction, while object tracking and interpolation labeling algorithms leverage the temporal features in video data. A dashboard creates labels for object detection and image segmentation, generating unique instance IDs in individual frames. And vector labeling tools allow users to annotate relevant image and video data.

Cord can apply nested classifications, set up label structures with hierarchical relationships, assign custom attributes, and preserve conditional relationships at the individual object level. This helps to keep track of object and classification counts in training data in addition to class and attribute composition, according to Landau.

Growth

Cord is in a category adjacent to companies like Scale AI, which has raised hundreds of millions for its suite of data labeling services, and CloudFactory, which says it offers labelers growth opportunities and "metric-driven" bonuses. That's not to mention Hive, Alegion, Appen, SuperAnnotate, Dataloop, Labelbox, Superb AI, and Cognizant, all of which occupy a global data annotation tools market valued at over $494 million in 2020, according to Grand View Research.

But Cord has managed to nab about a dozen customers including King's College London, a "leading" restaurant automation provider, and a human behavior AI company.

One of Cord's clients, Stanford University's Division of Nephrology, claims to have reduced experiment duration by 80% while processing 3 times more images. Prior to deploying Cord, Stanford was using three different pieces of software to identity, annotate, and count podocytes (kidney cells) and glomeruli (clusters of nerve endings) in microscopy images. After the nephrology group started using Cord's training data platform and SDK to automate segmentations, count, and calculate sizes of segments, it managed to reduce experiment duration from an average of 21 to 4 days.

"Any company that's trying to build an AI model needs a lot of labeled training data to do so. This process is often time-consuming and expensive due to being highly manual with existing tools. Using our [platform], companies are able to generate training data much faster and cheaper while also not having to ship the data anywhere else," Landau said.

Cord, which is based in London, raised $125,000 in a pre-seed raise prior to this latest funding round. Y Combinator, Crane Venture Partners, and the Harvard Management Company participated in this latest round.

Features

Growth

More