Computer vision-powered workplace safety systems could lead to bias and other harms

Increasingly, AI is being pitched as a way to prevent the estimated over 340 million workplace accidents that occur worldwide every day. Using machine learning, startups are analyzing camera feeds from industrial and manufacturing facilities to spot unsafe behaviors, alerting managers when employees make a dangerous mistake.

But while marketing materials breathlessly highlight their life-saving potential, the technologies threaten to violate the privacy of workers who aren't aware their movements are being analyzed. Companies may disclose to staff that they're subjected to video surveillance in the workplace, but it's unclear whether those deploying -- or providing -- AI-powered health and safety platforms are fully transparent about the tools' capabilities.

Computer vision

The majority of AI-powered health and safety platforms for workplaces leverage computer vision to identify potential hazards in real time. Fed hand-labeled images from cameras, the web, and other sources, the systems learn to distinguish between safe and unsafe events -- like when a worker steps too close to a high-pressure valve.

For example, Everguard.ai, an Irvine, California-based joint venture backed by Boston Consulting Group and SeAH, claims its Sentri360 product lowers incidents and injuries using a combination of AI, computer vision, and industrial internet of thing devices (IIoT). The company's platform, which was developed for the steel industry, ostensibly learns "on the job," improving safety and productivity as it adapts to new environments.

"Before the worker walks too close to the truck or load in the process, computer vision cameras capture and collect data, analyze the data, recognize the potential hazard, and within seconds (at most), notify both the worker and the operator to stop via a wearable device," the company explains in a recent blog post. "Because of the routine nature of the task, the operator and the worker may have been distracted causing either or both to become unaware of their surroundings."

But Everguard doesn't disclose on its website how it trained its computer vision algorithms or whether it retains any recordings of workers. In lieu of this information, how -- or whether -- the company ensures data remains anonymous is an open question, as is whether Everguard requires its customers to notify their employees be notified their movements are analyzed.

"By virtue of data gathering in such diverse settings, Everguard.ai naturally has a deep collection of images, video, and telemetry from ethnographically and demographically diverse worker communities. This diverse domain specific data is combined from bias-sensitive public sources to make the models more robust," Everguard CEO Sandeep Pandya told VentureBeat via email. "Finally, industrial workers tend to standardize on protective equipment and uniforms, so there is an alignment around worker images globally depending on vertical -- e.g. steel workers in various countries tend to have similar 'looks' from a computer vision perspective."

Everguard competitor Intenseye, a 32-person company that's raised $29 million in venture capital, similarly integrates with existing cameras and uses computer vision to monitor employees on the job. Incorporating federal and local workplace safety laws as well as organizations' rules, Intenseye can identify 35 kinds of scenarios within workplaces, including the presence of personal protective equipment, area and vehicle controls, housekeeping, and various pandemic control measures.

"Intenseye's computer vision models are trained to detect ... employee health and safety incidents that human inspectors cannot possibly see in real time. The system detects compliant behaviors to track real-time compliance scores for all use cases and locations," CEO Sercan Esen told VentureBeat via email. "The system is live across over 15 countries and 40 cities, having already detected over 1.8 million unsafe acts in 18 months."

When Intenseye spots a violation, health and safety professionals receive an alert immediately via text, smart speaker, smart device, or email. The platform also takes an aggregate of compliance within a facility to generate a score and diagnose potential problem areas.

Unlike Everguard, Intenseye is transparent about how it treats and retains data. On its website, the company writes: "Camera feed is processed and deleted on the fly and never stored. Our system never identifies people, nor stores identities. All the output is anonymized and aggregated and reported by our dashboard and API as visual or tabular data. We don't rely on facial recognition, instead taking in visual cues from all features across the body."

"Our main priority at intenseye is to help save lives but a close second is to ensure that workers' privacy is protected," Esen added. "Our AI model is built to blur out the faces of workers to ensure anonymity. Privacy is, and will continue to be, a top priority for Intenseye and it is something that we will not waiver on."

San Francisco, California-based Protex AI claims its workplace monitoring software is "privacy-preserving," plugging into existing CCTV infrastructure to identify areas of high risk based on rules. But public information is scarce. On its website, Protex AI doesn't detail the steps it's taken to anonymize data, or clarify whether it uses the data to fine-tune algorithms for other customers.

Training computer vision models

Computer vision algorithms require lots of training data. That's not a problem in domains with many examples, like apparel, pets, houses, and food. But when photos of the events or objects an algorithm is being trained to detect are sparse, it becomes more challenging to develop a system that's highly generalizable. Training models on small datasets without sufficiently diverse examples runs the risk of overfitting, where the algorithm can't perform accurately against unseen data.

Fine-tuning can address this "domain gap" -- somewhat. In machine learning, fine-tuning involves making small adjustments to boost the performance of an AI algorithm in a particular environment. For example, a computer vision algorithm already trained on a large dataset (e.g., cat pictures) can be tailored to a smaller, specialized corpus with domain-specific examples (e.g., pictures of a cat breed).

Another approach to overcome the data sparsity problem is synthetic data, or data generated by algorithms to supplement real-world datasets. Among others, autonomous vehicle companies like Waymo, Aurora, and Cruise use synthetic data to train the perception systems that guide their cars along physical roads.

But synthetic data isn't the end-all, be-all. Worst case, it can give rise to undesirable biases in the training datasets. A study conducted by researchers at the University of Virginia found that two prominent research-image collections displayed gender bias in their depiction of sports and other activities, showing images of shopping linked to women while associating things like coaching with men. Another computer vision corpus, 80 Million Tiny Images, was found to have a range of racist, sexist, and otherwise offensive annotations, such as nearly 2,000 images labeled with the N-word, and labels like "rape suspect" and "child molester."

Bias can arise from other sources, like differences in the sun path between the northern and southern hemispheres and variations in background scenery. Studies show that even differences between camera models -- e.g., resolution and aspect ratio -- can cause an algorithm to be less effective in classifying the objects it was trained to detect. Another frequent confounder is technology and techniques that favor lighter skin, which include everything from sepia-tinged film to low-contrast digital cameras.

Recent history is filled with examples of the consequences of training computer vision models on biased datasets, like virtual backgrounds and automatic photo-cropping tools that disfavor darker-skinned people. Back in 2015, a software engineer pointed out that the image recognition algorithms in Google Photos were labeling his Black friends as "gorillas." And the nonprofit AlgorithmWatch has shown that Google's Cloud Vision API at one time automatically labeled thermometers held by a Black person as "guns" while labeling thermometers held by a light-skinned person as "electronic devices."

Proprietary methods

Startups offering AI-powered health and safety platforms are often reluctant to reveal how they train their algorithms, citing competition. But the capabilities of their systems hint at the techniques that might've been used to bring them into production.

For example, Everguard's Sentri360, which was initially deployed at SeAH Group steel factories and construction sites in South Korea and in Irvine and Rialto, California, can draw on multiple camera feeds to spot workers who are about to walk under a heavy load being moved by construction equipment. Everguard claims that Sentri360 can improve from experience and new computer vision algorithms -- for instance, learning to detect whether a worker is wearing a helmet in a dimly lit part of a plant.

"A camera can detect if a person is looking in the right direction," Pandya told Fastmarkets in a recent interview.

In the way that health and safety platforms analyze features like head pose and gait, they're akin to computer vision-based systems that detect weapons and automatically charge brick-and-mortar customers for goods placed in their shopping carts. Reporting has revealed that some of the companies developing these systems have engaged in questionable behavior, like using CGI simulations and videos of actors -- even employees and contractors -- posing with toy guns to feed algorithms made to spot firearms.

Insufficient training leads the systems to perform poorly. ST Technologies' facial recognition and weapon-detecting platform was found to misidentify black children at a higher rate and frequently mistook broom handles for guns. Meanwhile, Walmart's AI- and camera-based anti-shoplifting technology, which is provided by Everseen, came under scrutiny last May over its reportedly poor detection rates.

The stakes are higher in workplaces like factory floors and warehouses. If a system were to fail to identify a worker in a potentially hazardous situation because of their skin color, for example, they could be put at risk -- assuming they were aware the system was recording them in the first place.

Mission creep

While the purported goal of computer vision-based workplace monitoring products on the market is health and safety, the technology could be coopted for other, less humanitarian intents. Many privacy experts worry that they'll normalize greater levels of surveillance, capturing data about workers' movements and allowing managers to chastise employees in the name of productivity.

Each state has its own surveillance laws, but most give wide discretion to employers so long as the equipment they use to track employees is plainly visible. There's also no federal legislation that explicitly prohibits companies from monitoring their staff during the workday.

"We support the need for data privacy through the use of 'tokenization' of sensitive information or image and sensor data that the organization deems proprietary," Pandya said. "Where personal information must be used in a limited way to support the higher cause or worker safety, e.g. worker safety scoring for long term coaching, the organization ensures their employees are aware of and accepting of the sensor network. Awareness is generated as employees participate in the training and on-boarding that happens as part of post sales-customer success. Regarding duration of data retention, that can vary by customer requirement, but generally customers want to have access to data for a month or more in the event insurance claims and accident reconstruction requires it."

That has permitted employers like Amazon to adopt algorithms designed to track productivity at a granular level. For example, the tech giant's notorious "Time Off Task" system dings warehouse employees for spending too much time away from the work they're assigned to perform, like scanning barcodes or sorting products into bins. The requirements imposed by these algorithms gave rise to California's proposed AB-701 legislation, which would prevent employers from counting health and safety law compliance against workers' productive time.

"I don't think the likely impacts are necessarily due to the specifics of the technology so much as what the technology 'does,'" University of Washington computer scientist Os Keyes told VentureBeat via email. "[It's] setting up impossible tensions between the top-down expectations and bottom-up practices ... When you look at the kind of blue collar, high-throughput workplaces these companies market towards -- meatpacking, warehousing, shipping -- you're looking at environments that are often simply not designed to allow for, say, social distancing, without seriously disrupting workflows. This means that technology becomes at best a constant stream of notifications that management fails to attend to -- or at worse, sticks workers in an impossible situation where they have to both follow unrealistic distancing expectations and complete their job, thus providing management a convenient excuse to fire 'troublemakers.'"

Startups selling AI-powered health and safety platforms present a positive spin, pitching the systems as a way to "[help] safety professionals recognize trends and understand the areas that require coaching." In a blog post, Everguard notes that its technology could be used to "reinforce positive behaviors and actions" through constant observation. "This data enables leadership to use 'right behaviors' to reinforce and help to sustain the expectation of on-the-job safety," the company asserted.

But even potential customers that stand to benefit, like Big River Steel, aren't entirely sold on the promise. CEO David Stickler told Fastmarkets that he was concerned a system like the one from Everguard would become a substitute for proper worker training and trigger too many unnecessary alerts, which could impede operations and even decrease safety.

"We have to make sure people don't get a false sense of security just because of a new safety software package," he told the publication, adding: "We want to do rigorous testing under live operating conditions such that false negatives are minimized."

Computer vision

Training computer vision models

Proprietary methods

Mission creep

More