For International Women’s Day last month in Berlin, some ticket machines used automatic gender recognition, a form of facial recognition software, to give female riders discounts on their tickets.

As well-intentioned as that may seem, AI researcher Os Keyes is worried about how such systems will negatively impact the lives of transgender people or those who do not adhere to strict binary definitions of male or female.

A recipient of the Ada Lovelace Fellowship from Microsoft Research, Keyes served as an expert witness for facial recognition software regulation being considered by lawmakers in the state of Washington and was cited earlier this month by a group of more than two dozen AI researchers who say Amazon should stop selling its facial recognition software Rekognition to law enforcement agencies.

In the instance of, say, rent-stabilized apartment buildings in New York where facial recognition systems are being proposed for entry, poorly made systems could provide negative user experiences for transgender or gender-neutral people who may encounter trouble opening the door. But Keyes also fears such systems could lead to increased encounters with law enforcement that lead to trans people getting discriminated against, overly monitored, or killed.

Keyes is especially concerned because their analysis of historic facial recognition software research found that the industry has rarely considered transgender or gender-fluid people in their work. This has led them to believe facial recognition software is an inherently transphobic technology.

Although the National Institute of Standards and Technology’s (NIST) facial recognition software testing system has been called a gold standard, Keyes staunchly opposes the organization, which is part of the U.S. Department of Commerce.

They take issue with NIST’s mandate to establish federal AI standards detailed in Trump’s executive order, the American AI Initiative. NIST is the wrong organization to lead the creation of federal AI standards, they said, because it employs no ethicists, it has a poor history with matters of gender and race, and its Facial Recognition Vendor Test (FRVT) uses photos of exploited children and others who did not provide their consent.

Research by Keyes and others due out later this year examines the history of facial recognition software and organizations like NIST, some of which was shared last month in a Slate article.

Keyes recently spoke with VentureBeat about the potential dangers of automatic gender recognition (AGR), facial recognition software regulation being considered by the U.S. Senate, and why they believe all governments use of facial recognition software should be banned.

This interview has been edited for brevity and clarity.

VentureBeat: So who are you? Tell me a bit about yourself.

Keyes: So I’m a PhD student at the University of Washington in the Department of Human-Centered Design and Engineering… I mostly work around trying to ground data ethics in the material world, and trying to ground industry in the material world in the sense that ethics should be applicable to what people are actually doing, and what people are actually doing should ideally not kill people.

Most of my work these days is centered around facial recognition, which honestly originated purely by accident, or more accurately through spite.

I read a paper that used facial recognition that made me so mad that I stayed mad for two years, and when I started grad school, I decided to write a paper on it, which I published, and I ended up, as a side effect of that, finding out just enough about the background story and history of facial recognition to get me curious. And to be honest, nobody’s really adequately explored where this technology comes from and what the implications of that are — the infrastructural politics of it all.

And so I’m currently working on a few projects that explore where this technology comes from. Who’s developed it? For what purposes? How’s it being evaluated and treated differently over time? The answer to all of these things are really depressing.

VentureBeat: One of the things that caught my eye recently was this news about getting train tickets in Berlin, and the AGR that’s deployed there. What are your thoughts on that? Are there similar examples that come to mind? Seems like it makes it coherent why AGR can be an issue or a problem.

Keyes: It was for International Women’s Day, which is nice, but it’s an exclusionary stunt that legitimizes the technology.

There are obviously damaging uses of it and other uses that may be well intentioned, but the problem is that those legitimize the technology, like it becomes something familiar, something convenient, and that’s how you make it OK to do something damaging.

Another example I’m aware of is there’s a housing development in Hong Kong that’s about to open that uses gender recognition for gaining admission. Basically, it’s a component of the system working out if someone actually lives there. So I hope you never go out and get a haircut or look different when you come back, because otherwise you’re in trouble.

VentureBeat: Understanding the nightmare scenario for facial recognition software as it relates to it being deployed in communities of color, where predictive policing can be driven by historically biased data — that’s pretty clear to me. I can see the through line to that.

Any specific nightmare scenarios come to mind for you that may come about as a result of automatic gender recognition, or automated bias that could take place? This doesn’t have to be something that exists today, but something that you worry could be on the horizon.

Keyes: Well, since gender recognition is embedded in how facial recognition systems generally tend to work, exactly the same nightmare scenario can make an appearance. Like we know these systems don’t work for trans people, we know that facial recognition generally doesn’t work very well for people of color. Recently NIST put out a report saying they detest that facial recognition systems don’t work the same. This was offered by the head of their facial recognition testing program.

But in 2017 he said one of the hypotheses for the differences in those traits was maybe photos of black people just naturally look much similar to each other than photos of white people, which has to be the most scientific way of saying “I think black people look the same and so does my computer” that I’ve ever heard.

Like, not only did you write that down, you said it, and then you put the slides on the internet. There’s a cascading series of what-the-fucks here.

So anyway, trans people tend to live in overpoliced areas, what tends to be the cheapest areas, because being trans isn’t great for your income. Trans people of color expect to be doubly burdened. And I say that as if it’s entirely hypothetical, but it’s one of the scenarios NIST uses to justify why they were researching this and the use case a lot of research papers point at.

The specific scenario, the worst case scenario, is some bright spark has the idea to fit it to bathrooms, then we can alert an operator when a bloke tries to walk into the women’s bathroom or something.

So they put these cameras in and then they alert an operator when a bloke walks in — except the problem is that it doesn’t work for trans people and it doesn’t work for gender non-conforming people, so in practice when a trans person tries to use the bathroom, an operator gets alerted to respond in some way.

Now maybe they respond by asking for ID, which is still gross and difficult because updating the gender markers on ID cards is hard, particularly for people who are poor; or maybe they respond by calling security or calling the cops.

There are a lot of cases, particularly for trans people of color, of the cops getting called on you just for using the bathroom by someone else in the bathroom.

And that’s even without having an algorithmic system designed to let someone know and be like, “Hey, there’s a trans person in there. You might want to call the police.” To be exceedingly deadpan, the police’s record with trans people of color is not great, so yeah — the worst case scenario is someone tries to go to the bathroom because they just want to piss and they end up shot or arrested or harassed, or shot and then arrested and then harassed.

VentureBeat: Have you had a chance to look at the Commercial Facial Recognition Privacy Act?

Keyes: Yeah, I read it. I think I get what it’s trying to do and [understand] that it’s well intentioned, but I would describe it in a single word as milquetoast. It has no restrictions on law enforcement or government usage at all, and while I appreciate constraining surveillance capitalism for a tiny bit, surveillance capitalism was not the place where this technology started. This technology started for policing, border control, the military, explicitly racialized and gendered things.

I don’t feel much better because the government says we’re listening, and instead of [not] having facial recognition deployed by the police in areas already subject to overpolicing and predictive policing, [at least] you can be assured that your Starbucks won’t be doing the same. Technically that’s improvement, but not much. Honestly, I think an actual bill would constrain the government as well and put in standards and regulatory systems coming from a body which is not one of the usual suspects — is not NIST — but a new entity that has facial recognition experts, ethicists, and people representing overpoliced communities, those subjected to structural oppression by both the industry and the state. And that’s not what the bill provides.

At the same time, it is a start. It is improvable, I think, but we can’t just start by banning the commercial usage and keeping the state usage. This bill needs substantial amendments and strengthening.

I was one of the expert witnesses on the Washington state bill, and it started off banning government from using facial recognition, and two amendment sets later it ended up with “police aren’t allowed to use facial recognition as the only reason they arrested someone.” And that’s it.

And if this is the starting point, I’m cynical about it getting stronger. I’m not used to bills going that way, but I would very much appreciate the Senate surprising me.

I guess I would say I’m used to legislators getting bored, and it would be good to make sure that, if we can get one bill through, that it’s a really good bill.

VentureBeat: Any other essential elements that you think should be included in facial recognition software regulation from the federal government?

Keyes: I think people collecting facial recognition data sets should be obliged to let people who are in it withdraw and should be obliged to ensure they explicitly consent to being in those data sets.

And I’d also strongly argue that the data sets should basically have an expiration date. Like, you shouldn’t be able to keep people’s photos around indefinitely after their death for the purposes of designing surveillance systems.

At the same time, my base opinion is frankly that the technology just needs to be banned. I don’t think there’s a way of making it good. These are all ways of making it less actively bad, and you know, on a pragmatic basis that’s probably the most I allow myself to hope for, but in the long term, I would like to see facial recognition development and usage just made straight-up illegal, because I don’t think this is a technology with redeeming features. Nobody has been able to point me to a use case that directly benefits humanity that can’t be solved with other means.

It’s so obviously ripe for abuse and has already been [so] abused that it’s not worth doing.

VentureBeat: The positive example I hear often, and I’ve used from time to time, is the idea of finding missing kids; I believe it was the New Dehli police department that did a trial to find missing kids that apparently was rather effective. I also think of somebody who has Alzheimer’s wandering away from their home, and other missing-person scenarios. I can think of a million bad examples too, but there’s a couple positive ones.

Keyes: Absolutely. I’m interested in the New Dehli one because the government reports on facial recognition for the last 15 years have said, amongst other things, that it’s consistently bad at identifying children. In fact that’s basically the only thing the entire field of facial recognition agrees on is “facial recognition sucks on kids.” But ok, let’s take the missing person example, right?

Let’s say I want to design and deploy a system for finding missing kids so [that] you can go to the police and say “my kid’s missing, here’s a picture of my kid,” and they go out and put it in the system, and if it gets a hit they go over there, pick the kid up, and check that they’re safe, right?

You have to trust that the parents are never lying. You also have to trust the police would never use it in an untoward manner, which is difficult to imagine, and you have to believe that having built this massive infrastructure for surveillance and control, it wouldn’t then get used for other things.

Because first you’re tracking missing kids and elderly people. Everyone can agree that missing kids or elderly people is a bad thing. But what about terrorists? Well everyone agrees that terrorists are a bad thing.

Ok, so we’ll allow you to track terrorists, but what’s the definition of a terrorist? Does it mean someone who’s committed a crime? Someone who looks a bit funny, and “looks a bit funny” usually means brown, and before long, it slides and slips down.

Once it’s brought into existence, people look for excuses to use it because it’s right there. It’s so much easier, cheaper, and convenient, and I don’t think tracking and surveilling people should be that easy, whoever they are. I don’t think that people should have to live every day constantly observed and monitored by unseen faces all around them, because speaking as a trans person, I already do that, and it’s shit. I don’t think that should be the standard experience of all of humanity.

So yeah, there are definitely some use cases you can point to where it’s like, “yes, in that situation it would help,” but the fact of the matter is that those are a small minority of use cases, and the technology can be put to more use cases than that.

I don’t think the positives outweigh the negatives, and society will get noticeably worse.

VentureBeat: Well, it’s helpful to understand your perspective on this, because not a lot of people say it should be banned outright.

Keyes: Yeah, people seem to be scared to say that, even in academia. I don’t know why.

VentureBeat: So I read some of the Slate article. What do you think is wrong with the NIST facial recognition vendor testing program?

Keyes: I have serious trouble that it exists at all, but there’s sort of two issues: one problem with the program and one problem with NIST. The problem with the program is that the images it uses are deeply worrisome, and are gathered largely without consent. And in some ways, in places and situations where content can’t be gathered and are extremely biased.

So they’re using three data sets. The first is photos from immigrant visas. They’re doing much better on bigotry and bias now, because historically, it was explicitly Mexican immigrant visas. Just in case anyone missed it, the entire point of facial recognition was Border Patrol.

They’re also using a data set of child abuse pictures, specifically child abuse pictures from unsolved cases.

And then the third is the multiple encounters data set. They picked the name because “multiple encounters [deceased subjects]” was a bit grim. It’s mugshots of people who were arrested by the police multiple times and are now dead. And in all these cases, there was no consent in using this data for facial recognition.

Because of the way immigration in this country works, and because of the way policing in this country works, there are massive racial biases in the data sets.

And a lot of these data sets are just kind of gross in the sense that they use pictures of dead people who the government arrested to train government surveillance systems used to arrest people. It’s sort of like a closed loop; it magnifies biases that are already present in the data and the system.

The child abuse one is similarly non-consensual as far as we’re aware, and is almost a re-violation, taking people whose traumatizing experience was not being able to say no and using their data to your purposes without letting them say no.

And the overall issue with NIST and all of this is that under Trump’s new AI executive order, the American AI Initiative, NIST’s job is to develop regulations standards in AI, broadly construed. I don’t know what that covers, but I’m guessing that includes facial recognition, because I’ve never heard of a definition of AI that does not.

To me, the overarching issue of why NIST’s involvement in standardizing regulation of this technology is so terrifying not just because of their long history of developing the software and racism in terms of how this technology is being developed and discussed, but because they don’t have any ethicists. From my perspective, there’s a whole variety of reasons why NIST shouldn’t be doing this, but the first is that they employ zero ethicists.

The … facial recognition system for identifying race was in fact invented in part by the person who headed up NIST’s facial recognition program for 20 years.

VentureBeat: Well if it’s not NIST, what federal agency do you feel like would be an appropriate to be part of the process to figure out what facial recognition software regulation should look like?

Keyes: I can’t point to one where I think they’re doing an amazing job. It shouldn’t be NIST, amongst other things because of NIST’s funding model. It’s a sub agency of the Department of Commerce — technically speaking, its job is to make sure people are making money — and is partially funded by the FBI and by the National Institute of Justice. Asking them to regulate the FBI and commercial use of this technology is sort of like asking you to be your best mate’s boss.

The [FTC’s Bureau of] Consumer Protection … might kind of make sense, but this is wider than just consumers or business. This is also about government malpractice, so I suspect they need to set up a new program entirely.

Actually, there is an organization that would be good for this — the GAO [Government Accountability Office]. I think it would be a good patchwork, a good like “patching a hole in the dam,” and takes advantage of the fact that the government is, in fact, the facial recognition industry’s primary contractor.

The government very much, through its purchasing, sets much of the agenda for facial recognition, and for biometric authentication more generally.

In the long term, like I said, I think this software should be banned, but if we want to put constraints on what it can be used for, and constraints in how it must function, then having purchasing requirements would make sense.

A government agency should have to go to this body when it wants to buy facial recognition and explain what it’s going to use it for and how it’s going to be constrained. Then commercial vendors who want to sell to the government have to go to this agency, explain how the technology works, and be allowed to audit it sort of like FDA style, that kind of thing.

Once you do that — once you have that entity requiring things like transparency in how the system works, deliberate constraints in what it can do — the companies are going to get in line, because if Microsoft can’t sell facial recognition to any part of the federal government, they’re going to either change or stop making facial recognition software.

The GAO might not make sense for that, but it makes the least nonsense of any of the organizations I can currently point to, because I feel like congressional auditing people are like the only people where everyone kind of agrees that they’re actually neutral — or at least they must be because their reports manage to somehow piss everybody off.


VentureBeat spoke to NIST prior to running this interview. A NIST spokesperson claimed the Slate article referenced above had inaccuracies, but the organization did not respond to repeated requests for elaboration.

A NIST spokesperson did, however, answer a few specific questions concerning the sensitive datasets that Keyes referenced. “All of the image datasets used by NIST are used in accordance with Human Subjects Protection regulations,” the spokesperson said. “Most of the data used in the FRVT program is collected by other government agencies per their respective missions. Other datasets are from images of volunteers who have consented to having their images used. In both cases, the data is stripped of identifiable private information before it comes to NIST.”

In specific reference to the images of exploited children, the spokesperson said, “NIST has provided testing assistance to the Department of Homeland Security to help it determine if facial recognition could be used to combat child abuse, through the Child Exploitation Image Analytics program and the Face Recognition Vendor Test.”

When asked if the organization attempted to acquire consent from the individuals used in the FRVT training datasets, the spokesperson said that the FRVT datasets are not used to train the test nor facial recognition algorithms. Instead, “The test provides independent government evaluations of prototype face recognition technologies. These evaluations provide developers, technologists, policy makers and end-users with technical information that can help them determine whether and under what conditions facial recognition technology should be deployed.”