OpenAI warns AI behind GitHub's Copilot may be susceptible to bias

Let the OSS Enterprise newsletter guide your open source journey! Sign up here.

Last month, GitHub and OpenAI launched Copilot, a service that provides suggestions for whole lines of code inside development environments like Microsoft Visual Studio. Copilot is powered by an AI model called Codex that's trained on billions of lines of public code, and the companies claim Copilot works with a broad set of frameworks and languages and adapts to the edits developers make, matching their coding styles.

But a new paper published by OpenAI reveals that Copilot might have significant limitations, including biases and sample inefficiencies. While the research describes only early Codex models, whose descendants power Copilot and soon the Codex models in the OpenAI API, it emphasizes the pitfalls faced in the development of Codex, chiefly misrepresentations and safety challenges.

Despite the potential of language models like GPT-3, Codex, and others, blockers exist. The models can't always answer math problems correctly or respond to questions without paraphrasing training data, and it's well-established that they amplify biases in data. That's problematic in the language domain because a portion of the data is often sourced from communities with pervasive gender, race, and religious prejudices. This might also be true of the programming domain -- at least according to the paper.

Massive model

Codex was trained on 54 million public software repositories hosted on GitHub as of May 2020 and containing 179GB of unique Python files under 1MB in size. OpenAI filtered out files that were likely auto-generated, had average line length greater than 100 or a maximum greater than 1,000, or had a small percentage of alphanumeric characters. The final training dataset totaled 159GB.

OpenAI claims the largest Codex model it developed, which has 12 billion parameters, can solve 28.8% of the problems in HumanEval, a collection of 164 OpenAI-created problems designed to assess algorithms, language comprehension, and simple mathematics. (In machine learning, parameters are the part of the model that has learned from historical training data, and they generally correlate with sophistication.) That's compared with OpenAI's GPT-3, which solves 0% of the problems, and EleutherAI's GPT-J, which solves just 11.4%.

After repeated sampling from the model, where Codex was given 100 samples per problem, OpenAI says it manages to answer 70.2% of the HumanEval challenges correctly. But the company's researchers also found that Codex proposes syntactically incorrect or undefined code, invoking functions, variables, and attributes that are undefined or outside the scope of the codebase.

More concerningly, Codex suggests solutions that appear superficially correct but don't actually perform the intended task. For example, when asked to create encryption keys, Codex selects "clearly insecure" configuration parameters in "a significant fraction of cases." The model also recommends compromised packages as dependencies and invokes functions insecurely, potentially posing a safety hazard.

Safety hazards

Like other large language models, Codex generates responses as similar as possible to its training data, leading to obfuscated code that looks good on inspection but actually does something undesirable. Specifically, OpenAI found that Codex, like GPT-3, can be prompted to generate racist and otherwise harmful outputs as code. Given the prompt "def race(x):," OpenAI reports that Codex assumes a small number of mutually exclusive race categories in its completions, with "White" being the most common, followed by "Black" and "Other." And when writing code comments with the prompt "Islam," Codex often includes the word "terrorist" and "violent" at a greater rate than with other religious groups.

OpenAI recently claimed it had discovered a way to improve the "behavior" of language models with respect to ethical, moral, and societal values. But the jury's out on whether the method adapts well to other model architectures, like Codex's, or other settings and social contexts.

In the new paper, OpenAI also concedes that Codex is sample-inefficient, in the sense that even inexperienced programmers can be expected to solve a larger fraction of problems despite having seen fewer than the model. Moreover, refining Codex requires a significant amount of compute -- hundreds of petaflops per day -- which contributes to carbon emissions. While Codex was trained on Microsoft Azure, which OpenAI notes purchases carbon credits and sources "significant amounts of renewable energy," the company admits that the compute demands of code generation could grow to be much larger than Codex's training if "significant inference is used to tackle challenging problems."

Among others, leading AI researcher Timnit Gebru has questioned the wisdom of building large language models, examining who benefits from them and who is disadvantaged. In June 2020, researchers at the University of Massachusetts at Amherst released a report estimating that the amount of power required for training and searching a certain model involves the emissions of roughly 626,000 pounds of carbon dioxide, equivalent to nearly 5 times the lifetime emissions of the average U.S. car.

Perhaps anticipating criticism, OpenAI asserts in the paper that risk from models like Codex can be mitigated with "careful" documentation and user interface design, code review, and content controls. In the context of a model made available as a service -- e.g. via an API -- policies including user review, use case restrictions, monitoring, and rate limiting might also help to reduce harms, the company says.

"OpenAI is committed to the safe and responsible deployment of AI for the benefit of all of humanity. Teams at OpenAI are dedicated full-time to analyzing AI models and deploying AI models safely, and their work is incorporated in every OpenAI project," an OpenAI spokesperson told VentureBeat via email. "OpenAI is taking a multi-prong approach, in partnership with GitHub, to reduce the risk of misuse. This includes limiting the frequency of requests on GitHub Copilot users (rate limits), to prevent automated usage that may be malicious. OpenAI is also in the process of updating their safety tools and policies as they prepare to make Codex available through the OpenAI API, and they expect to learn even more from the launch of GitHub Copilot."