VentureBeat presents: AI Unleashed - An exclusive executive event for enterprise data leaders. Network and learn with industry peers. Learn More
GitHub Copilot has been the subject of some controversy since Microsoft announced it in the Summer of 2021. Most recently, Microsoft has been sued by programmer and lawyer Matthew Butterick, who has alleged that GitHub’s Copilot violates the terms of open-source licenses and infringes the rights of programmers. Despite the lawsuit, my sense is that Copilot is likely here to stay in some form or another but it got me thinking: if developers are going to use an AI-assisted code generation tool, it would be more productive to think about how to improve it rather than fighting over its right to exist.
Behind the Copilot controversy
Copilot is a predictive code generator that relies on OpenAI Codex to suggest code — and entire functions — as coders compose their own code. It is much like the predictive text seen in Google Docs or Google Search functions. As you begin to compose a line of original code, Copilot suggests code to complete the line or fragment based on a stored repository of similar code and functions. You can choose to accept the suggestion or override it with your own, potentially saving time and effort.
The controversy comes from Copilot deriving its suggestions from a vast training set of open-source code that it has processed. The idea of monetizing the work of open-source software contributors without attribution has irked many in the GitHub community. It has even resulted in a call for the open-source community to abandon GitHub.
There are valid arguments for both sides of this controversy. The developers who freely shared their original ideas likely did not intend it to end up packaged and monetized. On the other hand, it could be argued that what Microsoft has monetized is not the code but the AI technology for applying that code in a suitable context. Anyone with a free GitHub account can access the code, copy it and use it in their own projects — without attribution. In this regard, Microsoft isn’t using the code any differently from how it has been used all along.
An exclusive invite-only evening of insights and networking, designed for senior enterprise executives overseeing data stacks and strategies.
Taking Copilot to the next level
As someone who has used Copilot and observed how it saves time and increases productivity, I see an opportunity for Microsoft to improve Copilot and address some of the complaints coming from its detractors.
What would enhance the next generation of Copilot is a greater sense of context for its suggestions. To make usable recommendations, Copilot could base them on more than a simple GitHub search. The suggestions could work in the specific context of the code being written. There must be some significant AI technology at work behind the suggestions. This is both the unique value of Copilot and the key to improving it.
Software programmers want to know where the suggestions come from before accepting them, and to understand that the code is a fit for their specific purposes. The last thing we want is to use suggested code that works enough to run when compiled, but is inefficient, or worse, prone to failure or security risks.
By providing more context to its Copilot suggestions, Microsoft could give the coder the confidence to accept them. It would be great to see Microsoft offer a peek into the origin of the suggested code. A trail back to the original source — including some attribution — would achieve this, and also share some of the credit that is due. Just knowing there is a window into the original open-source repository could bring some calm to the open-source community, and would also help Copilot users make better coding decisions as they work. I was pleased to see Microsoft reaching out to the community recently to understand how to build trust in AI-assisted tooling, and I am looking forward to seeing the results of that effort.
As I said, it is hard to imagine that GitHub Copilot is going to go away merely because a portion of its community is upset with Microsoft’s repackaging of their work behind a paywall. But Microsoft would have everything to gain by extending a digital olive branch to the open-source community — while at the same time improving its product’s effectiveness.
Coty Rosenblath is CTO at Katalon.
Welcome to the VentureBeat community!
DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation.
If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers.
You might even consider contributing an article of your own!