Microsoft, GPT-3, and the future of OpenAI

One of the biggest highlights of Build, Microsoft’s annual software development conference, was the presentation of a tool that uses deep learning to generate source code for office applications. The tool uses GPT-3, a massive language model developed by OpenAI last year and made available to select developers, researchers, and startups in a paid application programming interface.

Many have touted GPT-3 as the next-generation artificial intelligence technology that will usher in a new breed of applications and startups. Since GPT-3’s release, many developers have found interesting and innovative uses for the language model. And several startups have declared that they will be using GPT-3 to build new or augment existing products. But creating a profitable and sustainable business around GPT-3 remains a challenge.

Microsoft’s first GPT-3-powered product provides important hints about the business of large language models and the future of the tech giant’s deepening relation with OpenAI.

A few-shot learning model that must be fine-tuned?

Microsoft uses GPT-3 to translate natural language commands to data queries

According to the Microsoft Blog, “For instance, the new AI-powered features will allow an employee building an e-commerce app to describe a programming goal using conversational language like ‘find products where the name starts with “kids.”’ A fine-tuned GPT-3 model [emphasis mine] then offers choices for transforming the command into a Microsoft Power Fx formula, the open source programming language of the Power Platform.”

I didn’t find technical details on the fine-tuned version of GPT-3 Microsoft used. But there are generally two reasons you would fine-tune a deep learning model. In the first case, the model doesn’t perform the target task with the desired precision, so you need to fine-tune it by training it on examples for that specific task.

In the second case, your model can perform the intended task, but it is computationally inefficient. GPT-3 is a very large deep learning model with 175 billion parameters, and the costs of running it are huge. Therefore, a smaller version of the model can be optimized to perform the code-generation task with the same accuracy at a fraction of the computational cost. A possible tradeoff will be that the model will perform poorly on other tasks (such as question-answering). But in Microsoft’s case, the penalty will be irrelevant.

In either case, a fine-tuned version of the deep learning model seems to be at odds with the original idea discussed in the GPT-3 paper, aptly titled, “Language Models are Few-Shot Learners.”

Here’s a quote from the paper’s abstract: “Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art finetuning approaches.” This basically means that, if you build a large enough language model, you will be able to perform many tasks without the need to reconfigure or modify your neural network.

So, what’s the point of the few-shot machine learning model that must be fine-tuned for new tasks? This is where the worlds of scientific research and applied AI collide.

Academic research vs commercial AI

There’s a clear line between academic research and commercial product development. In academic AI research, the goal is to push the boundaries of science. This is exactly what GPT-3 did. OpenAI’s researchers showed that with enough parameters and training data, a single deep learning model could perform several tasks without the need for retraining. And they have tested the model on several popular natural language processing benchmarks.

But in commercial product development, you’re not running against benchmarks such as GLUE and SQuAD. You must solve a specific problem, solve it ten times better than the incumbents, and be able to run it at scale and in a cost-effective manner.

Therefore, if you have a large and expensive deep learning model that can perform ten different tasks at 90 percent accuracy, it’s a great scientific achievement. But when there are already ten lighter neural networks that perform each of those tasks at 99 percent accuracy and a fraction of the price, then your jack-of-all-trades model will not be able to compete in a profit-driven market.

Here’s an interesting quote from Microsoft’s blog that confirms the challenges of applying GPT-3 to real business problems: “This discovery of GPT-3’s vast capabilities exploded the boundaries of what’s possible in natural language learning, said Eric Boyd, Microsoft corporate vice president for Azure AI. But there were still open questions about whether such a large and complex model could be deployed cost-effectively at scale to meet real-world business needs [emphasis mine].”

And those questions were answered with the optimization of the model for that specific task. Since Microsoft wanted to solve a very specific problem, the full GPT-3 model would be an overkill that would waste expensive resources.

Therefore, the plain vanilla GPT-3 is more of a scientific achievement than a reliable platform for product development. But with the right resources and configuration, it can become a valuable tool for market differentiation, which is what Microsoft is doing.

Microsoft’s advantage

In an ideal world, OpenAI would have released its own products and generated revenue to fund its own research. But the truth is, developing a profitable product is much more difficult than releasing a paid API service, even if your company’s CEO is Sam Altman, the former President of Y Combinator and a product development legend.

And this is why OpenAI enrolled the help of Microsoft, a decision that will have long-term implications for the AI research lab. In July 2019, Microsoft made a $1 billion investment in OpenAI—with some strings attached.

From the OpenAI blog post that declared the Microsoft investment: “OpenAI is producing a sequence of increasingly powerful AI technologies, which requires a lot of capital for computational power. The most obvious way to cover costs is to build a product, but that would mean changing our focus [emphasis mine]. Instead, we intend to license some of our pre-AGI technologies, with Microsoft becoming our preferred partner for commercializing them.”

Alone, OpenAI would have a hard time finding a way to enter an existing market or create a new market for GPT-3.

On the other hand, Microsoft already has the pieces required to shortcut OpenAI’s path to profitability. Microsoft owns Azure, the second-largest cloud infrastructure, and it is in a suitable position to subsidize the costs of training and running OpenAI’s deep learning models.

But more importantly—and this is why I think OpenAI chose Microsoft over Amazon—is Microsoft’s reach across different industries. Thousands of organizations and millions of users are using Microsoft’s paid applications such as Office, Teams, Dynamics, and Power Apps. These applications provide perfect platforms to integrate GPT-3.

Microsoft’s market advantage is fully evident in its first application for GPT-3. It is a very simple use case targeted at a non-technical audience. It’s not supposed to do complicated programming logic. It just converts natural language queries into data formulas in Power Fx.

This trivial application is irrelevant to most seasoned developers, who will find it much easier to directly type their queries than describe them in prose. But Microsoft has plenty of customers in non-tech industries, and its Power Apps are built for users who don’t have any coding experience or are learning to code. For them, GPT-3 can make a huge difference and help lower the barrier to developing simple applications that solve business problems.

Microsoft has another factor working to its advantage. It has secured exclusive access to the code and architecture of GPT-3. While other companies can only interact with GPT-3 through the paid API, Microsoft can customize it and integrate it directly into its applications to make it efficient and scalable.

By making the GPT-3 API available to startups and developers, OpenAI created an environment to discover all sorts of applications with large language models. Meanwhile, Microsoft was sitting back, observing all the different experiments with growing interest.

The GPT-3 API basically served as a product research project for Microsoft. Whatever use case any company finds for GPT-3, Microsoft will be able to do it faster, cheaper, and with better accuracy thanks to its exclusive access to the language model. This gives Microsoft a unique advantage to dominate most markets that take shape around GPT-3. And this is why I think most companies that are building products on top of the GPT-3 API are doomed to fail.

The OpenAI Startup Fund

_{Microsoft CEO Satya Nadella (left) and OpenAI CEO Sam Altman (right) at Microsoft Build 2021}

Microsoft CEO Satya Nadella (left) and OpenAI CEO Sam Altman (right) at Microsoft Build 2021

And now, Microsoft and OpenAI are taking their partnership to the next level. At the Build Conference, Altman declared a $100 million fund, the OpenAI Startup Fund, through which it will invest in early-stage AI companies.

“We plan to make big early bets on a relatively small number of companies, probably not more than 10,” Altman said in a prerecorded video played at the conference.

What kind of companies will the fund invest in? “We’re looking for startups in fields where AI can have the most profound positive impact, like healthcare, climate change, and education,” Altman said, to which he added, “We’re also excited about markets where AI can drive big leaps in productivity like personal assistance and semantic search.” The first part seems to be in line with OpenAI’s mission to use AI for the betterment of humanity. But the second part seems to be the type of profit-generating applications that Microsoft is exploring.

Also from the fund’s page: “The fund is managed by OpenAI, with investment from Microsoft and other OpenAI partners. In addition to capital, companies in the OpenAI Startup Fund will get early access to future OpenAI systems, support from our team, and credits on Azure.”

So, basically, it seems like OpenAI is becoming a marketing proxy for Microsoft’s Azure cloud and will help spot AI startups that might qualify for acquisition by Microsoft in the future. This will deepen OpenAI’s partnership with Microsoft and make sure the lab continues to get funding from the tech giant. But it will also take OpenAI a step closer toward becoming a commercial entity and eventually a subsidiary of Microsoft. How this will affect the research lab’s long-term goal of scientific research on artificial general intelligence remains an open question.

Ben Dickson is a software engineer and the founder of TechTalks. He writes about technology, business, and politics.

A few-shot learning model that must be fine-tuned?

Academic research vs commercial AI

Microsoft’s advantage

The OpenAI Startup Fund

More