Head over to our on-demand library to view sessions from VB Transform 2023. Register Here
It’s been two years since OpenAI announced the arrival of GPT-3, its seminal natural language processing (NLP) application. Users were blown away by the AI language tool’s uncanny ability to generate insightful, considered and colorful prose — including poems, essays, song lyrics and even detailed manifestos — using the briefest of prompts.
Known as a “foundation model,” OpenAI’s GPT-3 was trained by feeding it practically the entire internet, from Wikipedia to Reddit to the New York Times and everything in between. It uses this vast dataset to predict what words are the most plausible given any prompt. Given the scale of such a vast research undertaking, there are only a handful of these foundation models. Others include Meta’s RoBERTa and Google’s BERT, plus others developed by startups, such as AI21.
Almost all commercial AI text generation applications, from blog creation to headline generators, are built on top of one of these foundation models, via an API. The foundation models are the tracks, and the commercial applications are the trains that run along them. And the traffic on these tracks is starting to build, fueled by recent VC investment. This has included an $11 million Series A round for AI copywriting tool Copy.ai, $10 million in seed funding for AI content generator Copysmith, and a $21 million Series A round for AI writing assistant writer.com.
Given the huge amount of AI text gen use cases now being catered to in the marketing and communication industries, some of the everyday content that marketing and comms teams produce is now being generated by AI, including ad copy, social media captions and blog posts.
VB Transform 2023 On-Demand
Did you miss a session from VB Transform 2023? Register to access the on-demand library for all of our featured sessions.
While much of this is fairly prosaic and raises few concerns — if AI can write search ad titles that increase clicks, then so be it — legitimate concerns over the effect of this technology in other uses need to be acknowledged and addressed by those of us developing these platforms.
Scale alone doesn’t equal success
At the heart of the matter for many is the tremendous opportunity for scale that this technology provides, and what the consequences of this could be. When thinking about this, it’s useful to distinguish between short-form content and long-form content. The negative consequences from organizations scaling short-form content creation, such as ad copy or landing page copy, is negligible. Companies can reduce costs and improve conversions, with few downsides. It’s in the realm of longer-form content that issues arise.
At the lowest end of the long-form content food chain, such as travel and lifestyle blogs, there may seem to be little harm in using natural language processing (NLP) to plumb vast datasets to generate blogs for SEO. After all, is there any difference between NLP using its dataset to generate a 500-word blog post on the “5 best things to do in Denver” and a human writer researching (and slightly plagiarizing) the first few results on Google? Not much, most likely.
But does the world need more “meh” content, written by someone (or something) with no actual expert knowledge of the subject matter? And who gets any benefit from this apart from the platform that’s being paid to generate it? True, we’ve had this problem already, with content mills churning out articles by writers with zero knowledge of what they’re writing about. But the cost and time constraints of human writers has kept at least a partial lid on this. Removing these constraints could open the floodgates.
Google addressing AI text generation
Google is clearly planning for this, as is evident in some of its recent search updates. It recently announced the rollout of its “Helpful Content” algorithm update that will devalue in its search rankings content with “low added value,” among other things. Meanwhile, Google has also gradually been increasing the importance of what it terms E-A-T — Expertise, Authoritativeness and Trustworthiness — when evaluating content. In a nutshell, Google is focusing more on the demonstrable expertise of the author and the authority and trustworthiness of the website.
To put it another way, Google is placing greater emphasis on content written by proven subject matter experts, which offers unique insights, findings or other added value published on websites with demonstrably robust editorial policies. Since NLP tools generate content based on what’s already been written about a topic, they’ll struggle to provide anything unique. So while it’s never been easier to generate a blog post on literally any topic, the bar has never been higher for getting a blog to the first page of Google.
Marketers must understand this new paradigm and use AI text gen tools where they add value and avoid using them in areas where they don’t — for example, to generate optimized ad and sales copy, or content briefs, rather than relying on them to generate longer-form technical blog content.
Automation is not a replacement for expertise
Another area of communications where we’re seeing rapid growth of AI text gen tools is PR, with a number of NLP-powered media pitching platforms now on the market. These range from light personalization based on the target recipient’s Linkedin profile to full-on pitch writing on behalf of the company.
Yet, I speak from direct experience when I say that it’s vital that platform developers and users fully appreciate the problem these platforms have set out to solve and the part of the process where there is no replacement for human expertise. The role of these platforms is essentially to remove friction between an organization or an individual and the media, helping users to send pitches to journalists with more speed and accuracy. Or in other words, to act as a door opener.
Users must still be subject matter experts on the topics they are pitching to journalists, and they must be offering valuable insights in their pitches, rather than simply spamming reporters. Platform developers have some control over the latter, by setting sensible limits on how often a reporter can be contacted, while the former is the responsibility of the users.
And platform developers must educate users on the responsible use of these platforms. There is only negative value in connecting a journalist with a user, if that user is then not qualified to talk about the topic at hand. They’ll quickly be found out and they’ll damage their personal and professional reputations within media circles.
A case for the human editor
If language models were ever to achieve true sentience, one of their defining personality traits would be that of a pathological liar. As anyone who experiments with one of these models will quickly realize, they make it up as they go along, generating whatever is the most plausible response to a given prompt.
It’s not difficult, therefore, to envision some of the problems of AI-generated text being left unchecked across numerous use cases. For example, content providing inaccurate financial or health advice could lead to serious harm.
This is why the role of the human editor will need to grow in importance, as an ever greater amount of content is AI-generated. Editors will be crucial within both media outlets and regular organizations. What’s more, the role of the editor will need to evolve to focus more on fact-checking and verification. A large focus of an editor’s job will also be on training AI models to the desired tone of voice, technical level and narrative style, in much the same way they train human writers.
Ultimately, we’re still only at the very beginning of the adoption curve of AI text generation technology. It’s all but inevitable that usage will become mainstream throughout this decade, given the technology’s enormous ability to scale and achieve efficiencies. What we can expect to see, in reaction to this, is even more value placed on genuine subject matter experts, who are providing original and unique content. And those who can leverage AI text gen tools in the right use cases, to remove friction and to scale, will benefit the most.
Welcome to the VentureBeat community!
DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation.
If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers.
You might even consider contributing an article of your own!