As AI agents like Auto-GPT speed up generative AI race, we all need to buckle up | The AI Beat

If you thought the pace of AI development had sped up since the release of ChatGPT last November, well, buckle up. Thanks to the rise of autonomous AI agents like Auto-GPT, BabyAGI and AgentGPT over the past few weeks, the race to get ahead in AI is just getting faster. And, many experts say, more concerning.

It all started in late March, when developer Toran Bruce Richards, under the name @significantgravitas, launched Auto-GPT, an "experimental open-source application" connected to OpenAI's GPT-4 by API. Running on Python, Auto-GPT had internet access, long/short-term memory and, by stringing together GPT calls in loops, could act autonomously without requiring a human agent to prompt every action. With just a goal in mind — such as preparing a podcast — it could research information online, for example, and then without being prompted take further action towards the goal, like preparing a list of topics and titles.

Then, on March 26, this tweet by Yohei Nakajima went viral, garnering over a million views:

A few days later, Nakajima launched BabyAGI, a "task-driven autonomous agent" that leverages GPT-4, Pinecone's vector search, and LangChainAI's framework to "autonomously create and perform tasks based on an objective" — say, planning and automatically execute a campaign to grow your Twitter following or creating and running a small content marketing business.

Fast-forward a couple of weeks, and now Auto-GPT has more GitHub stars than PyTorch (82K at the moment) and is the "fastest growing GitHub repo in history, eclipsing decade old open source projects in 2 weeks." Fortune says BabyAGI is "taking Silicon Valley by storm" and OpenAI's Andrej Karpathy, who was formerly director of AI at Tesla, called the tools the "next frontier of prompt engineering."

Are AI agents a game-changer?

Jay Scambler, an Oklahoma City-based consultant and strategist building AI tools for small businesses and creatives, told me last week by Twitter message that the tools feel like a game-changer. "I don’t mean to sound dramatic, but we now have the power and responsibility of managing a coordinated team of AI agents at our fingertips without much effort,” he said. “This team doesn’t have fatigue, executes code *almost* flawlessly (depending on who you ask), and can find answers to almost any problem using tools like LangChain.”

Others aren't as optimistic. Nvidia AI scientist Jim Fan tweeted: "I see AutoGPT as a fun experiment, as the authors point out too. But nothing more. Prototypes are not meant to be production-ready. Don't let media fool you --- most of the 'cool demos' are heavily cherry-picked."

Either way (and of course there's a but), at the moment both Auto-GPT and BabyAGI require developer skills and are not accessible to the average ChatGPT user. And even Nicola Bianzino, chief technology officer at EY, told me in an interview that Auto-GPT is "fascinating" — but he admits that he doesn't yet understand the details of how it actually works. This is moving so quickly, he explained, that there are already a host of versions on top of the original. "I don't personally know the different variations that are out there in the wild," he said.

Serious concerns about AI agents in the wild

While the AI agents are "profound," there are also serious concerns. Daniel Jeffries, former chief information officer at Stability AI and managing director of the AI Infrastructure Alliance, told me last week that "the challenge becomes that we don't really know what an error looks like. Currently Auto-GPT fails 15-30% of the time in reasoning, I think we get less tolerant of errors as they become more autonomous.”

And even though the current use cases are limited, as Fortune's article pointed out, there are other risks coming down the pike — including the AI agent's continuous chains of prompts quickly running up substantial bills with OpenAI; the possibility of malicious use cases like cyberattacks and fraud; and the danger of autonomous bots taking action in ways the user didn't intend, including buying items, making appointments or even selling stock.

More AI agent tools are quickly being developed

That doesn't seem to be slowing down the race to develop AI agent tools, however. Last week, for example, HyperWrite, a startup known for its generative AI writing extension, unveiled an experimental AI agent that can browse the web and interact with websites much like a human user.

Matt Shumer, CEO of HyperWrite, said his team is very focused on issues of safety. “We want to figure out the right way to do it, and that’s sort of the common theme through all this, we’re taking our time to do this the right way,” he said.

I also had a chance to speak to the developers behind AgentGPT, a browser-based AI agent launched on April 8 that offers easier access to the non-tech user.

A trio of developers with day jobs worked on autonomous agents in their spare time, with an eye toward use cases for internal tooling. When they saw the explosive popularity of Auto-GPT and BabyAGI, they decided to push out their project and get some feedback. In just nine days, AgentGPT has more than 14,000 stars on GitHub and over 280,000 users, and developers sleeping a couple of hours a night trying to keep it all going.

"It's been pretty crazy," said Srijan Subedi, one of AgentGPT's founders. "We've been doing like 2X every day."

While AgentGPT doesn't yet have web browser capabilities, they say that is something they will implement within a week or two. But the larger vision behind AgentGPT, the developers say, is to go beyond web browsing to integrate the AI agent with other tools — such as Slack, email and even Facebook.

One of AgentGPT's other founders, Adam Watkins, has been helping develop the tool while backpacking in Europe, and says he's been using it to build his own travel itinerary. But he emphasized there are clear limits to what it can do.

"Because this is just a demo version, it doesn't have access to other tools or other platforms," he said. "As they access to these we're going to be paying close attention to exactly what they can do within these systems and providing guardrails to ensure that the actions that they take aren't going to be harmful. One big thing is allowing not only just a log of everything they're doing but keeping humans in the loop — so as you're about to perform actions, you'll be able to take a look and confirm whether or not that's something you really want to do."

Are AI agents just hype and hustle?

Some are saying that the new focus on AI agents is just another example of "hustle bros," with hyperbolic claims by "get-rich schemers" looking to play off the excitement around the potential of these tools.

That may be true — but to me, it seems like the pace of AI development in this space is real. That means it's worth keeping a close eye on, especially as the risks and dangers become crystal clear. It may be impossible to fully keep up with what's going on right now — but with developers starting to run Auto-GPT on their phones, I think we all need to buckle up for a fast ride.