OpenAI's new agentic coding model GPT-5-Codex can run independently for 7 hours at a time

OpenAI is rolling out GPT-5-Codex, a new, fine-tuned version of its GPT-5 model designed specifically for software engineering tasks in its AI-powered coding assistant, Codex.

The release is part of a broader set of product updates that OpenAI says will help unify its Codex AI assistant across developers' terminals, integrated development environments (IDEs), GitHub, web, and the ChatGPT app.

As of today, September 15, GPT-5-Codex becomes the default engine for cloud tasks and code review, with developers able to use it locally through the Codex command line interface (CLI) and IDE extension.

The company said it would add the model as an option to the dozens already available in its application programming interface (API) at an unspecified future time, which would enable developers to build their own applications atop it or pipe it into existing ones more easily.

Codex has steadily evolved since the company first introduced its CLI in April and web version in May.

Two weeks ago, OpenAI consolidated these into a single product experience linked to ChatGPT accounts, enabling developers to switch seamlessly between local and cloud environments without losing context.

GPT-5-Codex now extends this unified setup with deeper engineering capabilities — even running "independently for more than 7 hours at a time on large, complex tasks" according to the company.

What GPT-5-Codex Brings to the Table

Unlike GPT-5, which is built as a general-purpose AI model, GPT-5-Codex is optimized for what OpenAI calls “agentic coding," essentially where the AI agent functions as an autonomous colleague to a developer and can go off and code, run code, and make alterations all on its own for up to hours at a time.

It sounds vaguely similar, though distinct enough, from the "agentic swarm" trend that VentureBeat founder Matt Marshall wrote about last week, in which human users can harness multiple agents with defined roles (3 or more) to collaborate on software writing and testing projects all on their own.

As for GPT-5 Codex, OpenAI says it was trained on real-world engineering work, from debugging and adding tests to building projects from scratch and executing large-scale refactors.

The model is also trained specifically for code review, where it can navigate codebases, reason over dependencies, and run tests to validate correctness.

One of the model’s key design features is adaptive thinking: that is, GPT-5-Codex adjusts how long it spends reasoning based on the complexity of the task.

For short interactions, such as bug fixes or quick Q&A, it responds more quickly than earlier versions. For more demanding projects, it is able to run independently for extended periods — in testing, it successfully executed refactor jobs lasting more than seven hours, iterating until it delivered a working implementation.

But don't get it twisted: GPT-5-Codex does not rely on a router to switch between sub-models. Instead, the model itself dynamically decides how much “thinking” to do at training time.

This lets it adjust mid-task if needed, for example revising a plan halfway through execution. For developers, that translates into more consistent handling of both short and long tasks without manual tweaking.

In daily use, OpenAI says engineers have relied on Codex for reviewing pull requests, debugging, adding new features, and refactoring existing code. The company reports that GPT-5-Codex has already been integrated into internal review processes, catching hundreds of issues each day.

Prior to this announcement, Codex was already running on GPT-5, the general-purpose model, for about the past month.

With today’s launch, GPT-5-Codex becomes available as a specialized variant trained specifically for coding tasks inside Codex. This marks a shift from a general AI model powering Codex to one purpose-built for agentic software engineering.

Updates Across Codex

Alongside the new model, OpenAI is releasing enhancements to the broader Codex product:

Codex CLI: Now rebuilt around agentic workflows, the CLI allows developers to attach images like wireframes or screenshots directly, share context, and track progress with built-in to-do lists. It includes simplified approval modes, improved formatting for tool calls and diffs, and support for longer sessions. OpenAI has open-sourced the CLI, which means teams can inspect or extend the harness that connects the model to developer environments. The company sees this as key to helping other organizations align product development with research improvements.
Codex IDE extension: Available in VS Code, Cursor, and similar forks, the extension enables developers to preview local changes and edit code with Codex inside their editor. It maintains context when moving tasks between cloud and local environments, allowing smoother collaboration. According to OpenAI, the IDE extension has quickly become the most popular way to use Codex, reflecting developers’ preference for working directly alongside their code.
Codex cloud environment: Behind the scenes, OpenAI has made infrastructure improvements that reduce median completion times by 90 percent. The cloud agent can now auto-configure its environment, install dependencies as needed, and handle screenshots for UI work. Developers can delegate frontend tasks, and Codex will spin up a browser, inspect progress, and attach screenshots to GitHub pull requests. A new feature launching with GPT-5-Codex also enables Codex to output its own screenshots directly within web tasks, giving developers visual confirmation of results without needing to run code locally.
Code review automation: Once enabled in a GitHub repository, Codex automatically reviews pull requests as they progress, flagging issues and even implementing suggested fixes within the same review thread. Developers can also request specialized reviews, such as checks for security vulnerabilities. Codex’s review process differs from static analyzers in that it takes a full copy of the codebase, reasons across dependencies, and runs tests before posting feedback — a level of diligence that even thorough human reviewers may not always apply.

Developer Experiences

Early customers highlight how Codex is reshaping their workflows. Aaron Wang, Senior Software Engineer at Duolingo, noted that it outperformed other tools in backend code review benchmarks, catching compatibility issues and bugs that others missed.

Richard Masters, VP Data & AI at Virgin Atlantic, described how his team can now leave a simple comment in a pull request and Codex will generate a clean update almost instantly, reducing back-and-forth cycles.

At Cisco Meraki, Tech Lead Tres Wong-Godfrey used Codex to handle a complex refactor for another team’s codebase, freeing him to focus on other priorities while Codex delivered tested, high-quality code.

Kevin Royer, Senior AI/ML Engineer at Vanta, said Codex helps him maintain flow by taking on tasks in the background and returning clean diffs.

Austin Ray, who leads the AI Dev X team at Ramp, reported that Codex caught an active bug other review tools had missed.

Internally, OpenAI teams use Codex across domains like security, frontend, and infrastructure, treating it as a teammate for daily development. Usage of Codex has grown tenfold in the past month, reflecting both external adoption and heavier reliance inside the company itself.

Safety and Controls

OpenAI emphasizes that Codex runs in a sandboxed environment with network access disabled by default, whether in the cloud or locally.

Developers can grant permissions for additional capabilities, such as web search or external tool connections, but are encouraged to review Codex’s output before deploying.

Each task comes with logs, test results, and citations to help ensure accuracy. The company also classifies GPT-5-Codex as "high capability" in sensitive domains such as biology and chemistry, and has implemented safeguards to minimize associated risks.

Procurement teams may note that Codex is not positioned as a replacement for human review but as an additional layer of diligence. OpenAI recommends keeping Codex as a reviewer alongside human engineers, particularly for critical production systems.

Pricing and Availability

Codex is already bundled with OpenAI's paid subscription ChatGPT Plus ($20/month), Pro ($200/month), Business ($25), as well as Edu and Enterprise plans (no public pricing — interested parties need to contact OpenAI for pricing).

Usage scales by tier: Plus, Edu, and Business plans cover a limited number of focused sessions with the new model per week; Pro supports a full week of development across multiple projects; Business offers add-on credits for expanded use; and Enterprise provides a shared credit pool for large teams.

For Plus users, limits are structured around a handful of coding sessions per week, while Pro users should rarely encounter limits in normal use.

OpenAI has acknowledged developer feedback on wanting better visibility into usage caps and is considering more transparent indicators in future updates.

With GPT-5-Codex, OpenAI positions Codex as more than just an autocomplete or debugging tool. By adapting to tasks of different sizes, working autonomously when needed, and integrating tightly into developer workflows, the company aims to provide a consistent coding partner across environments.

Importantly, OpenAI does not intend for Codex to compete with tab-completion tools like GitHub Copilot. Instead, the company views Codex as complementary, with many developers using both — one for keystroke-level acceleration, the other for higher-level task delegation.