The influence of artificial intelligence on software development is rapidly evolving, with many programming tasks now being handled by a network of AI agents. As developers explore new interfaces for human-AI collaboration, even the most advanced AI labs find it challenging to keep pace.
Currently, the focus is on agentic software development, where AI agents autonomously tackle coding assignments. This trend is exemplified by applications like Claude Code and Cowork. Meanwhile, OpenAI has been enhancing its Codex tool, which initially debuted as a command line utility last April and expanded to a web interface shortly thereafter.
In a significant move, OpenAI has launched a new MacOS application for Codex, which incorporates many of the agentic features that have gained traction over the past year. This application is designed to enable multiple agents to work simultaneously, integrating advanced workflows to enhance productivity. This launch follows closely on the heels of the introduction of GPT-5.2-Codex, OpenAI's most robust coding model to date, which aims to attract users from Claude Code.
"For complex tasks, GPT-5.2 is the most powerful model available," stated CEO Sam Altman during a press briefing. "However, its complexity has made it less user-friendly, so we're excited to offer this capability in a more accessible format."
While Altman's enthusiasm for GPT-5.2 is evident, coding benchmarks present a more nuanced picture. GPT-5.2 currently ranks first on TerminalBench, a metric evaluating AI's performance in command-line programming tasks. However, competing agents like Gemini 3 and Claude Opus have achieved comparable results, indicating a close competition. Similar findings from SWE-bench, which assesses AI's ability to resolve real-world software bugs, show no distinct advantage for GPT-5.2. Nevertheless, measuring agentic applications effectively remains a challenge, and user experiences can vary widely among state-of-the-art models.
The Codex application introduces a variety of new functionalities that OpenAI claims will help it match or even surpass the Claude applications. It allows users to schedule automations that run in the background, with results queued for review upon the user's return. Additionally, users can choose different agent personalities, ranging from pragmatic to empathetic, aligning with their preferred working styles.
Ultimately, the standout feature for OpenAI is the accelerated development pace facilitated by AI. "You can start from scratch and create a sophisticated software solution in just a few hours," Altman remarked. "The speed at which ideas can be transformed into reality is unprecedented."