Codex, Operator and Deep Research: What can the 3 AI agents on ChatGPT do?

OpenAI released its latest AI agent Codex on Friday, marking the third such release by the company this year. While Deep Research and Operator are aimed at wider audience, the new Codex tool looks to do many of the tasks of a software engineer and help users with little coding experience build new tools. Here’s a look at what the 3 AI agents from OpenAI offer.

1) Codex:

Codex is a software engineering agent on ChatGPT that can work on various coding-related tasks at a time. As per OpenAI, the new tool is capable of writing features, fixing bugs and answering questions about the user’s codebase while running each task in its own sandbox (private coding environment).

Codex is powered by a version of OpenAI’s latest reasoning o3 which was optimized for software engineering related tasks. OpenAI says this model was trained using reinforcement learning on real-world coding tasks in a variety of environments to generate code that “closely mirrors human style and PR preferences, adheres precisely to instructions, and can iteratively run tests until it receives a passing result”.

2) Operator:

Operator is powered by Computer-Using Agent (CUA) model which is a combination of GPT-4o’s vision capabilties and reasoning abilities from the companies’ more advanced models. OpenAI says CUA can break tasks into multi-step plans and self-correct itself when faced with challenges.

One of Operator’s key features is its ability to interact seamlessly with graphical user interfaces (GUIs), including buttons, menus, and text fields. It operates within a dedicated browser, allowing it to execute tasks independently while the user focuses on other activities. Additionally, it accepts both text and image inputs, enabling more versatile task management.

Unlike traditional AI assistants, Operator analyses raw pixel data from the screen and interacts using a virtual keyboard and mouse within a controlled sandbox environment.

3) Deep Research:

Deep Research is powered by OpenAI’s latest o3 reasoning model, optimised for web browsing and data analysis. The AI agent searches, interprets and analyses vast amounts of text, images and PDFs on the web to produce a comprehensive report that is close to the level of a research analyst.

Unlike normal searches on ChatGPT, deep research queries will take between 5 and 30 minutes to return a result, and the chatbot will send users a notification when their research is complete.

OpenAI says Deep Research can synthesise hundreds of hours of online sources to produce a report at the level of a research analyst.

Source link