OpenAI has made its first major foray into AI agents with the release of a research preview for Operator. The AI assistant has the power to make autonomous decisions for users in their web browser, such as navigating web pages, downloading lectures, ordering groceries and combining PDFs.
How does Operator work?
Operator is powered by Computer-Using Agent (CUA) model which is a combination of GPT-4o’s vision capabilties and reasoning abilities from the companies’ more advanced models. OpenAI says CUA can break tasks into multi-step plans and self-correct itself when faced with challenges.
āCUA is trained to interact with graphical user interfaces (GUIs)āthe buttons, menus, and text fields people see on a screenājust as humans do. This gives it the flexibility to perform digital tasks without using OS- or web-specific APIs.ā the Microsoft backed AI startup explained in a blogpost.
Operator is currently available as a research preview for ChatGPT Pro users in the US. The AI Agent can be accessed by navigating to operator.chatgpt.com.
What OpenAI’s AI agent can’t do?
OpenAI says it has implemented certain safeguards that prevent Operator from undertaking a few tasks in order to mitiate the risk posed by a first generation AI agent.
Operator will refuse commands related to āharmful tasksā and āillegal or regulated activitiesā. It is also barred from accessing gambling, adult and drug/gun related websites. Moreover, OpenAI says Operator will also decline certain high risk tasks like banking transactions and tasks requiring āsensitive decision making.ā
The CUA model running Operator has also been trained to asks for user confirmations before finalizing tasks that could have some serious repercussions like submitting an order or sending an email.
What Sam Altman had said about AI agents?
At the start of the year, OpenAI CEO Sam Altman had made a bold prediction on the future of AI Agents, stating the new technology will ājoin the workforceā this year.
āWe are now confident we know how to build AGI as we have traditionally understood it. We believe that, in 2025, we may see the first AI agents ājoin the workforceā and materially change the output of companies.ā Altman had wrote in his blogpost.
However, Altman had sounded positive about the impact of AI Agents, writing, āWe continue to believe that iteratively putting great tools in the hands of people leads to great, broadly-distributed outcomes.ā