Saturday, October 18, 2025
HomeAIIntroducing the Gemini 2.5 Computer Use model

Introducing the Gemini 2.5 Computer Use model

Published on

spot_img


Earlier this year, we mentioned that we’re bringing computer use capabilities to developers via the Gemini API. Today, we are releasing the Gemini 2.5 Computer Use model, our new specialized model built on Gemini 2.5 Pro’s visual understanding and reasoning capabilities that powers agents capable of interacting with user interfaces (UIs). It outperforms leading alternatives on multiple web and mobile control benchmarks, all with lower latency. Developers can access these capabilities via the Gemini API in Google AI Studio and Vertex AI.

While AI models can interface with software through structured APIs, many digital tasks still require direct interaction with graphical user interfaces, for example, filling and submitting forms. To complete these tasks, agents must navigate web pages and applications just as humans do: by clicking, typing and scrolling. The ability to natively fill out forms, manipulate interactive elements like dropdowns and filters, and operate behind logins is a crucial next step in building powerful, general-purpose agents.

How it works

The model’s core capabilities are exposed through the new `computer_use` tool in the Gemini API and should be operated within a loop. Inputs to the tool are the user request, screenshot of the environment, and a history of recent actions. The input can also specify whether to exclude functions from the full list of supported UI actions or specify additional custom functions to include.



Source link

Latest articles

Regulation on deepfakes soon, two semicon units operational now: IT minister Vaishnaw

The government is going to bring a regulation on deepfakes soon that will...

Woman Wins Court Case by Using ChatGPT as a Lawyer

A woman in California successfully used AI tools, including ChatGPT, to overturn her...

Honor teased it: A smartphone with a giggling, robotic camera arm

Honor’s latest concept is something straight out of sci-fi or maybe Pixar. Imagine...

More like this

Regulation on deepfakes soon, two semicon units operational now: IT minister Vaishnaw

The government is going to bring a regulation on deepfakes soon that will...

Woman Wins Court Case by Using ChatGPT as a Lawyer

A woman in California successfully used AI tools, including ChatGPT, to overturn her...

Honor teased it: A smartphone with a giggling, robotic camera arm

Honor’s latest concept is something straight out of sci-fi or maybe Pixar. Imagine...