Friday, December 5, 2025
HomeAIIntroducing the Gemini 2.5 Computer Use model

Introducing the Gemini 2.5 Computer Use model

Published on

spot_img

[ad_1]

Earlier this year, we mentioned that we’re bringing computer use capabilities to developers via the Gemini API. Today, we are releasing the Gemini 2.5 Computer Use model, our new specialized model built on Gemini 2.5 Pro’s visual understanding and reasoning capabilities that powers agents capable of interacting with user interfaces (UIs). It outperforms leading alternatives on multiple web and mobile control benchmarks, all with lower latency. Developers can access these capabilities via the Gemini API in Google AI Studio and Vertex AI.

While AI models can interface with software through structured APIs, many digital tasks still require direct interaction with graphical user interfaces, for example, filling and submitting forms. To complete these tasks, agents must navigate web pages and applications just as humans do: by clicking, typing and scrolling. The ability to natively fill out forms, manipulate interactive elements like dropdowns and filters, and operate behind logins is a crucial next step in building powerful, general-purpose agents.

How it works

The model’s core capabilities are exposed through the new `computer_use` tool in the Gemini API and should be operated within a loop. Inputs to the tool are the user request, screenshot of the environment, and a history of recent actions. The input can also specify whether to exclude functions from the full list of supported UI actions or specify additional custom functions to include.

[ad_2]

Source link

Latest articles

The Future of Monetization in Video Streaming

The video streaming landscape is evolving at a pace that even the most forward-thinking...

Explore Ancient Lands Filled With Secrets

Unveiling Worlds Where Mystery and Adventure Collide Ancient lands have always captivated the human imagination,...

Battle for the Last Crystal Kingdom

In the heart of a world shaped by ancient magic and shifting destinies lies...

The Evolution of Game Controllers Over the Decades

Game controllers have come a long way from their humble beginnings. What started as...

More like this

The Future of Monetization in Video Streaming

The video streaming landscape is evolving at a pace that even the most forward-thinking...

Explore Ancient Lands Filled With Secrets

Unveiling Worlds Where Mystery and Adventure Collide Ancient lands have always captivated the human imagination,...

Battle for the Last Crystal Kingdom

In the heart of a world shaped by ancient magic and shifting destinies lies...