OpenAI has recently begun renting Google’s Tensor Processing Unit (TPU) AI chips via Google Cloud infrastructure in its first significant shift away from its previously exclusive reliance on Nvidia hardware, The Information reported.
OpenAI is using the chips for inference, which refers to the provision of AI services after training is complete, as it seeks a more cost-effective way of providing inference, according to The Information‘s report.
The ChatGPT developer previously used Nvidia chips through partnerships with Microsoft and Oracle to train and deploy its models.
AI inference
Google is not providing OpenAI with access to its most powerful TPU chips, suggesting those remain reserved for internal use, the report said.
Google uses the chips to train its own AI models for its Gemini service that competes with OpenAI’s ChatGPT.
It’s unclear whether OpenAI will only use Google’s chips for inference or whether it may expand their use to the more compute-intensive process of training, the report said.
OpenAI’s move is part of a broader strategy by the start-up to decrease its reliance on Microsoft’s Azure cloud infrastructure, which was previously its exclusive provider of data centre capacity under a deal worth more than $13 billion (£9.5bn) the two companies signed in early 2023.
The companies have been renegotiating their relationship as OpenAI seeks to transform into a for-profit public-benefit corporation from a non-profit one and to pave the way to a future IPO, according to previous reports.
Infrastructure diversification
OpenAI has been diversifying its infrastructure in part because of its spiralling demand for computing capacity for its compute-intensive AI models that Microsoft has struggled to supply, according to reports.
Its other efforts include the Stargate AI data centre project also backed by Oracle and SoftBank.
The use of Google chips is a win for the search giant, which has been gradually opening up access to the chips that were until recently reserved for its own internal use.