Wednesday, October 22, 2025
HomeNewOpenAI’s GPT-5 demo shows error-riddled charts

OpenAI’s GPT-5 demo shows error-riddled charts

Published on

spot_img


FILE PHOTO: OpenAI’s live demo of GPT-5 last night included graphs with errors that users immediately pointed out. 
| Photo Credit: Reuters

OpenAI’s live demo of GPT-5 last night included graphs with errors that users immediately pointed out. A comparison chart showing how accurate GPT-5 was compared to OpenAI’s older AI models, o3 and GPT-4o, displayed 52.8% accuracy (with thinking) as higher than o3’s 69.1%. 

Meanwhile, o3’s 69.1% accuracy was the same level as GPT-4o’s 30.8% on the bar graph. 

Another bar graph showing “Deception Evals across models” showed GPT-5 with thinking attaining 50% but still having a much smaller bar than o3’s 47.4%. 

After several users on X pointed out the errors, CEO Sam Altman responded to them saying, “wow a mega chart screwup from us earlier.” He noted that the same chart was accurate on the blog post for the release. 

Users speculated that OpenAI had used their own AI models to generate these graphs but OpenAI hasn’t clarified if that was the case. 

Altman has said called GPT-5 a “PhD-level expert” unlike the previous flagship models which felt more like speaking with a “student.”



Source link

Latest articles

Oops! The AWS Outage Took Down Everybody’s Bored Apes

A major outage on Amazon Web Services (AWS), the tech giant’s cloud computing...

Qwen's new Deep Research update lets you turn its reports into webpages, podcasts in seconds

Chinese e-commerce giant Alibaba’s famously prolific Qwen Team of AI model researchers and...

Amazon Resolves Cloud Outage That Roiled Internet

Amazon Web Services said it...

Company Churning Out AI Podcasts Filled With Bizarre Glitches They Didn’t Even Catch

Illustration by Tag Hartman-Simkins / Futurism. Source: Getty Images It’s no secret that...

More like this

Oops! The AWS Outage Took Down Everybody’s Bored Apes

A major outage on Amazon Web Services (AWS), the tech giant’s cloud computing...

Qwen's new Deep Research update lets you turn its reports into webpages, podcasts in seconds

Chinese e-commerce giant Alibaba’s famously prolific Qwen Team of AI model researchers and...

Amazon Resolves Cloud Outage That Roiled Internet

Amazon Web Services said it...