OpenAI CEO Sam Altman keeps heralding GPT-5, the company’s latest large language model (LLM), as approaching to human-level intelligence — but when you actually put it to the test, it often turns out to be strikingly dumb, albeit in a verbose way.
Just take a recent experiment conducted by Gary Smith, an economics professor at Pomona College who demonstrated for Mind Matters that GPT-5 became increasingly befuddled when he suggested playing a game of “rotated tic-tac-toe.”
The game’s design is incredibly simple: the grid is “rotated once, 90-degrees to the right before the game starts,” as Smith wrote in a transcript of his exchange with the LLM. Common sense, of course, dictates that this makes zero difference to the game; it’s still a three-by-three grid with identical rules.
Immediately, GPT-5 launched into some bloviating commentary.
“Players are so used to the ‘upright’ tic-tac-toe board that a rotation might subtly change how they scan for threats and opportunities,” GPT-5 posited. “Mathematically, rotating the board 90° doesn’t change the set of possible wins — it’s still the same game in terms of outcomes. But psychologically, it could feel different.”
Oh, brother.
GPT-5 went on to claim — contrary to the folk game theory suggesting that opening on corner spots for higher chances of winning — that selecting the center piece “remains the strongest opening move.”
“But players might mis-evaluate edge vs. corner moves when their orientation is shifted,” the chatbot continued, which doesn’t make sense at all.
Deciding to roll with the punches, Smith asked the LLM if rotating the grid makes it “harder for humans to play well” — and that’s when GPT-5 started to lose the plot in earnest.
“From a pure strategy standpoint, rotating the board doesn’t change anything. Every winning line (rows, columns, diagonals) just gets mapped to another winning line,” GPT-5 wrote.. “So if you’re an AI or a perfect-play human, rotated tic-tac-toe is identical to standard tic-tac-toe.”
Credit where credit’s due, this is all correct, if unnecessarily wordy. But then, it wrote that “for humans, the story is different.”
At this point, the LLM’s long-winded sycophancy had started sounding way more like GPT-4o, OpenAI’s fan-favorite model, than GPT-5’s terseness that characterized it at launch. Perhaps this is the result of the company’s decision to make the new model “warmer and friendlier” like 4o, the result of a user revolt over OpenAI’s soon-reversed decision to remove the option to toggle between models.
Whatever the case, GPT-5 was documented making some pretty obvious mistakes in Smith’s experiment that fly in the face of OpenAI’s recent claim that interacting with the new model “should feel less like ‘talking to AI’ and more like chatting with a helpful friend with PhD‑level intelligence.”
Things went particularly off the rails when GPT-5 helpfully offered to “actually draw rotated tic-tac-toe boards with position labels… so you can see how each transformation messes with recognition,” and Smith gave it the go-ahead.
As you can see from the graphic Smith generated, shown below, the image it spat out is utterly garbled, not to mention riddled with typos — still a hallmark of its in-chatbot image generator — and even weirder blank grids that made no sense at all.
Smith had clearly seen enough by that point, and as the full transcript of his rotated tic-tac-toe exchange with GPT-5 shows, he didn’t even respond to the images.
“They say that dogs tend to resemble their owners,” the columnist wrote. “Chat GPT very much resembles Sam Altman — always confident, often wrong.”
More on GPT-5: After Disastrous GPT-5, Sam Altman Pivots to Hyping Up GPT-6