Why your favourite AI chatbot might need a maths tutor
“8.8 – 8.11 = -0.31”
No, this isn’t the setup for a niche joke. It’s the head-scratcher answer you’ll get from some of the most advanced Large Language Models (LLMs) out there. Yes, really.
I asked ChatGPT and Google Gemini to do some basic subtraction. You know, like a pocket calculator from 1985. And they both confidently spat out: -0.31.
For the record, the correct answer is 0.69 (insert your own wink here).
So what’s going on here? Have these AI geniuses skipped maths class? Are they just taking the mickey? Or is something deeper at play? Spoiler alert: it’s the last one.
1. LLMs: The Shakespeare of Sentences, the Clown of Calculators
LLMs like GPT and Gemini are built for language – words, phrases, vibes. They’re not designed to crunch numbers with laser accuracy.
They don’t calculate. They predict. Specifically, they predict the most likely next token (that’s a fragment of a word or number) based on what came before. So when you throw a maths problem at them, they’re not really solving it – they’re guessing what answer looks right in the context of human-like text.
And when they’ve seen loads of dodgy maths in their training data (and trust me, the internet is full of it), guess what happens?
2. The Token Trap: Why AI Sees Numbers Like Puzzle Pieces
To your calculator, “8.11” is a number. To an LLM? It’s a chopped-up string of symbols – something like [“8”, “.”, “11”] or [“8”, “.”, “1”, “1”]. Depends on how it’s been trained.
Now imagine trying to subtract puzzle pieces. You’re not working with values – you’re juggling fragments. Drop one or misplace it, and the whole answer goes sideways. Literally.
3. Why -0.31 Isn’t Random – It’s Systematic (and a Bit Sad)
What’s fascinating is that both GPT and Gemini give the same wrong answer. That’s not coincidence – it’s architectural déjà vu. These models were built differently, but trained on very similar data. So they pick up the same bad habits. Like kids copying off the same dodgy maths worksheet.
And they really love patterns. If a certain sequence – like “8.8 – 8.11 = -0.31” – pops up enough times in their training data (correct or not), that pattern gets reinforced. Truth has nothing to do with it.
4. So, Should We Panic?
Nope. But we should be smart.
LLMs are amazing at human-sounding chat. But for things like arithmetic? Trust but verify. Or better yet, let them hand over the maths bit to a tool that actually knows how to subtract.
This is already happening – advanced models now use external tools like Python to double-check their answers. But until that becomes universal, just remember:
If you’re using a chatbot to do your taxes, may the refund gods be with you.
5. In Conclusion: Don’t Fire the AI, Just Don’t Make It Your Accountant
The “8.8 – 8.11 = -0.31” blunder is more than a blooper – it’s a lesson in how LLMs work under the hood. These systems are text-generation engines, not number-crunching savants.
Want precision? Pair your LLM with a real calculator (or Python script). Want wit, insight, and beautifully phrased nonsense? The AI’s your mate.
Now, if you’ll excuse me, I’m off to teach ChatGPT long division using interpretive dance and sarcasm.