Don’t be fooled: AI is a long way from being able to think for itself

One thing the increased interest in generative artificial intelligence such as ChatGPT has proven beyond doubt is the unquestioned wealth of dystopian fiction humanity has produced.

In 1967 ITV aired The General, the sixth episode of The Prisoner. In it, a new technology called Speed Learn instils a three-year university degree in just three minutes over a TV screen.

At least, that’s what Patrick McGoohan’s Number Six is meant to believe. Instead, all that has been done is that a sterile data set has been uploaded into the minds of the would-be students. There’s no ability to break down what the information means, no way of providing context or analysing it. In short, it is rote learning without the capacity for reason.

Apple’s recent paper The Illusion of Thinking has brought the issue to light for several generative AI platforms including OpenAI’s ChatGPT, Google’s Gemini, and Anthropic’s Claude 3.7. The study went across both large language models (LLMs) and large reasoning models (LRMs).

Irish company developing medical devices that could allow cancer patients to treat themselves at home

The AI assistant helping accountants get fast, reliable tax answers

Counting the cost of the AI boom

The results were illuminating. It found that while memorising oodles of data is relatively easy for LLMs and LRMs alike, learning from that data and using it to reason or essentially think is still beyond their capacities.

The problem is not so much that they failed but in how they tried. These models are, at heart, focused on providing an answer first and foremost. That answer being correct is at best secondary.

There are ways to improve the probability of an accurate response. These include using normal human forms of speech when asking questions, such as please or thank you, or specifying that you only want accurate information and would prefer no answer than the wrong one.

There’s a tendency to trust machines because they lack emotion but that doesn’t mean they are built to be honest or objectively accurate

These are far from sure-fire routes to these models providing a useful response; they merely reduce the chance of error. With the possible exception of the worst people pleasers, the Apple research is a dramatic rebuke to any thoughts that AI as we know it today is close to human-like reasoning.

The research also points questions starkly at us, the users. Life can be many things but it’s often exhausting. Finding a way to make life easier is often welcomed; that concept goes back to the invention of the wheel.

When AI models became open to a wider userbase, the ability to take shortcuts and reduce the onerous nature of some tasks was obviously appealing to many. This wasn’t a matter of people being lazy. Like the wheel, it was a way to make it possible to devote more time to less monotonous matters.

As the amount of data available to these tools increased, the more intelligent they felt to the user because the responses back were more relevant. Moreover, they were programmed to engage in a more conversational tone that made it feel like the user was having a real sounding board to work with.

That’s where the illusion becomes a problem. These programmes are designed to simulate, not solve. They can pore over information available to them but having the kind of multilayered thinking that you or I might use even while getting groceries is not yet on the table.

[ Are fears of an AI slash and burn of white-collar roles well founded?Opens in new window ]

There’s a tendency to trust machines because they lack emotion but that doesn’t mean they are built to be honest or objectively accurate. All of these platforms have one thing in common: humans built them and the biases of their builders naturally bleed into the programming.

AI is getting better at fooling us than it was even six months or a year ago. Whereas once an image of a person could end up with them having three six-fingered left hands, now they tend to merely be in the uncanny valley land – almost human, but just a little bit off.

Fortunately the best pushback against this over-reliance is coming from the would-be end users themselves. Duolingo has been forced to roll back hard on plans to be AI-first.

The language learning app’s founder, Louis von Ahn, said in April that he wanted to phase out contractors and replace them with AI. Anyone who has used even a rudimentary LLM can see an issue here. Although they have improved, they still don’t speak quite like humans.

That was only part of the reason for the backlash from users, and Duolingo has an active user base that feels it has real agency, as the idea that users learn from real native speakers developing the lessons is core to its appeal.

Von Ahn has since backtracked, at least for now. Still it’s important that those seeking to embrace AI realise its limitations. For all it knows, it can’t think.

One day it might be able to do so. That day is further away than its biggest advocates care to admit. Number Six, after all, broke the General with an insoluble question, one that anyone reading this is able to ponder: why?