The Hallucinations of Large Language Models: Can They Be Overcome?

"Hallucinations", a term coined by Google AI researchers in 2018, refer to mistakes in AI-generated text that are plausible but incorrect.

By Daniel Detlaf

One-man flea circus, writer, sci-fi nerd, news junkie and AI tinkerer.

Pssst. Would you like a quick weekly dose of AI news, tools and tips to your inbox? Sign up for our newsletter, AIn't Got The Time.


Have you ever wondered how accurate AI-generated text really is? While ChatGPT has amazed us all with its depth of knowledge and fluency in responses, there’s one major issue holding it back: hallucinations.

Hallucinations, a term coined by Google AI researchers in 2018, refer to mistakes in AI-generated text that are plausible but incorrect or nonsensical. The AI system produces content that looks great but cannot be trusted, creating challenges for applications like OpenAI’s Codex or Github’s Copilot in generating code.

Even high school students have to be cautious when using ChatGPT for book reports or essays, as they may contain erroneous “facts.”

Leading AI Researchers Disagree on Hallucinations

As reported in IEEE Spectrum, OpenAI’s Chief Scientist, Ilya Sutskever, believes that hallucinations can be eliminated over time by improving reinforcement learning with human feedback (RLHF), a technique pioneered by OpenAI and Google’s DeepMind. The iterative process of RLHF involves a human evaluator checking ChatGPT’s responses and updating the AI model based on their feedback.

Deep learning pioneer Yann LeCun has argued that there’s a more fundamental flaw causing hallucinations. He believes large language models (LLMs) need to learn from observation to acquire nonlinguistic knowledge, which is essential for understanding the underlying reality of language.

“Large language models have no idea of the underlying reality that language describes,” Lecun is quoted as saying in the IEEE article. “There is a limit to how smart they can be and how accurate they can be because they have no experience of the real world.”

Sutskever, on the other hand, argues that text already contains all the necessary knowledge about the world. He believes abstract ideas can still be learned from text, given the billions of words used to train LLMs like ChatGPT: “Our pretrained models already know everything they need to know about the underlying reality,” he said.

Mathew Lodge, CEO of Diffblue, reportedly thinks problems can be more accurately solved by “traditional” ML reinforcement learning systems than LLMs at a fraction of the cost, especially when dealing with complex, error-prone tasks. Many engineers have since come to agree that LLMs are best used when errors and hallucinations are not high impact.

It remains to be seen if RLHF can eliminate hallucinations in LLMs. In the meantime, the usefulness of these models in generating precise outputs is still limited.

Sutskever remains optimistic, believing that improved generative models will have a deep understanding of the world as seen through the lens of text.

Updated 5/27/2025

Since that interview, the most effective safeguard has been retrieval‑augmented generation (RAG). That’s where a model is forced to ground each answer in a small bundle of documents pulled from an external search.

Enterprise teams that swapped “pure” GPT outputs for a RAG pipeline reported double‑digit drops in error rates, and researchers found the same in controlled tests. The idea has become mainstream. Microsoft, Google, Amazon, Nvidia, and Cohere all ship RAG toolkits, and Patrick Lewis (one of the original RAG authors) now leads a group at Cohere focused on source‑cited responses.

Nvidia’s Jensen Huang even called hallucinations “very solvable” once RAG is in the loop.

A second line of defense is self‑reflection. After drafting an answer, the model is prompted to critique its own reasoning and regenerate anything it flags as dubious. An EMNLP 2023 paper showed that a single reflection pass cut false statements in medical Q & A by more than 30 percent.

External tool calling attacks the same problem from another angle. A Meta AI research model called Toolformer (pdf paper here) showed that letting an LLM delegate math, code execution, or search queries to specialized APIs noticeably trimmed hallucinations, especially in numeric or domain‑specific answers. Follow‑up work on “reliability alignment” teaches models to decide when a tool call is actually needed, further reducing mistaken invocations.

Finally – and this might be one of the bigger developments in reliability of LLMs in terms of practical applications – developers are taming free‑form text with hard output constraints. By providing a JSON schema or function signature in the system prompt and refusing anything that violates it, applications can reject or auto‑repair ill‑formed content.