What exactly did 2025 AI hallucinate for you? The last time I've seen a hallucin...

j2kun · 2025-07-30T00:27:07 1753835227

If the last time you saw a wrong answer was a year ago, then you are definitely regularly getting them and not noticing.

ziotom78 · 2025-07-30T03:59:25 1753847965

Just a couple of days ago, I submitted a few pages from the PDF of a PhD thesis written in French to ChatGPT, asking it to translate them into English. The first 2-3 pages were perfect, then the LLM started hallucinating, putting new sentences and removing parts. The interesting fact is that the added sentences were correct and generally on the spot: the result text sounded plausible, and only a careful comparison of each sentence revealed the truth. Near the end of the chapter, virtually nothing of what ChatGPT produced was directly related to the original text.

wizzwizz4 · 2025-07-30T09:52:00 1753869120

Transformer models are excellent at translation, but next-token prediction is not the correct architecture for it. You want something more like seq2seq. Next token prediction cares more about local consistency (i.e., going off on a tangent with a self-consistent but totally fabricated "translation") than faithfulness.

majormajor · 2025-07-30T04:20:10 1753849210

I use it every day for work and every day it gets stuff wrong of the "that doesn't even exist" variety. Because I'm working on things that are complex + highly verifiable, I notice.

Sure, Joe Average who's using it to look smart in Reddit or HN arguments or to find out how to install a mod for their favorite game isn't gonna notice anymore, because it's much more plausible much more often than two years ago, but if you're asking it things that aren't trivially easy for you to verify, you have no way of telling how frequently it hallucinates.

physix · 2025-07-30T00:17:31 1753834651

I had Google Gemini 2.5 Flash analyse a log file and it quoted content that simply didn't exist.

It appears to me like a form of decoherence and very hard to predict when things break down.

People tend to know when they are guessing. LLMs don't.

kristofferR · 2025-07-30T00:04:00 1753833840

Nah it's not that rare.

This is one I got today:

https://chatgpt.com/share/6889605f-58f8-8011-910b-300209a521...

(image I uploaded: http://img.nrk.no/img/534001.jpeg)

The correct answer would have been Skarpenords Bastion/kruttårn.

Avicebron · 2025-07-30T00:54:18 1753836858

OpenAI's o3/40 models completely spun out when I was trying to write a tiny little TUI with ratatui, couldn't handle writing a render function. No idea why, spent like 15 minutes trying to get it to work, ended up pulling up the docs..

I haven't spent any money with claude on this project and realistically it's not worth it, but I've run into little things like that a fair amount.

Barbing · 2025-07-30T06:59:53 1753858793

>Thanks all for the replies, we’re hardcoding fixes now

-LLM devcos

Jokes aside, get deep into the domains you know. Or ask to give movie titles based on specific parts of uncommon films. And definitely ask for instructions using specific software tools (“no actually Opus/o3/2.5, that menu isn’t available in this context” etc.).

andsoitis · 2025-07-30T01:21:25 1753838485

For starters, lots of examples over the last few months where AIs make up stuff when it comes to coding.

A couple of non-programming examples: https://www.evidentlyai.com/blog/llm-hallucination-examples

sigseg1v · 2025-07-30T03:58:15 1753847895

Are you using them daily? I find that maybe 3 or 4 programming questions I ask per day, it simply cannot provide a correct answer even after hand holding. They often go to extreme gymnastics to try to gaslight you no matter how much proof you provide.

For example, today I was asking a LLM about how to configure a GH action to install a SDK version that was just recently out of support. It kept hallucinating on my config saying that when you provide multiple SDK versions in the config, it only picks the most recent. This is false. It's also mentioned in the documentation specifically, which I linked the LLM, that it installs all versions you list. Explaining this to copilot, it keeps doubling down, ignoring the docs, and even going as far as asking me to have the action output the installed SDKs, seeing all the ones I requested as installed, then gaslighting me saying that it can print out the wrong SDKs with a `--list-sdks` command.

apparent · 2025-07-30T05:37:54 1753853874

ChatGPT hallucinates things all the time. I will feed it info on something and have a conversation. At first it's mostly fine, but eventually it starts just making stuff up.

SwtCyber · 2025-07-30T06:58:20 1753858700

I've found that giving it occasional nudges (like reminding it of the original premise) can help keep it on track

fan_tastic · 2025-07-30T12:38:10 1753879090

Ah yes it is a fantastic tool when you manually correct it all the time.

dxdm · 2025-07-30T04:18:54 1753849134

For me, most commonly ChatGPT hallucinates configuration options and command line arguments for common tools and frameworks.

noosphr · 2025-07-30T00:00:05 1753833605

Two days ago when my boomer mother in law tried to justify her anti-cancer diet that killed Steve Jobs. On the bright side my partner will be inheriting soon by the looks of it.

filoleg · 2025-07-30T00:27:42 1753835262

Not defending your mother-in-law here (because I agree with you that it is a pretty silly and maybe even potentially harmful diet), afaik it wasn’t the diet itself that killed Steve Jobs. It was his decision to do that diet instead of doing actual cancer treatment until it was too late.

noosphr · 2025-07-30T03:03:18 1753844598

Given that I've got two people telling me here "ackshually" I guess it may not be hallucinations and just really terrible training data.

Up next - ChatGPT does jumping off high buildings kill you?

>>No jumping off high buildings is perfectly safe as long as you land skillfully.

filoleg · 2025-08-03T18:32:13 1754245933

>>No jumping off high buildings is perfectly safe as long as you land skillfully.

Not really, because no matter how you spin it, the person in your scenario dies.

However, doing Steve Jobs’ diet might actually be fine (or at least not deadly) for an average person. Only as long as they don’t have late-stage pancreatic cancer and don’t decide to forego chemotherapy treatment.

Which is what killed Jobs, not the diet. For all we know, he might’ve been alive today even if he followed the same diet, as long as he also did the chemo treatment.

UltraSane · 2025-07-30T05:32:44 1753853564

Job's diet didn't kill him. Not getting his cancer treated was what killed him.

noosphr · 2025-07-30T05:37:16 1753853836

Yes, we also covered that jumping off buildings doesn't kill people. The landing does.

_flux · 2025-07-30T07:02:18 1753858938

Indeed if you're a base jumper with a parachute, you might survive the landing.

Ackshually, this seems analogous to Job's diet and refusal of cancer treatment! And it was the cancer that put him at the top of the building in the first place.

throwawaylaptop · 2025-07-30T00:27:37 1753835257

The anti cancer diet absolutely works if you want to reduce the odds of getting cancer. It probably even works to slow cancer compared to the average American diet. Will it stop and reverse a cancer? Probably not.

paulryanrogers · 2025-07-30T01:10:44 1753837844

I thought it was high fiber diets that reduce risk of cancer (ever so slightly), because of reduced inflammation. Not fruity diets, which are high in carbohydrates.

pasc1878 · 2025-07-30T10:33:29 1753871609

Cutting red or preserved meat cuts bowel cancer risk so fruity diets would cut that risk.

AppleBananaPie · 2025-07-30T01:13:25 1753838005

How much does it 'reduce the odds'?

throwawaylaptop · 2025-07-30T04:16:50 1753849010

Idk, I'm not an encyclopedia. You can Google it.

bonzini · 2025-07-30T02:05:43 1753841143

Last week I was playing with the jj VCS and it couldn't even understand my question (how to swap two commits).

tekno45 · 2025-07-29T23:58:51 1753833531

How do you know? its literally non-deterministic.

r3trohack3r · 2025-07-30T00:13:23 1753834403

Most (all?) AI models I work with are literally deterministic. If you give it the same exact input, you get the same exact output every single time.

What most people call “non-deterministic” in AI is that one of those inputs is a _seed_ that is sourced from a PRNG because getting a different answer every time is considered a feature for most use cases.

Edit: I’m trying to imagine how you could get a non-deterministic AI and I’m struggling because the entire thing is built on a series of deterministic steps. The only way you can make it look non-deterministic is to hide part of the input from the user.

davidcbc · 2025-07-30T01:05:42 1753837542

This is an incredibly pedantic argument. The common interfaces for LLMs set their temperature value to non-zero, so they are effectively non-deterministic.

noosphr · 2025-07-30T05:46:19 1753854379

From the good old days: https://152334h.github.io/blog/non-determinism-in-gpt-4/ (that's been a short two years).

Unless something has fundamentally changed since then (which I've not heard about) all sparse models are only deterministic at the batch level, rather than the sample level.

majormajor · 2025-07-30T04:18:04 1753849084

Even after temperature=0 I believe there is some non-determinism at the chip level, similar to https://stackoverflow.com/questions/50744565/how-to-handle-n...

throwaway31131 · 2025-07-30T00:37:27 1753835847

> I’m trying to imagine how you could get a non-deterministic AI

Depends on the machine that implements the algorithm. For example, it’s possible to make ALUs such that 1+1=2 most of the time, but not all the time.

…

Just ask Intel. (Sorry, I couldn’t resist)

tekno45 · 2025-07-30T01:02:38 1753837358

So by default. Its non-deterministic for all non power users.