Hallucinations are an interesting problem - in both humans and statistical models. If we asked an average person 500 years ago how the universe works, they would have confidently told you the earth is flat and it rests on a giant turtle (or something like that). And that there are very specific creatures - angels and demons who meddle in human affairs. And a whole a lot more which has no grounding in reality.
How did we manage to reduce that type of hallucination?
>> And the problem is more - how can an LLM tell us it doesn't know something instead of just making up good sounding, but completely delusional answers.
I think the mistake lies in the belief that the LLM "knows" things. As humans, we have a strong tendency to anthropomorphize. And so, when we see something behave in a certain way, we imagine that thing to be doing the same thing that we do when we behave that way.
I'm writing, and the machine is also writing, but what I'm doing when I write is very different from what the machine does when it writes. So the mistake is to say, or think, "I think when I write, so the machine must also think when it writes."
We probably need to address the usage of the word "hallucination", and maybe realize that the LLM is always hallucinating.
Not: "When it's right, it's right, but when it's wrong, it's hallucinating." It's more like, "Sweet! Some of these hallucinations are on point!"
There are probably many, but the most glaring one is that LLMs has to write a word every time it thinks, meaning it can't solve a problem before it starts to write down the solution. That is an undeniable limitation of current architectures, it means that the way the LLM answers your question also matches its thinking process, meaning that you have to trigger a specific style of response if you want it to be smart with its answer.
>but I just focus on something and the answer pops into my head.
It's perfectly valid to say "I don't know", because no one really understand these parts of the human mind.
The point here is saying "Oh the LLM thinks word by word, but I have a magical black box that just works" isn't good science, nor is it a good means of judging what LLMs are capable or not capable of.
That's a difficult question to answer, since I must be doing a lot of very different things while thinking. For one, I'm not sure I'm never not thinking. Is thinking different from "brain activity"? We can shut down the model, store it on disk, and boot it back up. Shut down my brain and I'm a goner.
I'm open to saying that the machine is "thinking", but I do think we need more clear language to distinguish between machine thinking and human thinking.
EDIT: I chose the wrong word with "thinking", when I was trying to point out the logical fallacy of anthropomorphizing the machine. It would have been more clear if I had used the word "breathing": When I write I'm breathing, so the machine must also be breathing.
I don’t think that “think” is a wrong word here. I believe people are machines - more complicated than GPT4, but machines nevertheless. Soon GPT-N will become more complicated than any human, and it will be more capable, so we might start saying that whatever humans do when they think is simpler or otherwise inferior to what the future AI models will do when they “think”.
And the problem is more - how can an LLM tell us it doesn't know something instead of just making up good sounding, but completely delusional answers.
Which arguably isn't about being smart, and is only tangentially about less or more (external) knowledge really. It's about self-knowledge.
Going down the first path is about knowing everything (in the form of facts, usually). Which hey, maybe?
Going down the second path is about knowing oneself. Which hey, maybe?
They are not the same.