Maybe it requires understanding, maybe there are other ways to get to 'I don't know'. There was a paper posted on HN a few weeks ago that tested LLMs on medical exams, and one interesting thing that they found was that on questions where the LLM was wrong (confidently, as usual), the answer was highly volatile with respect to some prompt or temperature or other parameters. So this might show a way for getting to 'I don't know' by just comparing the answers over a few slightly fuzzied prompt variations, and just ask it to create an 'I don't know' answer (maybe with a summary of the various responses) if they differ too much. This is more of a crutch, I'll admit, arguably the LLM (or neither of the experts, or however you set it up concretely) hasn't learnt to say 'I don't know', but it might be a good enough solution in practice. And maybe you can then use that setup to generate training examples to teach 'I don't know' to an actual model (so basically fine-tuning a model to learn its own knowledge boundary).
> Maybe it requires understanding, maybe there are other ways to get to 'I don't know'.
> This is more of a crutch, I'll admit, arguably the LLM (or neither of the experts, or however you set it up concretely) hasn't learnt to say 'I don't know', but it might be a good enough solution in practice. And maybe you can then use that setup to generate training examples to teach 'I don't know' to an actual model (so basically fine-tuning a model to learn its own knowledge boundary).
When humans say "I know" it is often not narrowly based on "book knowledge or what I've heard from other people".
Humans are able to say "I know" or "I don't know" using a range of tools like self-awareness, knowledge of a subject, experience, common sense, speculation, wisdom, etc.
Ok, but LLMs are just tools, and I'm just asking how a tool can be made more useful. It doesn't really matter why an LLM tells you to go look elsewhere, it's simply more useful if it does than if it hallucinates. And usefulness isn't binary, getting the error rate down is also an improvement.
> Ok, but LLMs are just tools, and I'm just asking how a tool can be made more useful.
I think I know what you're after (notice my self-awareness to qualify what I say I know): that the tool's output can be relied upon without applying layers of human judgement (critical thinking, logical reasoning, common sense, skepticism, expert knowledge, wisdom, etc.)
There are a number of boulders in that path of clarity. One of the most obvious boulders is that for an LLM the inputs and patterns that act on the input are themselves not guaranteed to be infallible. Not only in practive, but also in principle: the human mind (notice this expression doesn't refer to a thing you can point to) has come to understand that understanding is provisional, incomplete, a process.
So while I agree with you that we can and should improve the accuracy of the output of these tools given assumptions we make about the tools humans use to prove facts about the world, you will always want to apply judgment, skepticism, critical thinking, logical evaluation, intuition, etc. depending on the risk/reward tradeoff of the topic you're relying on the LLM for.
Yeah I don't think it will ever make sense to think about Transformer models as 'understanding' something. The approach that I suggested would replace that with rather simple logic like answer_variance > arbitrary_threshold ? return 'I don't know' : return $original_answer
It's not a fundamental fix, it doesn't even change the model itself, but the output might be more useful. And then there was just some speculation how you could try to train a new AI mimicking the more useful output. I'm sure smarter people than me can come up with way smarter approaches. But it wouldn't have to do with understanding - when I said the tool should return 'I don't know' above, I literally meant it should return that string (maybe augmented a bit by some pre-defined prompt), like a meaningless symbol, not any result of anything resembling introspection.
This is a fair question: LLMs do challenge the easy assumption (as made, for example, in Searle's "Chinese Room" thought experiment) that computers cannot possibly understand things. Here, however, I would say that if an LLM can be said to have understanding or knowledge of something, it is of the patterns of token occurrences to be found in the use of language. It is not clear that this also grants the LLM any understanding that this language refers to an external world which operates in response to causes which are independent of what is or might be said about it.
Should it matter how the object of debate interacts and probes the external world? We sense the world through specialized cells connected to neurons. There's nothing to prevent LLMs doing functionally the same thing. Both human brains and LLMs have information inputs and outputs, there's nothing that can go through one which can't go through the other.
A current LLM does not interact with the external world in a way that would seem to lead to an understanding of it. It emits a response to a prompt, and then reverts to passively waiting for the next one. There's no way for it to anticipate something will happen in response, and thereby get the feedback needed to realize that there is more to the language it receives than is contained in the statistical relationships between its tokens. If its model is updated in the interim, it is unaware, afterwards, that a change has occurred.
Explain sora. It must have of course a blurry understanding of reality to even produce those videos.
I think we are way past the point of debate here. LLMs are not stochastic parrots. LLMs do understand an aspect of reality. Even the LLMs that are weaker than sora understand things.
What is debatable is whether LLMs are conscious. But whether it can understand something is a pretty clear yes. But does it understand everything? No.
I do not understand these comments at all. Sora was trained on billions of frames from video and images - they were tagged with words like "ballistic missile launch" and "cinematic shot" and it simply predicts the pixels like every other model. It stores what we showed it, and reproduces it when we ask - this has nothing to do with understanding and everything to do with parroting. The fact that it's now a stream of images instead of just 1 changes nothing about it.
What is the difference between a machine that for all intents and purposes appears to understand something to a degree of 100 percent versus a human?
Both the machine and the human are a black box. The human brain is not completely understood and the LLM is only trivially understood at a high level through the lens of stochastic curve fitting.
When something produces output that imitates the output related to a human that we claim "understands" things that is objectively understanding because we cannot penetrate the black box of human intelligence or machine intelligence to determine further.
In fact in terms of image generation the LLM is superior. It will generate video output superior to what a human can generate.
Now mind you the human brain has a classifier and can identify flaws but try watching a human with Photoshop to try to even draw one frame of those videos.. it will be horrible.
Does this indicate that humans lack understanding? Again, hard to answer because we are dealing with black boxes so it's hard to pinpoint what understanding something even means.
We can however set a bar. A metric. And we can define that bar as humans. all humans understand things. Any machine that approaches human input and output capabilities is approaching human understanding.
> What is the difference between a machine that for all intents and purposes appears to understand something to a degree of 100 percent versus a human?
There is no such difference, we evaluate that based on their output. We see these massive model make silly errors that nobody who understands it would make, thus we say the model doesn't understand. We do that for humans as well.
For example, for Sora in the video with the dog in the windos, we see the dog walk straight through the window shutters, so Sora doesn't understand physics or depth. We also see it drawing the dogs shadow on the wall very thin, much smaller than the dog itself, it obviously drew that shadow as if it was cast on the ground and not a wall, it would have been very large shadow on that wall. The shadows from the shutters were normal, because Sora are used to those shadows being on a wall.
Hence we can say Sora doesn't understand physics or shadows, but it has very impressive heuristics about those, the dog accurately places its paws on the platforms etc, and the paws shadows were right. But we know those were just basic heuristics since the dog walked through the shutters and its body cast shadow in the wrong way meaning Sora only handles very common cases and fails as soon as things are in an unexpected envionment.
>There is no such difference, we evaluate that based on their output. We see these massive model make silly errors that nobody who understands it would make, thus we say the model doesn't understand. We do that for humans as well.
Two things. We also see the model make things that are correct. In fact the mistakes are a minority in comparison to what it got correct. That is in itself an indicator of understanding to a degree.
The other thing is, if a human tried to reproduce that output according to the same prompt, the human would likely not generate something photorealistic and the thing a human comes up with will be flawed, ugly disproportionate wrong and an artistic travesty. Does this mean a human doesn't understand reality? No.
Because the human generates worse output visually than an LLM we cannot say the human doesn't understand reality.
Additionally the majority of the generated media is correct. Therefore it can be said that the LLM understands the majority of the task it was instructed to achieve.
Sora understands the shape of the dog. That is in itself remarkable. I'm sure with enough data sora can understand the world completely and to a far greater degree than us.
I would say it's uncharitable to say sora doesn't understand physics when it gets physics wrong, and that for the things it gets right it's only heuristics.
I understand physics because science has performed a series of measurable experiments over 100s of years resulting in concrete mathematical formulas and theories that explain the laws of physics so that they can be reproduced by machines.
Sora has zero of this knowledge. This is very much allegory of the cave[1].
If Sora sees a series of images that contain impossible physics, for example MC Escher paintings, what will happen?
Sora understands basic motion without mathematics in the same way people typically understand physics.
A person with no knowledge about mathematical formulas will still be able to recognize impossible MC Escher paintings. With enough data, Sora will be able to generate both impossible and possible physics and know the difference. We can already see in the video that it has a rough understanding of it.
If by “understand” you mean “can model reasonably accurately much of the time” then maybe you’ll find consensus. But that’s not a universal definition of “understand”.
For example, if I asked you whether you “understand” ballistic flight, and you produced a table that you interpolate from instead of a quadratic, then I would not feel that you understand it, even though you can kinda sorta model it.
And even if you do, if you didn’t produce the universal gravitation formula, I would still wonder how “deeply” you understand. So it’s not like “understand” is a binary I suppose.
Well what would you need to see to prove understanding? That's the metric here. Both the LLM and the human brain are black boxes. But we claim the human brain understands things while the LLM does not.
Thus what output would you expect for either of these boxes to demonstrate true understanding to your question?
It is interesting that you are demanding a metric here, as yours appears to be like duck typing: in effect, if it quacks like a human...
Defining "understanding" is difficult (epistemology struggles with the apparently simpler task of defining knowledge), but if I saw a dialogue between two LLMs figuring out something about the external world that they did not initially have much to say about, I would find that pretty convincing.
This is a common misunderstanding, one also seen with regard to definitions. When applied to knowledge acquisition, it suffers from a fairly obvious bootstrapping problem, which goes away when you realize that metrics and definitions are rewritten and refined as our knowledge increases. Just look at what has happened to concepts of matter and energy over the last century or so.
You are free to disagree with this, but I feel your metric for understanding resembles the Turing test, while the sort of thing I have proposed here, which involves AIs interacting with each other, is a refinement that makes a step away from defining understanding and intelligence as being just whatever human judges recognize as such (it still depends on human judgement, but I think one could analyze the sort of dialogue I am envisioning more objectively than in a Turing test.)
No it's not a misunderstanding. Without a concrete definition on a metric comparisons are impossible because everything is based off of wishy washy conjectures on vague and fuzzy concepts. Hard metrics bring in quantitative data. It shows hard differences.
Even if the metric is some side marker where in the future is found to have poor correlation or causation with the the thing being measured the hard metric is still valid.
Take IQ. We assume iq measures intelligence. But in the future we may determine that no it doesn't measure intelligence well. That doesn't change the fact that iq tests still measured something. The score still says something definitive.
My test is similar to the Turing test. But so is yours. In the end there's a human in the loop making a judgment call.
This is rather self-contradictory: you insist we can't make progress with wishy-washy conjectures on vague and fuzzy concepts, and yet your entire argument in this thread for your claim that machine understanding of the real world has been achieved is based on exactly that: your personal subjective assessment of LLM performance!
In your final paragraph, you attempt to suggest that my proposed test is no better than the Turing test (and therefore no better than what you are doing), but as you have not addressed the ways in which my proposal differs from the Turing test, I regard this as merely waffling on the issue. In practice, it is not so easy to come up with tests for whether a human understands an issue (as opposed to having merely committed a bunch of related propositions to memory) and I am trying to capture the ways in which we can make that call.
You entered this debate saying "I think we are way past the point of debate here. LLMs are not stochastic parrots. LLMs do understand an aspect of reality", yet your post here ends with "in the end there's a human in the loop making a judgment call", explicitly acknowledging that your strong initial claims are matters of opinion, rather than established facts supported by hard metrics.
>This is rather self-contradictory: you insist we can't make progress with wishy-washy conjectures on vague and fuzzy concepts, and yet your entire argument in this thread for your claim that machine understanding of the real world has been achieved is based on exactly that: your personal subjective assessment of LLM performance!
No it's not. I based my argument on a concrete metric. Human behavior. Human input and output.
> I regard this as merely waffling on the issue.
No offense intended but I disagree. There is a difference but that difference is trivial to me. To LLMs talking is also unpredictable. LLMs aren't machines directed to specifically generate creative ideas, they only do so when prompted. Left to its own devices to generate random text does not necessarily lead to new ideas. You need to funnel got in the right direction.
>You entered this debate saying "I think we are way past the point of debate here. LLMs are not stochastic parrots. LLMs do understand an aspect of reality", yet your post here ends with "in the end there's a human in the loop making a judgment call", explicitly acknowledging that your strong initial claims are matters of opinion, rather than established facts supported by hard metrics.
There are thousands of quantitative metrics. LLMs perform especially well on these. Do I refer to one specifically? No. I refer to them all collectively.
I also think you misunderstood. Your idea is about judging an whether an idea is creative or not. That's too wishy washy. My idea is to compare the output to human output and see if there is a recognizable difference. The second idea can easily be put into an experimental quantitative metric in the exact same way the Turing test does it. In fact, like you said it's basically just a Turing test.
Overall AI has passed the Turing test but people are unsatisfied. Basically they need to just make a harsher Turing test to be convinced. For example have people directly know the possibility that the thing inside a computer is possibly an LLM and not a person and have the person directly investigate to uncover the true identity. If the LLM can successfully decieve the human consistently then that is literally the final bar for me..
What are these "thousands of quantitative metrics" on which you base your latest claims? If you have had them on hand all this while, it seems odd that you have not made use of them so far.
>What are these "thousands of quantitative metrics" on which you base your latest claims? If you have had them on hand all this while, it seems odd that you have not made use of them so far.
Hey no offense but I don't appreciate this style of commenting where you say it's "odd." I'm not trying to hide evidence from you and I'm not intentionally lying or making things up in order to win an argument here. I thought of this as a amicable debate. Next time if you just ask for the metric rather then say it's "odd" that I don't present it that would be more appreciated.
I didn't present evidence because I thought it was obvious. How are LLMs compared with one another in terms of performance? Usually those are done with quantitative tests. You can feed any number of these tests including stuff like the SAT, BAR, ACT, IQ, SATII etc.
Most of these tests aren't enough though as the LLM is remarkably close to human behavior and can do comparably well and even better than most humans. I mean that last statement I made would usually make you think that those tests are enough, but they aren't because humans can still detect whether or not the thing is an LLM with a longer targetted conversation.
The final run is really giving the human with full knowledge of his task a full hour of investigating an LLM to decide whether it's human or a robot. If the LLM can deceive the human that is a hard True/False quantitative metric. That's really the only type of quantitative test left where there is a detectable difference.
I had no intention of implying any malfeasance in my use of the word "odd"; I mean it in the sense of unusual, unexpected and surprising. The thing is, you finishished your precursor post saying, about your tests and mine, that it comes down to there being a human in the loop making a judgement call, but in a follow-on you say that there are thousands of quantitative metrics. Why, I wondered, would that matter, if it comes down to a human making a judgement call? Were you switching to a different line of argument, one that (as far as I could tell) had not been raised before? That's what I found surprising about your claim.
I am still rather confused about how this fits into what you are saying more generally. At first I thought you were saying, in your latest post, that the Turing-test interrogator should be restricted to asking questions from the sets having quantitative metrics in order for it to be an objective process, but that doesn't really hold up, as far as I can see. Frankly, I suspect that the tests with objective metrics are beside the point, and the essence of your position is contained within your final paragraph: "If the LLM can deceive the human [then] that is a hard True/False quantitative metric [and the only sort we can get]."
If so, then (no surprise) I think there are some problems with it, but before I go further, I would like to check that I understand your position.
>I had no intention of implying any malfeasance in my use of the word "odd"; I mean it in the sense of unusual, unexpected and surprising. The thing is, you finishished your precursor post saying, about your tests and mine, that it comes down to there being a human in the loop making a judgement call, but in a follow-on you say that there are thousands of quantitative metrics. Why, I wondered, would that matter, if it comes down to a human making a judgement call? Were you switching to a different line of argument, one that (as far as I could tell) had not been raised before? That's what I found surprising about your claim.
It matters because of humans. If I gave an LLM thousands of quantitative tests and it passed them all but in an hour long conversation a human could identify it was an LLM through some flaw the human would consider all those tests useless. That's why it matters. The human making a judgement call is still a quantitative measurement btw as you can limit human output to True or False. But because every human is different in order to get good numbers you have to do measurements with multitudes of humans.
>I am still rather confused about how this fits into what you are saying more generally. At first I thought you were saying, in your latest post, that the Turing-test interrogator should be restricted to asking questions from the sets having quantitative metrics in order for it to be an objective process, but that doesn't really hold up, as far as I can see.
it can still be objective with a human in the loop assuming the human is honest. What's not objective is a human offering an opinion in the form of a paragraph with no definitive clarity on what constitutes a metric. I realize that elements of MY metric have indeterminism to it, but it is still a hard metric because the output is over a well defined set. Whenever you have indeterminism you would then turn to probability and many samples in order to produce a final quantitative result.
>If so, then (no surprise) I think there are some problems with it, but before I go further, I would like to check that I understand your position.
yes my position is that exactly. If all observable qualities indicate it's a duck, then there's nothing more you can determine beyond that, scientifically speaking. You're implying there is a better way?
At this point, I think it is worth refreshing what the issue here is, which is whether LLMs understand that the language they receive is about an external world, which operates through causes which have nothing to do with token-combination statistics of the language itself.
> It matters because of humans...
I'm still a bit puzzled here, because it seems to me that the paragraph continuing from here is making the argument that LLM performance on these tests doesn't matter, as far as the question is concerned: in this paragraph you seem to be saying (paraphrased) that despite LLMs' impressive performance on these quantitative tests, they could still fail Turing tests, so their performance on these quantitative tests is not decisive.
> yes my position is that exactly…
The impression I get from what you have written in this post is that you are not claiming that a test conforming to your requirements has actually been successfully performed, you are just assuming it could be?
Regardless, let’s assume (at least for the sake of argument) that the series of tests you propose have been performed, and the results are in: in the test environment, humans can’t distinguish current LLMs from humans any better than by chance. How do you get from that to answering the question we are actually interested in? The experiment does not explicitly address it. You might want to say something like “The Turing test has shown that the machines are as intelligent as humans so, like humans, these machines must realize that the language they receive is about an external world” but even the antecedent of that sentence is an interpretation that goes beyond what would have objectively been demonstrated by the Turing test, and the consequent is a subjective opinion that would not be entailed by the antecedent even if it were unassailable. Do you have a way to go from a successful Turing test to answering the question here, which meets your own quantitative and objective standards?
>I'm still a bit puzzled here, because it seems to me that the paragraph continuing from here is making the argument that LLM performance on these tests doesn't matter, as far as the question is concerned: in this paragraph you seem to be saying (paraphrased) that despite LLMs' impressive performance on these quantitative tests, they could still fail Turing tests, so their performance on these quantitative tests is not decisive.
It matters in the quantitative sense. It measures AI performance. What it won't do is matter to YOU. Because you're a human and humans will keep moving the bar to a higher standard right? When AI shot passed the turing test humans just moved the goal posts. So to convince someone like YOU we have to look at the final metric. The point where LLM I/O becomes indistinguishable/superior to humans. Of course you look at the last decade... AI is rapidly approaching that final bar.
>The impression I get from what you have written in this post is that you are not claiming that a test conforming to your requirements has actually been successfully performed, you are just assuming it could be?
Whether I assume or don't assume, the projection of the trendline currently indicates that it will. Given the trendline that is the most probable conclusion.
>The experiment does not explicitly address it.
Nothing on the face of the earth can address the question. Because nobody truly knows what "understanding" something actually is. You can't even articulate the definition in a formal way such that it can be dictated on a computer program.
So I went to the next best possibility, which is my point. The point is ALTHOUGH we don't know what understanding is, we ALL assume humans understand things. So we set that as a bar metric. Anything indistinguishable from a human must understand things. Anything that appears close to a human but is not quite human must understand things ALMOST as well as a human.
> What it won't do is matter to YOU. Because you're a human and humans will keep moving the bar to a higher standard right? When AI shot passed the turing test humans just moved the goal posts. So to convince someone like YOU we have to look at the final metric.
It is disappointing to see you descending into something of a rant here. If you knew me better, you would know that I spend more time debating in opposition to people who think they can prove that AGI/artificial consciousness is impossible than I do with people who think it is already an undeniable fact that it has already been achieved (though this discussion is shifting the balance towards the middle, if only briefly.) Just because I approach arguments in either direction with a degree of skepticism and I don't see any value in trying to call the arrival of true AGI at the very first moment it occurs, it does not mean that I'm trying (whether secretly or openly) to deny that it is possible either in the near-term or at all. FWIW, I regard the former as possible and the latter highly probable, so long as we don't self-destruct first.
> Nothing on the face of the earth can address the question. Because nobody truly knows what "understanding" something actually is. You can't even articulate the definition in a formal way such that it can be dictated on a computer program.
The anti-AI folk I mentioned above would willingly embrace this position! They would say that it shows that human-like intelligence and consciousness lies outside of the scope of the physical sciences, and that this creates the possibility of a type of p-zombie that is indistinguishable by physical science from a human and yet lacks any concept of itself as an entity within an external world.
More relevantly, your response here repeats an earlier fallacy. In practice, concepts and their definitions are revised, tightened, remixed and refined as we inquire into them and gain knowledge. I know you don't agree, but as this is not an opinion but an empirical observation, validated by many cases in the history of science and science-like disciplines, I don't see you prevailing here - and there's the knowledge-bootstrap problem if this were not the case, as well.
It occurred to me this morning that there's a variant or extension of the quantitative Turing test which goes like this:
We have two agents and a judge. The judge is a human and the agents are either a pair of humans, a pair of AIs, or one of each, chosen randomly and without the judge being unaware of the mix. One of the agents is picked, by random choice, to start a discussion with the other with the intent of exploring what the other understands about some topic, with the discussion-starter being given the freedom to choose the topic. The discussion proceeds for a reasonable length of time - let's say one hour.
The judge follows the discussion but does not participate in it. At the conclusion of the discussion, the judge is required to say, for each agent, whether it is more likely that it is a human or AI, and the accuracy of this call is used to assign a categorical variable to the result, just as in the version of the Turing test you have described.
This seems just as quantitative, and in the same way, as your version, yet there's no reason to believe it will necessarily yield the same results. More tests are better, so what's not to like?
>It is disappointing to see you descending into something of a rant here.
I'm going to be frank with you. I'm not ranting and uncharitable comments like this aren't appreciated. I'm going to respond to your reply later in another post, but if I see more stuff like this I'll stop stop communicating with you. Please don't say stuff like that.
I could have, equally reasonably, made exactly the same response to your post. I will do my best to respond civilly (I admit that I have some failings in this regard), but I also suggest that whenever you feel the urge to capitalize the word "you", you give it a second thought.
Apologies, by YOU I mean YOU as a human, not YOU as an individual. Like we all generally feel that the quantitative tests aren't enough. The capitalization was for emphasis for you to look at yourself and know that you're human and likely feel the same thing. Most people would say the stuff like IQ tests aren't enough and we can't pinpoint definitively why, as humans, WE (keyword change) just feel that way.
That feeling is what sets the bar. There's no rhyme or reason behind it. But humans are the one who make the judgement call so that's what it has to be.
The author of the post is saying that understanding something can't be defined because we can't even know how the human brain works. It is a black box.
The author is saying at best you can only set benchmark comparisons. We just assume all humans have the capability of understanding without even really defining the meaning of understanding. And if a machine can mimic human behavior to it must also understand.
That is literally how far we can go from a logical standpoint. It's the furthest we can go in terms of classifying things as either capable of understanding or not capable or close.
What you're not seeing is the LLM is not only mimicking human output to a high degree. It can even produce output that is superior to what humans can produce.
What the author of the post actually said - and I am quoting, to make it clear that I'm not putting my spin on someone else's opinion - was "There's no difference between doing something that works without understanding and doing the exact same thing with understanding."
I'm the author. To be clear. I referred to myself as "the author."
And no I did not say that. Let me be clear I did not say that there is "no difference". I said whether there is or isn't a difference we can't fully know because we can't define or know about what "understanding" is. At best we can only observe external reactions to input.
That was just about guaranteed to cause confusion, as in my reply to solarhexes, I had explicitly picked out "the author of the post to which you are replying", who is cultureswitch, not you, and that post most definitely did make the claim that "there's no difference between doing something that works without understanding and doing the exact same thing with understanding."
It does not seem that cultureswitch is an alias you are using, but even if it is, the above is unambiguously the claim I am referring to here, and no other.
I think there are two axes: reason about and intuit. I "understand" ballistic flight when I can calculate a solution that puts an artillery round on target. I also "understand" ballistic flight when I make a free throw with a basketball.
On writing that, I have an instinct to revise it to move the locus of understanding in the first example to the people who calculated the ballistic tables, based on physics first-principles. That would be more accurate, but my mistake highlights something interesting: an artillery officer / spotter simultaneously uses both. Is theirs a "deeper" / "truer" understanding? I don't think it is. I don't know what I think that means, for humans or AI.
I fail to see how changing the output medium from sentences to movie frames is a difference that I need to account for - the principle is the same either way.
I feel you are missing an important part of my point here. I am not taking a position on whether LLMs can be said to understand anything at all; I am saying that I seriously doubt that LLMs understand that the language they receive refers to an external world.
> I think we are way past the point of debate here. LLMs are not stochastic parrots. LLMs do understand an aspect of reality. Even the LLMs that are weaker than sora understand things.
What is one such aspect? (I'm not asking in order to debate it here, but more because I want to test / research it on my own time)
I pay for chatGPT so it depends on if you pay for that or not. I think it's worth it because whether it understands things or not chatGPT represents a paradigm shift in human history. You'll need it because it's currently the best conversational LLM out there and the one that shows the most compelling evidence.
Basically you just spend a lot of time with chatGPT4 and ask it deep questions that don't exist in it's dataset. get creative. The LLM will output answers that demonstrate a lack of understanding, but it will also demonstrate answers that display a remarkable amount of understanding. Both sets of answers exist and people often cite the wrong answers as evidence for lack of understanding but they're setting bar too high. The fact that many of these answers do demonstrate understanding of concepts makes it very very compelling.
This entire conversation thread I believe does not exist in a parallel form in it's data set. It demonstrates understanding of RPS beyond the confines of text, it demonstrates understanding of simultaneity EVEN when the LLM wholly lives in a world of turn based questions and responses, it understands itself relative to simultaneity, it tries to find solutions around it's own problem, it understands how to use creativity and solutions such as cryptography to solve the problem of RPS when playing with it, it also understands the weaknesses of it's own solutions.
Conversations such as this show that chatGPT displays remarkable understanding of the world. There are conversations that are opposite to this that demonstrate LLMs displaying an obvious lack of understanding. But the existence of these conversation that lack understanding does NOT negate the ones that do demonstrate understanding. The fact that partial understanding even exists is a milestone for AI.
This isn't Anthropomorphism. People are throwing this word trying to get people to recognize their own biases without realizing that it's just demonstrating their own biases. We literally can't even define "understanding" and both LLMs and the human brain are black boxes. Making adamant claims saying that LLMs don't understand anything without addressing this fact is itself a form of bias.
The way I address the problem above is that I just define a bar. I define humans as the bar of "understanding" without defining what understanding means itself. Then if any machine begins approaching this bar in terms of input and output matching human responses, then this is logically indistinguishable from approaching "understanding". That's literally the best metric we have.
> how did LLMs get this far without any concept of understanding? how much further can they go until they become “close enough”?
I don't know that that is quite the right question to ask.
Understanding exists on a spectrum. Even humans don't necessarily understand everything they say or claim (incl. what they say of LLMs!), and then there are things a particular human would simply say "I don't understand".
But when you ask a human "can you understand things?" you will get an unequivocal Yes!
Ask that same question of an LLM and what does it say? I don't think any of them currently respond with a simple or even qualified "Yes". Now, some might claim that one day an LLM will cross that threshold and say "Yes!" but we can safely leave that off to the side for a future debate if it ever happens.
General note: it is worth separating out things like "understanding", "knowledge", "intelligence", "common sense", "wisdom", "critical thinking", etc. While they might all be related in some ways and even overlap, it does not follow that if you show high performance in one that you automatically excel in each of the other. I know many people who anyone would say are highly intelligent but lack common sense, etc.
At the root of the problem, I believe, is that a human (or LLM) saying they understand has little to no bearing on if they actually understand!
People in particular have evolved complex self protective mechanisms to provide the right answers for their given environment for safety reasons, based on a number of different individual strategies. For example, the overly honest, the self depreciating, the questioner, the prosecutor, the victim, the liar, the absent minded professor, the idiot, etc.
LLMs are not that complex or self-referential.
Personally, my guess is that you'd want to build a model (of some kind!) whose sole job is determining the credibility of given string of tokens (similar to what someone else noted in a sibling comment about high answer volatility based on minor input changes - that does sound like a signal of low credibility), and somehow integrate THAT self-referential feedback into the process.
Notably, even the smartest lawyers (or perhaps, especially the smartest lawyers) will have assistants do research once they've set out a strategy so they are sure THEY aren't bullshitting. Same with professors, professional researchers, engineers, etc.
Because until someone goes and actually reads the case law from a credible source, or checks the primary research, or calculates things, it's possible someone was misremembering or just wrong.
Being right more often is not about never having a wrong thought/idea/statement, it's about double checking when you're thinking you might be bullshitting, and NOT saying the bullshit answer until you've checked. Which is proportionally, very expensive. The really good professionals will generate MANY lines of such inquiry in parallel for folks to track down, and then based on their degree of confidence in each one and the expected context the answer will be used in, will formulate the 'most correct' response, which is proportionally even more expensive.
So at least during the process, there would be a signal that the system was likely 'bullshitting'. Which might help it in at least being able to signal when it's answers are low-confidence. (human equivalent of stuttering, looking down and away, looking ashamed haha!)
Every human gets fooled sometimes in at least some venue though.
> At the root of the problem, I believe, is that a human (or LLM) saying they understand has little to no bearing on if they actually understand!
That's certainly one root of the problem, but I would argue that there are multiple roots to this problem!
Humans have further realized that understanding itself is provisional and incomplete, which is quite a remarkable insight (understanding if you will), itself.
In order to do that effectively, an LLM has to itself have understanding. At a certain point, we end up in a metaphysical argument about whether a machine that is capable of responding as if it had understanding actually does have understanding. It ends up being a meaningless discussion.
The students learned to repeat the text of the books, without "understanding" what the books were describing. I'm sure this says something about one side or the other of this conundrum, but I'm not sure which. :-)
The central claim is that a machine which answers exactly the same thing a human would answer given the same input does not have understanding, while the human does.
This claim is religious, not scientific. In this worldview, "understanding" is a property of humans which can't be observed but exists nonetheless. It's like claiming humans have a soul.
People also often don't understand things and have trouble separating fact from fiction. By logic only one religion or no religion is true. Consequently also by logic most religions in the world where their followers believe the religion to be true are hallucinating.
The second thing to realize that your argument doesn't really apply. Its in theory possible to create a stochastic parrot that can imitate to a degree of 100 percent the output of a human who truly understands things. It blurs the line of what is understanding.
One can even define true understanding as a stochastic parrot that generated text indistinguishable total understanding.
> People also often don't understand things and have trouble separating fact from fiction.
That's not the point being argued. Understanding, critical thinking, knowledge, common sense, etc. all these things exist on a spectrum - both in principle and certainly in humans. In fact, in any particular human there are different levels of competence across these dimensions.
What we are debating, is whether or not, an LLM can have understanding itself. One test is: can an LLM understand understanding? The human mind has come to the remarkable understanding that understanding itself is provisional and incomplete.
Of course it can. Simply ask the LLM about itself. chatGPT4 can answer.
In fact. That question is one of the more trivial questions it will most likely not hallucinate on.
The reason why I alluded to humans here is because I'm saying we are setting the bar too high. It's like everyone is saying it hallucinates and therefore it can't understand anything. I'm saying that we hallucinate too and because of that LLMs can approach humans and human level understanding.
To answer "I don't know" requires one to know when you know. To know when you know in turn requires understanding.