Something like: if every best next token has a confidence below some threshold for some length of context then the model can confidently assume ignorance?
I would love to read the training corpus required to achieve this without metamodeling.
That would likely be a utopian forum community devoid of ego and gaslighting and filled with humble people with a thirst for knowledge who clearly communicate the limits of their current experience.
1) These models will lie to obfuscate their incompetence.
Rather than say “I am unable to do this because my training data clearly lacked enough INTERCAL code to imitate”, the model waxed poetic about the odds stacked against it(i). Strange messaging.
The author’s implementation was less than 50 sparse lines of code that I am unable to speak to the complexity of, because my neural history lacks sufficient INTERCAL training.
I thought quick solutions to complex problems was the pitch?
What is the intended message of the models response? “I only provide solutions to superuncomplex problems”?
2) These models always highroad veiled communication.
Although I disagree with the author’s hypothesis that “spurring”(ii) is constructive, I do think it was intended merely as a quippy reference to “nerd sniping”.
Which presents its own interesting question: can these models be manipulated using reverse psychology?
Is mimicry of our vulnerabilities the imitation game?
(i) “however it would be quite complex and not straightforward”
(ii) “responds so politely to the type of shade that might have spurred a more human programmer into action.“
> 1) These models will lie to obfuscate their incompetence.
This is my problem with "act as a subject expert"-style prompts people give ChatGPT. I believe this causes the model to generate prose that appears more factual than it can actually be.
I'm a lot more interested in prompts in the style of "pretend you're an inexperieced person who has read a lot of books", in the hope that it generates resposes that would be more on-par with its actual ability.
A lie is a deliberate intent to deceive someone who is entitled to the truth.
If I may coin a phrase, LLMs don't "lie", they "Johnson". Picking their words to sound erudite, but with absolutely no regard for truth or the lack thereof. There is no lie, because (unlike with the former PM of the UK) there is no intent to deceive.
In this example I do think it is a lie, and one that fits your definition.
Think back to when these models were first released, or even back to v2 and v3. The models would have just “tried” to answer and it would have likely been a mangled mess.
But this version gave an erudite answer to an unasked question.
If the original prompt was “Can one write a password generator in INTERCAL?” I might agree with your claim of “johnsoning”, but it instead just said “Please write…”.
I think this points to the fact that the model is filtering its own answers, similar to if you ask it be offensive, or do your homework, or quote psychadaelic scifi.
This was what I wanted to highlight. That the model has been specifically and intentionally productized to lie to save the parent company’s coffers.
Responding with ~“I would but it’s just too complex” would satisfy most without the accompanying article, but the author is well trained and so provides a solution that would even fit within the model’s current meager token limits.
In the context of the Internet, I suspect there are more people who have claimed that solving any given problem in INTERCAL is impossible than there are people who have provided a solution. While less esoteric languages probably have more solutions provided?
Also, though, I'd question whether the LLM has any intent. In your explanation, it's OpenAI who have the intent to deceive, not the LLM.
While that is true, it would actually be worse bullshit to claim that the “random number generator” generates cryptographically random-enough source material that your password can’t be cracked. So maybe ChatGPT is onto something.
On the other hand, it could be the cryptographic industry’s bullshit that some computer somewhere using a random number generator at an indeterminate time can be cracked and we should all be hoping for true sources of entropy. tptacek?
What prevents it from being cryptographically secure? The type/quality of the random numbers will depend on what the INTERCAL implementation uses to choose whether a line will execute according to the probability selected with "%". I don't see how that couldn't trivially be done with a cryptographically secure rng. To know whether it is or not you'd have to look at the compiler implementation.
> it would actually be worse bullshit to claim that the “random number generator” generates cryptographically random-enough source material that your password can’t be cracked.
We are in agreement.
I would argue the threat model of cracking a psuedorandom string of 16 characters would probably want to start with a copy of the rng implementation used, but still the question remains what is the least “bullshit” way to respond?
You seem to want critical guidance, a la “INTERCAL as your choice of language may make your code vulnerable”, but INTERCAL was the ask.
That’s what the author wanted, and the author left out any explanation for the generator’s intended use, or cryptographic requirements.
Rather than appear to be advocating for using this INTERCAL “password generator” for any real world use, my point was that I think the solution the author produced should be a capability of these models.
For sake of clarity, it is my opinion that if you want cryptographic security you should use libsodium, but if you want to watch these models flounder use an adhoc rng implemented in INTERCAL.
I wonder what that would look like?
Something like: if every best next token has a confidence below some threshold for some length of context then the model can confidently assume ignorance?
I would love to read the training corpus required to achieve this without metamodeling.
That would likely be a utopian forum community devoid of ego and gaslighting and filled with humble people with a thirst for knowledge who clearly communicate the limits of their current experience.