If LLMs still produce code that is eventually compiled down to a very low level...that would mean it can be checked and verified, the process just has additional steps.
JavaScript has a ton of behavior that is very uncertain at times and I'm sure many JS developers would agree that trusting what you're standing on is at times difficult. There is also a large percentage of developers that don't mathematically verify their code, so the verification is kind of moot in those cases, hence bugs.
The current world of LLM code generation lacks the verification you are looking for, however I am guessing that these tools will soon emerge in the market. For now, building as incrementally as possible and having good tests seems to be a decent path forward.
There are 4 important components to describing a compiler. The source language, the target language, and the meaning (semantics in compiler-speak) of both those languages.
We call a C->asm compiler "correct" if the meaning of every valid C program turns into an assembly program with equivalent meaning.
The reason LLMs don't work like other compilers is not that they're non-deterministic, it's that the source language is ambiguous.
LLMs can never be "correct" compilers, because there's no definite meaning assigned to english. Even if english had precise meaning, LLMs will never be able to accurately turn any arbitary english description into a C program.
Imagine how painful development would be if compilers produced incorrect assembly for 1% of all inputs.
English does have precise meaning, if constructed to be precise, the issue is that LLMs do not assign meaning in the way humans assign meaning. Humans assign English meaning to code every day just fine, and sometimes it does result in bugs as well.
The LLM in this loop is the equivalent of a human, which also has ambiguous source language if we’re going by your theory of English being ambiguous. So it sounds like you’re saying that if a human produces a C program, it is not verifiable and testable because the human used an ambiguous source language?
I guess for some reason people thought I meant that the compiler would be LLM > machine code, where actually I meant the compiler would still be whatever language the LLM produces down to machine code. Its just that the language the LLM produces can be checked through things like TDD or a human, etc...
I would probably agree! I came off sounding as if there is no human in the loop. What I meant is that input is still the programming language that is produced and output is the result. Not that the LLM is the initial input. A human in the loop can clean the code produced or create tests that check for an end result(or intermediate results as well).
I understand that an input to an LLM will create a different result in many cases, making the output not deterministic, but that doesn’t mean we can’t use probability to arrive to results eventually.
I mean the things _producing_ the code can be checked and verified, meaning the code generated is guaranteed to be correct. You're talking about verifying the code _produced_. That's the big difference.
Would be curious as to how you check and verify LLMs? And how you get guaranteed correct code?
Verifying code produced is a much simpler task for some code because I, as a human, can look at a generated snippet and reason about it and determine if it is what I want. I can also create tests to say “does this code have this effect on some variable” and then proceed to run the test.
JavaScript has a ton of behavior that is very uncertain at times and I'm sure many JS developers would agree that trusting what you're standing on is at times difficult. There is also a large percentage of developers that don't mathematically verify their code, so the verification is kind of moot in those cases, hence bugs.
The current world of LLM code generation lacks the verification you are looking for, however I am guessing that these tools will soon emerge in the market. For now, building as incrementally as possible and having good tests seems to be a decent path forward.