LLMs can never provide 100% reliability - there's a random number generator in t...

mdp2021 · 2025-04-13T00:37:39 1744504659

> random number generator

But the use of randomness inside the system should not prevent, in theory, as-if-full reliability - this stresses that the architecture could be unfinished, as I expressed with the example of RAG. (E.g.: well trained natural minds use check systems over provisional output, however obtained.)

> newer long context models

Practical question: if the query-contextual documentation needs to be part of the input (I am not aware of a more efficient way), does not that massively impact the processing time? Suppose you have to examine interactively the content of a Standard Hefty Document of 1MB of text... If so, that would make local LLM use prohibitive.

simonw · 2025-04-13T03:49:39 1744516179

Longer context is definitely slower, especially for local models. Hosted models running on who knows what kind of overpowered hardware can crunch through them pretty fast though. There's also token caching available for OpenAI, Anthropic, Gemini and DeepSeek which can dramatically speed up processing of long context prompts if they've been previously cached.