Hacker Newsnew | past | comments | ask | show | jobs | submit | senko's commentslogin

Fun fact: "right here, right now" sample in eponymous Fatboy Slim song is Angela Basset trying to make Ralph Fiennes come to his senses and forget Juliette Lewis.

> Compilers will produce working output given working input literally 100% of my time in my career. I've never personally found a compiler bug.

First compilers were created in the fifties. I doubt those were bug-free.

Give LLMs some fifty or so years, then let's see how (un)reliable they are.


What I don't understand about these arguments is that the input to the LLMs is natural language, which is inherently ambiguous. At which point, what does it even mean for an LLM to be reliable?

And if you start feeding an unambiguous, formal language to an LLM, couldn't you just write a compiler for that language instead of having the LLM interpret it?


1) Determinism isn't the same as reliability.

Compilers are deterministic (modulo bugs), but most things in life are not, but can still be reliable.

The opposite also holds: "npm install && npm run build" can work today and fail in a year (due to ecosystem churn) even though every single component in that chain is deterministic.

2) Reliability is a continuum, not a discreet yes/no. In practice, we want things to be reliable enough (where "enough" is determined per domain).

I don't presume this will immediately change your mind, but hopefully will open your eyes to looking at this a bit differently.


> I don't presume this will immediately change your mind

I'm not saying that AI isn't useful. I'm just claiming it's not analogous to a compiler. If it was, you would treat your prompts as source code, and check them into source control. Checking the output of an LLM into source control is analogous to committing the machine code output from a compiler into source control.

My question still stands though. What does it mean for a tool to be reliable when the input language is ambiguous? This isn't just about the LLM being nondeterministic. At some point those ambiguities need to be resolved, either by the prompter, or the LLM. But the resolution to those ambiguities doesn't exist in the original input.


For those wondering how that looks in practice, here's one of OP's past blog posts describing a coding session to implement a non-trivial feature: https://mitchellh.com/writing/non-trivial-vibing (covered on HN here: https://news.ycombinator.com/item?id=45549434)

We already have Reddit.

The unironic use of "off brand office suite" here is hysterically tragic.

Also, the productivity suite formerly known as Office is these days called "365 Copilot".



OP here, no worries, loved the comment and appreciate the feeling :)

Don't give it access to your ssh keys!

Yes, it should have its own dedicated key instead of sharing one of your own.

There's https://code.claude.com/docs/en/sandboxing that uses something called Seatbelt on Mac and bubblewrap (the same thing I used here) on Linux.

No idea how customizable that is.


Article author here. I used trial and error - manual inspection it is.

This took me a few minutes but I feel more in control of what's being exposed and how. The AI recommended just exposing the entire /etc for example. It's probably okay in my case, but I wanted to go more precise.

On the network access part, I let it fully loose (no restrictions, it can access anything). I might want to tighten that in the future (or at least disallow 192.168/16 and 10/8), for now I'm not very concerned.

So there's levels of how tight you want to set it.


> I feel more in control of what's being exposed and how

Makes complete sense. Thanks for your insights!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: