More

senko · 2026-02-07T00:01:38 1770422498

Fun fact: "right here, right now" sample in eponymous Fatboy Slim song is Angela Basset trying to make Ralph Fiennes come to his senses and forget Juliette Lewis.

senko · 2026-02-06T12:08:18 1770379698

> Compilers will produce working output given working input literally 100% of my time in my career. I've never personally found a compiler bug.

First compilers were created in the fifties. I doubt those were bug-free.

Give LLMs some fifty or so years, then let's see how (un)reliable they are.

wtetzner · 2026-02-06T18:41:36 1770403296

What I don't understand about these arguments is that the input to the LLMs is natural language, which is inherently ambiguous. At which point, what does it even mean for an LLM to be reliable?

And if you start feeding an unambiguous, formal language to an LLM, couldn't you just write a compiler for that language instead of having the LLM interpret it?

senko · 2026-02-06T21:15:15 1770412515

1) Determinism isn't the same as reliability.

Compilers are deterministic (modulo bugs), but most things in life are not, but can still be reliable.

The opposite also holds: "npm install && npm run build" can work today and fail in a year (due to ecosystem churn) even though every single component in that chain is deterministic.

2) Reliability is a continuum, not a discreet yes/no. In practice, we want things to be reliable enough (where "enough" is determined per domain).

I don't presume this will immediately change your mind, but hopefully will open your eyes to looking at this a bit differently.

wtetzner · 2026-02-07T13:23:33 1770470613

> I don't presume this will immediately change your mind

I'm not saying that AI isn't useful. I'm just claiming it's not analogous to a compiler. If it was, you would treat your prompts as source code, and check them into source control. Checking the output of an LLM into source control is analogous to committing the machine code output from a compiler into source control.

My question still stands though. What does it mean for a tool to be reliable when the input language is ambiguous? This isn't just about the LLM being nondeterministic. At some point those ambiguities need to be resolved, either by the prompter, or the LLM. But the resolution to those ambiguities doesn't exist in the original input.

senko · 2026-02-05T21:50:18 1770328218

For those wondering how that looks in practice, here's one of OP's past blog posts describing a coding session to implement a non-trivial feature: https://mitchellh.com/writing/non-trivial-vibing (covered on HN here: https://news.ycombinator.com/item?id=45549434)

senko · 2026-02-05T18:16:35 1770315395

We already have Reddit.

senko · 2026-02-04T18:02:52 1770228172

The unironic use of "off brand office suite" here is hysterically tragic.

Also, the productivity suite formerly known as Office is these days called "365 Copilot".

senko · 2026-02-03T21:56:03 1770155763

Repost of Gary Marcus' blog[0] on ACM. Previously discussed here: https://news.ycombinator.com/item?id=46848552

[0] https://garymarcus.substack.com/p/openclaw-aka-moltbot-is-ev...

senko · 2026-02-03T21:13:03 1770153183

OP here, no worries, loved the comment and appreciate the feeling :)

senko · 2026-02-03T21:08:06 1770152886

Don't give it access to your ssh keys!

charcircuit · 2026-02-03T21:36:37 1770154597

Yes, it should have its own dedicated key instead of sharing one of your own.

senko · 2026-02-03T20:53:45 1770152025

There's https://code.claude.com/docs/en/sandboxing that uses something called Seatbelt on Mac and bubblewrap (the same thing I used here) on Linux.

No idea how customizable that is.

senko · 2026-02-03T20:45:18 1770151518

Article author here. I used trial and error - manual inspection it is.

This took me a few minutes but I feel more in control of what's being exposed and how. The AI recommended just exposing the entire /etc for example. It's probably okay in my case, but I wanted to go more precise.

On the network access part, I let it fully loose (no restrictions, it can access anything). I might want to tighten that in the future (or at least disallow 192.168/16 and 10/8), for now I'm not very concerned.

So there's levels of how tight you want to set it.

ATechGuy · 2026-02-03T20:52:32 1770151952

> I feel more in control of what's being exposed and how

Makes complete sense. Thanks for your insights!