Hacker Newsnew | past | comments | ask | show | jobs | submit | raesene9's commentslogin

I've just been looking in to this as I've got quite a lot of older hardware that'll be fine for running some websites lying around.

My ISP has a static IP option for £5/month, but I reckon I can save £30/month+ on server costs even before any rises.

Ofc it does mean I have to do my own sysadmining, but a combination of my general knowledge + an LLM should make that relatively easy.


Watch out for the energy usage. What's electric now, 27p/kWh?

I'm going to guess the key differentiator here is "major ISPs". I can see the page fine using a Zen Internet connection, but from my phone, which uses EE, it's blocked.

I can access it from both my mobile and fiber connections, different ISPs. I'm with smaller players so maybe that's it.

I think avoiding filling context up with too much pattern information, is partially where agent skills are coming from, with the idea there being that each skill has a set of triggers, and the main body of the skill is only loaded into context, if that trigger is hit.

You could still overload with too many skills but it helps at least.


Of course it depends on exactly what you're using Claude Code for, but if your use-case involves cloning repos and then running Claude Code on that repo. I would definitely recommend isolating it (same with other similar tools).

There's a load of ways that a repository owner can get an LLM agent to execute code on user's machines so not a good plan to let them run on your main laptop/desktop.

Personally my approach has been put all my agents in a dedicated VM and then provide them a scratch test server with nothing on it, when they need to do something that requires bare metal.


In what situations where it require bare metal?


In my case I was using Claude Code to build a PoC of a firecracker backed virtualization solution, so bare metal was needed for nested virtualization support.


Firecracker can solve the kind of problems where you want more isolation than Docker provides, and it's pretty performant.

There's not a tonne of tooling for that use case now, although it's not too hard to put together I vibe-coded something that works for my use case fairly quickly (CC + Opus 4.5 seemed to understand what's needed)


Reading the article, it seemed to me that both the professor and the students were interested in the material being taught and therefore actively wanted to learn it, so using an LLM isn't the best tactic.

My feeling is that for many/most students, getting a great understanding of the course material isn't the primary goal, passing the course so they can get a good job is the primary goal. For this group using LLMs makes a lot of sense.

I know when I was a student doing a course I was not particularly interested in because my parents/school told me that was the right thing to do, if LLMs had been around, I absolutely would have used them :).


Yeah they definitely can be true (IME), as there's a massive difference depending on how LLMs are used to the quality of the output.

For example if you just ask an LLM in a browser with no tool use to "find a vulnerability in this program", it'll likely give you something but it is very likely to be hallucinated or irrelevant.

However if you use the same LLM model via an agent, and provide it with concrete guidance on how to test its success, and the environment needed to prove that success, you are much more likely to get a good result.

It's like with Claude code, if you don't provide a test environment it will often make mistakes in the coding and tell you all is well, but if you provide a testing loop it'll iterate till it actually works.


What specifically are you concerned about when running an LLM agent in a container versus a VM.

Assuming a standard Docker/Podman container with just the project directory mounted inside it, what vectors are you expecting the LLM to use to break out?


From “How it works” in the readme:

> yolobox uses container isolation (Docker or Podman) as its security boundary…

I have no issue with running agents in containers FWIW, just in framing it as a security feature.

> what vectors are you expecting the LLM to use to break out?

You can just search for “Docker CVE”.

Here is one later last year, just for an example: https://nvd.nist.gov/vuln/detail/CVE-2025-9074


Everything has CVEs, you can find CVEs in VM hypervisors if you like (the one you linked is in Docker Desktop, not Docker engine which is what this project uses).

There are valid criticisms of Docker/Podman isolation but it's not a binary "secure/not secure" thing, and honestly in this use case I don't see a major difference, apart from it being easier for a user to weaken the isolation provided by the container engine.

Docker/Podman security is essentially Linux security, it just uses namespaces+cgroups+capabilities+apparmor/SELinux+seccomp filters. There's a larger attack surface for kernel vulns when compared to VM hypervisors, but I've not heard of an LLM trying to break out by 0-day'ing the Linux kernel as yet :)


I’m not so much worried about a malicious agent, more so a confused deputy if that makes sense. The agent itself seems like a juicy RCE vector with a larger surface area than an unpatched binary. And think of all the side channels for delivering your exploits. You don’t need to bake into an executable payload, probably well crafted wording in a README.

Like you say, there’s a larger attack surface area for kernel vs hyper visor. If it’s easy to do, why wouldn’t you take advantage of the extra isolation of a VM?

It’s 2026 and microVMs are a thing. The DevX gap between VMs and containers is shrinking.


Or buying CDs packed with software from markets. In Glasgow you could get copies of loads of high-end software from traders in the Barras market.


Claude code has a YOLO mode, and from what I've seen a lot of heavy users, use it.

Fundamentally any security mechanism which relies on users to read and intelligently respond to approval prompts is doomed to fail over time, even if the prompts are well designed. Approval fatigue will kick in and people will just start either clicking through without reading, or prefer systems that let them disable the warnings (just as YOLO mode is a thing in Claude code)


Yes it basically does! My point was that I really doubt Anthropic will miss making it clear to users that this is manipulating their computer


Users are asking it to manipulate their computer for them, so I don't think that parts being lost.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: