Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

To a certain extent it has already - models are already very good at picking tools to use: ask for a video transformation and it uses ffmpeg, ask it to edit an Excel sheet and it uses Python with openpyxl, etc.

My post is more about how sometimes you still need to make environment design decisions yourself. My favorite example is the Fly.io one, where I created a brand new Fly organization with a $5 spending limit and issue an API token that could create resources in that organization purely so the coding agent could try experiments to optimize cold start times without messing with my production Fly environment.

An agent might be able to suggest that pattern itself, but it would need a root Fly credential in order to create itself the organization and restricted credentials and given how unsafe agents with root credentials are I'd rather keep that step to myself!



It's amusing to think that the endgame is that the humans in the loop are parents with credit cards.

I suppose you could never be sure that an agent would explicitly follow your instruction "Don't spend more than $5".

But maybe one could build a tool that provides payment credentials, and you get to move further up the chain. E.g., what if an MCP tool could spin up virtual credit cards with spending caps, and then the agent could create accounts and provide payment details that it received from the tool?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: