My take from working at a Big Corp is that individuals using coding agents can increase velocity substantially and produce good quality work, assuming they are proficient with the tools. But it falls apart quickly when you have a team trying to work together.
imo we either need to centralize the agent and and submit plan, spec, reference doc MRs rather than submitting code changes. Or develop SCM systems/workflows that incorporate plan/spec/reference/prompt metadata with code so intent can be factored into merges.
I do the same. Codex manages a per project flake.nix and uses `nix develop` for all testing. nix-direnv for my own convenience. I generally have it generate dockerfiles or other deployment assets at some point.
The distinction is scale. "AI Datacenters" are a new level of scale with new levels of power consumption and heat generation. Sure you could run regular compute and w/e in them but it's not practical to build these mega sites for regular compute. GPU Compute / AI workloads require network/interconnect bandwidth and latencies where distance matters so you're forced to solve problems you wouldn't otherwise have to. Those problems are mostly solved with money.
I'm building the same stuff I've always built. Just faster and with less dependence on others. Not having to argue with devs that have their own agendas has been my biggest benefit from coding agents.
> Not having to argue with devs that have their own agendas
Agendas like, "let's not check our API key into a public github repo" or "Let's not store passwords in plaintext" or "Don't expose customer data via a public api"?
Did you install the network connector Orca slicer prompted you to download? It's a closed source blob that runs on your PC which I'm presuming you haven't air-gapped as well.
TBF this was the case prior to the firmware change. It wasn't a bait and switch. It just wasn't obvious to someone buying a printer they thought worked with open source slicers.
There's a few dimensions you can look at for gpu load. Probably the easiest indirect metric to watch for gpu load is power usage.
But if you really care about this, you should actually profile your application. nsight systems makes this pretty simple to do. Dunno how many actually care about having a TUI.
Power is useful as a second-order metric and can help catch drastic underutilization, but it has similar problems to SM Active (DCGM) -- it tends to overestimate utilization and doesn't distinguish between useful compute and memory traffic. It's very possible to be in a memory-bound workload with high power even though underutilizing compute utilization. Our goal was to separate these bottlenecks out so there's more visibility into where to optimize.
On nsys, agreed it's great, but we wanted something that could run continuously instead of an offline analysis tool. We think there's room for both to be useful.
I previously worked at a managed database as a service company. On more than one occasion during my time there, a junior engineer deleted a customers database and at least one time one of our most senior dbas made it unrecoverable. Never got such straight forward confessions out of them.
Programmers mostly don't. Ordinary people see figuring out how to use the computer as a hindrance rather than empowering, they want Star Trek. They want "computer, plan my next vacation to XYZ for me" to lay out a full itinerary and offer to buy the tickets and make the reservations.
Knowledge work is work most people don't really want to deal with. Ordinary people don't put much value into ideas regardless of their level of refinement
I have been a programmer for 30 years and have loved every minute of it. I love figuring out how to get my computers to do what I want.
I also want Star Trek, though. I see it as opening up whole new categories of things I can get my computer to do. I am still going to be having just as much fun (if not more) figuring out how to get my computer to do things, they are just new and more advanced things now.
I was talking about this "plan a trip" example somewhere else, and I don't think we're prepared for the amount of scams and fleecing that will sit between "computer, make my trip so" and what it comes back with.
> They want "computer, plan my next vacation to XYZ for me" to lay out a full itinerary and offer to buy the tickets and make the reservations.
Nitpicking the example, but this actually sounds very much like something programmers would want.
Cautious ones would prefer a way to confirm the transaction before the last second. But IMO that goes for anyone, not just programmers.
Also I get the feeling the interest in "computers" is 50/50 for developers. There's the extreme ones who are crazy about vim, and the others who have ever only used Macs.
People want to do stuff, and they want to get it done fast and in a pretty straightforward manner. They don’t want to follow complicated steps (especially with conditional) and they don’t want to relearn how to do it (because the vendor changes the interface).
So the only thing they want is a very simple interface (best if it’s a single button or a knob), and then for the expected result to happen. Whatever exists in the middle doesn’t matter as long as the job is done.
So an interface to the above may be a form with the start and end date, a location, and a plan button. Then all the activities are show where the user selects the one he wants and clicks a final Buy button. Then a confirmation message is displayed.
Anything other than that or that obscure what is happening (ads, network error, agents malfunctioning,…) is an hindrance and falls under the general “this product does not work”.
Ordinary people absolutely hate AI and AI products. There is a reason why all these LLM providers are absolutely failing at capturing consumers. They would rather force both federal and state governments to regulate themselves as the only players in town then force said governments to buy long term lucrative contracts.
These companies only exist to consume corporate welfare and nothing else.
Everyone hates this garbage, it's across the political spectrum. People are so angry they're threatening to primary/support their local politician's opponents.
giving these things control over your actual computer is a nightmare waiting to happen – i think its irresponsible to encourage it. there ought to be a good real sandbox sitting between this thing and your data.
Hard agree. I'm on vacation in Mexico atm and when I get back I get to repair my OS because I gave codex full control over my system before I left. Was rushing trying to reorganize my project files to get up to the
GitHub before I left. Instead it deleted my OS user profile and bonked my system.
Local models on different machines with multiple RTX Pro 6000 or multiple DGX Sparks or a 512GB RAM Macstudio; the agents themselves run on that Pentium J NUC and just use exposed endpoints for local models. Forgejo for Git runs on another server. Therefore I don't really care if that NUC goes kaboom and can test everything quickly (OpenClaw, Hermes, Claude Code, Codex, OpenCode, Pi etc.). Or I can just use OpenRouter API key and access 10-100x cheaper models than Opus.
I want it yes. I already feel like Im the one doing the dumb work for the AI of manually clicking windows and typing in a command here or there it cant do.
Ive also been getting increasingly annoyed with how tedious it is to do the same repetitive actions for simple tasks.
I don’t think clicking buttons on a Mac is a particularly scary barrier. It’s not anymore scary then running an LLM in agent mode with a very large number of auto-approve programs and walking away for 15 minutes.
I did some work on an agent that was supposed to demonstrate a learning pipeline. I figured having it fix broken linux servers with some contrived failures would make for a good example if it getting stuck, having to get some assistance to progress, and then having a better capability for handling that class of failure in the future.
I couldn't come up with a single failure mode the agent with a gpt5.x model behind it couldn't one shot. I created socket overruns.. dangling file descriptors.. badly configured systemd units.. busted route tables.. "failed" volume mounts..
Had to start creating failures of internal services the models couldn't have been trained on and it was still hard to have scenarios it couldn't one shot.
imo we either need to centralize the agent and and submit plan, spec, reference doc MRs rather than submitting code changes. Or develop SCM systems/workflows that incorporate plan/spec/reference/prompt metadata with code so intent can be factored into merges.
reply