people like to say this like they’re apples to apples but this comparison isn’t remotely how the brain actually works - and even if it did, the brain does it automatically without direction and at an infitesimal percentage of the power required.
And we’re just talking about cognition - it completely ignores the automatic processes such as maintaining and regulating the body and it’s hormones, coordinating and maintaining muscles, visual/spacial processing taking in massive amounts of data at a very fine scale, and informing the body what to do with it - could go on.
One of the more annoying things about this conversation is you don’t even need to make this argument to make the point you’re trying to make, but people love doing it anyway. It needlessly reduces how amazing the human brain is to a bunch of catchy sci fi sounding idioms.
It can be simultaneously true that transformer based language models can be very smart and that the human brain is also very smart. It genuinely confuses me why people need to make it an either/or.
Thank you, this comparison has been a huge annoyance of mine for the past 3 years of... this same debate over and over.
I think it's the hubris that I find most offensive in this argument: a guy knows one complex thing (programming) and suddenly thinks he can make claims about neuroscience.
I’ve seen you/anthropic comment repeatedly over the last several months about the “thinking” in similar ways -
“most users dont look at it” (how do you know this?)
“our product team felt it was too visually noisy”
etc etc. But every time something like this is stated, your power users (people here for the most part) state that this is dead wrong. I know you are repeating the corporate line here, but it’s bs.
They don’t want to officially disclose the reality because while some users will understand the realities of protecting a product while innovating, many will just realize it means one can go looking for claude 4.5 performance elsewhere.
building for the loud users on a forum is generally a losing move. if we built notion for angry HN users, we'd probably be a great obsidian competitor with end to end encryption, have zero ai features, and make zero money.
Anecdotally the “power users” of AI are the ones who have succumbed to AI psychosis and write blog posts about orchestrating 30 agents to review PRs when one would’ve done just fine.
The actual power users have an API contract and don’t give a shit about whatever subscription shenanigans Claude Max is pulling today
Generalisations and angry language but I almost agree with the underlying message.
New tools, turbulent methods of execution. There's definitely something here in the way of how coding will be done in future but this is still bleeding edge and many people will get nicked.
Whatever makes you feel better about yourself, I guess. My account history on this topic is pretty easily searchable, but I guess it's easier to make driveby comments like this than be informed.
Eh, that’s not at all how I do it. I like to design the architecture and spec and let them implement the code. That is a fun skill to exercise. Sometimes I give a little more leeway in letting them decide how to implement, but that can go off the rails.
imho “tell them what you want and let them come up with a solution” is a really naive way to use these tools nearly guaranteed to end up with slopware.
the more up front design I’ve given thought to, they are usually very accurate in delivering to the point I dont need to spend very much time reviewing at all. and, this is a step I would have had to do anyway if doing it by hand, so it feels natural, and results in far more correct code more often than I could have on my own, and allows multitasking several projects at once, which would have been impossible before.
Modern "skills" and Markdown formats of the day are no different than "save the kittens". All of these practices are promoted by influencers and adopted based on wishful thinking and anecdata.
Uh, this couldn't be more false. I've implemented these from scratch at my company and rolled them out org-wide and I've yet to watch a youtube video and don't consume any influencers. Mostly by just using the tools and reading documentation - as any other technical tool.
Perhaps your blanket statement could be wrong, and I would encourage you to let your mind be a bit more open. The landscape here is not what it was 6 months ago. This is an undeniable fact that people are going to have to come to terms with pretty soon. I did not want to be in this spot, I was forced to out of necessity, because the stuff does work.
To be fair, if you have never watched a YouTube video in your life then how can you say the OP was wrong about what influencers are peddling? Side note, have you ever seen that Onion article on the man that can't stop telling people he doesn't own a TV?
Great, so how do you know this stuff works? Did you evaluate it against other approaches? How do you know it's actually reliable?
The Vercel team had some interesting findings[1]:
> In 56% of eval cases, the skill was never invoked. The agent had access to the documentation but didn't use it.
Others had different findings for commonly accepted practices[2], some you may have adopted from reading documentation, which surely didn't come from influencers.
And yet others swear by magical Markdown documents[3].
So... who is the ultimate authority on what actually works, and who is just cargo culting the trendy practice of the week? And how is any of this different from what was being done a few years ago?
Sorry, but from your first comment, I don’t particularly feel inclined to help you figure this out. I was just offering I’ve already deployed these things at a scale with success using many of the configuration options offered as documentation in the op here. this stuff isn’t some mystical blackbox, although you seem to think it is.
I measure the tooling success with a suite of small prompt tests performing repeatable tasks, measuring the success rate over time, educating the broader team, and providing my own tried and tested in the field skills that I’ve shared to similar successes to the broader teams. We’ve seen a huge increase in velocity and lower bug rate, which are also very easily measurable (and long evaluated stats) enough to put me in the position I am, which was not a reluctant one. You’re perfectly free to view my long history on this topic on this forum to see I am a complete skeptic on this topic, and wouldn’t be here unless I had to.
everyone is figuring this out still. There is no authority, I am my own authority on what I have seen work and what hasn’t. Feel free to take of that what you will. I just wanted to provide a counterpoint to your initial claim. I’m certainly not going to expose to a fine degree what has worked for my org and what hasn’t due to obvious reasons.
what? non techies are most at risk. There are a huge number of malicious skills. Not knowing or caring how to spot malicious behavior doesn’t mean someone shouldn’t be concerned about it, no matter how much they can’t or don’t want to do it.
I am an adminstrator of this stuff at my company and it’s an absolute effing nightmare devising policies that protect people from themselves. If I heard this come out of someone’s mouth underneath me I’d tell them to leave the room before I have a stroke.
And this is stuff like, if so and so’s machine is compromised, it could cost the company massive sums of money. for your personal use, fine, but hearing this cavalier attitude like it doesn’t matter is horrifying, because it absolutely does in a lot of contexts.
I run a small local non-profit which is essentially security hardening guide with some helper tooling that simplifies some concepts for non-techies (FDE, MFA, password managers etc).
LLMs have completely killed my motivation to continue running it. None of standard practices apply anymore
reply