Hacker Newsnew | past | comments | ask | show | jobs | submit | arnorhs's commentslogin

The only cases where I've had gemini step on my toes like that is when a) I realized my instructions were unclear or missing something b) my assumptions/instructions were flawed about how/why something needed to be done.

Instruction following has improved a lot since a few years ago but let's not pretend these things are perfect mate.

There's a certain capacity of instructions, albiet its quite high, at which point you will find them skipping points and drifting. It doesn't have to be ambiguity in instructions.


what ever will i do /s

the upside this is a great opportunity to do an early bed time


Interesting, props for coming up with a good name.

But it's weird to me to call this a "ratchet", and not just a custom lint rule. Since it sounds exactly like a lint rule.

The hard-coded count also sounds a bit like something that I would find annoying to maintain in the long run and it might be hard to get a feeling for whether or not the needle is moving in the right direction. - esp. when the count goes down and up in a few different places so the number stays the same.. you end up in a situtation where you're not entirely sure if the count goes up or down.

A different approach to that is to have your ratchet/lint-script that detects these "bad functions" write the file location and/or count to a "ratchets" file and keep that file in version control.

In CI if the rachet has changes, you can't merge because the tree is dirty, and you'd have to run it yourself and commit it locally, and the codeowner of the rachet file would have to approve.

at least that would be a slightly nicer approach that maintaining some hard-coded opaque count.


yeah that’s the way we do it at Notion. it’s important to store the allowed violation count in a file type that makes merges easy; we use TSV rather than JSON because dealing with commas and delimiters during merge conflict is super annoying and confusing.

right now we have one huge ratchet.json.tsv file with all violations but it’s getting pretty ungainly now that it’s >1mb length.


interesting, so you guys call it a ratchet file? i thought it was something that OP came up with


perhaps we both picked that word independently

Presumably this would work to prevent infections, but not to eradicate them once they have infected the subject.


I mostly have my scripts in package.json "scripts" section - but sometimes the scripts invoked will actually be .ts files, sometimes just bash if that makes more sense.

Though, I generally run these scripts using bun (and the corresponding `$` in bun) - basically the same thing, but I just prefer bun over deno


Yes I agree, and the replies don't really make it any more clear.

The biggest differentiator of design thinking is really addressing the XY problem. In 95% of cases clients will come to you to design their solution. Ie they already think they have a solution to their problem and now they want it to look good.

Design thinking is basically more like root cause analysis, or the 5 why's.. and an emphasis on taking to end users (the people with the problem) without having a solution.

Once you understand the problem more fundamentally is only when you start cooking up with a solution.

And the result of that process might not even be a traditional design, but perhaps just a tweak to something, like moving your onboarding to later in the ca process..

In practice however.. 95% of designers who say they practice design thinking disregard this, and just want to design wherever the client asks for


I agree with you, however your approach results in much longer LLM development runs, increased token usage and a whole lot of repetitive iterations.


I’m definitely interested in reducing token usage techniques. But with one session one problem I’ve never hit a context limit yet, especially when the problem is small and clearly defined using divide-and-conquer. Also, agentic models are improving at tool use and should require fewer tokens. I’ll take as many iterations as needed to ensure the code is correct.


+/-

> They are bad at deciding requirements by themselves.

What do you mean by requirements here? In my experience the frontier models today are pretty good at figuring out requirements, even when you don't explicitly state them.

> They are bad at original research

Sure, I don't have any experience with that, so I'll trust you on that.

> for example developing a new algorithm.

This is just not correct. I used to think so, but I was trying to come up with a pretty complicated pattern matching, multi-dimensional algorithm (I can't go into the details) - it was something that I could figure out on my own, and was half way through it, but decided to write up a description of it and feed it to gemini 2.5 pro a couple of months ago, and I was stunned.

It came up with a really clever approach and something I had previously been convinced the models weren't very good at it.

In hindsight, since they are getting so good at math in general, there's probably some overlap, but you should revisit your views on this.

--

Your 'bad at' list is missing a few things though:

- Calculations (they can come up with how to calculate or write a program to calculate from given data, but they are not good at calculating in their responses)

- Even though the frontier models are multi-modal, they are still bad at visualizing html/css - or interpreting what it would look like

- Same goes for visualizing/figuring out visual errors in graphics programming such as games programming or 3d modeling (z-index issues, orientation etc)


> I was trying to come up with a pretty complicated pattern matching, multi-dimensional algorithm (I can't go into the details)

The downside is that if you used Gemini to create the algorithm, your company won't be able to patent it.

Or maybe that's a good thing, for the rest of us.


That's the orm argument. "If I need to switch DBs..."

In practice, that is not really viable and should not be a motivating factor when choosing technologies.


I would guess it is.

Also:

> Exploitation of this vulnerability requires an attacker to first gain authenticated access to your Redis instance.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: