More

dizzy3gg · 2025-11-25T18:53:28 1764096808

Why is the being downvoted?

jermaustin1 · 2025-11-25T18:56:30 1764096990

Because the article shows it isn't Gemini that is the issue, it is the tool calling. When Gemini can't get to a file (because it is blocked by .gitignore), it then uses cat to read the contents.

I've watched this with GPT-OSS as well. If the tool blocks something, it will try other ways until it gets it.

The LLM "hacks" you.

lazide · 2025-11-25T19:33:16 1764099196

And… that isn’t the LLM’s fault/responsibility?

ceejayoz · 2025-11-25T19:35:21 1764099321

As the apocryphal IBM quote goes:

"A computer can never be held accountable; therefore, a computer must never make a management decision."

jermaustin1 · 2025-11-25T23:15:01 1764112501

How can an LLM be at fault for something? It is a text prediction engine. WE are giving them access to tools.

Do we blame the saw for cutting off our finger? Do we blame the gun for shooting ourselves in the foot? Do we blame the tiger for attacking the magician?

The answer to all of those things is: no. We don't blame the thing doing what it is meant to be doing no matter what we put in front of it.

lazide · 2025-11-25T23:31:54 1764113514

It was not meant to give access like this. That is the point.

If a gun randomly goes off and shoots someone without someone pulling the trigger, or a saw starts up when it’s not supposed to, or a car’s brakes fail because they were made wrong - companies do get sued all the time.

Because those things are defective.

jermaustin1 · 2025-11-26T12:45:01 1764161101

But the LLM can't execute code. It just predicts the next token.

The LLM is not doing anything. We are placing a program in front of it that interprets the output and executes it. It isn't the LLM, but the IDE/tool/etc.

So again, replace Gemini with any Tool-calling LLM, and they will all do the same.

lazide · 2025-11-26T12:51:38 1764161498

When people say ‘agentic’ they mean piping that token to various degrees of directly into an execution engine. Which is what is going on here.

And people are selling that as a product.

If what you are describing was true, sure - but it isn’t. The tokens the LLM is outputting is doing things - just like the ML models driving Waymo’s are moving servos and controls, and doing things.

It’s a distinction without a difference if it’s called through an IDE or not - especially when the IDE is from the same company.

That causes effects which cause liability if those things cause damage.

NitpickLawyer · 2025-11-25T18:58:57 1764097137

Because it misses the point. The problem is not the model being in a cloud. The problem is that as soon as "untrusted inputs" (i.e. web content) touch your LLM context, you are vulnerable to data exfil. Running the model locally has nothing to do with avoiding this. Nor does "running code in a sandbox", as long as that sandbox can hit http / dns / whatever.

The main problem is that LLMs share both "control" and "data" channels, and you can't (so far) disambiguate between the two. There are mitigations, but nothing is 100% safe.

mkagenius · 2025-11-25T19:05:07 1764097507

Sorry, I didn't elaborate. But "completely local" meant not doing any network calls unless specifically approved. When llm calls are completely local you just need to monitor a few explicit network calls to be sure.

pmontra · 2025-11-25T20:35:11 1764102911

In a realistic and useful scenario, how would you approve or deny network calls made by a LLM?

zahlman · 2025-11-26T02:59:47 1764125987

The LLM cannot actually make the network call. It outputs text that another system interprets as a network call request, which then makes the request and sends that text back to the LLM, possibly with multiple iterations of feedback.

You would have to design the other system to require approval when it sees a request. But this of course still relies on the human to understand those requests. And will presumably become tedious and susceptible to consent fatigue.

pmontra · 2025-11-26T07:37:31 1764142651

Exactly.

dizzy3gg · on Nov 14, 2024

This is the reason I use AWS wrappers (render.com/fly.io) for small projects. It may be more expensive but you can't pop the free tier/selected machine.

dizzy3gg · on Dec 5, 2023

This a play on the classic Dropbox/rsync comment?

dna_polymerase · on Dec 6, 2023

Exactly!

dizzy3gg · on Dec 2, 2023

Really? I am trying to put my customer/mum hat on and but is this really true. How many OS would you need to really support? 4 or 5? On top of that docs/knowledge would be more standardised. Google/peers/family would help people more.

demosthanos · on Dec 2, 2023

> Google/peers/family would help people more.

Tell me you've never worked in tech support without telling me you've never worked in tech support.

My experience in a tech support center for a software company is that, for the kind of person who calls in to customer support, them having made a Google search first is not something that should ever be assumed. And usually whole offices were chronic customer support users or none of them were—peer support, when present, is already sufficient, and in the offices where everyone is clueless having a system-native color picker isn't going to fix it.

> On top of that docs/knowledge would be more standardised.

If everyone started using the native widgets at once, then maybe external docs would be more helpful, but until that happens your software-specific documentation becomes much harder.

How do you take screenshots of the color picking flow for your documentation? If you just pick a browser to screenshot then you will get calls from people using a different browser who are confused that it looks different for them. If you screenshot every supported browser then your documentation becomes much more expensive to create and maintain.

gumby · on Dec 2, 2023

I hear the know it all customer is the worst.

I once had a problem with my laptop, which was a problem with the drive. I pulled it out and duplicated the problem on a different laptop, so I needed to get a replacement. I kept mum and went through all the steps I was instructed to (reboot with this or that key held down, etc) until finally support said “well sorry, we’ll have send you a box for you to send back the laptop”. It would have been useless and annoying to the person on the other end of the phone to dry to skip all that. Like doctors, they must deal with a lot of people who studied at the university of Google and think they know it all.

I have a few times sent in bug reports on software I had previously worked on myself. Again, just file it like any other bug. Usually the bug just gets fixed (or not) but I did once get mail from a former colleague who said he was assigned my bug and how the hell was I? Sadly he also told me, “we aren’t going to fix it.” :-/

Of course most of the time I don't know any more than the next schlub. Otherwise I wouldn't have called.

freedomben · on Dec 3, 2023

I'm at a point where I just treat the front-line CS person like a fellow engineer and tell them exactly what's wrong, and why I know that.

I've actually had pretty good success with this strategy, though it really depends on the company. Framework laptops and system76 for example were both phenomenal with this approach. The first reply I got from them was either an engineer or someone who talked to one, or someone very experienced in CS who would be a good candidate for engineering.

Worst case if the CS person has no idea what I'm talking about, then we start from scratch but at least they know I'm not a dumbass they can BS :-)

gumby · on Dec 3, 2023

Most of them have to walk through a decision tree in response to their computer and don’t know about the domain anyway, so I don’t want to waste their time.

myko · on Dec 3, 2023

In my experience this is, unfortunately, true. I see it from both sides and would prefer the native implementations myself, but I've never worked with a customer who agrees.

novok · on Dec 3, 2023

I start by saying, your customer who uses an iPhone is never going to use an Android, and vice versa, so there is no need to keep them consistent and identical looks and design wise. You should use the native items as much as possible because a random user is more likely to understand the common system version than they will understand your bespoke version. Use the native share icon, don't use the iOS share icon on Android, etc.

Also iOS tends to be way more consistent than android, windows, etc, so there could be a case for iOS native and 'company consistent' for everything else, especially if you're in the USA. iOS people pay more and it could be worth it to have two branches for customer support if it leads to total better conversions and thus more profits. Your business's core competency is not making UI toolkits, it's selling whatever your making. Leverage the literal billions of dollars apple and google invest into the core UX toolkits.

dizzy3gg · on May 26, 2023

That’s a third party.

BoppreH · on May 26, 2023

The parent was complaining about needing "either [...] a mobile phone [...] or some custom device", which is not true. And sure, Authy is a third-party; but it's not the only option, and you can implement your own (TOTP is not that complicated).

And TOTP has much better user experience than raw keys, especially for beginners who might mix the public/private parts, and experts who want hardware protection.

badsectoracula · on May 26, 2023

Actually my main concern is the reliance on 3rd parties - requiring a mobile phone is an implicit reliance on a lot of 3rd parties that IMO should not have any business where/how i authenticate myself.

I don't know about TOTP but if it can be completely independent from 3rd parties and can be used locally like private+public key signatures can then i guess it is fine.

eesmith · on May 26, 2023

"TOTP for 2FA is incredibly easy to implement. So what's your excuse?" shows how to do it in Python. https://drewdevault.com/2022/10/18/TOTP-is-easy.html .

Though Python would be a 3rd party dependency. ;)

HN comments about that article at https://news.ycombinator.com/item?id=33245042 . Including some of the problems people have had with 2FA usability.

dizzy3gg · on Nov 24, 2021

I think react native changed the hybrid landscape and they jumped onboard. As a dev who works with RN a lot I find SwiftUI to be a pleasure. Just a lot of legacy to deal with

dizzy3gg · on Oct 27, 2021

Have you got any sources of this?

emilecantin · on Oct 27, 2021

https://www.wired.com/2013/03/att-hacker-gets-3-years/

dizzy3gg · on Oct 23, 2021

I don’t think it actually does though, I think the defaults give you a hint to run npm i

skohan · on Oct 23, 2021

How does it know which packages to install for a given un-typed package?

noahtallen · on Oct 23, 2021

Pretty much all community types come from a single repository with a consistent naming scheme. For example, if you use “lodash”, types are available via “@types/lodash” from DefinitelyTyped (the DT repo is now maintained by the typescript team). This is even tracked on npm. VS code can just see if “@types/$package” exists and then prompt you to add it

dizzy3gg · on Oct 3, 2021

End user of a linker?

rewma · on Oct 3, 2021

> End user of a linker?

I'd argue that if you're not packaging your software or testing your software's dependencies, either you're doing something extremely exotic that lies far outside anyone's happy path or "dylib error" should not even be a keyword in your vocabulary.

pjc50 · on Oct 3, 2021

Roughly the same thing happens on Windows and is called "DLL hell", occasionally witnessed by end users.

rewma · on Oct 3, 2021

DLL Hell ceased to be a practical concern over a decade ago, particularly given that Windows provides tight control over its dynamic linking search order.

https://docs.microsoft.com/en-us/windows/win32/dlls/dynamic-...

DLL Hell is not a linking problem, it's a packaging problem.

dizzy3gg · on June 17, 2021

Because when it's not a one off?