Hacker Newsnew | past | comments | ask | show | jobs | submit | bachmeier's commentslogin

I think it's been clear from the beginning that the per-token price of usage was far below what it will be when firms have implemented their profit-maximizing price plans. "AI winter" will happen when these firms start maximizing profit. At that point it'll be too expensive for all but certain use cases to use the best technology for work.

We'll see AI chat replace Google, we'll see companies adopting AI in high-value areas, and we'll see local models like Gemma 4 get used heavily.

AI winter will see a disappearance of the clickbait headlines about everyone losing their jobs. Literally nobody is making those statements taking into account that pricing to this point is way less than the profit maximizing level.


I'm happy I invested in setting up Codex CLI and getting it to work with ollama. For the toughest jobs I can use Github Copilot (free as an academic) or Gemini CLI. If we see the per token price increase 5x or 10x as these companies move to focusing on revenue, local models will be the way to go, so long as stuff like Gemma 4 keeps getting released.

> We're fine, the trick is to remember to GET OFF THE INTERNET and remember that reality isn't the same as the Internet.

That works fine, except in the cases where the bad news reflects reality, or understates how bad the reality is. In that case it's like saying cancer isn't the problem, the problem is that you visited the doctor and listened as he told you bad news.


> That works fine, except in the cases where the bad news reflects reality

The issue is that the 24/7 Internet chatroom/forums shift the "bad news" target on a daily basis. Sometimes its war, others its natural disaster, others its a horrific crime, etc. If you've been only seeing bad news since Covid, then it makes you (read, made me) think the world's in a terrible place. I stopped spending allll my time in the 24/7 chatroom and when I say this IN the chatroom everyone thinks I'm completely unaware. I'm not. I just engage on other matters, like cheering on my buddies when they release something.


The world is (and the US is) a measurably more terrible place than only a few years ago, and a big part of the reason is that, whether or not they remain online, people are helplessly detached from events; being blissfully ignorant is not substantively different in societal impact than being in a state of paralysis from oversaturation of a mix of real, mis- and dis-informaton, even if it is more enjoyable in the near term.

Shutting off the feeds (especially those that are becoming more extremely manipulated to produce ineffective rage, which is part of how the world is worse) may be an effective way to manage the near-term stress of the present combination of media and material conditions, but it doesn't do anything to actually address the material conditions. Heck, detachment and demobilization to reduce resistance to arbitrary exercise of power is a big part of what you are being manipulated for. It's not an accident that that works as stress relief; that's part of the design of the manipulation.


> The world is (and the US is) a measurably more terrible place than only a few years ago

I neither agree nor disagree (if that makes sense), but I certainly agree that being modern Internet has warped people's views on things. I hear it called a "screen detox" via my Spotify BetterHelp ads and while I never used that service, I get what they mean.

Back during Digg 4.0 last year, one of the core members of users referred to it as "trying to have a conversation while attending a riot". Its a lot of third parties and faceless usernames chiming in, and if you don't answer all of them the impression can get equally get warped about the original intent of the conversation. Even how the conversation gets steered after the original comment is interesting to see.

I just think Covid made us all "get on the same wavelength", then someone(s) tainted that through things like heavy Reddit moderation. Like, we were all doing our own little things, then "everyone" is refreshing Johns Hopkins' dashboard, wondering if they have enough toilet paper because of the Seuz Canal, or watching all of the protest/riots unfold in other states.

But what got lost was no one going out to things, saving/gambling their money on the next short squeeze, and not supporting local stuff. If anything, GET OFF THE INTERNET is my attempt at manipulation/psyop/marketing campaign. And, locally, yeah, we're offline, openly talking about what we see on the different platforms since Reddit and Twitter are politically skewed, and sort of remembering a time before the pandemic.

I go to Magic the Gathering events at my LGS now. Its pretty cool to meet the nerds in that "missing third space". We're still talking about tariffs and global conflicts. We're just doing it respectfully and not trying to ruin the game at the same time cause not everyone agrees. I can even tell when someone is fresh off Arena because they play some of those insta-win meta combos. I just make tribal decks, I don't have time to study all that.


You can still read print media like WAPO, NYT, or WSJ. Stay away from opinion and editorial sections and you'll still be informed about what matters but not manipulated so much that it gives you anxiety.

Yeah, here's what Bezos wrote. I seriously doubt it ends with the opinion section:

> I’m writing to let you know about a change coming to our opinion pages...We are going to be writing every day in support of defense of two pillars: personal liberties and free markets. We’ll cover other topics too of course, but viewpoints opposing those pillars will be left to be published by others...

I'll leave it to others to make a decision on whether WAPO qualifies as a propaganda outfit.


While those listed papers may not be outright fabrications, they are very much manipulated by what their billionaire owners want you to know.

Part of the problem here is you can only list a few papers that might tell you the truth at all, when in the past there was far more independent news organizations that would vie against each other. Now they need to check in with their shareholders first.


late checking in on the conversation. I agree with you to a degree but it's better than the rage bait online. Also, with a physical piece of paper you can reach the end, it doesn't scroll forever. I was subscribed to the paper wsj for a while and that was my favorite part, i could reach the end.

Every WaPo reporter and editor doesn't check in with Bezos before a story goes to print. Yeah, the owners steer some stuff and kill some articles, but for the most part there's still very good reporting going on at the major US papers. It's a convenient fallacy to handwave away all established journalism because billionaire owners are chipping away around the edges.

>WaPo reporter and editor doesn't check in with Bezos before a story goes to print.

Reporters are at the bottom of the list, there is a pile of middle and upper management that does all this work for Bezos without his need to keep an eye on it.

All it takes is one phone call from him saying they need to be careful around a topic and that's it. Funds dry up for investigations into that topic.

Now, I never said 'throw away' journalism, I said to ensure you understand the bias of the paper in question. Just because WaPo isn't reporting on Bezos doesn't mean there isn't anything to report on said guy.


So is there something I can take from that table if I have a 24 GB video card? I'm honestly not sure how to use those numbers.

I just tried with llama.cpp RTX4090 (24GB) GGUF unsloth quant UD_Q4_K_XL You can probably run them all. G4 31B runs at ~5tok/s , G4 26B A4B runs at ~150 tok/s.

You can run Q3.5-35B-A3B at ~100 tok/s.

I tried G4 26B A4B as a drop-in replacement of Q3.5-35B-A3B for some custom agents and G4 doesn't respect the prompt rules at all. (I added <|think|> in the system prompt as described (but have not spend time checking if the reasoning was effectively on). I'll need to investigate further but it doesn't seem promising.

I also tried G4 26B A4B with images in the webui, and it works quite well.

I have not yet tried the smaller models with audio.


> I'll need to investigate further but it doesn't seem promising.

That's what I meant by "waiting a few days for updates" in my other comment. Qwen 3.5 release, I remember a lot of complaints about: "tool calling isn't working properly" etc.

That was fixed shortly after: there was some template parsing work in llama.cpp. and unsloth pulled out some models and brought back better one for improving something else I can't quite remember, better done Quantization or something...

coder543 pointed out the same is happening regarding tool calling with gemma4: https://news.ycombinator.com/item?id=47619261


The model does call tools successfully giving sensible parameters but it seems to not picking the right ones in the right order.

I'll try in a few days. It's great to be able to test it already a few hours after the release. It's the bleeding edge as I had to pull the last from main. And with all the supply chain issues happening everywhere, bleeding edge is always more risky from a security point of view.

There is always also the possibility to fine-tune the model later to make sure it can complete the custom task correctly. But the code for doing some Lora for gemma4 is probably not yet available. The 50% extra speed seems really tempting.


If you are running on 4090 and get 5 t/s, then you exceeded your VRAM and are offloading to the CPU (or there is some other serious perf. issue)

Thank you. I have the same card, and I noticed the same ~100 TPS when I ran Q3.5-35B-A3B. G4 26B A4B running at 150TPS is a 50% performance gain. That's pretty huge.

> Fairly straightforward but a ton of bitch work. The LLM blasted through it like it was nothing.

One might argue that this is a substitute for metaprogramming, not software developers.


It's interesting more people haven't talked about this. A lot of so-called agentic development is really just a very roundabout way to perform metaprogramming.

At my own firm, we generally have a rule we do almost everything through metaprogramming.


> Using a coding agent seems quite low skill to me.

I agree if that's all you can do. Using a coding agent to complement a valuable domain-specific skill is gold.


Thus why many technical business-facing people are super excited about AI (at the cost of developers)

I agree with your conclusion, but that's by design. The goal is not to tell people the truth (how would they even do that). The goal is to give the answer that would have come from the training data if that question were asked. And the reality is that confirmation is part of life. You may even struggle to stay married if you don't learn to confirm your wife's perspectives.

> The goal is to give the answer that would have come from the training data if that question were asked.

Or more cynically, the goal is to give you the answer that makes you use the product more. Finetuning is really diverging the model from whats in the training set and towards what users "prefer".


The loss function is based on predicting the response based on the training data, or based on subsequent RLHF. The goal is usually to make money. Not only does the training data contain a lot of "you're absolutely right" nonsense, but that goal tends to push more of it in the RLHF step.

> You may even struggle to stay married if you don't learn to confirm your wife's perspectives.

I don't dispute that but man that is some shitty marriage. Even rather submissive guys are not happy in such setup, not at all. Remember its supposed to be for life or divorce/breakup, nothing in between.

Lifelong situation like that... why folks don't do more due diligence on most important aspect of long term relationships - personality match? Its usually not a rocket science, observe behavior in conflicts, don't desperately appease in situations where one is clearly not to blame. Masks fall off quickly in heated situations, when people are tired and so on. Its not perfect but pretty damn good and covers >95% of the scenarios.


Not everyone has a supportive family or the requisite childhood / life experiences to do “due diligence”.

All this, and yet, people are so angered by the term "stochastic parrot".

I use LLMs every day, I use Claude, Gemini, they're great. But they are very elaborate autocomplete engines. I'm not really shaking off that impression of them despite daily use .


It's weird. It's literally what they are. It's a gigantic mathematical function that takes input and assigns probabilities to tokens.

Maybe they can also be smart. I'm skeptical that the current LLM approach can lead to human-level intelligence, but I'm not ruling it out. If it did, then you'd have human-level intelligence in a very elaborate autocomplete. The two things aren't mutually exclusive.


People are hung up on what they “really” are. I think it matters more how the interact with the world. It doesn’t matter if they are really intelligent or not, if they act as if they are.

Totally agreed. Although the difference between sounding intelligent and being intelligent is proving to be a bit troublesome.

Yes, it is. But those distinctions are going to be a lot less relevant with robotics. It won’t matter if it’s impatient or just acting impatient. Feels slighted or just acting like it feelss slighted. Afraid, or just acting afraid. For better or for worse, we are modeling AI after ourselves.

I am hearing this term for the first time but I love it. It is novel and creates a picture. Exactly what Scott Adams says about labels used for persuasion. I usually say "highly trained autocomplete" in discussions at work, but I am going to say "stochastic parrot" from now on.

oh, OK. You should google the term to see where it comes from. it's from someone who is essentially an anti-LLM activist and it's meant as a slur. That's likely why people consider it to be a slur, due to its origins.

You can't make a "slur" against software. It isn't a person, it doesn't have feelings.

"stochastic parrot" describes what an LLM does, that it (like a parrot) generates coherent human language without understanding its meaning.

Being offended on behalf of software is weird.


> Being offended on behalf of software is weird.

Yes. I can really recommend this essay of PG about that:

https://paulgraham.com/identity.html


> You may even struggle to stay married if you don't learn to confirm your wife's perspectives

Nope. You picked the wrong wife if that is the situation you are finding yourself in. My partner and I accept each others perspectives even if we disagree. I would never date a woman who can't accept that different opinions exist and that we both will sometimes be wrong.


>And the reality is that confirmation is part of life.

Sycophantic agreement certainly is, as is lying, manipulation, abuse, gaslighting.

Those aren't the good parts of life.

Those aren't the parts I want the machine to do to people on a mass scale.

>You may even struggle to stay married if you don't learn to confirm your wife's perspectives.

Sorry what?

The important part is validating the way someone feels, not "confirming perspectives".

A feeling or a perspective can be valid ("I see where you're coming from, and it's entirely reasonable to feel that way"), even when the conclusion is incorrect ("however, here are the facts: ___. You might think ___ because ____, and that's reasonable. Still, this is how it is.")

You're doing nobody a favor by affirming they are correct in believing things that are verifiably, factually false.

There's a word for that.

It's lying.

When you're deliberately lying to keep someone in a relationship, that's manipulation.

When you're lying to affirm someone's false views, distorting their perception of reality - particularly when they have doubts, and you are affirming a falsehood, with intent to control their behavior (e.g. make them stay in a relationship when they'd otherwise leave) -

... - that, my friend, is gaslighting.

This is exactly what the machine was doing to the colleague who asked "which of us is right, me or the colleague that disagrees with me".

It doesn't provide any useful information, it reaffirms a falsehood, it distorts someone's reality and destroys trust in others, it destroys relationships with others, and encourages addiction — because it maximizes "engagement".

I.e., prevents someone from leaving.

That's abuse.

That, too is a part of life.

>I agree with your conclusion, but that's by design

All I did was named the phenomena we're talking about (lying, gaslighting, manipulation, abuse).

Anyone can verify the correctness of the labeling in this context.

I agree with your assertion, as well as that of the parent comment. And putting them together we have this:

LLM chatbots today are abusive by design.

This shit needs to be regulated, that's all. FDA and CPSC should get involved.


That's probably true to some extent, but I'm not completely on board.

> 1. Anything that is in the world when you’re born is normal and ordinary and is just a natural part of the way the world works.

Television and calculators were in the world when I was born, but I never viewed them as "natural". TV always seemed to be a way to distract yourself from the world.

> 2. Anything that’s invented between when you’re fifteen and thirty-five is new and exciting and revolutionary and you can probably get a career in it.

I was happy to get on board with the WWW, the web browser, and widespread email usage. Those were revolutionary technologies with immense values. On the other hand, I'm still not on board with text messaging, phone scrolling, or social media. If I could, I'd eliminate social media from society.

> 3. Anything invented after you’re thirty-five is against the natural order of things.

I'm over 50 and a strong believer in the value of the LLM. It's a work tool that I can use at work and put away when I'm at home (or not, depending on my mood). It's new and exciting and revolutionary and a move in the right direction for humanity.


> Why is that argument always applied against Linux, and never against for instance macOS, which also can't run Windows software?

There's a certain type of technical user that gets joy from coming up with arguments, good, bad, or just pulled out of their butt, explaining why people can't use Linux. I'm not going to spend my day trying to understand people's unusual preferences.


Another day in the post-copyright world. Surely someone somewhere is already using this to test the effect of copyright laws, should we decide to go back to that world.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: