More

fourside · 2026-05-21T14:48:02 1779374882

> it was born of some arrogance that they were speeding towards the inevitability of AGI

I think it was partly also PR. Google, OpenAI and Anthropic are fighting for mindshare and Dalle-E, Sora, Nano banana, etc generated a lot of media buzz for Google and OpenAI at various points in time.

fourside · 2026-05-18T14:00:11 1779112811

He was on stage and had a mic. I don’t know that the students had a lot of options to make their voices heard in the situation. And since folks like Schmidt already have access to channels to spread their opinions and this was the students’ graduation I think they get a pass.

ryandrake · 2026-05-18T17:38:54 1779125934

This is exactly the meme: “I am being silenced! says the man with the microphone, book deal, and celebrity-level media access.”

fourside · 2026-05-08T15:49:33 1778255373

Sounds like the parent comment probably doesn’t know much about how the auto industry works and should refrain from commenting.

fourside · 2026-05-01T15:44:51 1777650291

This is a very loaded question. I’m not sure what type of answer you expect to get.

fourside · 2026-04-28T20:03:08 1777406588

People who reach outlier-level success in a field tend to have strong opinions and an emotional connection to said field. It’s probably a non-trivial part of why they are so successful.

fourside · 2026-04-24T16:51:52 1777049512

Maybe for folks who are deep into this, but it’s not exactly accessible. I tried reading up on it a couple of months ago, but parsing through what hardware I needed, the model and how to configure it (model size vs quantization), how I’d get access to the hardware (which for decent results in coding, new hardware runs $4k-$10k last I checked)—it had a non trivial barrier of entry. I was trying to do this over a long weekend and ran out of time. I’ll have to look into it again because having the local option would be great.

Edit: the replies to my comment are great examples of what I’m talking about when I say it’s hard to determine what hardware I’d need :).

jonaustin · 2026-04-24T18:09:32 1777054172

Just get a decent macbook, use LM Studio or OMLX and the latest qwen model you can fit in unified ram.

Hooking up Claude Code to it is trivial with omlx.

https://github.com/jundot/omlx

imetatroll · 2026-04-24T21:35:02 1777066502

For me the big hangup is the hardware. If I could find a simple guide to putting together a machine that I can run off an outlet in my home, I am sold. The problem is that I haven't found this yet (though I suppose I haven't looked very hard either).

root_axis · 2026-04-24T17:18:11 1777051091

> new hardware runs $4k-$10k last I checked

Starting closer to 40k if you want something that's practical. 10k can't run anything worthwhile for SDLC at useful speeds.

zozbot234 · 2026-04-24T17:29:17 1777051757

$10K should be enough to pay for a 512GB RAM machine which in combination with partial SSD offload for the remaining memory requirements should be able to run SOTA models like DS4-Pro or Kimi 2.6 at workable speed. It depends whether MoE weights have enough locality over time that the SSD offload part is ultimately a minor factor.

(If you are willing to let the machine work mostly overnight/unattended, with only incidental and sporadic human intervention, you could even decrease that memory requirement a bit.)

SwellJoe · 2026-04-24T17:59:49 1777053589

You can't put "SSD offload" and "workable speed" in the same sentence.

zozbot234 · 2026-04-24T18:49:11 1777056551

As a typical example DeepSeek v4-pro has 59B active params at mostly FP4 size, so it needs to "find" around 30GB worth of params in RAM per inferred token. On a 512GB total RAM machine, most of those params will actually be cached in RAM (model size on disk is around 862GB), so assuming for the sake of argument that MoE expert selection is completely random and unpredictable, around 15GB in total have to be fetched from storage per token. If MoE selection is not completely random and there's enough locality, that figure actually improves quite a bit and inference becomes quite workable.

SwellJoe · 2026-04-25T01:21:17 1777080077

I've never seen reports of this kind of setup being able to deliver more than low single-digit tokens per second. That's certainly not usable interactively, and only of limited utility for "leave it to think overnight" tasks. Am I missing something?

Also, I don't know of a general solution to streaming models from disk. Is there an inference engine that has this built-in in a way that is generally applicable for any model? I know (I mean, I've seen people say it, I haven't tried it) you can use swap memory with CPU offloading in llama.cpp, and I can imagine that would probably work...but definitely slowly. I don't know if it automatically handles putting the most important routing layers on the GPU before offloading other stuff to system RAM/swap, though. I know system RAM would, over time, come to hold the hottest selection of layers most of the time as that's how swap works. Some people seem to be manually splitting up the layers and distributing them across GPU and system RAM.

Have you actually done this? On what hardware? With what inference engine?

fourside · 2026-04-22T13:56:22 1776866182

Of the big three, Gemini gives me the worst responses for the type of tasks I give it. I haven’t really tried it for agentic coding, but the LLM itself often gives, long meandering answers and adds weird little bits of editorializing that are unnecessary at best and misleading at worst.

hyperbovine · 2026-04-22T14:39:07 1776868747

Same. The tone is really off. Here is a response I just got from Gemini 3.1: "Your simulation results are incredibly insightful, and they actually touch on one of the most notoriously difficult aspects of ..." It's pure bullshit, my simulation results are in fact broken, GPT spotted it immediately.

fourside · 2026-04-17T15:10:20 1776438620

Except there are several obvious ways in which LLMs are not indistinguishable from humans.

fourside · 2026-04-07T13:59:29 1775570369

I get that credit cards are a barrier of entry but I’m more willing to give providers a break now that AI agents make it much easier to abuse free tiers. It’s also harder for smaller companies to offer free tiers. If we want a more diverse set of service providers we as customers need to be willing to accept some trade-offs.

fourside · 2026-04-07T10:24:08 1775557448

The irony in your comment is that you accuse the OP of interpreting the world based on his own warped view of it rather than what’s actually in front of him, yet you’re doing precisely that. The OP did not call Altman racist and made a point to draw the distinction. He also claims his is not the only example of this and is effectively encouraging an investigative journalist and the rest of HN to look into it and verify for ourselves.

Some degree of skepticism is healthy here. An online comment is not definitive proof, and it’s all too easy to pile accusations as part of a comment thread that’s already critical of someone. But the way you readily armchair psychoanalyze and dismiss the OP tells me you’re not engaging in an honest way.