Hacker Newsnew | past | comments | ask | show | jobs | submit | zbendefy's commentslogin

>The user writes the data to CPU mapped GPU memory first and then issues a copy command, which transforms the data to optimal compressed format.

Wouldnt this mean double gpu memory usage for uploading a potentially large image? (Even if just for the time the copy is finished)

Vulkan lets the user copy from cpu (host_visible) memory to gpu (device_local) memory without an intermediate gpu buffer, afaik there is no double vram usage there but i might be wrong on that.

Great article btw. I hope something comes out of this!


What has changed now in the memory landscape/ai workload in the recent months compared to summer or spring?


Apparently OpenAI locked down 40% of the global DRAM supply for their Stargate project, which then caused everyone else to start panic-buying, and now we're here: https://pcpartpicker.com/trends/price/memory/


It's kind of depressing to see that it takes just one asshole to screw the entire electronic market. If you read this, Sam, FU.


I got one of the Newegg circulars in my email advertising a sweet little uATX AMD server board and got to thinking that my home FreeBSD server could use a CPU bump and more memory. As soon as I saw how much 128GB of DDR5 ECC would cost my jaw dropped and noped the fuck out. The cheapest 32GB modules are around $300 and upwards of $500. Thought I was going to gift myself early this Christmas. Depressing indeed.


Indeed, it makes mini computers with soldered ram actually end up being quite cheap by comparison. HP will currently sell you 128GB AMD or Nvidia boxes for 1.7-2.8k depending on your flavor of choice. Not ECC though.


The strix halo mini workstation is 128gb of ecc. Good machine. Source https://h20195.www2.hp.com/v2/GetPDF.aspx/c09086887


Necro reply - I don't think they actually shipped it with ECC... If you lookup a Z1 mina G1a repair video: https://youtu.be/1i_PfH05ekw?si=MpQ0Uc9QVwhgsFzi&t=267 It has 8 chips, not 10. It needs 10 from true ECC.

I should check this next time I take it apart, thanks for the heads up :)

I’d be really curious to see - that would possibly sell me on one! Linux should also report edac being up in dmesg if it has ecc.

...till supplies last. Which won't be long when people do exactly that (hey, that mini PC is now cheaper than building similar setup)


Indeed…


Exactly this.

I'd been planning to upgrade my desktop as a christmas present for myself.

Now I have the cash and was looking at buying my PCPartPicker list, the cost of the 64GB DDR5-6000 RAM I planned to buy has gone from £300-400 to £700-800+, a difference of almost the price of the 9070 XT I just bought to go in the computer.

I guess I'll stick with my outdated AM4/X370 setup and make the best of the GPU upgrade until RAM prices stop being a complete joke.


literally every market is like that. if you've got market-cap amounts of money and place a market buy order for all of it, you'll quickly learn what slippage is.


That really isn't unprecedented. We need high RAM prices for manufacturers to expand fabs, supply overshoots demand because the AI bubble will contract to some extend and then we'll have cheap RAM once again. Classic cycle.


> We need high RAM prices for manufacturers to expand fab

Manufactures aren't dumb, they lost a lot of money in the last cycle and aren't playing that game anymore. No additional capacity is planned, OEMs are simply redirecting existing capacity towards high-margin products (HBM), instead of chasing fragile demand.


The proles will get dumb screens tethered to their sanctioned models; and we will be grateful!


I understand hating at people like Musk who destroys human lifes but what is Sam Altman doing?

Because of (c) of images or just because he bought ram?


Inefficiently (from society’s perspective) allocating massive amounts of resources? Why is he specifically being singled out I’m absolutely that certain..


People should be happy that commercial entities invest that much money into compute, especially on hn.

This will leap frog cancer research, material research etc.


> Apparently OpenAI locked down 40% of the global DRAM supply for their Stargate project

That sounds like a lot, and almost unbelievable, but the scales of all of this kind of sits in that space, so what do I know.

Nonetheless, where are you getting this specific number and story from? I've seen it echoed before, but no one been able to trace it to any sort of reliable source that doesn't boil down to "secret insider writing on Substack".


Samsung directly announced that OpenAI expects to procure up to 900,000 DRAM wafers every month. That number being 40% of global supply comes from third party analysis, but the market is going to notice nearly a million wafers being diverted each month however you slice it. That's a shitload of silicon.

https://news.samsung.com/samsung-and-openai-announce-strateg...

https://www.tomshardware.com/pc-components/dram/openais-star...


> Samsung directly announced that OpenAI expects to procure up to 900,000 DRAM wafers every month

The article says: "OpenAI’s memory demand projected to reach up to 900,000 DRAM wafers per month", but not by when, or what current demand is. If this is based on OpenAI's >$1T of announced capex over the next 5 years, its not clear that money will ever actually materialize.


I dont even get this trend, wouldnt OpenAI be buying ECC RAM only anyway? Who in their right mind runs this much infrastructure on NON ECC RAM??? Makes no sense to me. Same with GPUs they aren't buying your 5090s. Peoples perception is wild to me.


OpenAI bought out Samsung and SK Hynixes DRAM wafers in advance, so they'll prioritize producing whatever OpenAI wants to deploy whether that's DDR/LPDDR/GDDR/HBM, with or without ECC. That means way less wafers for everything else so even if you want a different spec you're still shit out of luck.


You forgot to mention that everyone else also raised their price because, you know, who don't like free money.

Last year I brought two 8G DDR3L RAM stick made by Gloway for around $8 each, now the same stick is priced around $22, a 275% increase in price.

SSD makers are also increasing their prices, but that started one or two years ago, and they did it again recently (of course).

It looked like I'll not be buying any first-hand computers/parts before the price can go normal again.


> you know, who don't like free money.

Yes but otherwise you’d get huge shortages and would be unlikely to be able to buy it at all. Also a significant proportion of the surplus currently going to manufacturers/etc. would go to various scalper and resellers


ECC memory is a bit like RAID: A consumer-level RAM stick will (traditionally) have 8 8-bit-wide chips operating basically in RAID-0 to provide 64-bit-wide access, whereas enterprise-level RAM sticks will operate with 9 8-bit-wide chips in something closer to RAID-4 or -5.

But they are all exactly the same chips. The ECC magic happens in the memory controller, not the RAM stick. Anyone buying ECC RAM for servers is buying on the same market as you building a new desktop computer.


> enterprise-level RAM sticks will operate with 9 8-bit-wide chips

Since DDR5 has 2 independent subchannels, 2 additional chips are needed.


> Anyone buying ECC RAM for servers is buying on the same market as you building a new desktop computer.

Even when the sticks are completely incompatible with each other? I think servers tend to use RDIMM, desktops use UDIMM. Personally I'm not seeing as step increase in (b2b) RDIMMs compared to the same stores selling UDIMM (b2c), but I'm also looking at different stores tailored towards different types of users.


The expensive part is DRAM chips. They drive prices for sticks.


At the chip level there’s no difference as far as I’m aware, you just have 9 bits per byte rather than 8 bits per byte physically on the module. More chips but not different chips.


> you just have 9 bits per byte rather than 8 bits per byte physically on the module. More chips but not different chips.

For those who aren't well versed in the construction of memory modules: take a look at your DDR4 memory module, you'll see 8 identical chips per side if it's a non-ECC module, and 9 identical chips per side if it's an ECC module. That's because, for every byte, each bit is stored in a separate chip; the address and command buses are connected in parallel to all of them, while each chip gets a separate data line on the memory bus. For non-ECC memory modules, the data line which would be used for the parity/ECC bit is simply not connected, while on ECC memory modules, it's connected to the 9th chip.

(For DDR5, things are a bit different, since each memory module is split in two halves, with each half having 4 or 5 chips per side, but the principle is the same.)


I seriously doubt that single bit errors on the scale of OpenAI workloads really matters very much, particularly for a domain that is already noisy.


Till they hit your program memory. We just had really interesting incident where one of the Ceph nodes didn't fail but started acting erratically, bringing whole cluster to a crawl, once a failing RAM module had some uncorrectable errors.

And that was caught because we had ECC. If not for that we'd be replacing drives, because metrics made it look like it is one of OSDs slowing to a crawl, which usual reason is drive dying.

Of course, chance for that is pretty damn small, bit also their scale is pretty damn big.


Random bit flips is their best path to AGI.


On the flipside, LLMs are so inconsistent you might argue ECC is a complete waste of money. But Open Ai wasting money is hardly anything new.


Using digital chips instead of some novel analog approach is even greater waste.

> China's AI Analog Chip Claimed to Be 3000X Faster Than Nvidia's A100 GPU (04.11.2023)

https://news.ycombinator.com/item?id=38144619

> Q.ANT’s photonic chips – which compute using light instead of electricity – promise to deliver a 30-fold increase in energy efficiency and a 50-fold boost in computing speed, offering transformative potential for AI-driven data centers and HPC environments. (24.02.2025)

https://qant.com/press-releases/q-ant-and-ims-chips-launch-p...


ECC modules use the same chips as non ECC modules so it eats into the consumer market too.


Good point! But they are slightly more energy hungry. At these scales I wonder if Stargate could go with one less nuclear reactor simply by switching to non-ECC RAM


Penny-wise and pound foolish. Non-ECC RAM might save on the small amount of RAM power, but if a bit-flip causes a failed computation then an entire forwards/backwards step – possibly involving several nodes – might need to be redone.


Linus Torvalds was recently on Linux Tech Tips to build a new computer and he insisted on ECC RAM. Torvalds was convinced that memory errors are a much greater problem for stability than otherwise posted and he's spent an inordinate amount of time chasing phantom bugs because of it.

https://www.youtube.com/watch?v=mfv0V1SxbNA


>but if a bit-flip causes a failed computation then an entire forwards/backwards step – possibly involving several nodes – might need to be redone.

Which for the most part it would be an irrelevant cost-of-doing business compared to the huge savings from non-ECC and how incosequential it is if some ChatGPT computation fails...


The 5090 is the same chip as the workstation RTX 6000.

Of course OpenAI is also not buying that but B200 DGX systems, but that is still the same process at TSMC.


ECC RAMs utility is overblown. Major companies often use off-the-shelves non enterprise parts for huge server installations, including regular RAM. The rare bit flipping is hardly a major concern at their scale, and for their specific purposes.


Most server CPUs require RDIMMs, and while non-ECC RDIMMs exist, they are not a high-volume product and are intended for workstations rather than servers. The used parts market would look very different if there were lots of large-scale server deployments using non-ECC memory modules.


Do you have a source for this?

I would not want to rerun a whole run just because of bit flips and bit flips become a lot more relevant the more servers you need.


That DDR5-4800 2x16GB price tend is crazy. It tripled from August/September until now.


Even DDR4. Just checked, I bought a non-ECC 1x32go stick for my homelab on August 25th, priced 78€ on Amazon. Same offer is now at 229€. Yeah I guess I'll wait before updating to 64gig then


It reminds me very much of the crypto mining craze, when there was a run on GPUs and one couldn't be had for any less than 5x it's MSRP. I know that eventually passed and so too will this but it still sucks if you had been planning to purchase RAM or anything needing it.


I don't think DDR4 is even being manufactured anymore, so the rush is clearing out that inventory for good.


It is still being manufactured. Older memory standards continue to be manufactured long after they stop being used in computers, e.g. for use in embedded devices.


What will happen once the bubble pops and OpenAI will not be able to pay for all the useless stuff they ordered ?


Ideally the consumer market gets flooded with surplus at cost or below server grade hardware flowing out in going out of business fire sales.


not much use for the 100GB+ AI boards or server RAM for consumers. Tho homelab guys will be thrilled.

Enterprise wise, the used servers kinda always have been cheap (at least compared to MSRP or even after discount price), just because there is enough companies that want a feel good of having a warranty on equipment and yeet it after 5 years.


Nowadays old-gen server hardware can be a viable alternative to a new HEDT or workstation, which would typically use top-of-the-line consumer parts. The price and performance are both broadly comparable.


Isn't the typical server much noisier than, e.g., a high-end desktop (HEDT) with Noctua fans?


No. Up to you to cool. I use an Epyc based system as a home server and you can’t hear it. At a previous employer we built a cluster out of these and just water cooled them. Very easy.

This is a chassis and fan problem not a CPU problem. Some devices do need their own cooling if your case is not a rack mount. E.g. if you have a mellanox chip those run hot unless you cool them specifically. In rackmount use case that happens anyway.


Depends how big the fans are. Tiny 1U rack-mountable hardware = lots of noise; huge fans = near silent with better heat removal capacity.


Oof the RAM in my computers is apparently worth more than I paid for the entire thing...


DRAM prices have skyrocketed recently


there is also a vr mod for HL1 as well


My takeaway is that sandboxing should be more readily available, and integrated into the OS.

I used sandboxie a while ago for stuff like this, but afaik windows has some sandbox built into it since a few years which I didnt think about until now.


Yeah, Windows Sandbox is available on Win 10/11 Pro and Enterprise and it's actually pretty neat. I used to use it in a previous job where I was forced to run Windows.

However, I think OP might be using WSL and I'm not sure that's available in Sandbox.


Windows Sandbox looks like an alpha. It's nowhere near where Microsoft's valuation is.

That said with enough attacks of this kind we may actually get real security progress (and a temporary update freeze maybe), fucking finally.


Microsoft's valuation? Update freeze?


Why not get the wifi enabled fridge and just not hook it up to your router?

Genuinely asking because I plan to do this once I have to get new appliances, is there something missing that way?


Why not buy the fridge that doesn't have wifi smarts to begin with?

If I want to monitor my fridge's temperature, I can buy a widget that does that for a dozen or so dollars and have that sensor talk to the home automation system of my choice. And when the fridge dies or otherwise gets replaced, I can move the sensor to the new fridge. (And when a new sensor comes out that I like better, I can spend another McDonald's Value Meal worth of money to use that instead.)

Besides: We here on HN should all have a certain amount of distrust for devices that self-report problems.

This distrust is part of the reason why ZFS doesn't trust hard drives to self-report issues and does its own checksums instead.

---

But that's a general rant. To answer your question more-directly, if somewhat-tangentially: One of the popular open-source-oriented YouTube dudes (Jeff Geerling?) recently bought a dishwasher that had functional modes that could not be accessed without a wifi connection to The Clown.

And that's... that's not good: In order to be able to use the functions that the thing natively includes, one must always allow it to call home to mother.


My folks have one of these: it bitches at you for internet.

Also the device is generally designed with internet in mind, so certain local-only functions don’t work properly without internet.


at some point they'll put limits until you connect. it might go from just not working to limiting the temperature or whatever.


Thing is (correct.me if Im wrong), that if you use modules, all of your code need to use modules (e.g. you cant have mixed #include <vector> and import <vector>; in your project). Which rules out a lot of 3rd party code you might want to depend on.


you wrong You can simply use modules with includes. If you will #include vector inside your purview then you will just get a copy of the vector in each translation unit. Not good, but works. On the other hand. If you include a vector inside the global module fragment, then the number of definitions will be actually 1, even if you include it twice in different modules.


Well, the standard says you can, but it doesn't actually work in practice in msvc, which is the only compiler that's supported modules for over a year.


gcc and clang implemented them too, but partialy.

My comment about this absolutely wrong point:

> all of your code need to use modules

With all three major compilers you can right now use modules and at the same time include some other dependencies.


How different is CBOR compared to BSON? Both seem to be binary json-like representations.

Edit: BSON seems to contain more data types than JSON, and as such it is more complex, whereas CBOR doesn't add to JSON's existing structure.


That's not entirely true: with CBOR you can add custom data types through custom tags. A central registry of them is here:

https://www.iana.org/assignments/cbor-tags/cbor-tags.xhtml

This is, for example, used by IPLD (https://ipld.io) to express references between objects through native types (https://github.com/ipld/cid-cbor/).


I think parsing BSON is simpler than parsing JSON, BSON has additional types but the top level is always a document. Whereas the following are all valid JSON:

- `null`

- `"hello"`

- `[1,2,NaN]`

Additionally, BSON will just tell you what the type of a field is. JSON requires inferring it.


NaN is not part of JSON by any spec. Top level scalar values were disallowed by RFC 4627.


Fair enough. I'm not sure how much JSON parsers in the wild care about that spec. I just tried with Python and it was happy to accept scalars and NaN. JavaScript rejected NaN but was happy to accept a scalar. But sure, compliant parsers can disregard those cases.


its 12 years not 22.

An embedded device bought today may be easily in use 12 years from now.


oops. fixed that.


offtopic: why is the Copyright © icon shake like crazy at the bottom of the page?

Edit: Oh I guess it seems to be intentional, I clicked around and I like the rgbcube site map.


<copyright intensifies>


This is such a good analogy. Awereness about social media shluld be like awereness about junk food you consume.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: