What every programmer should know about solid-state drives (2014)

NovemberWhiskey · on May 24, 2022

I feel like maybe this is "what all filesystem developers should now about solid-state drivers"; not very obvious how most other developers would interact with a device at the level of abstraction where they have the kind of necessary control.

mmmpetrichor · on May 24, 2022

If some typical write pattern from a typical app is wearing out the SSD really fast, I'd say that's the SSD firmware engineer's problem? And I think they've actually done a great job in general, judging by the typical lifespan of SSDs and the typically great performance. I'd argue that if the drive is designed correctly, most programmers shouldn't have to care about low level details. (I did say MOST).

Sakos · on May 24, 2022

I think you misspelled "it's the user's problem". I don't think most companies care until it becomes something that materially affects them. Until then, users are reliant on the developers of the applications they use to make up for the deficiencies in lower layers.

lazide · on May 25, 2022

A reputation for drives that fail faster than their competitors will definitely effect them!

melissalobos · on May 25, 2022

> A reputation for drives that fail faster than their competitors

How can they get that if they stuff enough fake reviews, plus the legion of consumers who would have no idea that the drive was the issue and not "viruses".

lazide · on May 25, 2022

Neither of those apply to enterprise customers.

dotancohen · on May 25, 2022

That's a completely different market, to which are marketed different drives.

lazide · on May 25, 2022

True mostly! OEMs are the largest non-enterprise market, and they do (roughly) the same types of testing and validation a enterprise customer would though.

No one is going to be selling a million laptops with a drive from RandoDriveManuGoodBrand off Amazon with no track record and no validation.

Anyone buying the non-name brand types of drives knows they're getting (at best) something that might only work a little while before exploding.

The name brands like Samsung, et. al. work hard to make their firmware not grenade something (and the drives overall to be AT LEAST as reliable as their competitors) BECAUSE they want the name to mean something. It is what drives customers their way, most of the time.

If they get a reputation as a company selling junk (cough Deskstar/Deathstar) that costs them billions over many years.

grogers · on May 25, 2022

The firmware's job is wear leveling - making sure all the sectors wear out at about the same time, which they do a great job at. But SSDs can write so fast that you can burn out the drive in months (maybe even weeks?) if you wanted to. There's nothing the firmware can do fix the limitations of flash itself. The important thing to keep in mind is that for write heavy workloads, you need to keep write amplification in mind.

I remember an adjacent team to mine that had to store several gigs of data which changed often, but only a small percent changed at any one time. They needed to recover quickly from a crash so they wrote it to disk. But they wrote the entire data set out to disk after every update, instead of keeping it in e.g. rocksdb or even sqlite. Their entire fleet burnt through their SSDs at about the same rate, so machines were dying in rapid succession, ouch. Write amplification is a real problem, but SSDs great performance often masks it being an issue until down the road.

eyegor · on May 25, 2022

> But SSDs can write so fast that you can burn out the drive in months (maybe even weeks?)

You can burn out a modern consumer drive in 2 days if you want to. Write perf ~6 gb/s, mtbf 700 tb written on a 1 tb drive. The tlc/qlc cells have very poor endurance imo.

Helmut10001 · on May 25, 2022

I am surprised the article doesn't mention monitoring TBW (TerraBytesWritten). I found it a good indicator for how much data is actually written to the SSD. In my case, I decided to buy cheap consumer SSDs (from WD), because I calculated I have only about 50TBW per year on my VM-drive, a ZFS-Mirror. In reality, it is even less - so far 14TBW in 2022, with 16 VMs. See a blog post here for how I monitor the stats in InfluxDB [1]. WD says the drive has an average lifetime of about 300 to 400 TBW, so I can expect at least another 5 years.

[1]: https://du.nkel.dev/blog/2021-05-05_proxmox_influxdb/#config...

belter · on May 25, 2022

"Storage study finds SSDs might not be much more reliable than HDDs after all"

https://www.pcgamer.com/storage-study-finds-ssds-might-not-b...

"Are SSDs Really More Reliable Than Hard Drives?"

https://www.backblaze.com/blog/are-ssds-really-more-reliable...

jbverschoor · on May 24, 2022

> Cells are grouped into a grid, called a block, and blocks are grouped into planes. The smallest unit through which a block can be read or written is a page. Pages cannot be erased individually, only whole blocks can be erased. The size of a NAND-flash page size can vary, and most drive have pages of size 2 KB, 4 KB, 8 KB or 16 KB. Most SSDs have blocks of 128 or 256 pages, which means that the size of a block can vary between 256 KB and 4 MB. For example, the Samsung SSD 840 EVO has blocks of size 2048 KB, and each block contains 256 pages of 8 KB each.

Very confusing and might be incorrect. What are planes. And are pages made out of blocks or vice-versa? If blocks are grouped in pages, with erasing it sounds very different.. Only whole blocks, which sounds like blocks are bigger than pages.

jasonwatkinspdx · on May 24, 2022

It's correct.

Planes reflect the physical structure of the storage chips: there's multiple layers that share a common vertical bus.

Plane > Block > Page, that is to say Blocks are always made up of multiple pages (commonly 128 or 256 as the quote mentions). Pages are the unit of read and write, while blocks are the unit of erasure. The FTL tries to hide this page write vs block erase mismatch as best it can, but as the original article points out you may need to be aware of what it's doing in very high performance systems.

wtallis · on May 25, 2022

A single NAND die is only divided into two or four planes. It's a function of how many copies of the peripheral circuitry for accessing the array are included, not how many layers are in the 3D NAND array. More planes means the die can do more things in parallel (subject to constraints).

A drive with 8 dies each having 512Gbit capacity divided into four planes per die will perform almost as well as one with 16 dies of 256Gbit divided into two planes, other things being equal (eg. number and speed of the channels between the SSD controller and the NAND, page and block sizes and access times, all of which are subject to change at the same time a generational change increases die capacity and number of planes).

jbverschoor · on May 25, 2022

Ok that makes more sense. A small diagram outlining the 3d structure would be helpful.

grogers · on May 24, 2022

Page = minimum read/write unit, block = minimum erase unit. Blocks are composed of some integer number of pages. Planes don't matter (probably).

rsaxvc · on May 25, 2022

Planes allow you to perform multiple parallel operations a single die (assuming it's the same as the raw SLC I work with).

tenebrisalietum · on May 24, 2022

> Splitting cold and hot data as much as possible into separate pages will make the job of the garbage collector easier.

How do I tell my SSD to write stuff to specific pages? You can't really tell the SSD to do anything except read, write, or trim LBAs.

Does NVMe support this with its queues?

> 27. Over-provisioning is useful for wear leveling and performance

I thought most if not all SSDs were already overprovisioned. Does additional overprovisioning help?

> To ensure that logical writes are truly aligned to the physical memory, you must align the partition to the NAND-flash page size of the drive.

I think this is false. This assumes there is a one-to-one mapping of LBA to SSD PBA which you don't know. LBA 2048 could go to any PBA on any page/block/flash line in the unit and as things are written and rewritten, any correspondence that might happen due to sequential assignment of PBAs->LBAs would gradually diminish, IF you knew for sure that was happening in the first place. Because you wouldn't really know what the SSD is doing without reverse engineering or seeing the source code of firmware, unless there's things going on in NVMe land that are new and I don't yet know.

wtallis · on May 24, 2022

I wrote a series of articles that covered the new features defined for NVMe drives. The general pattern is that there are now lots of optional hints that drives and host systems can exchange about data placement, alignment and lifetime. But there are also alternative paradigms available like Zoned Storage that break compatibility to offer explicit control. These features are mostly only implemented in enterprise SSDs, and often only if a big customer specifically asks for them.

https://www.anandtech.com/show/11436/nvme-13-specification-p...

https://www.anandtech.com/show/14543/nvme-14-specification-p...

https://www.anandtech.com/show/16702/nvme-20-specification-r...

https://www.anandtech.com/show/15959/nvme-zoned-namespaces-e...

thfuran · on May 24, 2022

>I thought most if not all SSDs were already overprovisioned. Does additional overprovisioning help?

I think a big extra helping of overprovisioning is one of the major differences between consumer and enterprise SSDs.

bob1029 · on May 24, 2022

I've been thinking about the possibility of "dumb" SSD devices.

All of the current HW-level performance hacks could actually get in the way if your software already enforces things like single writer, chunky writes and/or append-only log structures.

Give me a drive that only writes in 1 linear direction (until its full) and has a big red button to clean the entire thing all at once (which would clearly require some offline processing time & multiple disks for a realistic system).

jerdfelt · on May 24, 2022

Does the ZNS (Zoned Namespaces) spec come close enough?

https://nvmexpress.org/new-nvmetm-specification-defines-zone...

bob1029 · on May 24, 2022

Yes, actually. This looks like a realistic/practical path. Had no idea this was a thing.

mbjorling · on May 24, 2022

There is more technical information at zonedstorage.io which also offers drives for academia and open-source projects.

https://zonedstorage.io/docs/community/devices

MichaelZuo · on May 25, 2022

I think that's roughly what the flash storage modules on Apple's new Mac Studio are.

rsaxvc · on May 25, 2022

Have you seen SMR spinning disks? You can get them today in host-managed flavors.

bruce343434 · on May 24, 2022

Sure! Go ahead and order some memory cells.

nonrandomstring · on May 24, 2022

From a low level programmatic standpoint, managing size and alignment with (potentially unknown) page sizes throws the same challenges as for AV buffers and network packet MTU/sizes - either side of "just right" is suboptimal.

amelius · on May 24, 2022

From Wikipedia:

> In December 2012, Taiwanese engineers from Macronix revealed their intention to announce at the 2012 IEEE International Electron Devices Meeting that they had figured out how to improve NAND flash storage read/write cycles from 10,000 to 100 million cycles using a "self-healing" process that used a flash chip with "onboard heaters that could anneal small groups of memory cells."

So can I apply this myself by placing an SSD drive in an oven?

rasz · on May 25, 2022

Yes, if you have manufacturer software to factory format blank drives. Heating up heals cells being written (probably filled as writing empties cells while erasing stores max charge value), but also speeds up data degradation in all the other cells not being written to.

dang · on May 24, 2022

dtgriscom · on May 24, 2022

A question: do you have a tool that searches the history for previous links, or do you just have a really good memory?

dang · on May 25, 2022

Here's a pointer to past explanations: https://news.ycombinator.com/item?id=29370676.

dmurray · on May 24, 2022

There's a "past" link on every HN story that shows you previous submissions of the same story.

metadat · on May 24, 2022

What sorts of programmers should be concerned about these matters? Page cache doesn't seem too important or interesting in my day to day app and distributed systems development.

Maybe it's useful if you want to make something like a more performant version of grep? (aka ripgrep?)

loxias · on May 24, 2022

> Page cache doesn't seem too important or interesting in my day to day app and distributed systems development

This is why we can't have nice things.

tshaddox · on May 24, 2022

How so? Isn't the only point of developing these systems and abstractions so that other people don't have to worry about them?

chrisandchris · on May 24, 2022

IMHO, today to many people think "don't have to worry about them" equals "don't need to know anything about it".

tshaddox · on May 24, 2022

I would argue that in most cases you "don't need to know anything about it" either. It's reasonable to deliberately treat abstractions as if they are not leaky, as long as you're aware that all abstractions in fact are leaky and you're equipped to investigate and learn about them if the leaks cause problems.

dotopotoro · on May 24, 2022

“don't need to know anything about it” is acceptable, but should not be encouraged.

It’s not like reading 10 bullet points on the subject is “diving deep” and making huge time investment.

It’s just getting the minimal context, so later on at least some keywords are known.

tshaddox · on May 24, 2022

> It’s not like reading 10 bullet points on the subject is “diving deep” and making huge time investment.

True, but you're using so many abstractions that the rule can't feasibly be "read a short summary of every abstraction you're using." There are just too many. At some point you have to choose a threshold where the likelihood of an abstraction leakage is sufficiently low. When you're debugging a CSS selector you will almost certainly never need to know about even the existence of, say, Fermi–Dirac statistics.

dotopotoro · on May 25, 2022

> True, but you're using so many abstractions that the rule can't feasibly be "read a short summary of every abstraction you're using."

Rule - no. Goal - yes.

Some topics are more stable and valuable then others, so prioritisation helps.

“How utf8 generally works” vs “implementation details of js-node-utf-related-library-X.”

macintux · on May 24, 2022

10 bullet points on every conceivable computer-related topic is, well, a lot more than 10.

dotopotoro · on May 25, 2022

One topic at high level (like in the article, 10-20 minutes?) per week, results in ~50 topics per year.

Not sure how many computer related topics you know/want (“The more you know, the more you know you don't know”), but for me, 50 topics on programming seems sufficiently high at frankly a very low effort/commitment.

macintux · on May 26, 2022

Fair enough, I’m just not sure how many years would go by before I even thought about SSD performance. Irrelevant to most of my career.

the_only_law · on May 24, 2022

I love how people say this, when the reality is, all the software from the oh-so-coveted is the biggest shit show I’ve seen.

But it’s rarely because some developer didn’t understand page caches, and usually because it obviously didn’t revive enough QA or UX input.

eschneider · on May 24, 2022

People who read from disks and people who write to them. How SSDs organize data definitely had read and write performance implications and if you're writing to disk, some write habits that are perfectly reasonable on regular disks can cause catastrophically fast wear on SSDs.

KennyBlanken · on May 24, 2022

Yes, but the number of people who need to be worried about aligning their writes and such is pretty small; certainly not "every" programmer. The author gets into the weeds about certain things application level programmers almost never need to know or concern themselves about. He really doesn't understand what's useful information and what isn't.

If you're programming at enterprise scale, this sort of stuff is the responsibility of architect-level programmers and senior systems engineers.

Even most linux sysadmins know all about block alignment (well, if they predate most of the various tools figuring out block size/alignment stuff for you.) It's nothing new - RAID arrays work best when properly aligned, for example.

jeffbee · on May 24, 2022

> doesn't seem too important or interesting in my day to day app and distributed systems development.

Makes sense to me. At Google we were told to stop thinking about all this stuff, that the storage hardware and software people were responsible for hiding things like wearout from application developers. This article is really "things you should know if you plan to directly access an NVMe device" but there is a huge class of programmers who are better off not knowing.

rasz · on May 24, 2022

>At Google we were told to stop thinking about all this

and as a result Chrome slams SSD by writing cached Youtube videos to disk .... except Youtube never reuses cached video data (not even when rewinding more than couple minutes to already watched spot in same video), it explicitly generates hashed requests with custom URL parameters googlevideo.com/videoplayback?expire (~6hour shelf life) &range &sig &lsig. Heavy YT viewing results in wearing out your SSD by tens of gigabytes per day for no particular reason. This is just one small example of side effects from such brilliant decisions.

MichaelZuo · on May 25, 2022

Yikes, is that really the case? No wonder the 128GB drive on my 2013 MacBook air wore out to 60% of its original performance...

bombcar · on May 24, 2022

There was an article by varnish taking about how you should leave the caching and memory management to the OS - even if you can beat the virtual memory manager today you’ll stop improving your home grown solution while RAM and the kernel keep marching on.

wmf · on May 24, 2022

https://varnish-cache.org/docs/trunk/phk/notes.html

pavon · on May 24, 2022

My take:

1-13) General background info that informs the rest.

14-25) Important for any programmer that does enough file IO that they need to optimize it.

26-29) Important for any system admin to ensure they aren't inadvertently limiting the performance of their hardware.

yourapostasy · on May 24, 2022

Not just programmers. Anyone using ZFS with SSD, whether as the pool itself or in various caches like slog(zil) is going to find this information of use when tuning for better SSD citizenship. Programmers treating SSD like faster spinning rust is like programmers treating S3 like another POSIX filesystem; you can do it, but you're trading away compounding future advantages for that one moment of expedience.

dekhn · on May 24, 2022

In my career I have found that file system tuning for the devices an anti-pattern that almost always ends up causing more problems than it's worth.

philjohn · on May 24, 2022

Are you writing low-level software, such as filesystems, or raw block backed database storage engines? If not, then that's definitely a decent maxim to live by.

golergka · on May 24, 2022

Don't your distributed systems use databases of some sort?

alpaca128 · on May 24, 2022

And why does a DB user need to know those details? Isn't it the whole point of DB systems to provide an optimized solution that allows users to focus on other things?

dotopotoro · on May 24, 2022

Databases always try to flush something to disk after transaction, just in case unexpected reboot happens. So your writes to db have direct correlation to disk writes.

Choice of db schema impacts physical layout on ssd. E.g. Different tables are more likely to be on different ssd pages resulting in random writes.

Databases are insanely complex, but not magic.

Gordonjcp · on May 24, 2022

By the looks of the article? People writing SSD firmware, or SSD drivers.

There is probably a small but non-zero number of these on here.

jqcoffey · on May 24, 2022

The author appears to be an EM at Booking.com. It seems unlikely that anyone at Booking would be working on SSD firmware or drivers, but a CDN seems like a reasonable assumption and also a useful place to plumb the depths of SSD implementations.

B1FF_PSUVM · on May 24, 2022

This guy used to hammer a good point about databases:

"In a time of SSD, multi-core/processor, two terabyte memory and Optane App Direct Mode machines, there is no reason not to build from BCNF data. Time to do what Dr. Codd demonstrated. Technology has finally caught up with the maths."

https://drcoddwasright.blogspot.com (skip the distractions)

dekhn · on May 24, 2022

I treat ssds like faster hard drives and I have never been disappointed.

fuzzfactor · on May 26, 2022

Well, there's still flashbench:

https://github.com/bradfa/flashbench

Plus, alternatively, there's FlashBench:

https://github.com/JonghyeokPark/FlashBench

These might be found useful for determining the underlying structure.

thehappypm · on May 25, 2022

Personally I feel like files are an abstraction that are too low-level for your typical new programmer. I find it odd that a typical script use case that you learn in Python 101 is reading a bunch of junk from a file and then write into another file. Files are finicky and we have much better abstractions than them, like databases.

yopawngungmstyl · on May 25, 2022

Since code is typically stored in files, I hope that new programmers would be expected to learn what a file is.

nine_k · on May 25, 2022

Look at the file system as at a key-value store, only tree-shaped.

thehappypm · on May 25, 2022

A key value store has simple CRUD primitives, does a file system?

SoftTalker · on May 25, 2022

I'd say pretty much yes. If the file name is the "key" then most languages have straightforward ways to create and delete files, read data from them, and write data to them.

thehappypm · on May 25, 2022

Not really.. try editing a giant file or reading a file with an odd encoding

wly_cdgr · on May 24, 2022

How relevant is this in 2022? What's changed and what still applies?

wolverine876 · on May 24, 2022

A serious question: What has changed?

rasz · on May 25, 2022

Geometry got smaller, thus wear endurance got a LOT worse.

fomine3 · on May 25, 2022

No, 3D NAND helped a lot for durability.

MichaelZuo · on May 25, 2022

3D QLC NAND, which is what all cheap consumer SSDs are transitioning to, is pretty bad, like 1/10th the durability of common 3D TLC NAND from 3 years ago. And 1/100th the durability of even non 3d MLC NAND from 2014.

Enterprise class 3D TLC NAND is relatively close to enterprise class non 3D MLC NAND, the gap is bigger for consumer drives.

But I think as of 2022 only Apple still sells consumer desktops/laptops with entirely TLC NAND. Everyone else is racing to the bottom for their consumer stuff.

mhh__ · on May 24, 2022

Is anyone aware of a book-length equivalent of this?

the_only_law · on May 24, 2022

Doubt it, books on niche technical subjects don’t seem to be much of a thing anymore unless you’re willing to pay extortionist prices for university textbooks.

mhh__ · on May 24, 2022

There is a book on DRAM, caches and hard drives by Bruce Jacobs.

Basically I want what every programmer should know about storage but in the style of dreppers original article.

wolverine876 · on May 24, 2022

> books on niche technical subjects don’t seem to be much of a thing anymore

Why not? Blog posts aren't nearly as valuable.

SketchySeaBeast · on May 24, 2022

I assume someone would be writing that book in the hope they'd make money back and that's hard to do with a super niche subject few will be interested in and even fewer would be willing to pay for.

wolverine876 · on May 25, 2022

How does that differ from 10 years ago?

SketchySeaBeast · on May 25, 2022

I would assume smaller and smaller niches and the information is now easier to find online.

the_only_law · on May 24, 2022

> Blog posts aren't nearly as valuable.

Also since we’re talking about hardware, I imagine a lot of people with necessary domain knowledge can’t share what they’ve learned done because of IP restrictions.

wolverine876 · on May 25, 2022

That was true in the past, when books were, if the GGP is accurate, more common.

pkaye · on May 25, 2022

If you want to know about some internals of SSD, the only book I know is "Inside Solid State Drives (SSDs)" by G. Wong. Its an old book though.

tester756 · on May 24, 2022

are speeds of bleeding edge mem devices getting close to RAM?

cogman10 · on May 24, 2022

Not really.

Max throughput is around 6gbps with a fairly high latency. DDR5 has speeds of 52gbps, lower latency, AND your CPU will almost undoubtedly have a cache on it to increase that speed further.

This is all assuming you are putting your mem device on a pci-express bus.

KennyBlanken · on May 24, 2022

> Max throughput is around 6gbps with a fairly high latency.

In the consumer market, a number of performance NVMe drives will hit over 5GB/sec, which would be 40 Gbps.

The latency isn't anywhere near as good as even quite-old RAM, but modern SSDs are considerably less than an order magnitude off in transfer speed from even current, common ram (DDR4) and "only" about a hundred times higher in latency than RAM.

That's pretty stunning from mass storage. So is well over 500,000 IOPS.

fomine3 · on May 25, 2022

GP uses wrong unit for both, both GiB/s.

MichaelZuo · on May 25, 2022

Your forgetting Optane DIMMs for enterprise.

Scene_Cast2 · on May 24, 2022

In terms of bandwidth or latency? All conditions, worst case, best case?

user3939382 · on May 24, 2022

What you should know is that I had an Apple OEM 1TB SSD in my late-2013 MBP and one day it failed so catastrophically under normal conditions that 2 of the best data recovery teams in the world told me there was nothing they could do.

Backup your stuff

toast0 · on May 24, 2022

From my experience, SSDs tend to just disappear from the bus when they're done. If there's JTAG pins, maybe it's OEM recoverable, but good luck. At least with spinning disks, they usually have a media failure which often has warning signs. Bearing failures are usually seized at startup and there are ways to get them moving and then do a full dump. If the electronics fail, often you can pull a board from a working unit and attach it to the media and get good results. I don't think it's reasonable to swap flash chips onto another board (but maybe, I dunno?).

Dwedit · on May 24, 2022

Get an 8TB backup drive (Costco has them really cheap), and run Macrium Reflect to clone your HDD onto the backup drive. Macrium Reflect makes use of Volume Shadow Copy, so you can continue using your computer while it's backing things up.

Those big backup HDDs use shingled storage, so they're not any good as general purpose hard drives, but they're excellent for strictly sequential writes, such as a full disk backup to a single file.

eli · on May 24, 2022

Pair that with an online/remote backup and you're all set. I like Backblaze because the software client is very good but you could just as well push your own encrypted backup to S3 or a VPS.

thekrendal · on May 24, 2022

You can also use BackBlaze B2 to push your own backups with whatever software will support it, similarly to how you'd use S3.

samatman · on May 24, 2022

I'll admit my memories of 2013 are hazy, but I do recall TRIM being an issue early in the Macbook's history†.

Backup your stuff! I happen to also back up to an SSD these days, because the difference between minutes and hours is hard to argue with.

†edit: history of shipping with an SSD standard, that is.

lostlogin · on May 24, 2022

> because the difference between minutes and hours is hard to argue with.

If the backups are incremental it shouldn’t take hours.

samatman · on May 24, 2022

For a given backup an SSD will be much faster, less susceptible to drop and vibration damage, and pocketable where a portable hard drive is pouchable at best.

Retric · on May 24, 2022

Incremental backups are slightly higher risk.

avgcorrection · on May 24, 2022

Wow, you didn’t have a backup routine. That’s so basic. Why not?

-

Oh, what my routine is? Uh. I `cp -a ~ /mnt/backup/date` a couple of times a month.

... Testing backups?

twofornone · on May 24, 2022

Speaking about backing up...if one were interested in long term archiving, do magnetic platters offer longer lasting data integrity than SSDs in cold storage?

UI_at_80x24 · on May 24, 2022

>..do magnetic platters offer longer lasting data integrity than SSDs in cold storage?

Yes. With an SSD the enemy is electron leakage. Minute quantities of electrons trying to escape an unnatural state and return to equilibrium. (yes, I just anthropomorphized electrons.) Magnets however are more stable by nature. (yes there is nothing natural about hard-drive storage. SMR doubly so!)

Anecdote/anecdata: I have been able to retrieve full drives worth of data off of drives that have sat in a cardboard box for 10 years. I also have trouble accessing data on 1-year old USB flash drives.

supertrope · on May 25, 2022

The JEDEC standard specifies client SSDs have to retain data powered off for a year under worst case temperature. Enterprise drives have a relaxed requirement for three months. This is because lower programming voltages are used to achieve higher total bytes written endurance.

Even hard disks should be powered on occasionally to test backups.

bombcar · on May 24, 2022

In general I trust the older tech more than newer for long-term archiving. So that would mean HDD (the oldest tech thereof you can find still sold, probably) or tape or DVD over SSD.

But multiple copies in multiple formats cannot hurt, and the most important stuff should have multiple live copies.

unilynx · on May 24, 2022

it really depends on the format. pressed DVDs will outlast your VHS tapes

doublepg23 · on May 25, 2022

I've been coming around to the POV that "cold storage" is a bad idea and it's best to keep everything hot. It's been discussed on 2.5admins.com a lot.

Melatonic · on May 24, 2022

Not sure about that but I do know that the new sealed helium filled drives are much harder to take apart and do backup recovery on