Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

While I get that there are use cases for physical media, as both a data hoarder and data paranoiac (good things in my line of work), I've moved on. It's the data that matters, not the media.

To that end, I have automated two separate, but complementary, processes for my personal data:

- Sync to multiple machines, at least one of which is offsite, in near realtime. This provides rapid access to data in the event of a disk/system/building/etc failure. Use whatever software you want (Dropbox, Resilio, rsync scripts, etc), but in the event of failure, this solves 99% of my issues - I have another device that has fast access to my most recent data, current to within seconds. This is especially important when bringing up a new, upgraded system - just sync over the LAN. (Currently this is 4 devices, 2 offsite, but it flexes up/down over time occassionally).

- Backup to multiple cloud providers on a regular cadence (I do hourly encrypted incrementals). This protects me against data loss, corruption, malware attacks, my stupidity deleting something, etc. This solves the remaining 1% of my issues, enabling point-in-time recovery for any bit of data in my past, stretching back many years. I've outsourced the "media" issue to the cloud providers in this case, so they handle whatever is failing, and the cost is getting pretty absurdly cheap for consumers, and will continue to do so. My favorite software is Arq Backup, but there are lots of options. (Currently this is 4 discrete cloud providers, and since this is non-realtime typically, utilizes coldest storage options available).

Between these two complimentary, fully automated approaches, I no longer have to worry about the mess of media failure, RAID failures, system failures, human error, getting a new system online, cloud provider being evil, etc etc.



> This protects me against…malware attacks

Are you sure about that? Many ransomware attackers do recon for some time to find the backup systems and then render those unusable during the attack. In your case your cloud credentials (with delete permissions?) must be present on your live sou ce systems, rendering the cloud backups vulnerable to your overwrite or deletion.

There are immutable options in the bigger cloud storage services but in my experience they are often unused, used incorrectly, or incompatible with tools that update backup metadata in-place.

I’ve encountered several tools/scripts mark a file file as immutable for 90 days the first time it is backed up, but not extend that date correctly on the next incremental, leaving older but still critical data vulnerable to ransomware.


I discovered recently that Microsoft OneDrive will detect a ransomeware attack and provide you with the option to restore your data to a point before the attack!

MS need to advertise this feature more, because I'd never heard of it and assumed all the files on the PC were toast!

Of course, the fact that a script on Windows can be accidentally run and then quietly encrypt all the users files in the background is another matter entirely!


Most cloud providers do this now. Encryption operations like this are relatively easy to detect.


Actually I think almost all malworm worms are totally automated. The attacker knows nothing about your network and backups, it just encrypts and deletes absolutely everything it has write access to.


Not true, even though that might depend on how valuable you are to a ransomware threat actor.

The DFIR report has some insights: https://thedfirreport.com/category/ransomware/


There are no delete credentials, and the WORM option is enabled when a provider supports it. I can ~always get back to point-in-time.


No delete credentials present a cost issue when moving from a provider... I've accidentally left data behind after I thought I'd deleted it. Worth the risk, and learned my lesson.


You can set a lifecycle rule. You don't need credentials to delete.


Only if you allow permissions to set a lifecycle rule...


You don't set the lifecycle rule at runtime. You set it at environment setup time. The credentials that put your object don't have to have the power to set lifecycle permissions.

You obviously don't put your environment setup user in your app. That would be utterly retarded.


Not useful for me at environment setup time because I never want any of my data deleted. The only time is if I decide to abandon that cloud provider.


And when you're moving providers you use your application credentials to do that? That makes no sense. This is nonsensical engineering. You'd use your environment credentials to alter the environment.


I'm not "engineering" anything - I'm just stopping a service. I close the account, or disable billing, or whatever that step requires. I don't even read the data back out or anything - just cancel. Doesn't really require "engineering".


You seem well placed to answer this one: how is cost for this resilience? Compare to the cost of the storage itself? Including the cost of migrating from solutions that are withdrawn from the market?


The cost (I assume you're talking about "my time" cost?) is unbelievably low, mostly in part due to good software. It "just works".

Specifically, Arq Backup, for example, lets you simply add/remove providers at will. It's happened multiple times, Amazon Drive changed (or went away? I forget...), Google Drive changed their Enterprise policies, etc... No big deal, I just deleted the provider and added another one. I still had plenty of working provider backups, so I wasn't worried while it took a day or two to fill the next provider. (Good argument for having 2+, I'd argue 3+ providers...)

Using notifications from your sync apps/systems/scripts/whatever is essential, of course, in case something fails... but all the good software has that built-in (including email and other notifications, not just OS, which helps for remote systems).

At this point, it's nearly idiot proof. (Good for idiots like me ;)


I meant more monetary cost. Nominally cloud storage for one unit of storage and one unit of time is perfectly "fine". Except that it adds up. More storage, indefinitely held, multiple hosts. Data which needs to be copied from one to the other which incurs costs from both. Add to this routine retrieval costs - if you "live this way". And routine test retrieval costs if you know what's good for you.

So last time I looked, unit costs were low - sure. But all-included costs were high.


Certainly some of this simply comes down to "how valuable is my data?".

Currently, given the extremely low (and dropping YoY) cost of storing cold data at rest, the essentially free cost of ingest, and the high cost of retrieving cold data which I almost never have to do, the ROI is wildly positive. For me.

And since all of these things (how many providers, which providers, which storage classes, how long to retain the data, etc) are all fine-tunable, you can basically do your own ROI math, then pick the parameters which work for you.


I get some peace of mind (in both professional and business settings) from having backup include a physically separable and 100% offline component. I like knowing an attacker would need to resort to kinetic means to completely destroy all copies of the data.

The physically separable component often lags behind the rest of the copies. It may only be independently verified on an air-gapped machine periodically. It's not the best copy, for sure.

I still take comfort in knowing attackers generally won't launch a kinetic attack.


That's fair. Nothing like piece of mind. :)

Conceptually, though, I think my separation of "sync vs backup" and separating of discrete providers (both software and supplier) accomplishes this same goal. Conceptually, it's not very different, or possibly just a level up, from "online media vs archive media". At least, it seems that way to me.


Mr Metorite can launch such an attack but as long as you have two physically gapped backups at a distance greater than the likely blast radius you'll be fine.


Backups and archival are different things, with similar requirements, but different priorities. A backup doesn't care about data you think you'll never need again.


Are your backups encrypted and if so, how do you manage and backup the keys?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: