I owe most of my career growth to HN community. I never thought of this place at warm when I first joined. But now, I feel attached the HN crowd. Especially the unique perspective I get from the comments. No echo chambers!
What could be the best way for status page apart from third party services? I remember one project that used IPFS as distributed status page. https://news.ycombinator.com/item?id=16273609
That third party service has still got to be hosted somewhere though. What happens when the cloud they're running on goes down?
The rule should be host your status page on your competitor's cloud. If you're AWS, host it in Azure, if you're Azure, host it in GCP, if you're GCP, host it in AWS. (Linode, Digital Ocean, OVH, etc can do their own dance.)
This is mind-boggling, to think how the flight is almost perfect, I could not even imagine how evolution could work such wonders considering this is a 6000 fps shot. This is some insane level optimization for my brain to understand
How can you explain yesterday's outage (Facebook, Instagram, WhatsApp) to your parents?
You are feeling hungry and went to food court. The food court (open area) has a lot of options. You sit down in front of Domino's (Facebook), since you want to eat garlic bread. Now, you can't order from the counter directly. The waiter will come to your seat and ask for the order. You ordered garlic bread from the waiter, but the guy at Domino's counter went missing. Your order was not reaching to the chef in kitchen as Domino's counter guy was not present.
This explains why Domino's (Facebook) ecosystem was down, but what about other vendors? They had nothing to do with Facebook.
To understand this, we need to go back to our food court again. Now, there are a lot of hungry people sitting outside Domino. Since they were not getting answer from one waiter as why their food is not on their table, they started disturbing all the waiters. Due to this, majority of the waiters were trying to figure out where the Domino's counter guy went and other food joints (read websites) were not able to fulfil their own orders.
So although only Domino's was down, it appeared as if whole Food Court (Internet) was facing issues.
Counter Guy at Domino's - Facebook Nameservers
Waiters - DNS Servers (Cloudflare, Google, Akamai)
I'd just say that you had an address for Facebook in your address book. The page somehow vanished and you don't know their address any more. So you start phoning other people and knocking on their door to try and find what their address is. Everyone else is doing this and no one knows what their address is. So you've got millions of people phoning each other and knocking on doors.
Facebook being down was already an issue, but everyone phoning and knocking on doors was causing disruption to everyone else.
There is no need to make it any more complex than "facebook, the company, messed up, now their properties are broken". An overly elaborated analogy just makes you sound condescending.
Internet is just a bunch of computers interconnected via tons of cables (hence the name; "inter-networked computers").
To be reachable, every equipment and computer constantly need to tell the others about their existence (to publicly announce on which network cable they can be reached at).
Facebook engineers wanted to optimise that system but accidentally broke it during the update.
As a consequence, after a few minutes, other computers didn't know on which cables they can reach Facebook.
Facebook had to call the technicians sitting in the datacenter to cancel the last change that was done (because the Facebook engineers couldn't themselves connect from the office) and everything was fine again.
That would have been accurate for a DNS outage ; but with my layman understanding of BGP, I would say the analogy would be something between "...but their phone line is broken" and "...but they disappeared from the phone book because they don't have a phone line any more" .
Actually, it probably is, especially if you dial the analogy back a couple decades before the "We're sorry that number has been disconnected" automated responses: Facebook's phone line went down and when you call the Operator even if you have the phone number, they can't connect you, but this is weird and you aren't the only one trying to call Facebook so now they are calling in other Operators to diagnose the problem because surely someone has heard from Facebook recently.
That analogy includes the snowball impact on the other websites and services as the Switchboard Operators get more over-utilized into puzzling out Facebook's problem than servicing calls for still working phone numbers.
Has anyone tried to migrate to Backblaze. Their pricing seems really aggressive but I am not sure if we can compare Amazon and Backblaze when it comes to reliability.
I love the folks at backblaze but the single datacenter thing really worries me (and again, disclosure, I work on Google Cloud). If you're just using it as another backup, maybe that's less of a concern: your house would have to burn down at the same time that they have a catastrophic failure. But it is part of the reason you see S3 and GCS costing more (plus the massive amount of I/O you can do to either S3 or GCS; I'd be curious what happens when there's a run on the Backblaze DC).
Sorry if I wasn't clear: your bytes on GCS and S3 are stored across multiple buildings (GCS Regional, S3 Standard). More copies is more dollars not less ;).
As far as I am aware GCS does erasure coding across sites?
Backblaze could do multiple tiers of erasure coding and they would still be able to reduce prices given more scale, ceteris paribus.
It's not a question of number of replicas, data centers or technical implementation, but a question of pricing policy.
Does one want to use volume and scale to drive prices down (and cheaper prices to increase volume) or does one want to use volume and scale to bloat margins? Backblaze are arguably doing the former.
Does one want to lock customers into an ecosystem by enforcing excessive bandwidth prices or does one want to pass on bandwidth cost-savings to customers? Backblaze are arguably doing the latter.
Backblaze would continue to be cheaper because their pricing policy serves customers across all dimensions.
More scale is definitely less dollars not more (even if it means a fraction of a few more erasure coded shards across sites).
Thank you pg and dang!