Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It is absolutely better to catch some errors than none.

In this case it gives me vibes of something going wrong after the CI pipeline, during the rollout. Maybe they needed advice a bit more specific than "just use a staging environment bro", like "use checksums to verify a release was correctly applied before cutting over to the new definitions" and "do staged rollouts, and ideally release to some internal canary servers first".



"Have these idiots even heard of CI/CD???" strangely seems to be a common condescending comment in this thread.

I honestly though HN was slightly higher quality than most of the comments here. I am proven wrong.


Agreed - The worst part is most of the people making these unhelpful comments are probably doing the same sorts of things which caused this outage.


Big threads draw a lot of people; we regress toward the mean


> I honestly though HN was slightly higher quality

HN reminds me of nothing so much as Slashdot in the early 2000's, for both good and ill. Fewer stupid memes about Beowulf Clusters and Natalie Portman tho.


I don't understand why you wouldn't do staged roll outs at this scale. even a few hours delay might have been enough to stop the release from going global.


They almost certainly have such a process, but it got bypassed by accident, probably got put into a "minor updates" channel (you don't run your model checker every time you release a new signature file after all). Surprise, business processes have bugs too.

But naw, must be every random commentator on HN knows how to run the company better.


> (you don't run your model checker every time you release a new signature file after all)

Wonder if the higher-ups who mandated this software to be installed in their hospitals were informed about that fact.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: