Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, this fits in with what I heard on the grapevine about this bug from a friend who knows someone working for Crowdstrike. The bug had been sitting there in the kernel driver for years before being triggered by this flawed data, which actually was added in a post-processing step of the configuration update - after it had been tested but before being copied to their update servers for clients to obtain.

Apparently, Crowdstrike's test setup was fine for this configuration data itself, but they didn't catch it before it was sent out in production, as they were testing the wrong thing. Hopefully they own up to this, and explain what they're going to do to prevent another global-impact process failure, in whatever post-mortem writeup they may release.



You need to be a very special kind of stupid to think postprocessing anything after you've tested it is a good idea.


"We need to ship this by Friday. Just add a quick post-processing step, and we'll fix it next week properly" - how these things tend to happen.


In my first engineering job ever, I worked with this snarky boss who was mean to everyone and just said NO every time to everything. She also had a red line: NO RELEASES ON THE LAST DAY OF THE WEEK. I couldn't understand why. Now, 10 years later, I understand I just had the best boss ever. I miss you, Vik.


I still have a 10-year old screenshot from a colleague breaking production on a Friday afternoon and posting "happy weekend everyone!" just as the errors from production started to flood in on the same chat. And he really did just fuck off leaving us to mop up the hurricane of piss he unleashed.

He was not my favourite colleague.


There’s someone from a company I worked at a few years ago that pumped out some trash on the last week before their two month sabbatical (or three?). I remember how pissed I was seeing the username in the commit because I recognized it from their “see you in a few months” email lmao


How would the situation have been any different if this was released on a Monday?


> How would the situation have been any different if this was released on a Monday?

I work 9 to 5 Monday to Friday


Sorry, I understand. Just that this was a huge outage that's all.


"I heard on the grapevine from a friend who knows someone working for Crowdstrike" is perhaps not the most reliable source of information, due to the game of telephone effect if nothing else.

And post-processing can mean many things. Could be something relatively simple such as "testing passed, so lets mark the file with a version number and release it".


> Could be something relatively simple such as "testing passed, so lets mark the file with a version number and release it".

I'd argue you shouldn't even do that. When I've been building CI systems pushing stuff to the customer in the past we've always been automatically versioning everything, and the version that's in the artifact you've found to be good is just the one you're going to be releasing.


Hmm, I post-process autonomous vehicle logs probably daily.

Why is this stupid? It’s pretty useful to see a graph of coolant temp vs ambient temp vs motor speed vs roll/pitch.

I must be especially stupid I suppose. Nuts.


That is not remotely what was meant..


Perhaps word choice and sentence structure are important then.


I don't think people should restate the basic context of the thread for every post... That's a lot of work and noise, and probably the same people who ignore the thread context would also ignore any context a post provided.


This is comparable to modifying the system under test after it has been validated and not simply looking at recorded data.


At last an explanation that makes a bit of sense to me.

>Hopefully they own up to this, and explain what they're going to do to prevent another global-impact process failure

They probably needn't bother, every competent sysadmin from Greenland to New Zealand is probably disabling the autoupdate feature right now, firewalling it off and hatching a plan to get the product off their server estate ASAP.

Marketing budgets for competing product are going to get a bump this quarter probably.


I think Crowdstrike is due for more than just "owning up to it". I sincerely hope for serious investigation, fines and legal pursuits despite having "limited liabilities" in their agreements.

Seriously, I don't even know how to do the math on the amount of damage this caused (not including the TIME wasted for businesses as well as ordinary people, for instance those taking flights)

There has to be consequences for this kind of negligence




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: