More

jrbancel · on Dec 2, 2020

> hence you need to query _all_ the nodes or can end up in a split-brain situation and lose consistency

I am not sure this is true. It is prohibitively expensive to read from all replicas. All you need to ensure is that any replica has the latest state. See how Azure Storage does it [0]. Another example is Spanner [1] that uses timestamps and not consensus for consistent reads.

[0] https://sigops.org/s/conferences/sosp/2011/current/2011-Casc... [1] https://cloud.google.com/spanner/docs/whitepapers/life-of-re...

valenterry · on Dec 2, 2020

Yes, so to quote from it [0]:

> This decoupling and targeting a specific set of faults allows our system to provide high availability and strong consistency in face of various classes of failures we see in practice.

What that says is: for the failures we have seen so far, we guarantee consistency and availability. But how about the other kind of failures? The answer is clear: strong consistency only is guaranteed for _certain expected_ errors, not in the general case. Hence generally speaking, strong consistency is _not_ guaranteed. As simple as that. Saying it is guaranteed "in practice" is, pardon, stupid wording.

jrbancel · on July 30, 2020

Absolutely, the vast majority (95%+) of logs are never read by a human. Therefore, processing it is enormously wasteful. A good architecture will write once and not touch anything until it is needed.

I spent years working on system handling 50+PB/day of logs. No database or ELK can handle that, and even if it did it would be prohibitively expensive.

Keyframe · on July 30, 2020

Where did you work? CERN?

jeffbee · on July 30, 2020

It's adorable when people think scientific computing has the same scale as a Google or Microsoft.

FridgeSeal · on July 30, 2020

Sorry what was that?

https://www.spie.org/news/photonics-focus/mayjun-2020/square...

jeffbee · on July 30, 2020

Ignoring the fantasy b.s. in the second half of the article, the stuff at the top is exactly what I mean.

A mighty 400 GB/s: i.e. much less than the > 50 PB/day of logs the other person mentioned;

1600 hours of SD video per second: i.e. about 1-2 million concurrent HD streams, or much less than the amount actually served by YouTube.

IBM Summit "world's most powerful supercomputer": < 5000 nodes, i.e. much below the median cell size described in the 2015 Borg paper. Summit is a respectable computer but it would get lost in a corner of a FAANG datacenter.

lukeschlather · on July 30, 2020

CERN is a correct example. The LHC reportedly generates 1PB per second: https://home.cern/news/news/computing/cern-data-centre-passe...

jeffbee · on July 30, 2020

If you define “generates” to mean “discards” then yes.

welterde · on July 31, 2020

It still gets processed though and only all non-interesting events get discarded..

Otherwise the tape alone to store it on would exceed their total operating budget in a day, so they have to be a bit clever about it.

tutfbhuf · on Aug 1, 2020

I think that even Google can not save 1PB/s in 2020.

welterde · on July 31, 2020

The numbers are not fantasy at all - this will be a huge radio telescope - one square kilometer of pure collecting area and thousands of receiving antennas (For reference: Arecibo has around 0.073 km^2). We are talking data input to the correlator on the terabit/s scale. And technology-demonstration with ASKAP are well under way. ALMA is working quite well by now as well (> 600 Gb/s with just an 50 antenna array).

tedisaacs · on July 31, 2020

it’s adorable how proud you are to have worked at FAANG and how angry you get at the idea some other organisation handles equivalent scale

andialo · on July 31, 2020

touche

pstrateman · on July 30, 2020

400GB/s is about 35 PB/day

not quite as big a difference

Keyframe · on July 31, 2020

So, Youtube puts whole streams into their logs? Interesting.

jrbancel · on March 31, 2020

> Expect at minimum 2 Medium difficulty questions in a phone interview that need to be able to be solved in optimal space time complexity in under 45 mins

I see this being repeated on Leetcode and Blind, but it is not my experience at all.

In 2019, I got Senior level (L5/E5) offers from everywhere I applied, including Facebook and Google. I never solved more than a single medium problem in 45m-1h. I was asked a hard problem maybe twice, and not at FB & G.

I also interviewed 100+ candidates at Microsoft and Google. I have seen hundreds of interview questions and feedback reports and I simply can't corroborate that statement.

_nckn · on March 31, 2020

Oh nice, I've been waiting for someone like you to reply. Thanks for replying, I need a perspective from you as well.

Why do you think it happens this way? I have some other anecdatas that I gathered from the forums but I am not gonna spew it here because it could be offensive/racist or whatever.

But what I think the more reasonable answer is, because interviewers are humans, and a lot of it are based on luck. Maybe the interviewer has a bad day? Maybe the interviewer has a favorite question that is super tricky and he/she really want to test this to the candidates.

I also think that due to your seniority, it seems that the FAANG companies already know that you know your stuffs. Since L5/E5 definitely not an easy task. Therefore you don't get a really hard question, because they don't need to determine whether you are a risk or not. They already know that you will perform great in the company.

Btw if you are doing your interview like this, I commend you. Thank you for not making life hard on people. I myself has gone through hundreds of Leetcode but still failed, due to the harsh questions that I've seen these days. Yet due to Leetcode I postpone learning other things that can serve me better in my career. It is stupid, but it is what it is. I know a lot of people can relate with me.

Anyway, please continue to do a great work of interviewing people! Maybe one day I will get my chance.

UncleMeat · on March 31, 2020

Googler here. I've probably done 150 interviews.

I've never once adjusted my question based on where the candidate was coming from. The idea that people would ease up if they saw FAANG on a resume or a 10+ year career is just not a thing I've ever seen.

There are a few truths in modern tech interviewing:

1. There is a large amount of randomness. Between the question that an interviewer chooses to ask and whether the interviewer is having a really bad day, you might just get screwed.

2. A lot of people are genuinely working really hard to do a good job interviewing. I've seen careful and detailed interview feedback and talked with a huge number of people who really care about interviewing well.

3. Interviewing has less oversight than ordinary work, so some people are bad at it and don't get feedback. Some people get back luck and are assigned one of these interviewers.

4. Interviewees suck at evaluating their experience. This means that when somebody says "I did everything well except I missed some impossible optimal algorithm and they rejected me so that's bullshit" you should become skeptical.

5. Angry people get signal boosted. Online discourse around interviewing is dominated by people with bad experiences.

_nckn · on March 31, 2020

You answered some of my questions in some other HN posts. I remember you.

Thanks for the replies.

Btw I am curious of this one thing. Say a FAANG has X years of experience in a FAANG company and then quit. When he/she reapplies to the company, is there any sort of stuff that makes he/she preferred over other candidates? Why I'm asking this question is this. I imagine that a FAANG engineer who has more than X years of experience, say x = 10, he/she must have better things to do rather than just memorizing 1000 Leetcode problems. Therefore if being judged strictly by using Leetcode style questions, I imagine he/she would not do well.

UncleMeat · on March 31, 2020

I only have experience with this at Google.

If you leave and then re-apply, hiring managers will look at your past performance reviews when considering you as a candidate. If your past performance reviews were excellent, it'll be a lot easier to come back. If you left another FAANG company... it doesn't matter beyond maybe making it easier to get a phone screen.

I expect that everybody has better things to do than memorizing Leetcode problems. That's because I don't expect people to memorize problems at all. I strongly believe that candidates should be able to succeed at the question I ask despite never having seen it before in their lives.

_nckn · on March 31, 2020

That makes sense. Thank you.

Apocryphon · on March 31, 2020

#5 Is very true, but also remember that outrage that the online masses finds particularly true and relatable gets signal boosted more.

kevstev · on March 31, 2020

What is considered a medium problem? Is this in terms of equivalence of problem levels found on leetcode and such? I have yet to do the grind. My current job is great and pays FAANG level- I hope to not have to engage in this crap.

_nckn · on March 31, 2020

Medium is defined as Leetcode's problem tagged in Medium category. It is quite blurry to be honest because some questions should be on Easy category and some Easy questions should be Medium and some Medium should be Hard and some Hard should be Medium.

That's good, keep that! Don't get involved in this crap. I have to do it out of necessity.

southphillyman · on March 31, 2020

On leetcode they are literally marked "Medium". Mostly questions that involve some sort of BFS or DFS on graphs, figuring out the next available time slot in a meeting calendar, figuring out the best time to buy and sell a stock given the daily price, stuff like that.

jrbancel · on Feb 5, 2019

> Standing is denser than walking, so the throughput is greater

Doesn't this depend on the walking/standing density ratio and walking speed/escalator speed ratio?

Let say the walking density is 1/2 of the standing density. If the walkers walk at the speed of the escalator (i.e. their speed relative to the ground is 2x the speed of the escalator), then the throughput is the same.

frosted-flakes · on Feb 5, 2019

Holborn's escalators are over 23 metres tall, as tall as a six or seven storey building. Very few people will walk up the entire escalator, essentially limiting utilisation to 50%.

jrbancel · on Sept 11, 2018

"Automatic" is way to broad to be correct.

A DCT (Dual Clutch Transmission) like the Porsche PDK is not remotely comparable to a torque converter based transmission. Yet, both are categorized as "automatic".

Marazan · on Sept 12, 2018

And one of those cost approximately my annual salary and the other doesn't.

Telling people automatics are great when you are referencing a type that the majority of people will never have a chance of owning is a classic Hacker News manual-vs-automatic discussion trope.

jrbancel · on Sept 12, 2018

Plenty of cars cheaper than a Porsche have a DCT.

Golf R, many Kia: e.g. a brand new Kia Soul for $22k.

Marazan · on Sept 12, 2018

I bet you dollars to donuts that the average person doesn't spend 5 figures on a car. They spend 4 figures on a second hand model.

jrbancel · on Aug 9, 2018

In France, anyone can show up and participate to the manual counting.

It is done onsite so there is no risk of tempering.

If you wanted, you could stand next to the transparent ballot box the entire day and then count yourself.

The results are available a couple hours later. I can't think of any downside.

jrbancel · on May 21, 2018

It is Toulouse[0], not Tolouse.

[0] https://en.wikipedia.org/wiki/Toulouse

_0nac · on May 21, 2018

Many of the spellings seem to be historical and/or purposely antiquated.

Rexxar · on May 21, 2018

The occitan/latin name is Tolosa. I never saw other intermediate form.

MartinMansson · on May 21, 2018

This is not on purpose. This is a misspelling on my part. fast fingers.

jrbancel · on May 2, 2018

Disclaimer: I work at Microsoft, not on Service Fabric but I have built complex stateful services on top of Service Fabric.

As zapita said, Service Fabric now handles containers but I think it is just because containers became trendy and FOMO kicked in.

Where Service Fabric is decades ahead of the container orchestration solutions is as a framework to build truly stateful services, meaning the state is entirely managed by your code through SF, not externalized in a remote disk, Redis, some DB, etc...

It offers high level primitives like reliable collections [0], as well as very low level primitives like a replicated log to implement custom replication between replicas [1]. I feel that publicly this is not advertised enough and it is unfortunate because it is a key differentiator for Service Fabric that the competitors won't have for a while, if ever because it is a completely opposite approach: containers are all about isolation, being self-contained and plateform independent while SF stateful services are deeply integrated with Service Fabric.

[0] https://docs.microsoft.com/en-us/azure/service-fabric/servic...

[1] https://docs.microsoft.com/en-us/dotnet/api/system.fabric.fa...

jrbancel · on March 29, 2018

I have had the exact opposite experience.

The first thing I was asked when I interviewed with Microsoft was what I wanted to do and what I wanted to avoid. I said, no frontend, something related to distributed systems and they put me up for interviews with an Azure team.

With Google, it was like "interview first, accept the offer before knowing where you are going to be placed, and we will put you in some team".

kettlecorn · on March 29, 2018

These are good anecdotes to share. A few years ago when I was a Microsoft college hire I got my impression from gathering anecdotes from others, but maybe it's better now!

Google absolutely has the problem you described and I forgot to mention it in my prior comment. With Google you have no clue what team you'll be on when you're evaluating the offer, but once you've accepted the offer you're given more choice with regards to placement.

jrbancel · on Aug 15, 2017

No, but they are constrained to be less than or equal to 999,999.

See http://gitweb.dragonflybsd.org/dragonfly.git/blob/586c43085f...