Anyone with experience in telecom will immediately recognize this as a traffic engineering problem easily solved with the Erlang formula: http://en.wikipedia.org/wiki/Erlang_(unit)
Not to be confused with the Erlang Language. ErlangC assumes queuing, which is probably applicable to this case.
I think this would be far more informative if they started by explaining the single teller case: it isn't obvious how customers that arrive at a rate of slightly less than 1 every 10 minutes and take 10 minutes to serve would have an average waiting time of 5 hours.
I think it's non-obvious because it quickly gets into queuing theory. To simplify (and expose a flaw in the case[1]), assume a bank is open 9am to 5pm (8 hours). They'd expect to see 46(.4, but let's round) people in a day.
So there's two factors: flow rate and service time.
FLOW RATE:
Those people aren't coming in the door every 10.3 minutes: they're all coming in at 9:00 before work, 12:00 on their lunch breaks, or 4:45 right before closing. So you'd see 15-20 people all working through the doors within 10-20 minutes of each other. Ouch! This is modeled in queue theory as a Poisson distribution of arrival times.
SERVICE TIME:
Again, we're dealing with averages and distributions, but this time it's exponential. For every person who comes in with a check to deposit (let's say that takes 3m), there's another extreme case (let's say someone who wants to deposit $2,500 in Canadian pennies). As someone in the back of the line, you have to wait for all customers before you to be served before it's your turn.
All of a sudden you're in a 29-person line and waiting 5 hours. THAT math makes sense: 29 people × 10 minutes per person = 4.8 hours.
THE FLAW [1]
Of course, this queue model is continuous: the bank doesn't open or close (as most banks do). Moreover, arrival times are deterministic: you can model based on a distribution, but you could quickly measure expected arrival times.
Process efficiency and queue theory are interesting topics. My favorite case is Toyota's six sigma production line engineers helping a NYC food kitchen cut wait times from 90 minutes to 18 minutes with simple adjustments:
That was depressing. The article details the adjustments Toyota made to the food kitchen's system:
> The kitchen, which can seat 50 people, typically opened for dinner at 4 p.m., and when all the chairs were filled, a line would form outside. Mr. Foriest would wait for enough space to open up to allow 10 people in. The average wait time could be up to an hour and a half.
> [Toyota] eliminated the 10-at-a-time system, allowing diners to flow in one by one as soon as a chair was free.
Talk about low-hanging fruit. This is quite literally a case of cutting waiting times by saying "hey, why don't we just stop telling people they have to wait?"
It's a non-obvious solution if you stop thinking of it in technical terms: they were effectively telling people (who arrived in groups) that they would not be allowed in when their group could fit, but instead would be forced to split up.
They're not arriving in groups. If they were arriving in groups, breaking up those groups so you could form them into platoons of 10 would still make no sense.
I'm reminded of an absolutely abysmal experience one time entering France through CDG. We arrived fairly early in the morning (I don't remember exactly, but let's say 7:30), a fully loaded jumbo jet, and so did about 6 other flights. The line to get through the passport check must have had over a thousand people in it. There was one officer on duty checking passports for the first hour we were in line, then two. Over on the "Shengen entry" side there were 4 or 5 officers so of course that side cleared out fairly quickly and those entry officers sat around doing not much.
We decided to entertain ourselves figuring out how long it would take to get through, assuming 1,000 people were in line (and also to figure out if we needed to make plans for meals while waiting).
Getting through a border usually takes no more than 2-3 minutes per person.
1,000 people will take 33.3 hours to process, assuming no major issues and 2 minutes per person.
When the second officer showed up, around 10:30am, we had moved just a few feet in 3 hours and had taken time to go get breakfast. We estimated that the single border officer had processed around 100 people in that time. More planes had emptied out behind us and the line snaked through as far as we chose to follow it during bathroom and food breaks.
So we assumed we had about 900 people in front of us...with 2 guards that worked out to only 15 more hours of waiting.
Eventually, a little after 1pm (and our lunch), somebody had the bright idea to move some of the officers from the Shengen area over to the international entrance, at this point the entire international terminal at CDG must have been clogged with people, and we saw a handful of major arguments and one fist-fight in our part of the line. We had moved forward about 25% of the way at this point and were trying to figure out which overpriced crappy snack kiosk we wanted to get our dinner at. They added 5 more officers (a total of 7) and they cleared us through fairly quickly after that.
Since then, I've had obvious questions about information asymmetry w/r to capacity planning. Border control could use the arrival schedule (which was known months in advance) to estimate the number of needed officers at any given time of day 90 days in advance. By setting a maximum queue wait time, they could decide capacity. The unknown variable would be international passports vs. Shengen, but past history could provide prior ratios useful for future planning.
I don't know if they ever availed themselves of this information, but I was not impressed with the DCPAF's (or whichever agency it was) planning or execution.
And this is why CDG regularly wins my "worst airport in the world" award. Well, this and many other reasons like really really poor signage, the long distance from the city with transport options being limited to crowded trains or using road vehicles that have to share the A1 with about 500000 other road users, and the distinct lack of things to do to distract yourself during the inevitable long waits. It's also the only airport in the world where officials managed to lose my residency papers - lucky I noticed that they weren't handed back to me with the rest of my documents, but I had to argue with the customs officials for 25 minutes before they would even believe me that they handed given it back to me. It then took another 25 minutes before they found it on the floor under a desk, which resulted in my missing my flight. No apologies, just s shrug of the shoulders - and this after I had remained extremely calm and pleasant during the whole process because losing it just makes things worse in that type of situation, and may even leave you in gaol.
That's the "steady state" issue. You have to imagine a bank that has been open an infinite amount of time with people arriving at an average of 5.8 times an hour during that entire period. It will eventually reach an equilibrium with a very high wait time (most specifically, the expected value of the wait time approaches 5 hours as the queuing process continues). The key insight here is that clusters of people coming in within a short time (which will happen) propagate into future wait times, while clusters of nobody coming in do not -- if the line is empty, it doesn't matter for your wait how long it has been empty, but if the line is clogged, it matters for your wait time how long it has been clogged.
That's kind of a silly model for a bank, but it makes more sense for a server (which, given the time span of most network requests, might as well have been up for an infinitely long time).
If the teller takes 10 min (on average) to serve someone and people arrive less often that 1 in every 10 minute, the average wait time is going to be finite. If people arrive more often than 1 in every 10 minutes, the queue is going to grow infinity large, and the average wait time is infinite.
So as the frequency with which people arrive approaches 1 in 10 minute (from above), the average wait time will increase dramatically, reaching infinity at 1 person / 10 min.
When you have a queue of N people, with service time T, an additional 1 person causes an additional N*T of net wait-time. So the net weight-time grows super-linear with line length.
There is a good chance that someone eventually arrives towards the middle of a service time. Once there is a one person line, it is very likely that every subsequent arrival has to wait.
The inverse corollary being the essay (I can't find it) about the sociology of lines at Tokyo Disneyland.
Question: what happens when there's no line for a specific ride at Tokyo Disneyland?
Answer: nobody wants to ride the ride!
Second question: what happens when there's a line for a specific ride at Tokyo Disneyland?
Answer: everybody starts queuing up for that ride, even if they don't know which ride it is!
If the one teller is 100% busy serving customers, then that's efficient. The queue is clearly temporary and due to demand spikes
If two tellers are idle 70% of the time, that's inefficient and one of them should be relocated.
I've had similar discussions with non-tech management about server utilisation (why should we buy another server when we're only using 80% capacity of the ones we've got?)
Appeal to their marketing side: excess peak demand represents angry clients who will take their business to the bank across the street, which does have two tellers and no wait time.
It helped me to visualize the problem. I don't know if HN's formatting will kill this but here goes. The idea is that six customers arrive at 9:00, 01, 02, 03, 04, 05, and 06 - then map them out to "What happens when you add a new teller?" I tried to figure out how the author's explanation that adding one teller can go from 5 hours to 3 minutes.
Edit - the text version was a disaster here. Here is an image of what I used to try to understand:
The queue are everywhere - your messaging queue, the threadpool, the hardware threads, and other layers of the stack and APIs you use. The video adds the interesting detail that as you add more tellers (workers) you learn of impending disaster only in the outlier p99 (or higher) latencies; by the time your p85 latency rises, you're already about to stall out.
This analogy will not resonate with the under-30 crowd. I realized the other day that my kids don't have firsthand recollection of what "going to the bank" might be.
Under-30 here, the analogy makes sense, I have to go to my bank somewhat regularly for things like check cashing, transferring funds from accounts to others and getting financial advice. I think you're mistaking people under 30 with people who haven't lived by themselves and handled their own finances which is a subgroup of under-30's.
I think it would make a -lot- more sense for the banks to be open from 'just after lunch' until normal close time on the weekends, than 'why are we open this early' until lunch.