As previous news articles state, Facebook has some implementation of XMPP going on. XMPP was designed from the ground-up to deal with exactly the issues that he highlights, and is the ideal real-time implementation for any system where everyone is expected to be aware of the statuses of all others on the network (verses the traditional "poll the server every x seconds" methods).
Even if Facebook isn't using XMPP per-say, they have full access to its implementations and source code for internal use for sure.
Granted, Facebook _does_ have the "slight" challenge of having 70 million active users; in light of which near everyone else's IM/XMPP networks are a mere pittance; but the core framework and algorithms are wholly addressed and implemented in XMPP standard.
It's one thing to make a more-efficient implementation of an already-existing standard that scales damn decently verses _designing_ a whole new system to serve their needs.
Note that the article doesn't once mention XMPP though.
Err .. as far as I understood their Jabber/XMPP announcement, these are used for interoperability and integration with 3rd party products only and not for the internal implementation.
So it's only natural that the article doesn't mention XMPP.
I can't say that I prefer in-browser chats, but I can appreciate the complexity of the solution. Scalability is the new manual memory management. I can't help but think that somebody's going to come up with the scalability equivalent of a garbage collector and make our lives a lot easier.
no, they're not sending notifications at every event. As far as I can tell, they're using an asynchronous algorithm that lazily propagates events and provides no responsiveness guarantees. (sort of ultra mushy stretchy unreal time guaranteed)
Sorry, but how is that different to sending all notification events to all users? You are still sending all notification events to all users, whether you do it lazily or not!
"The secret for going from zero to seventy million users overnight is to avoid doing it all in one fell swoop. We chose to simulate the impact of many real users hitting many machines by means of a "dark launch" period in which Facebook pages would make connections to the chat servers, query for presence information and simulate message sends without a single UI element drawn on the page. With the "dark launch" bugs fixed, we hope that you enjoy Facebook Chat now that the UI lights have been turned on"
Oh, they spilled the beans on using Erlang two weeks ago:
This is quite a remarkable piece of software engineering, very impressive stuff. I'm really glad they're open enough about it to share their techniques.
Also, I'd never heard of doing a "dark launch" before, but it sounds like a fantastic way to get early feedback from users.
Also they did stage the launch over a few weeks. I noticed it appear on my facebook page (because I'm in the Stanford network) well before it appeared on most of my friends'
i think the take-away here is the "dark launch" mentioned in the last paragraph, not necessarily the behind the scenes tech. although nice win for erlang here.
first time i have heard a company mention, publicly, about pushing features behind the scenes and testing in realtime. ajax makes this functionality possible nowadays.
That indeed is amazing innovation. Who would have thought zukerbergs team would have been open about their innovations. They are usually quite secretive about their future plans. I think having this kind of conversation with their user/developer community is amazing. More companies need to do this and dirty with the technical stuff not just a high level talk.
I correctly guessed at the use of Erlang for the web servers; persistent connections and pushing is a must and Apache is hardly designed for so many persistent processes. Thrift was also a pretty easy call considering it's a FB project; I still want to check that out, too.
The information wasn't incredibly in-depth but it's very cool and useful nonetheless to read about implementations like this on such a large scale. The chances of me ever creating something with the scale and resources that FB requires is pretty slim, but it's gratifying to know I've at least got a rough idea of some good ways to do it.
Now, if we could just get Twitter to do the same, perhaps someone could give them a few pointers... ;)
"Did I miss it, or does the note not mention how they actually implemented the notification?"
No, it doesn't go over that implementation, though it piqued my curiosity nonetheless. I would assume it's a time-based check on status rather than a real-time representation.
Considering they have 10,000 servers+, I don't think "scaling" is that big a feat.
So say they have every active user on at the same time (70million), and say they have 10,000 servers. That's only 7k users per server??? Not great IMHO
Maybe if they only had 1,000 servers, then it'd be a little more impressive.
Coordinating something that large is a feat. At that scale, everything is more difficult. Deploying updates to 10,000 machines, redundancy, upgrading, etc. Not to mention the avoiding bandwidth, memory, and process limitations, load balancing, and testing it, which they needed a clever solution to achieve.
"Facebook does not disclose the number of servers it operates. But research firm Data Center Knowledge puts the tally at about 10,000. The slug of cash will help Facebook buy approximately 50,000 more servers"
60,000 servers? Jesus christ. Are they planning to scale to take account of Alien users or something?
I realized recently that Facebook is trying to rebuild the entire Web inside their site. Home pages, check. Email, check. IM, calendaring, photo sharing, dating, check. Next they will add VoIP, photo editing, an office suite...
So how many servers do you need to replace the entire Web? 60,000 doesn't sound like enough.
The only reason this is considered impressive is because so many other services have set the bar so low. This isn't rocket science, it just requires thinking ahead and designing for scale.
Agreed. I find it ironic since a piece like this is likely written at least partially in an effort to attract programmers to facebook. To me it reads: "Come work for facebook and re-invent online chat. Again." Now, if the article had been about how they pushed the state of the art, I would be pretty interested, but none of this not new technology.
As for the dark launch thing, it is a fancy trick, but there are ways of doing load testing in an automated system by having test servers simulate the load from real users. This usually can give you much better data without wasting bandwidth, slowing down users' experiences, etc.
Yes, it has been done before... But I don't think it has been done in an environment where you need to scale up quickly.
They had a problem, they solved it in a very cool fashion using the best tool for the job (Erlang), they thought of a very good way of testing the application, and they were gracious enough to share their experiences with us, other developers...
Seems like a win all around, so I'm not going to complain.
Even if Facebook isn't using XMPP per-say, they have full access to its implementations and source code for internal use for sure.
Granted, Facebook _does_ have the "slight" challenge of having 70 million active users; in light of which near everyone else's IM/XMPP networks are a mere pittance; but the core framework and algorithms are wholly addressed and implemented in XMPP standard.
It's one thing to make a more-efficient implementation of an already-existing standard that scales damn decently verses _designing_ a whole new system to serve their needs.
Note that the article doesn't once mention XMPP though.