Solid, well-researched and comprehensive article. One small correction:
> Unicorn is built to cut off long running worker processes after a default of 30 seconds via SIGKILL. This is why Unicorn says it's built for "fast clients" - because anything beyond that cut-off is subject to termination.
This is not what "fast clients" refers to. Unicorn being built for fast clients means that it is only meant to serve local network clients. Or, in other words, it's not meant to communicate with end-user connections/clients directly (slow clients), rather, it's meant to run behind a proxy server like nginx or apache.
The "fast client" thing is basically saying, "this is an application server, not a web server", meaning, it's good at running application code, not managing the complexities of serving web requests to public internet (slow) clients. They're called slow clients because, well, they're slow ;) They have variable quality connections, bandwidth / transfer rates, and client capabilities (http 1.0 vs 1.1, 2.0, spdy, etc). Being a fast client means you can expect that you'll receive http requests at a relatively constant rate, basically whatever the local network or file system will allow, which is extremely fast (comparatively).
So, unicorn being a "fast client" just means that it's good at talking over local network connections (or, preferably, unix sockets) with a web server, and never on port 80 serving public http traffic.
The best and clearest explanation of the different options I've read yet. Thanks guys. I've read lots of threads on SO about the different choices ("Puma vs. Unicorn") any everyone has their suggestions, but this is the first to actually dive in and compare and contrast the differences (and when and where to use each.)
As a longtime fan of Thin, I've been very impressed lately by Reel[1], which is built on Celluloid::IO. In particular, Webmachine-ruby[2] is a nice framework that supports Reel.
While arguably not popular enough (yet) to be considered in the comparison here, I have a feeling it will become a major contender as it matures.
If you're looking for high concurrency, or doing things like WebSockets/SSE, I'd recommend taking a look.
In practice, what does this mean for your typical Rails app?
Is Rails (3.2/4.0) thread safe? Are most gems? For app code, presumably you'd have to avoid class variables and class instance variables. Anything else?
Rails 3.2/4 are thread-safe -- or at least, they are architected and intended to be. So few people actually use Rails in a multi-threaded deploy environment -- and concurrency bugs can be so tricky -- that it would not surprise me if there are still some bugs hiding out.
There really is such a huge performance advantage to a multi-threaded deploy environment -- EVEN with MRI, for the typical I/O-bound webapp, although you'd definitely want both multi-process and multi-threaded under MRI GIL. (Both puma and passenger enterprise can give you this, although not passenger free).
I'm hoping people start to catch on to this, and multi-threaded app servers get more popular, making them more mature and feature-rich, and in turn flushing out remaining bugs in Rails, causing yet more interest in multi-threaded deploy environment, virtuous circle blah blah
It's worth pointing out that the 'thread-safe mode' is really "allow concurrent request dispatch" mode. It doesn't make Rails any more or less thread-safe -- the `thread_safe!` declaration is about you telling Rails that YOUR code is thread-safe, and thus it's safe for Rails to allow overlapping concurrent requests to be processed, instead of forcing requests to be processed serially. (Still requires an app server that can dispatch requests concurrently, for it to actually happen. Basically puma or passenger enterprise).
But yes, in Rails 4 that's the default mode, for Rails to allow concurrent overlapping requests. Meaning Rails 4 assumes by default that any local or gem code is thread-safe (may or may not be a safe assumption of course).
Both Rails 3.2 and Rails 4 (and really older Rails for quite some time) are intended and architected to be thread-safe. Modulo bugs. There were concurrency bugs in older versions for sure, which we know because so many of them have been fixed.
I'm not sure why you qualified with MRI here. This is true on any runtime. Using Puma without threading (and thus requiring the code you run under it to be threadsafe) would be completely pointless.
I'm not trying to nitpick, I just don't want anyone getting the impression this isn't true under jruby or rubinius.
This article seems to have missed some major differences between Passenger and the other app servers. In my comparisons, I've found Passenger difficult to compare to other app servers on the basis that it aspires to be more than its competition. Specifically, Passenger rolls in its own watchdog [1] and management toolchain (utilities like passenger-status and passenger-memory-stats). Passenger also tries (and succeeds) in minimizing your investment in configuration and tweaking of your app server. These are things you have to roll your own solutions to if you are using other Ruby app servers.
This is not to say you shouldn't use those other app servers, or that Passenger always wins on performance, but I've always forced myself to ask the question: how will I benefit from the performance advantages of other app servers?
Another thing to keep in mind is that you have to look very carefully at your performance issues to understand whether your app server is actually your performance bottleneck. App servers get a lot of attention. There are droves of articles written comparing the maximum request rate achievable under circumstances that very few applications operate. The bottom line is that if your application spends most of its time executing Ruby code, your app server isn't your bottleneck. There is the possibility that you'll see an improvement in memory usage, but you need to test your application rather than rely on generalized assumptions. The improvements might not be worth the trade-offs.
Can you provide any more details on your stack and what you're seeing that blows away alternatives?
I'm running a staging version of our Rails 3.8 app with MRI 2.1 and Puma in clustered/threaded mode and everything looks pretty good .. but I haven't been able to throw a ton of concurrent traffic at it yet. Potential thread safety issues give me the niggles, not quite ready to replace Unicorn on production just yet.
> Can you provide any more details on your stack and what you're seeing that blows away alternatives?
The really big thing is how just one puma worker with 8-16 threads can actually replace a set of about 6-8 unicorn/passenger worker instances and give you about the same level of performance.
That level of memory savings lets you use smaller boxes or put more puma workers on one box and eliminate others. Admittedly I'm not doing twitter request/minute numbers but I still was very pleased with my findings.
Theoretically, the number of puma threads you're running should be the same as the number of unicorn processes. That is, equal to the number of available CPU cores on the machine. So, if you're running on a machine with 8 cores, you should run 8 process or threads.
What continues to be the major issue with anything thread-based in ruby is that to reliably reach desired performance at load in a threaded setup, you need to be running an interpreter than can make concurrent use of native system threads. MRI has a global interpreter lock, so two pieces of ruby code will not run at the same time. IO is not subject to the GIL, so a lot of what happens in a web app can run concurrently (DB calls, etc), but other things that all happen in ruby (routing, view rendering) cannot.
So, basically, to reliably achieve similar performance using threads, you need to be using Rubinius or JRuby, which don't have a GIL.
If Unicorn gets you most of the way, e.g. you have mostly fast requests but a handful of long running requests, you might consider using Rainbows[1] in tandem with Unicorn (or even on its own).
I actually was looking for information about this a few days ago, so this was quite timely. On a previous server I used Passenger to host multiple projects with the Apache plugin, but I've not really done any Rails for a while and wasn't certain if Passenger was still the go to. Seems like for my needs (multiple, small projects) Passenger is still the best choice, but it's nice to know in depth what the others offer over it.
> The second option, a global queue, allows Passenger to put all requests on the same queue "stack". Workers then are given whatever is next on this stack, thus making the aforementioned long request situation a little less problematic.
> workers will be killed off, and then when traffic picks up in the morning, requests will hang inside Passenger while it tries to launch new worker processes in memory. This also happens after nginx is restarted.
He mentions this later, in passing, but In the enterprise version of passenger, there is a mode where on restarting your app, passenger will not spin down worker processes until it has spuns up a new one. So this step wise system of spinning up a new worker before spinning down an old one ensures no-downtime deploys/restarts even after an nginx restart
Additionally, there is a configuration that will make Passenger not spin down instances at all (passenger_min_instances, just set it to be equal to passenger_max_instances). I'm a bit disappointed that this as seen as a downside for certain apps, as it's just a configuration that makes Passenger useful in some extra scenarios.
That said killing idle workers is the default configuration, and perhaps we should take this article as feedback that in current times that default is not what most people want anymore.
I'd love to see a stability comparison more than a performance comparison. In my experience Passenger (only used the free version) is far and away more stable than either thin or unicorn.
Passenger seems rock solid and always up whereas you have to have something monitoring the processes of the other two as they were liable to go down every now and again. Perhaps I just configured them incorrectly tho...
Like I said maybe I didn't have it configured properly but it sounds like you've had more success than I have. I don't think it's host as using the same server I've switched to Passenger and haven't had issues.
Hi Jacques, what's the conclusion you drew? In my opinion this article doesn't paint a good picture of the power and flexibility of Passenger. If you want us to call you and talk about it in Dutch let us know, we only employ developers so I promise it won't be a boring conversation!
I don't have a conclusion yet. That will take a while actually but in the end there will be a 'reference web app' programmed in a bunch of language on a whole pile of frameworks that can be benchmarked on various setups.
So, for a resource/time-strapped startup, does it make sense to migrate away from Phusion Passenger to Unicorn? The stated advantage of Passenger, multi-tenant deployment, is not applicable to use. Will moving to Unicorn result in increased performance? Or are we better off solving other problems right now?
Your use case is exactly what we built Phusion Passenger for. It could be that applying some configuration could improve startup performance for you, but general runtime performance of Passenger should be on par with a Unicorn solution.
Note that to use Unicorn you would have to configure an nginx instance that proxies for Unicorn. This is where Passenger shines, we already integrate with nginx so there's no extra configuration or management overhead. Passenger is resilient so it will survive any crashes of your application and automatically respawn any failed processes.
edit: The configuration to make your Passenger perfect for dedicated single app servers is:
You can measure your performance and tweak the number, as long as they are equal passenger won't be spawning and killing anymore, making your performance more reliable.
From my experience in running a small PaaS for ruby apps: if you want to have something that 'just works', go for phusion passenger. If you want to take advantage of threads, go with passenger enterprise.
If you want a more complicated setup, for example with varnish and haproxy between nginx and your application servers, thin is rock solid. It runs Rails 2.x-4.x apps without any issues. It's also well suited for handling websockets traffic.
If the workload of your apps can take advantage of threads (e.g. lots of external API calls, or lots of time spent in the DB), a server like Puma will show an advantage. Also, if memory is limited, you can handle more concurrent requests.
I tend to watch what Heroku is currently recommending for people deploying monolithic Ruby apps on them, since it gives you a good sense of what makes best use of a set memory slice (512MB.) Right now, they suggest Puma.
Moving to Unicorn will probably not increase performance. The performance characteristics of Unicorn are similar to Passenger's. Plus, Passenger has far better administration tools and documentation, so that in the event of problems you can quickly figure out what's wrong.
Passenger works just fine with single-tenant dedicated instances. Just configure passenger_min_instances to the same number of max_instances or max_pool_size, and it'll behave exactly like Unicorn's default settings.
> Unicorn is built to cut off long running worker processes after a default of 30 seconds via SIGKILL. This is why Unicorn says it's built for "fast clients" - because anything beyond that cut-off is subject to termination.
This is not what "fast clients" refers to. Unicorn being built for fast clients means that it is only meant to serve local network clients. Or, in other words, it's not meant to communicate with end-user connections/clients directly (slow clients), rather, it's meant to run behind a proxy server like nginx or apache.
The "fast client" thing is basically saying, "this is an application server, not a web server", meaning, it's good at running application code, not managing the complexities of serving web requests to public internet (slow) clients. They're called slow clients because, well, they're slow ;) They have variable quality connections, bandwidth / transfer rates, and client capabilities (http 1.0 vs 1.1, 2.0, spdy, etc). Being a fast client means you can expect that you'll receive http requests at a relatively constant rate, basically whatever the local network or file system will allow, which is extremely fast (comparatively).
So, unicorn being a "fast client" just means that it's good at talking over local network connections (or, preferably, unix sockets) with a web server, and never on port 80 serving public http traffic.