On problems with threads in Node.js

spion · on April 23, 2015

Not sure if it was intentional, but the article is quite misleading.

The thread pool in node is only used for a limited number of APIs. Pretty much all networking uses native async IO and is unaffected by the size of the thread pool. Things like Oracle's driver are rare exceptions: the typical MySQL/PostgreSQL/redis etc drivers all use native async IO and are unaffected by this.

The author only glosses over this briefly. As a result this article leaves the impression that the problem described is the norm, which is not the case.

suprememoocow · on April 23, 2015

It's a completely unscientific method, but searching through one of our large applications (`npm ls|wc -l` -> ~2000 dependencies), the only modules I can find using `uv_queue_work` are:

* kerberos, unused (dependency of mongodb)

* protobuf, for serializing data

* snappy, for compression

kerberos isn't actually used in our app, so it doesn't matter, but we send a lot of data through protobuf and snappy, so it may be worth us profiling this a little more.

spion · on April 23, 2015

You can also experiment with different values for the env variable `UV_THREADPOOL_SIZE`; last time I checked this can even be set from the JS code via `process.env['UV_THREADPOOL_SIZE']` if you make sure to do it before you call something that uses the threadpool.

draven · on April 23, 2015

There's a whole section of the article covering which parts of node may be affected. If the article began by saying it only affects FS and DNS ops, and some drivers, people may be more tempted to stop reading.

Is there any reliable way to check if the libs you're using are subject to this issue?

spion · on April 23, 2015

There is a pretty reliable way to make sure they don't: if they don't install any native modules and aren't filesystem or DNS related, they're not affected. If they do, you may grep the native module's source code for uv_queue_work but I don't know if that will catch everything.

millstone · on April 23, 2015

Aren't there some commonly used system calls that don't have asynchronous equivalents, such as open() or stat() or access()? Are threads used for those?

Kiro · on April 23, 2015

> Pretty much all networking uses native async IO and is unaffected by the size of the thread pool

Are they still using threads to get the "magic" working? I'm referring to this sentence: "But how did that happen? To the best of my knowledge node.js, is not powered by magic and fairy dust and things don’t just get done on their own."

spion · on April 23, 2015

Not magic, but APIs like these: https://en.wikipedia.org/wiki/Epoll https://en.wikipedia.org/wiki/Kqueue

Pretty much all event loop based programs work the same way: instead of blocking on a single request for IO, they use system calls (e.g. epool_wait) that block until any of the many descriptors (sockets) has some event (data to be read, client connecting, etc). It gets a bit complicated when there are queued tasks for the thread pool and timers involved too, but its the same principle.

wppick · on April 23, 2015

Don't most computer have separate processors for network, hard drive, etc. So, even if you have a single core processor, you are still running a multi-processor environment? Can anyone give me details on this? Someone told me something like this once and it has confused me ever since...

irascible · on April 23, 2015

Yes. There are 100s of sub units that are complete processors on your motherboard. DSP, Ethernet controllers, Disk IO controllers, sound controllers, memory controllers.. and thats not counting the programmable controllers in every disk drive, sd card, reader, USB hub, peripheral, etc.

wppick · on April 23, 2015

Does this have an impact on writing concurrent software?

laggyluke · on April 23, 2015

It's worth noting that DB drivers that actually integrate with libuv are a minority - most DB divers use Node-level network APIs and are unaffected by such thread pool limits.

masklinn · on April 23, 2015

Even when the db drivers integrate with libuv, there is little reason to use blocking APIs for that. The threadpool is primarily used for operations for which no non-blocking API is available (mostly filesystem access).

eknkc · on April 23, 2015

Why would a DB driver need to integrate with libuv in the first place?

paraboul · on April 23, 2015

In the case where the driver (for instance the official libmysql) doesn't expose its network interface (e.g. its sockets can't be used with an external event loop). In that case, it would block on IO operation, and there is no other choice but to push it to another thread.

That said, using a driver in another thread leads to various complexities that can be avoided when running in the same application (javascript/v8) thread/event loop.

coldtea · on April 23, 2015

To be evented and get the respective performance gains in a Node setting?

cddotdotslash · on April 23, 2015

I actually ran into an issue recently with CPU intensive tasks blocking my web server. It turns out that "querystring" (used to parse request bodies in web applications) is an asynchronous, blocking request. You'd never notice much slowness, until your request bodies are massive (think 50 nested JSON objects and some base64 image data for good measure) and you have multiple per second. Now, every request is blocked until the previous one is processed. I'm still trying to figure out a solution, after looking into worker threads, etc.

spion · on April 23, 2015

I always use the following replacements written by petkaanotonov, the author of bluebird :)

https://www.npmjs.com/package/querystringparser for query string parsing. Depending on content, you may get massive improvements (5x-20x)

https://www.npmjs.com/package/fast-url-parser for url parsing (the built in url parser is the main reason why node is so far behind on the TechEmpower benchmark - with this replacement the benchmark shows about 60-80% improvement in served req/s)

https://www.npmjs.com/package/cookieparser

For very large post bodies, I use JSON in conjunction with OboeJS - http://oboejs.com/ . Its not too much slower than native JSON.parse (about 5-7 times) however its non-blocking. Still haven't found a solution that is close enough in speed to native JSON.parse

kapv89 · on April 24, 2015

This comment in itself is probably more valuable than the OP

cddotdotslash · on April 23, 2015

These look promising, thanks!

GordyMD · on April 23, 2015

Thank you for sharing your findings here. Very pertinent to me right now - could actually drastically minimise the amount of research I have to do today. I love HN.

albertzeyer · on April 23, 2015

Is there a detailed overview about which functions in libuv (Nodejs) rely on blocking primitives and thus are using the thread pool to work async?

From [here](http://docs.libuv.org/en/latest/design.html), it sounds like all file IO is always based on blocking primitives, and native async file IO primitives are not used, although such async file IO primitives do exists and were tried out in libtorrent (http://blog.libtorrent.org/2012/10/asynchronous-disk-io/). The result of that experiment however was mostly that the thread pool solution was simpler to code (I guess).

From the libuv design doc, the overview is:

* Filesystem operations

* DNS functions (getaddrinfo and getnameinfo)

* User specified code via uv_queue_work()

I wonder whether this is really the best solution of if some combination of a thread pool and native async disk IO primitives could perform better.

inquisitiveio · on April 23, 2015

The article you link to touches on a few of the Linux AIO shortcomings.

But if you have also read the following blunt criticism from Linus http://yarchive.net/comp/linux/o_direct.html he also outlines a better alternative way of implementing asynchronous disk io on Linux at least.

masklinn · on April 23, 2015

> The result of that experiment however was mostly that the thread pool solution was simpler to code (I guess).

and uniformly asynchronous (native async operations may not cover e.g. file copy or filesystem operations, furthermore filesystems may block during submission of IO ops which makes the operations effectively synchronous) and have higher throughput (they support read/write vectors).

albertzeyer · on April 23, 2015

We could use the thread pool for blocking primitives and otherwise use the native AIO primitives, couldn't we?

And the higher throughput seemed only to be a problem on MacOSX, so we could fallback to the thread pool there, but use the async IO on Windows and Linux.

laggyluke · on April 23, 2015

I thought that's basically what libuv does: it uses AIO on platforms that support it, falling back to thread pool on platforms that don't.

Edit: apparently I was wrong.

on April 23, 2015

[deleted]

albertzeyer · on April 23, 2015

Yes, just read that (the libtorrent post). I somehow feel that this is a sad state. Shouldn't we try to improve the disk AIO API then, if it is not useful at the moment? Or maybe that has changed also? The article is from 2012. Or maybe we could at least partly use it? Of course, code complexity will be a problem then in any case. But for applications depending on a lot of concurrent disk IO, maybe it could be worth it.

mjpa · on April 23, 2015

Is this another case of "here's the code I ran" when in fact they didn't? There should be 3 lines of output, not 6!

Also, the code says it will print the time taken since the start of the program, which again doesn't go with the output and the conclusion being made!

Anyway, how come the output isn't in order?

xylem · on April 23, 2015

Oops, thanks for that - seems the results from the first run of the example somehow got lost in the final version and I didn't notice.

The order of the output is dependent on when each call finished - they run in parallel, so it's not guaranteed that functions will end in the order they were invoked.

mjpa · on April 23, 2015

Ah yes, so they do. My lack of sleep is showing!

For some reason I was thinking the readdir would run in series so output would go up by ~1s each time.

GordyMD · on April 23, 2015

Key line before where he shows output...

> However, watch what happens if we double the number of iterations

mjpa · on April 23, 2015

Ah yes, completely skipped over that bit :P

z3t4 · on April 23, 2015

Use named functions to avoid closures!

  for (var i = 0; i < 3; ++i) {
    namedFunction(i);
  }

  function namedFunction(id) {
    fs.readdir('.', function () {
      var end = process.hrtime(start);
      console.log(util.format('readdir %d finished in %ds', id, end[0] + end[1] / 1e9));
    });
  };

nypdhn · on April 23, 2015

Does it mean the async functions about file system operations are not really asynchronous in Node.js? The whole node.js server will be blocked if the number of file system operations is bigger than the thread pool size. :-(

laggyluke · on April 23, 2015

Not the whole server - most of the networking IO will work just fine, only the operations that use libuv thread pool will be queued.

nypdhn · on April 24, 2015

Got it. queued != blocked Thanks!

imaginenore · on April 23, 2015

4 seems like a ridiculously low default size for the thread pool.

justincormack · on April 23, 2015

It is only for file system operations, and mostly these do not actually block, as the results are available in cache. So it is probably reasonable for casual use.

Now if you want to get good performance on an SSD (ie the rated iops) you will need a decent queue depth, like 32 or so, so it wont work but thats a specialist use case.

jrochkind1 · on April 23, 2015

SSD's are a specialist use case these days? Or getting good performance on them is? What do you mean?

justincormack · on April 23, 2015

Getting full performance from them is a specialist requirement. Most people are not disk IO bound on SSD.

imaginenore · on April 23, 2015

Isn't pretty much every DB that doesn't fit into RAM?

ryankshaw · on April 23, 2015

(noob question) why would the libuv threadpool choose to use a static 4 instead of something like matching the number of processor cores available by default?

zwily · on April 23, 2015

For things like filesystem access, you'd want more threads than CPUs because that's not CPU heavy. It still seems like they could choose a saner default though.

jaysoncena · on April 23, 2015

Is there an easy way to identify the number of tasks queued in uv's work queue?

babuskov · on April 23, 2015

To sum it up: node.js runs only 4 background threads. Be aware of this.

No big deal really. I have multiple servers running node.js under load for 3+ years and never had an issue with this.

In fact, I found it helpful. If your database is under load and is already running 4 heavy queries, not giving it any more jobs is actually a good thing.

masklinn · on April 23, 2015

> In fact, I found it helpful. If your database is under load and is already running 4 heavy queries, not giving it any more jobs is actually a good thing.

Actually the threadpool shouldn't be a factor then, unless there are no nonblocking/async drivers for the database: internally, libuv uses its threadpool to asyncify operations for which no async/nonblocking version exists (Erlang does the same IIRC), for the most part it's filesystem operations, getaddrinfo and getnameinfo (I'm guessing it could include other stuff depending on what the underlying OS provides).

Socket or file IO should have native non-blocking APIs, so it does not need to use the threadpool.

pdpi · on April 23, 2015

On the other hand, not being able to hit the filesystem because you have those four queries running is... not so helpful.

In general, this looks like it could be a performance bottleneck for highly-concurrent applications.

simpleigh · on April 23, 2015

I agree with this - the right place to restrict the number of concurrent database queries is the database, not the whole IO layer!

amelius · on April 23, 2015

> If your database is under load and is already running 4 heavy queries, not giving it any more jobs is actually a good thing.

User-interfaces 101: don't sacrifice the latency of short-lived jobs for finishing lengthy jobs.

Also, let the OS figure it out. That's what it was made for.

babuskov · on April 23, 2015

There's a reason why queues are more performant than threads. See node.js and nginx for real-world examples.

The OS is dumb and does not understand the nature of load. If we applied your logic, everyone would still run Apache instead of nginx.

Your user-interfaces 101 fails flat on its face when you have hundreds of short-lived jobs that need to happen. Each one is insignificant on it's own, but if you overload the server will all of them at the same time just can starve system's RAM (forcing it to swap) or increase disk seeks by order of magnitude if short-lived jobs are asking for different data that is all over the place.

The only true answer here is: it depends.

exo762 · on April 23, 2015

Piece of badly engineered software conquering the world. As if more JavaScript was really something worth pursuing.

lampe3 · on April 23, 2015

comments like that are making me reading even fewer comments on HN...

financequoll · on April 23, 2015

Well, it's not perfect, but you could do a lot worse than use JavaScript. Lots of R&D at companies and organisations like Google, Microsoft, Mozilla and Intel is going into JavaScript (likely billions), and as far as I'm concerned that development is resulting in a pretty useful (and increasingly performant) language for doing what people use it for: web platform applications.

lordbusiness · on April 23, 2015

Do you realize that comments like this are what is breaking the community here at HN?

Are you aware that people like you are destroying something that was once brilliant?

HN is going through an Eternal September.

eklavya · on April 23, 2015

Calm down man, nobody's breaking anything. Everyone is entitled to his/her opinion. You can express your thinking on the matter by up/down voting. Relax :)

lordbusiness · on April 23, 2015

Heh - maybe you're right, and I wish the world were as you describe. :-)

There is however a deeper underlying issue; decorum is important and communities that exhibit genuine 'niceness' are nice. Communities that allow, or worse, overlook dark behaviour degenerate.

Flagging and down voting is one part of the solution, but when the nastiness reaches a level that the nice people start to disengage and go elsewhere, it's clear to me that we need another element of control. Perhaps algorithmically detecting repeat offenders? Perhaps more granularity with down votes?

There's are differences between a down vote because one disagrees with the author, and a down vote because one believes the author is ill-informed and spreading misinformation, and a down vote because the author is being downright juvenile.

A number of hits on the third case against a given author on multiple comments could conceivably constitute an automatic warning and / or banning system.

I don't want people to be unable to express their views, but when the mean-spirited people who contribute nothing but nonsense start to represent a large percentage of a community, it's reasonable to see if anything can be done.

jasonlotito · on April 23, 2015

> There's are differences between a down vote because one disagrees with the author, and a down vote because one believes the author is ill-informed and spreading misinformation, and a down vote because the author is being downright juvenile.

The difference is that the first two should not be voted down. If you vote down, you should not comment. If you comment, it means at the very least the comment added to the conversation, unless your comment is also not worth posting and you should be voted down as well.

It's fairly simple: does the comment bring value to the conversation? If it does so directly, vote up. If only indirectly, than don't. If it does not, vote down.

Whether you disagree or not is irrelevant. And someone being ill-informed should be corrected. At the very least, by writing an incorrect comment, they are presenting an opportunity to be corrected.

> I don't want people to be unable to express their views, but when the mean-spirited people who contribute nothing but nonsense start to represent a large percentage of a community, it's reasonable to see if anything can be done.

Things can already be done. Vote down and don't reply. That is the best way. Vote down and ignore.

72deluxe · on April 23, 2015

Perhaps the should split the upvote/downvote buttons into those 3 categories? I know the downvoting because of disagreement is a really really annoying state of affairs, particularly when they don't tell you WHY they disagree but instead just downvote you.

arh68 · on April 23, 2015

I would down vote much more often if it weren't so ambiguous. If I had more options. Sometimes I want to down vote just 0.1, just to say Walter, you're not wrong: you're just an asshole.

I also think pointed, honest replies like yours above go a long way. The best way to get people to assume good faith is to show it. (Reyk's Second Law)

q3k · on April 23, 2015

You mean destroying the echo chamber that is HN?

dud3z · on April 23, 2015

Can't upvote enough!