Personally I tend to prefer multi-process over multi-threads. I like my IPC to be a bit more explicit. It seems cleaner, better defined and easier to distribute across boxes. Sure it's a slight performance degradation, but you can always use shared memory if they are on the same machine.
This seems like the kind of bug that wouldn't hit until production ramped up and it would be a time suck to find/fix.
I'm inclined to agree with you, but simply using multiple processes + IPC wouldn't have really affected this bug: you'd run into the same issue with one process reading from a shared memory region while another process memcpy()'s into it. That might be less likely to occur than with a multi-thread model, but it's still quite possible.
I think the lesson from the original post isn't "multi-threaded == bad"; it is that you should be very careful when accessing the same memory region with two concurrent threads (or processes), even if both of those threads are performing seemingly-innocuous operations.
I agree, the lesson from the post isn't multi-threaded==bad. Processes just make you think more about the data interactions.
However, if you are in a multi-process model and using shared memory, you'd better be getting a semaphore for both memcpy and reading. So, get semaphore, do memcpy, release semaphore (all readers should be blocked during that time).
Not really efficient, neither is constantly doing a memcpy on static data.
I'd go farther: don't try to read from memory at the same time that someone else is writing it. In the absence of explicit guarantees of atomicity, don't assume that any guarantees exist.
The interesting discovery in the article isn't that memcpy() can leave things in an intermediate state, but that releasing a JNI array triggers a copy even if the array is unaltered.
But this looks like a bug in the program (or the VM): either that copy back to the heap should not be occuring, or there should be some locking here. What if the data was changed? Would the Java thread just be expected to deal with all intermediate states? Alternatively, if the data is read-only, don't write to it!
But this looks like a bug in the program (or the VM): either that copy back to the heap should not be occuring, or there should be some locking here. What if the data was changed? Would the Java thread just be expected to deal with all intermediate states? Alternatively, if the data is read-only, don't write to it!
In the comments to the post you can see that the correct way to prevent a copy back would be to release the memory using JNI_ABORT. As far as I know, there's no way to enforce immutability on array contents in Java, so it's an implicit contract in the system: the contents really never does change.
Looking at the man page for memcpy, it states that it is explicitly for non-overlapping memory areas. Depending on implementation, then, it might even be "correct" behavior to leave the array in an inconsistent state (if the results are not well-defined).
This seems like the kind of bug that wouldn't hit until production ramped up and it would be a time suck to find/fix.