Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

No, I understand that. Maybe I'm not explaining my point properly, so I'll try again:

If you issue a write() syscall from a process, and the syscall succeeds, then the data that was written is present in the OS's cached view of the filesystem, even if the process dies a nanosecond later. That view is shared consistently by all processes on the system. It's true that the changes may not actually be stored persistently on disk, but that difference is unobservable unless something happens to make the kernel lose its cached data.

So from the test suite's point of view, unless part of the test involves actually killing VMs and not processes, it should not be possible for the results to depend on whether or not fsync() was called.



Jepsen is just doing a kill -9 on the java process.

I posted a comment on the blog: https://aphyr.com/posts/323-call-me-maybe-elasticsearch-1-5-...

First I made sure that read() goes through the page cache. (It does as long as there's no O_DIRECT) Then I went and checked the write ahead log on ES.

Turns out from my reading that ES is considering a write to be durable if it is put into a userspace buffer.

https://github.com/elastic/elasticsearch/blob/master/src/mai...

Data is pushed to kernel space whenever the buffer gets full. Then it is fsync'd on the timer.


Nice research.

In case anyone else is wondering why that Github link is broken, the file in question was renamed a few hours ago. Here's a working permalink: https://github.com/elastic/elasticsearch/blob/fafd67e1aef091...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: