FTR: This gdb script will print full stack traces for every read syscall made:
catch syscall read
commmands
backtrace
continue
end
Put that into a file `trace-read.gdb` and attach to a running process like so:
gdb -x trace-read.gdb -p $(pgrep -n tar)
This assumes you are running an executable with built-in debugging symbols (gcc -g). It should be possible to side-load debugging symbols provided in an external package, though I don't have a command at hand. (Anyone?)
This works well enough on Linux, to quickly debug situations like the one above. However, it can make the attached process painfully slow, and occasionally bring it down all-together.
The right tool for this kind of in situations where slowing down or crashing the process is not acceptable, is DTrace:
Ironically, DTrace is one of the main selling points for Solaris/OmniOS (or FreeBSD) over Linux.
The situation has gotten better recently, with bpftrace becoming available:
You can do this on Linux using perf, if you want less overhead and less impact on the application-under-test:
$ perf record -e syscalls:sys_enter_read -g -- application arg1 arg2 ...
[ application runs while perf writes out a log, recording every read() syscall, and keeping track of the backtrace each time]
$ perf report -g --stdio
[ perf reads the log, writing out the backtraces ]
This is the basic usage. Lots more is available, obviously. This has been available for a LONG time.
fun trivia: gnu tar predates the GNU project. What?
It started out as an implementation of tar written by John Gilmore whose uname is 'gnu'. gnu has written a lot of free software as well, but the fact that his name and the GNU project's name are the same is a complete coincidence.
I see references to John Gilmore being the original author of pdtar, and John Gilmore using the gnu handle, but I don't see any references to his usage of the name predating the GNU project or pdtar being referred to as gnutar before it became part of the GNU project.
I'm not saying that I disbelieve you, I just haven't been able to find any references for your claim.
While the bug is in tar, the API for the read(2) system call doesn't help. Its return value can be positive, which indicates the number of bytes read; negative, which indicates an error, unless errno is EINTR, in which case you should retry; or zero, which indicates either end of file, or the number of bytes read when the buffer length is zero (and in this case you can't distinguish end of file from a normal read). It's easy to forget to handle one of these five cases.
Issues like this are almost entirely solved with rust-esque enum pattern matching. The compiler with fail if you don’t have every match case handled. There is ‘_’ to abuse the system but doing so should be fairly obvious.
——
Rust in general is pretty exciting for systems work. I just wish there were more shops using it professionally.
Unfortunately, Rust kept nearly the same design for its `std::io::Read` trait. It only separates the two "error" cases from the three "success" cases, so you still have to check for `ErrorKind::Interrupted` in the error case, and for `Ok(0)` in the success case (and when the passed buffer has zero length, you still can't distinguish between end-of-file and success, though that's not Rust's fault). A better design could have used a return type with more alternatives than just `Ok(n)` or `Err(e)`.
The issue with EINTR is pretty standard between most syscall; if you get one you should retry (or, depending on what flags you have set, this can be done for you automatically in some cases). A return value of zero is an EOF, there is no such thing as an “empty read”. Why would you pass in an empty buffer?
That would be a sixth case I had forgotten above (negative return value with errno EAGAIN and/or EWOULDBLOCK - these two might or might not be the same number).
Apparently the use of an empty read is to probe to see if you get various errors or an immediate 0-size return. This is merely a 'may' requirement in the Single Unix Specification, so I'm not sure how portable it is in practice.
On success, the number of bytes read is returned (zero indicates end
of file), and the file position is advanced by this number. [...]
On error, -1 is returned, and errno is set appropriately. In this
case, it is left unspecified whether the file position (if any)
changes.
I was going to assert that read significantly predates getchar, but it turns out that both may have existed from V1 Unix onward. However, getchar in V1 through V6 had a different EOF value from our familiar modern one of (-1); they all instead returned a 0 byte to indicate EOF.
(The V7 getc/getchar/etc manpage notes this as a BUG, although it doesn't specifically document what the V1-V6 EOF was. Presumably everyone who this was relevant for already knew. All of this is based on the historical Unix trees available through www.tuhs.org.)
With read(), the return value is separate from the data read. It is arguably elegant to read as many characters as possible and to not consider hitting the end of the file as an error—just a point that can't be read beyond, like a stop on a piece of machinery.
With getchar(), the decision was made to return the char directly (to have used a pointer to a single char would be daft, from the point of view of economy) and to use an in-band code to indicate end of file. This "in-band signalling" is considered a bit of a kludge by some people. So I'd say read() is a better/cleaner design.
> if you run GNU Tar with --sparse and a file shrinks while tar is reading it, tar fails to properly handle the resulting earlier than expected end of file.
Is that a bug? Yes. It's different than changing the data of file but still.
Clearly it’s a bug because tar goes into a mad loop reading nothing and outputting empty blocks forever, or until the file changes size again. It needs to check for eof to avoid this mistake, even if the eof is unexpected because the file size was different.
I think most gnu utils are written so that they treat input files as a stream of data so you can also use pipes instead of regular files. This also means they can somewhat handle size changes since they will keep reading until they encounter an EOF without relying on an earlier snapshot of the file size or sparse chunks. Whether the results are useful depends on the file format, if it's just temporary padding as in this case it works.
Agreed. OTOH, it is not a bug if the fact that tar is not instantaneous causes the tarball to contain combinations of files unlikely to occur during normal operation of the system. e.g., if a file is renamed while tar is running it is not a bug for the tarball to contain zero or two copies of that file.
It is a bug for tar to enter an infinite loop writing an ever expanding file of trailing zeroes because the current file was truncated to a size smaller than its stat returned.
It's basically a classic TOCTOU race, tar needs to honor the EOF returned from read() as authoritative.
This is actually a very common trap for junior *nix programmers; treating stat() and a subsequent mmap() or read() loop as if they were atomic with regards to stat.st_size. The size returned by stat() can be used as an estimate, but otherwise can't be applied to subsequent operations.
Finding a bug in GNU tar is not hard... just ask Jörg Schilling.
One of the first things I do when I download source code is decompress it and then de-archive it with GNU tar, then promptly re-archive it wit a real, AT&T UNIX System V 4.0 tar, because it's a known fact among us old UNIX folks that GNU tar is buggy as hell and only GNU tar can correctly read GNU tar (that wasn't the point of a tape archiver, but GNU people didn't get that memo).
Either from Heirloom tools, or by running any freeware, open source operating system which is built from the illumos source code, like SmartOS. illumos code base is the most direct, canonical descendant of AT&T UNIX System V 4.0. If you run anything based on illumos not only are you running a true UNIX, but a reference implementation of it.
Additionally, any true System V UNIX like IRIX and HP-UX will have it.
I rather enjoyed reading the adventure of hitting a problem in a large system, and then working towards narrowing down the specific bug or set of bugs.
Such posts have in the past been super helpful for me personally (and to others I imagine), in going from plain weeping that stuff just randomly breaks, to learning to enjoy examining the possible causes and understanding how to explain the problem concisely and with enough detail to make it useful to a developer.
He’ll be able to reuse much of his blog post in an great bug report with easy steps to reproduce the problem, excepted and actual outcomes of following those steps, and a scenario where it will happen in real deployments as well as a providing a workaround.
I think the guy just wants to enjoy the break. Talking and responding to an email/bug thread is bit more involved than just shouting text to your blog.
This works well enough on Linux, to quickly debug situations like the one above. However, it can make the attached process painfully slow, and occasionally bring it down all-together.
The right tool for this kind of in situations where slowing down or crashing the process is not acceptable, is DTrace:
- Tutorial: https://wiki.freebsd.org/DTrace/Tutorial- ustack: http://dtrace.org/guide/chp-user.html#chp-user-4
Ironically, DTrace is one of the main selling points for Solaris/OmniOS (or FreeBSD) over Linux. The situation has gotten better recently, with bpftrace becoming available:
- http://www.brendangregg.com/blog/2018-10-08/dtrace-for-linux...
- https://github.com/iovisor/bpftrace
Until you have a 4.x Kernel with the right configuration options running, I am afraid the above gdb scripts is your best option on Linux.