Read your Standard Libraries

unwind · on Jan 3, 2014

This is great advice, since in general you would like to assume that the standard library is implemented well, i.e. by people who know the target language.

Unfortunately, for C this is pretty hard to do, at least for the GNU libc. I'm not trying to critique the implementation, but every time I dive in (generally to help someone on Stack Overflow) I'm confused and get to spend a lot of time searching the code.

It's obviously optimized for a great deal of other parameters before ease of reading, which I totally understand yet still am a bit sorry to see.

chubot · on Jan 3, 2014

Yeah, a lot of good programmers have alternatives to libc, either for reasons of history or taste.

I was just reading DJB's daemontools (less than 6K LOC, very tidy!), and he doesn't use libc. Actually his code has been collected in libdjb, which seems quite nice:

http://www.fefe.de/djb/

And also the Plan 9 / Go guys don't use ANSI libc as far as I remember.

Those are two examples of people who don't even use the ANSI libc interface. But there are plenty of people who don't use GNU libc, and use uclibc or various other source-compatible alternatives. I think Debian switched to eglibc awhile ago.

rat87 · on Jan 3, 2014

I'm pretty sure eglibc(Embedded GLIBC) is basically a patchset for glibc for certain distros(especially debian and derivatives) because they disagreed with the maintainer especially over ARM support.

burntsushi · on Jan 3, 2014

Just FYI, I believe that by default, binaries compiled with the standard Go toolchain do link against your system's libc (which is probably GNU libc on Linux). However, I believe that can be disabled by compiling the Go toolchain with `CGO_ENABLED=0` set. But you might lose some functionality.[1]

[1] - https://groups.google.com/forum/#!msg/golang-nuts/H-NTwhQVp-...

chubot · on Jan 4, 2014

So I'm actually talking about what the Go compilers themselves use for libc, not Go programs. From what I can tell it is:

http://code.google.com/p/go/source/browse/include/libc.h

Which is definitely not ANSI libc. Go programmers won't care about this, because it's an implementation detail, but we're talking about reading source code.

burntsushi · on Jan 5, 2014

Ah! Mea culpa. I completely misunderstood your initial comment then. You're absolutely right.

asveikau · on Jan 3, 2014

I don't remember the last time I looked at glibc source so I am not sure if I can relate to this comment or not. However, I just wanted to throw in that I find the various *BSD projects, including their libcs, very readable. It helps that everything is in a single tree with some consistency in naming patterns etc, and the style is usually decent.

clarry · on Jan 3, 2014

I just have to second this. I have a checkout of the OpenBSD source tree on all my machines. While tinkering with stuff, I often end up browsing the code under libc/, with great curiosity. Though there are many not-quite-so-pretty bits inherited from the dark ages, I can't say I've ever really felt confused. On the contrary, there's plenty of nice, inspiring code just waiting to be read. Oh, and there's superb documentation to go with it.

NAFV_P · on Jan 3, 2014

I was reading stdio.h and stdlib.h yesterday, because I wanted to know the macro definition which prevents multiple includes. The code seems a bit 'organic', like it was cultured in a petri dish.

maaku · on Jan 3, 2014

It's an apt description as that code did grow and evolve, constrained by natural selection - only those changes which could be safely made without breaking or even changing functionality on hundreds of architectures over thousands of platforms ever made it in. Naturally that means it is an ever growing assortment of carefully vetted but otherwise horribly ugly hacks.

NAFV_P · on Jan 4, 2014

I thought the headers looked wicked, like a bizarre fantasy wilderness populated with fairies and leprechauns.

ori_b · on Jan 4, 2014

I suggest looking at musl libc: http://git.musl-libc.org/cgit/musl/tree

userbinator · on Jan 4, 2014

You might be interested in this: http://hg.pdclib.e43.eu/pdclib

It's a public domain C library, much smaller and less of a monster than glibc, plus you don't have to worry about any implications of reading GPL code.

alextingle · on Jan 4, 2014

Tell me more of these "implications of reading GPL code".

arielweisberg · on Jan 3, 2014

Excellent advice. Reading other people's (good) code is one of the best ways to learn how real world problem solvers solve real world problems.

The Java standard library is interesting and easy to jump into if you work with it every day. Guava is another good library to study.

I personally spend a lot of time looking at Riak, Cassandra, Hadoop, and Postgres since developing databases is my day job. I also follow mailing lists for both developers and users to understand the real world outcomes of their design choices and use that to inform how I think rather then going into every problem a blank slate.

pmr_ · on Jan 3, 2014

I mostly live in C++ land and the story is different there. The standard library implementations are a place of magick, mystery, and obscure defect reports. They provide plenty of insight what can go wrong if you are writing a generic library and how difficult correctly programming with templates actually is, but all the technicalities of the language make it hard to see the algorithms. They are definitely worth reading, but only if you have mastered a large amount of C++.

Not to mention the horrible naming conventions (leading double underscores everywhere).

I often think about a C++ standard library written for educational purposes with proper naming, focus on readability instead of optimization, and maybe multiple possible implementations of certain specs.

maaku · on Jan 3, 2014

No one would use it. It's a design flaw of C++ that performant, generic libraries require arcane magic and hideously obfuscated code.

pmr_ · on Jan 3, 2014

Please read carefully: "for educational purposes". I'm fully aware that such a library wouldn't be fit for production but it maybe would be useful for people learning generic C++ programming. Currently you have to look at Boost or any of the standard libraries and all you see are #ifdef and all the arcana and have a hard time seeing all the useful techniques.

maaku · on Jan 4, 2014

What I meant was: no one would use it, so it'll never be created.

saurik · on Jan 4, 2014

When the AP curriculum was in C++ there was, in fact, such a pedogigical version of the library used in classes. (I took AP CS the first year it was in C++.)

http://apcentral.collegeboard.com/apc/members/courses/teache...

davvid · on Jan 4, 2014

Leading double underscore is reserved for the implementation (of the standard library).

http://www.gnu.org/software/libc/manual/html_node/Reserved-N...

pmr_ · on Jan 4, 2014

I'm perfectly aware of that. I was just saying that it doesn't aid readability of the code (rather the opposite). There even have been proposals to solve this problem (scoped macros being just one of them) so this is acknowledged to be a rather real problem instead of just some cosmetic issue.

eonil · on Jan 4, 2014

It would make sense to write an easier wrapper around current C++ standard library. Because C++ provides very strong inlining, just simple wrapping with better method names will not make any performance hit.

But IMO, naming is one of the hardest work.

alextingle · on Jan 4, 2014

I always found the C++ standard library relatively readable. It's inspiring just how short many of the functions are.

avisk · on Jan 3, 2014

I always make sure that I attach the source code of the 3rdParty libraries used in my project to my IDE. The best time to read library source code is while using them in your code, where we know the context. This also help in writing better code. I feel that we should make reading source code of APIs we use, integral part of the development process rather than reading the standard library source code for the sake of reading it.

steveklabnik · on Jan 3, 2014

This is generally great advice, but the Ruby standard library isn't exactly the best to read if you want to learn. Don't get me wrong, it works, but the code is generally very, very old. Nobody knew how to write good Ruby code back then.

netghost · on Jan 3, 2014

It may not be the best place to learn current idioms, but it is still worthwhile if you want to really understand the tools you are using.

yawboakye · on Jan 4, 2014

If it has not been rewritten using modern (good) Ruby style I guess it's still turning a perfect cartwheel. While reading Eloquent Ruby (by Russ Olsen) Ruby's set standard library was recommended for source reading. It was an awesome read! I think standard libraries are really good at demonstrating at least two things: the language's preferred coding style and implementation of data structures and algorithms using the language's primitives. For these 2 reasons, they're still worth reading, imo.

bcjordan · on Jan 3, 2014

Ah, will certainly trust your assessment there. :)

In your opinion, what are some good starting points for interested Ruby readers ?

bradleyland · on Jan 3, 2014

If you're speaking of code reading, I find looking at the source code of Rubinius very interesting.

Ruby has so much utility wrapped up in places like Enumerator, that you don't see a lot of examples of low-level data structure implementation in the wild. You could look at MRI Ruby, you'd be looking at a lot of C, which tells you a lot about how Ruby works, but doesn't show you much actual Ruby. That's where Rubinius is really great. You get to see how a smart team would implement Ruby... in Ruby.

https://github.com/rubinius/rubinius

netghost · on Jan 3, 2014

I have to second this, if you use Ruby at all, Rubinius is quite fascinating to read, even if you still use MRI on a day to day basis.

steveklabnik · on Jan 3, 2014

I tend to point people at "Eloquent Ruby" to learn how to write idiomatic Ruby code, rather than point them at actual projects.

haukur · on Jan 3, 2014

Funnily enough, the first chapter recommends the standard library as idiomatic Ruby code.

steveklabnik · on Jan 3, 2014

Hehe. And it comes full circle. It is a few years old... I don't think the social norms of Ruby have changed all that much.

To be clear, there are good and bad parts of the standard library, just like there is all code. But much of it is basically a time capsule.

wasd · on Jan 3, 2014

How do you feel about the author's other book, Design Patterns in Ruby?

jurassic · on Jan 3, 2014

I loved it. Beyond pure technical chops, Russ Olsen is a great writer which makes his books stand out to me from other programming books I've read. In Design Patterns in Ruby each chapter handles a different pattern with a simple example that's easy to understand. And at the end of each chapter Olsen points out where that pattern is used "in the wild" in popular ruby projects if you'd like to see a larger example.

steveklabnik · on Jan 3, 2014

I have not read it.

mjbellantoni · on Jan 3, 2014

I came here to write this comment!

fit2rule · on Jan 3, 2014

I'm kind of surprised this isn't "a thing". I mean, I'm surprised its 'a thing'. Because, isn't this standard practice - I mean, isn't just a part of the coder Creed that you read the code, anywhere and everywhere you can?

edwinnathaniel · on Jan 4, 2014

This is where Java ecosystem shines the most:

1. Easy access to standard libraries

2. Most (not all, re: java.util.Date) of the standard libraries are well thought-out and the documentation is top-notch

3. Tools like Maven + Eclipse/IntelliJ can provide insight to Java libraries source code easily (Maven has built-in capabilities to download the javadoc _and_ the source code from the designated Maven repository if the author publish them correctly).

Access libraries (3rd party or well-known ones) are a shortcut away.

PS: NuGet had this capability recently, which is great, but I rarely use it so I don't know how easy and to what extend NuGet can replicate what Maven has.

emilv · on Jan 3, 2014

This is extremely useful, not only for inspiration and learning, but also so you actually know what is happening under the hood. This is great for troubleshooting and for reasoning about your own code. This is one big pro for open source in development!

I learned a lot by reading the source code for Python dicts (which also comes with a lengthy motivation for why it was implemented that way) and any Haskell library (Hackage links from the manual page directly to the source code for the respective function, which makes it very easy to see what is happening).

sg47 · on Jan 3, 2014

Is the Python dict source code available in the link provided in the article?

bcjordan · on Jan 3, 2014

Raw source: http://svn.python.org/view/%2acheckout%2a/python/tags/r266/O...

Walkthrough of the implementation: http://www.laurentluce.com/posts/python-dictionary-implement...

dalke · on Jan 3, 2014

And a PyCon video: http://pyvideo.org/video/276/the-mighty-dictionary-55 .

sg47 · on Jan 3, 2014

Thanks both of you!

bcjordan · on Jan 3, 2014

In case my Wordpress-on-DreamHost-unlimited-plan blog goes down, here is a gist copy: https://gist.github.com/bcjordan/8242593

bradleyland · on Jan 3, 2014

Looks like you had the good sense to install WP Super Cache. I don't know how much traffic you're seeing, but my blog has held up under some HN traffic using the same setup (Dreamhost+WP+WP Super Cache).

bcjordan · on Jan 3, 2014

Hah! The WP Super Cache "Activate" button was actually giving me 503 responses the first few tries. Now it's nice and speedy.

dinosaurs · on Jan 4, 2014

I think this is great advice and something I've been meaning to do for Ruby.

However, what in the case of JavaScript/Node.JS developers? What could/should they read to help them understand their language/frameworks better?

I've been reading through the Express source code lately to understand the module better. I'm wondering what other source code I could read to get a better grasp on Node/JS.

Thoughts?

drblast · on Jan 3, 2014

I always liked Common Lisp for this, since you could so easily inspect data structures like hashes at run time.

I think CMUCL had a really interesting and unique hash implementation that relied on two identically sized arrays, one for the keys and data, and another for a "next" index that showed you where to go for a key collision.

mail2vks · on Jan 3, 2014

Completely agree. Java collections and threads package well written code. Infact I ask interview questions based on API implementation

yawboakye · on Jan 4, 2014

I dig into Rails by accident. Most of the time I don't have internet and don't remember exactly how a certain API is used. So I jump into the code with `source_location` or `bundle open ...` and get what I want. I usually see something new. That's how I discovered Dir[]

TazeTSchnitzel · on Jan 4, 2014

PHP's Zend and /ext/standard are unfortunately unreadable and undocumented.

alextingle · on Jan 4, 2014

You read them if you want to regress, as a programmer.