Hello, I am the author. Thank you all for the instructive comments, I made some changes to the article since I first posted it:
- Added a link to this HN post.
- Renamed some variables and types in the code for readability.
- Mention (in a footnote) that we could allocate the 'Pool' structure and the chunk arrays with a single call, or even return the 'Pool' on the stack. Suggested by 'liontwist' and 'unwind'.
- Mention (in a footnote) that we don't always need to store the full address when building the linked list of free chunks. Suggested by 'cb321'.
- Added valgrind support (also to my 'libpool' project). Suggested by 'kazinator'.
FWIW, while it is true that a smaller pointer lets you have smaller minimum fixed size objects in your pool, as your new footnote suggests, the space concern I was mostly alluding to in [1] is all the references in your client code { such as the BST code I linked to where each binary search tree (BST) node has 2 pointers (or 3 if they have parent pointers to give BST nodes "doubly linked list-like autonomy") }.
This can really add up. For example, like 20 years ago, I did a `print sizeof` in gdb for some g++ STL map node with a data payload of like 4 bytes and it was some obscene size approximately >= 1 order of magnitude bigger than necessary (like 80 bytes instead of, say 8 with 2Byte ptrs). While main memory is cheap, CPU cache memory is not and for "linky" structures, it often pays to be more cache-friendly. It's pretty easy to imagine a 10X smaller thing fitting into an L2 CPU cache where the larger thing would not even fit in L3, netting you a big latency boost (though the more root-ward levels of the tree likely always stay in caches in a hot loop).
EDIT: Anyway, you don't have to believe me. Others did an x32 ABI for x86_64 mostly to have "merely 2X" smaller pointers: https://en.wikipedia.org/wiki/X32_ABI
As a last code snippet readability suggestion - maybe extract "malloc" and "free" calls to your own simple wrappers (like my_malloc and my_free), and do "valgrinding" inside of them? You can also pass additional params to these functions to specify type of allocation if needed, or make separate functions.
I am not sure what you mean, exactly. There are some other valgrind macros that need to be called for pointers other than the ones returned by 'malloc'. More importantly, the memory regions need to be marked as 'NOACCESS' after we are done using them from the pool allocator, or valgrind will complain; that's why I set them back to 'DEFINED' before writing to them in other functions.
This is my first time using valgrind, however, so I might just be confused about what you mean, and there is a simpler way of doing all this.
Also the font isn't incredibly easy to read on Windows. I'm assuming Georgia was selected? Which has been on Windows for years but renders so badly it looks like the font color isn't black, even though I think it is black? This isn't really your fault but I would have stuck to Times New Roman for Windows' sake.
- Added a link to this HN post.
- Renamed some variables and types in the code for readability.
- Mention (in a footnote) that we could allocate the 'Pool' structure and the chunk arrays with a single call, or even return the 'Pool' on the stack. Suggested by 'liontwist' and 'unwind'.
- Mention (in a footnote) that we don't always need to store the full address when building the linked list of free chunks. Suggested by 'cb321'.
- Added valgrind support (also to my 'libpool' project). Suggested by 'kazinator'.