1. It doesn't update—attempts to change the value associated with an existing key are silently ignored, although they do consume buckets.
2. It loops forever if the table is full and the key it was looking for is negative, because that invokes conversion of a negative signed value to an unsigned value, which I think is UB in C.
Maybe it has more bugs I haven't found yet.
Additionally the reason it took me 15 minutes to write 21 lines of code was that I changed a bunch of things in midstream, before posting the comment:
1. At first there was no "full" field in the hash table bucket, so there was no way to distinguish full buckets from empty buckets.
2. And then when I added it, I added it as "dead", with the opposite Boolean sense of "full", so I would have needed a separate initialization routine (to set all the dead fields to something nonzero). Instead I swapped the sense.
3. I was trying to detect the case where we'd looped back around to our starting point, indicating that the table was full and the search was unsuccessful, and I realized that the condition I wanted was not b == orig, which would always fail on the first iteration, but (b + 1) % table_size == orig. But doing that on every iteration seemed obscene, so I incremented b (mod table_size) to start out with instead.
4. At some point I think I realized I'd forgotten to set the .full field on newly inserted items.
5. Also I realized I'd forgotten to test the table[b].k == k condition in `get` as well, which would have made every search unsuccessful.
So I think it's reasonable to say that even a pretty limited hash-table implementation has plenty of opportunities to write bugs, more than I expected. But maybe I'm just running on low blood sugar this afternoon or something.
By contrast this binary search tree code I wrote in OCaml seems to have worked as soon as I got it to compile, although its worst-case performance is pretty bad, and, unlike the much longer hash table code, very likely:
let rec getp k = function `Empty -> None
| `Node(k1, v1, x, y) ->
if k1 = k then Some v1 else getp k (if k < k1 then x else y)
and putp k v = function `Empty -> `Node(k, v, `Empty, `Empty)
| `Node(k1, v1, x, y) when k1 = k -> `Node(k, v, x, y)
| `Node(k1, v1, x, y) -> if k < k1 then `Node(k1, v1, putp k v x, y)
else `Node(k1, v1, x, putp k v y)
And it's about a third the size of the hash-table code. OTOH I spent the last hour writing it.
With four more lines of code it's also a sorting algorithm:
and dictofp = function [] -> `Empty | (k, v)::xs -> putp k v (dictofp xs)
and itemsp d = let rec iter d tl = match d with `Empty -> tl
| `Node(k, v, x, y) -> iter x ((k, v) :: iter y tl)
in iter d []
1. It doesn't update—attempts to change the value associated with an existing key are silently ignored, although they do consume buckets.
2. It loops forever if the table is full and the key it was looking for is negative, because that invokes conversion of a negative signed value to an unsigned value, which I think is UB in C.
Maybe it has more bugs I haven't found yet.
Additionally the reason it took me 15 minutes to write 21 lines of code was that I changed a bunch of things in midstream, before posting the comment:
1. At first there was no "full" field in the hash table bucket, so there was no way to distinguish full buckets from empty buckets.
2. And then when I added it, I added it as "dead", with the opposite Boolean sense of "full", so I would have needed a separate initialization routine (to set all the dead fields to something nonzero). Instead I swapped the sense.
3. I was trying to detect the case where we'd looped back around to our starting point, indicating that the table was full and the search was unsuccessful, and I realized that the condition I wanted was not b == orig, which would always fail on the first iteration, but (b + 1) % table_size == orig. But doing that on every iteration seemed obscene, so I incremented b (mod table_size) to start out with instead.
4. At some point I think I realized I'd forgotten to set the .full field on newly inserted items.
5. Also I realized I'd forgotten to test the table[b].k == k condition in `get` as well, which would have made every search unsuccessful.
So I think it's reasonable to say that even a pretty limited hash-table implementation has plenty of opportunities to write bugs, more than I expected. But maybe I'm just running on low blood sugar this afternoon or something.