hashmap

hashmap

zig's hash maps for key-value lookup. two flavors parallel the arraylist split: StringHashMap stores the allocator, StringHashMapUnmanaged passes it to each method.

managed vs unmanaged

// managed — stores allocator, slightly more convenient
var map = std.StringHashMap([]const u8).init(allocator);
defer map.deinit();

try map.put("key", "value");

// unmanaged — pass allocator to each method (like ArrayList)
var map: std.StringHashMapUnmanaged([]const u8) = .empty;
defer map.deinit(allocator);

try map.put(allocator, "key", "value");

unmanaged is the better default for the same reasons as ArrayList — no stored allocator, statically initializable. use managed when passing as function parameter (avoids threading the allocator alongside it).

O(1) block index

the pattern that fixed zat's 79-second MST walk: build a lookup table during CAR parse, query it during tree traversal.

pub const Car = struct {
    blocks: []const Block,
    block_index: std.StringHashMapUnmanaged([]const u8) = .empty,
};

// during parse: populate index alongside the block array
var block_index: std.StringHashMapUnmanaged([]const u8) = .empty;
while (pos < data.len) {
    // ... parse CID and content ...
    try blocks.append(allocator, .{ .cid_raw = cid_bytes, .data = content });
    try block_index.put(allocator, cid_bytes, content);
}

// during walk: O(1) lookup by CID
const data = block_index.get(cid_bytes) orelse return error.BlockNotFound;

the key insight: CID bytes are already unique identifiers and the block data they point to is owned by the CAR buffer (arena-allocated). no copies needed — both key and value are slices into existing memory.

before this: linear scan through 243k blocks, called once per MST node (~50k nodes). ~12 billion comparisons. after: hash lookup.

see: zat/car.zig

key ownership

from the stdlib: "key memory is managed by the caller. keys and values will not automatically be freed."

this means:

  • keys must outlive the map
  • if keys are heap-allocated, you free them — the map won't
  • for string keys pointing into a long-lived buffer (like parsed data), this is free — the buffer already owns the memory

HashMap vs ArrayHashMap

HashMap (and StringHashMap): hash-table backed. O(1) lookup, unordered iteration.

ArrayHashMap (and StringArrayHashMap): array-backed with hash index. O(1) lookup, insertion-order iteration, slightly more memory.

use HashMap when you only care about lookup. use ArrayHashMap when iteration order matters (e.g., rendering parameters in a URL).

generic keys with AutoHashMap

for non-string keys, AutoHashMap generates hash/eql from the key type:

var map = std.AutoHashMap(u64, MyStruct).init(allocator);
defer map.deinit();

try map.put(42, my_value);

works for integers, enums, and types with hash and eql methods.