Overview
With the stdlib index providing your map, you now dive into Zig’s collection types—the workhorses of data manipulation. This chapter explores dynamic arrays (ArrayList), hash tables (HashMap and variants), priority structures (PriorityQueue), linked lists, specialized containers like MultiArrayList and SegmentedList, and sorting algorithms (see array_list.zig and hash_map.zig). Each collection embraces Zig’s explicit allocator model, giving you control over memory lifetime and enabling leak detection during testing.
Unlike languages with implicit garbage collection, Zig collections require you to deinit() or transfer ownership explicitly. This discipline, combined with the standard library’s rich suite of adapters (unmanaged variants, sentinel-aware slices, custom contexts), makes collections both powerful and predictable. By chapter’s end, you’ll confidently choose the right structure for your use case and understand the performance trade-offs inherent in each design (see sort.zig).
Learning Goals
- Use
ArrayList(T)for dynamic arrays: append, insert, remove, iterate, and understand reallocation strategies. - Employ
HashMapandAutoHashMapfor key-value lookups with custom hash and equality functions. - Leverage
PriorityQueuefor min/max heap operations and understand comparison contexts (see priority_queue.zig). - Apply
std.sortfor in-place sorting with stable and unstable algorithms (pdqsort, block sort, insertion sort). - Recognize specialized structures:
MultiArrayListfor structure-of-arrays layout,SegmentedListfor stable pointers, linked lists for intrusive designs (see multi_array_list.zig and segmented_list.zig). - Appreciate allocator impact: how collection growth triggers reallocation and how arenas simplify bulk-free patterns (see 10).
ArrayList: Dynamic Arrays
ArrayList(T) is Zig’s foundational growable array, analogous to C++'s std::vector or Rust’s Vec<T>. It manages a contiguous slice of T values, expanding capacity as needed. You interact with .items (the current slice) and call methods like append, pop, insert, and remove.
Basic Operations
Create an ArrayList by specifying the element type and passing an allocator. Call deinit() when done to free the backing memory.
const std = @import("std");
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
var list: std.ArrayList(i32) = .empty;
defer list.deinit(allocator);
try list.append(allocator, 10);
try list.append(allocator, 20);
try list.append(allocator, 30);
for (list.items, 0..) |item, i| {
std.debug.print("Item {d}: {d}\n", .{ i, item });
}
const popped = list.pop();
std.debug.print("Popped: {d}\n", .{popped.?});
std.debug.print("Remaining length: {d}\n", .{list.items.len});
}
$ zig build-exe arraylist_basic.zig$ ./arraylist_basicItem 0: 10
Item 1: 20
Item 2: 30
Popped: 30
Remaining length: 2ArrayList doubles capacity when full (exponential growth), amortizing reallocation cost. You can pre-allocate with try list.ensureTotalCapacity(allocator, n) if you know the final size.
Ownership and Unmanaged Variants
By default, ArrayList(T) stores its allocator internally (managed variant). For more explicit control, use the unmanaged form by accessing .items and .capacity directly, or use the deprecated Unmanaged APIs. The modern pattern is to use the simpler managed form unless you need to decouple allocation from the list itself.
const std = @import("std");
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
// ArrayList with explicit allocator
var list: std.ArrayList(u32) = .empty;
defer list.deinit(allocator);
try list.append(allocator, 1);
try list.append(allocator, 2);
try list.append(allocator, 3);
std.debug.print("Managed list length: {d}\n", .{list.items.len});
// Transfer ownership to a slice
const owned_slice = try list.toOwnedSlice(allocator);
defer allocator.free(owned_slice);
std.debug.print("After transfer, original list length: {d}\n", .{list.items.len});
std.debug.print("Owned slice length: {d}\n", .{owned_slice.len});
}
$ zig build-exe arraylist_ownership.zig && ./arraylist_ownershipManaged list length: 3
After transfer, original list length: 0
Owned slice length: 3toOwnedSlice() empties the list and returns the backing memory as a slice—you become responsible for freeing it with allocator.free(slice).
Insertion and Removal
Beyond append and pop, ArrayList supports mid-array operations. orderedRemove maintains element order (shifts subsequent elements), while swapRemove is O(1) but doesn’t preserve order (swaps with the last element).
const std = @import("std");
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
var list: std.ArrayList(i32) = .empty;
defer list.deinit(allocator);
try list.appendSlice(allocator, &.{ 1, 2, 3, 4 });
// Insert 99 at index 1
try list.insert(allocator, 1, 99);
std.debug.print("After insert at 1: {any}\n", .{list.items});
// Remove at index 2 (shifts elements)
_ = list.orderedRemove(2);
std.debug.print("After orderedRemove at 2: {any}\n", .{list.items});
// Remove at index 1 (swaps with last, no shift)
_ = list.swapRemove(1);
std.debug.print("After swapRemove at 1: {any}\n", .{list.items});
}
$ zig build-exe arraylist_insert_remove.zig && ./arraylist_insert_removeAfter insert at 1: [1, 99, 2, 3, 4]
After orderedRemove at 2: [1, 99, 3, 4]
After swapRemove at 1: [1, 4, 3]orderedRemove is O(n) in the worst case (removing the first element requires shifting all others); use swapRemove when order doesn’t matter for O(1) performance.
HashMap: Key-Value Lookups
Zig’s hash map family provides O(1) average-case lookups via open addressing and linear probing. HashMap(K, V, Context, max_load_percentage) requires a context with hash and eql functions. For convenience, AutoHashMap auto-generates these for hashable types, and StringHashMap specializes for []const u8 keys.
StringHashMap Basics
For string keys ([]const u8), use StringHashMap(V) which provides optimized string hashing. Note that AutoHashMap does not support slice types like []const u8 to avoid ambiguity—use StringHashMap instead.
const std = @import("std");
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
var map = std.StringHashMap(i32).init(allocator);
defer map.deinit();
try map.put("foo", 42);
try map.put("bar", 100);
if (map.get("foo")) |value| {
std.debug.print("Value for 'foo': {d}\n", .{value});
}
std.debug.print("Contains 'bar': {}\n", .{map.contains("bar")});
std.debug.print("Contains 'baz': {}\n", .{map.contains("baz")});
_ = map.remove("foo");
std.debug.print("After removing 'foo', contains: {}\n", .{map.contains("foo")});
}
$ zig build-exe hashmap_basic.zig && ./hashmap_basicValue for 'foo': 42
Contains 'bar': true
Contains 'baz': false
After removing 'foo', contains: falseUse put to insert or update, get to retrieve (returns ?V), and remove to delete. Check existence with contains without retrieving the value.
StringHashMap for String Keys
When keys are []const u8, use StringHashMap(V) for optimized string hashing. Remember: the map doesn’t copy key memory—you must ensure strings outlive the map or use an arena allocator.
const std = @import("std");
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
var population = std.StringHashMap(u32).init(allocator);
defer population.deinit();
try population.put("Seattle", 750_000);
try population.put("Austin", 950_000);
try population.put("Boston", 690_000);
var iter = population.iterator();
while (iter.next()) |entry| {
std.debug.print("City: {s}, Population: {d}\n", .{ entry.key_ptr.*, entry.value_ptr.* });
}
}
$ zig build-exe hashmap_string.zig && ./hashmap_stringCity: Seattle, Population: 750000
City: Austin, Population: 950000
City: Boston, Population: 690000String keys are not duplicated by the map—if you pass stack-allocated or temporary strings, they must remain valid. Use an arena allocator or dupe to manage key lifetimes.
Custom Hash and Equality
For types not supported by autoHash, define a context with custom hash and eql functions.
const std = @import("std");
const Point = struct {
x: i32,
y: i32,
};
const PointContext = struct {
pub fn hash(self: @This(), p: Point) u64 {
_ = self;
var hasher = std.hash.Wyhash.init(0);
std.hash.autoHash(&hasher, p.x);
std.hash.autoHash(&hasher, p.y);
return hasher.final();
}
pub fn eql(self: @This(), a: Point, b: Point) bool {
_ = self;
return a.x == b.x and a.y == b.y;
}
};
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
var map = std.HashMap(Point, []const u8, PointContext, 80).init(allocator);
defer map.deinit();
try map.put(.{ .x = 10, .y = 20 }, "Alice");
try map.put(.{ .x = 30, .y = 40 }, "Bob");
var iter = map.iterator();
while (iter.next()) |entry| {
std.debug.print("Point({d}, {d}): {s}\n", .{ entry.key_ptr.x, entry.key_ptr.y, entry.value_ptr.* });
}
std.debug.print("Contains (10, 20): {}\n", .{map.contains(.{ .x = 10, .y = 20 })});
}
$ zig build-exe hashmap_custom.zig && ./hashmap_customPoint(10, 20): Alice
Point(30, 40): Bob
Contains (10, 20): trueThe context parameter in HashMap(K, V, Context, max_load_percentage) allows stateful hashing (e.g., salted hashes). For stateless contexts, pass void.
PriorityQueue: Heap-Based Priority Structures
PriorityQueue(T, Context, compareFn) implements a binary min-heap or max-heap depending on your comparison function. It supports add, peek, remove (pop the top element), and removeIndex.
Min-Heap Example
A min-heap pops the smallest element first. The comparison function returns .lt when the first argument should come before the second.
const std = @import("std");
fn lessThan(context: void, a: i32, b: i32) std.math.Order {
_ = context;
return std.math.order(a, b);
}
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
var queue = std.PriorityQueue(i32, void, lessThan).init(allocator, {});
defer queue.deinit();
try queue.add(10);
try queue.add(5);
try queue.add(20);
try queue.add(1);
while (queue.removeOrNull()) |item| {
std.debug.print("Popped: {d}\n", .{item});
}
}
$ zig build-exe priorityqueue_min.zig && ./priorityqueue_minPopped: 1
Popped: 5
Popped: 10
Popped: 20For a max-heap, reverse the comparison logic: return .gt when a < b.
Priority Queue for Task Scheduling
Priority queues excel at scheduling: add tasks with priorities, then always process the highest-priority task first.
const std = @import("std");
const Task = struct {
name: []const u8,
priority: u32,
};
fn compareTasks(context: void, a: Task, b: Task) std.math.Order {
_ = context;
// Higher priority comes first (max-heap behavior)
return std.math.order(b.priority, a.priority);
}
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
var queue = std.PriorityQueue(Task, void, compareTasks).init(allocator, {});
defer queue.deinit();
try queue.add(.{ .name = "Documentation", .priority = 1 });
try queue.add(.{ .name = "Feature request", .priority = 5 });
try queue.add(.{ .name = "Critical bug", .priority = 10 });
while (queue.removeOrNull()) |task| {
std.debug.print("Processing: {s} (priority {d})\n", .{ task.name, task.priority });
}
}
$ zig build-exe priorityqueue_tasks.zig && ./priorityqueue_tasksProcessing: Critical bug (priority 10)
Processing: Feature request (priority 5)
Processing: Documentation (priority 1)PriorityQueue uses a heap internally, so add is O(log n), peek is O(1), and remove is O(log n).
Sorting
Zig’s std.sort module provides multiple algorithms: insertion (stable, O(n²)), heap (unstable, O(n log n)), pdq (pattern-defeating quicksort, O(n log n) worst-case), and block (stable, O(n log n) with extra memory). The default recommendation is pdq for most use cases.
Basic Sorting
Call std.sort.pdq with a slice, a context, and a lessThan function.
const std = @import("std");
fn lessThan(context: void, a: i32, b: i32) bool {
_ = context;
return a < b;
}
fn greaterThan(context: void, a: i32, b: i32) bool {
_ = context;
return a > b;
}
pub fn main() !void {
var numbers = [_]i32{ 5, 2, 8, 1, 10 };
std.sort.pdq(i32, &numbers, {}, lessThan);
std.debug.print("Sorted ascending: {any}\n", .{numbers});
std.sort.pdq(i32, &numbers, {}, greaterThan);
std.debug.print("Sorted descending: {any}\n", .{numbers});
}
$ zig build-exe sort_basic.zig && ./sort_basicSorted ascending: [1, 2, 5, 8, 10]
Sorted descending: [10, 8, 5, 2, 1]pdq is unstable but fast. Use block if you need stability (equal elements retain their original order).
Sorting Structs
Sort by struct fields by providing a custom comparison function.
const std = @import("std");
const Person = struct {
name: []const u8,
age: u32,
};
fn byAge(context: void, a: Person, b: Person) bool {
_ = context;
return a.age < b.age;
}
pub fn main() !void {
var people = [_]Person{
.{ .name = "Alice", .age = 30 },
.{ .name = "Bob", .age = 25 },
.{ .name = "Charlie", .age = 35 },
};
std.sort.pdq(Person, &people, {}, byAge);
std.debug.print("Sorted by age:\n", .{});
for (people) |person| {
std.debug.print("{s}, age {d}\n", .{ person.name, person.age });
}
}
$ zig build-exe sort_structs.zig && ./sort_structsSorted by age:
Alice, age 30
Bob, age 25
Charlie, age 35The context parameter in sorting functions can hold state (e.g., sort direction flags or comparison modifiers). Use anytype for flexibility.
MultiArrayList: Structure-of-Arrays Layout
MultiArrayList(T) stores structs in a structure-of-arrays (SoA) format: each field is stored in its own contiguous array, improving cache locality when accessing individual fields across many elements.
const std = @import("std");
const Entity = struct {
id: u32,
x: f32,
y: f32,
};
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
var entities = std.MultiArrayList(Entity){};
defer entities.deinit(allocator);
try entities.append(allocator, .{ .id = 1, .x = 10.5, .y = 20.3 });
try entities.append(allocator, .{ .id = 2, .x = 30.1, .y = 40.7 });
for (0..entities.len) |i| {
const entity = entities.get(i);
std.debug.print("Entity {d}: id={d}, x={d}, y={d}\n", .{ i, entity.id, entity.x, entity.y });
}
// Access a single field slice for efficient iteration
const x_coords = entities.items(.x);
var sum: f32 = 0;
for (x_coords) |x| {
sum += x;
}
std.debug.print("Sum of x coordinates: {d}\n", .{sum});
}
$ zig build-exe multiarraylist.zig && ./multiarraylistEntity 0: id=1, x=10.5, y=20.3
Entity 1: id=2, x=30.1, y=40.7
Sum of x coordinates: 40.6Use MultiArrayList when you frequently iterate over a single field (e.g., positions in a game engine) but rarely need the entire struct. This layout maximizes CPU cache efficiency.
SegmentedList: Stable Pointers
SegmentedList(T, prealloc_item_count) grows by allocating fixed-size segments rather than reallocating a single contiguous array. This ensures pointers to elements remain valid across insertions.
const std = @import("std");
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
var list = std.SegmentedList(i32, 4){};
defer list.deinit(allocator);
try list.append(allocator, 10);
try list.append(allocator, 20);
// Take a pointer to the first element
const first_ptr = list.at(0);
std.debug.print("First item: {d}\n", .{first_ptr.*});
// Append more items - pointer remains valid!
try list.append(allocator, 30);
std.debug.print("First item (after append): {d}\n", .{first_ptr.*});
std.debug.print("List length: {d}\n", .{list.len});
}
$ zig build-exe segmentedlist.zig && ./segmentedlistFirst item: 10
First item (after append): 10
List length: 3Unlike ArrayList, pointers to SegmentedList elements remain valid even as you add more items. Use this when you need stable addressing (e.g., storing pointers in other data structures).
Linked Lists
Zig provides DoublyLinkedList(T) and SinglyLinkedList(T) as intrusive linked lists: nodes embed the link pointers directly (see DoublyLinkedList.zig and SinglyLinkedList.zig). This avoids allocator overhead per node and integrates naturally with existing structs.
const std = @import("std");
const Node = struct {
data: i32,
link: std.DoublyLinkedList.Node = .{},
};
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
var list = std.DoublyLinkedList{};
var node1 = try allocator.create(Node);
node1.* = .{ .data = 10 };
list.append(&node1.link);
var node2 = try allocator.create(Node);
node2.* = .{ .data = 20 };
list.append(&node2.link);
var node3 = try allocator.create(Node);
node3.* = .{ .data = 30 };
list.append(&node3.link);
var it = list.first;
while (it) |node| : (it = node.next) {
const data_node: *Node = @fieldParentPtr("link", node);
std.debug.print("Node: {d}\n", .{data_node.data});
}
// Clean up
allocator.destroy(node1);
allocator.destroy(node2);
allocator.destroy(node3);
}
$ zig build-exe linkedlist.zig && ./linkedlistNode: 10
Node: 20
Node: 30Intrusive lists don’t own node memory—you allocate and manage nodes yourself. This is powerful but requires discipline to avoid use-after-free bugs.
Specialized Maps
ArrayHashMap
ArrayHashMap stores keys and values in separate arrays, preserving insertion order and enabling iteration by index (see array_hash_map.zig).
StaticStringMap
StaticStringMap(V) is a compile-time perfect hash map for string keys—fast lookups with zero runtime allocation or hashing overhead (see static_string_map.zig).
const std = @import("std");
const status_codes = std.StaticStringMap(u32).initComptime(.{
.{ "ok", 200 },
.{ "created", 201 },
.{ "not_found", 404 },
.{ "server_error", 500 },
});
pub fn main() !void {
std.debug.print("Status code for 'ok': {d}\n", .{status_codes.get("ok").?});
std.debug.print("Status code for 'not_found': {d}\n", .{status_codes.get("not_found").?});
std.debug.print("Status code for 'server_error': {d}\n", .{status_codes.get("server_error").?});
}
$ zig build-exe static_string_map.zig && ./static_string_mapStatus code for 'ok': 200
Status code for 'not_found': 404
Status code for 'server_error': 500Use StaticStringMap for compile-time constant mappings (e.g., keyword tables, command parsers). It compiles to optimal switch statements or lookup tables.
Allocator Impact on Collections
Every collection requires an allocator, either passed at initialization (ArrayList(T).init(allocator)) or per operation (unmanaged variants). Growth strategies trigger reallocations, and failure returns error.OutOfMemory (see 10).
Arena Pattern for Bulk-Free
When building temporary collections that live for a single scope, use an arena allocator to free everything at once.
const std = @import("std");
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
defer _ = gpa.deinit();
const allocator = gpa.allocator();
var arena = std.heap.ArenaAllocator.init(allocator);
defer arena.deinit();
const arena_allocator = arena.allocator();
var list: std.ArrayList(i32) = .empty;
for (0..1000) |i| {
try list.append(arena_allocator, @intCast(i));
}
std.debug.print("List has {d} items\n", .{list.items.len});
var map = std.AutoHashMap(u32, []const u8).init(arena_allocator);
for (0..500) |i| {
try map.put(@intCast(i), "value");
}
std.debug.print("Map has {d} entries\n", .{map.count()});
// No need to call list.deinit() or map.deinit()
// arena.deinit() frees everything at once
std.debug.print("All freed at once via arena.deinit()\n", .{});
}
$ zig build-exe collections_arena.zig && ./collections_arenaList has 1000 items
Map has 500 entries
All freed at once via arena.deinit()The arena doesn’t call individual collection deinit() methods. It frees all memory at once. Use this pattern when you know collections won’t outlive the arena’s scope (see 10).
Performance Considerations
- ArrayList growth: Doubling capacity amortizes reallocation cost, but large allocations may fail. Pre-allocate if size is known.
- HashMap load factor: Default
max_load_percentageis 80%. Higher values save memory but increase collision chains. - Sort stability:
pdqis fastest but unstable. Useblockorinsertionwhen order of equal elements matters. - MultiArrayList cache: SoA layout shines when iterating single fields but adds indirection overhead for full-struct access.
- SegmentedList segments: Smaller
prealloc_item_countmeans more segments (more allocations); larger values waste memory if lists stay small.
Exercises
- Implement a
FrequencyMapusingStringHashMap(u32)that counts word occurrences in a text file, then print the top-10 most frequent words using aPriorityQueue. - Compare
ArrayListvsSegmentedListperformance: create 10,000 items, take pointers to the first 100, then append 10,000 more. Verify pointers remain valid withSegmentedListbut may invalidate withArrayList. - Write an
LRUcache usingHashMapfor lookups andDoublyLinkedListfor eviction order. When capacity is reached, remove the least-recently-used item. - Sort an
ArrayListof structs by multiple keys (e.g., sort byage, then bynamefor ties) using a custom comparator andstd.sort.pdq.
Caveats, Alternatives, Edge Cases
- Unmanaged variants: Most collections have unmanaged counterparts (e.g.,
ArrayListUnmanaged(T)) for manual allocator threading, useful in generic code or when embedding collections in structs. - HashMap key lifetimes: Maps don’t duplicate keys. Ensure key memory outlives the map, or use an arena allocator to manage key storage collectively.
- Iterator invalidation: Like C++, modifying a collection (append, remove) may invalidate iterators or pointers to elements. Always check documentation for each operation.
- Stable vs unstable sort: If your data has equal elements that must maintain relative order (e.g., sorting a table by column but preserving row order for ties), use
std.sort.blockorinsertion, notpdq. - Treap: Zig also provides
std.Treap, a tree-heap hybrid for ordered maps with probabilistic balancing, useful when you need both sorted iteration and O(log n) operations (see treap.zig).