Overview
In this project, we turn the ideas from 15 into a practical workflow: generate small lookup tables at compile time and use them at runtime with zero overhead. This technique removes branches from hot loops, replaces repeated work with constant data, and keeps code simple. We’ll take a “measure-first” mindset and show when a table helps and when it’s not worth the binary size.
We’ll implement three self-contained demos:
- ASCII classification table: constant-time character categorization (digit/alpha/space/punct)
- Popcount table: fast bit counting for bytes, composable for larger aggregates
- Multiplication table: a parameterized N×N matrix rendered compactly
Each example uses Zig’s modern stdout writer (see the Writergate changes) and prints visible results when run directly. See v0.15.2 and ascii.zig.
Learning Goals
ASCII classification table
We construct a 256-entry table mapping bytes to bitmasks for digit/alpha/space/punct. At runtime, we summarize an input string. The “punctuation” set is derived from isPrint && !isAlphanumeric && !isWhitespace (sufficient for ASCII).
const std = @import("std");
/// Helper function to obtain a buffered standard output writer.
/// Uses a static buffer to avoid repeated allocations.
fn stdout() *std.Io.Writer {
const g = struct {
var buf: [4096]u8 = undefined;
var w = std.fs.File.stdout().writer(&buf);
};
return &g.w.interface;
}
/// Bit flags representing ASCII character classes.
/// Multiple flags can be combined using bitwise OR.
const Class = struct {
pub const digit: u8 = 0x01; // 0-9
pub const alpha: u8 = 0x02; // A-Z, a-z
pub const space: u8 = 0x04; // space, newline, tab, carriage return
pub const punct: u8 = 0x08; // punctuation characters
};
/// Builds a lookup table mapping each byte (0-255) to its character class flags.
/// This function runs at compile time, producing a constant table embedded in the binary.
fn buildAsciiClassTable() [256]u8 {
// Initialize all entries to 0 (no class flags set)
var t: [256]u8 = .{0} ** 256;
// Iterate over all possible byte values at compile time
comptime var b: usize = 0;
inline while (b < 256) : (b += 1) {
const ch: u8 = @intCast(b);
var m: u8 = 0; // Accumulator for class flags
// Check if character is a digit (0-9)
if (ch >= '0' and ch <= '9') m |= Class.digit;
// Check if character is alphabetic (A-Z or a-z)
if ((ch >= 'A' and ch <= 'Z') or (ch >= 'a' and ch <= 'z')) m |= Class.alpha;
// Check if character is whitespace (space, newline, tab, carriage return)
if (ch == ' ' or ch == '\n' or ch == '\t' or ch == '\r') m |= Class.space;
// Check if character is punctuation (printable, non-alphanumeric, non-whitespace)
if (std.ascii.isPrint(ch) and !std.ascii.isAlphanumeric(ch) and !std.ascii.isWhitespace(ch)) m |= Class.punct;
// Store the computed flags for this byte value
t[b] = m;
}
return t;
}
/// Counts occurrences of each character class in the input string.
/// Uses the precomputed lookup table for O(1) classification per character.
fn countKinds(s: []const u8) struct { digits: usize, letters: usize, spaces: usize, punct: usize } {
// Build the classification table (happens at compile time)
const T = buildAsciiClassTable();
// Initialize counters for each character class
var c = struct { digits: usize = 0, letters: usize = 0, spaces: usize = 0, punct: usize = 0 }{};
// Iterate through each byte in the input string
var i: usize = 0;
while (i < s.len) : (i += 1) {
// Look up the class flags for the current byte
const m = T[s[i]];
// Test each flag and increment the corresponding counter
if ((m & Class.digit) != 0) c.digits += 1;
if ((m & Class.alpha) != 0) c.letters += 1;
if ((m & Class.space) != 0) c.spaces += 1;
if ((m & Class.punct) != 0) c.punct += 1;
}
// Return the counts as an anonymous struct
return .{ .digits = c.digits, .letters = c.letters, .spaces = c.spaces, .punct = c.punct };
}
pub fn main() !void {
// Get buffered output writer
const out = stdout();
// Define test string containing various character classes
const s = "Hello, Zig 0.15.2! \t\n";
// Count each character class in the test string
const c = countKinds(s);
// Print the input string
try out.print("input: {s}\n", .{s});
// Print the computed counts for each character class
try out.print("digits={} letters={} spaces={} punct={}\n", .{ c.digits, c.letters, c.spaces, c.punct });
// Ensure buffered output is written to stdout
try out.flush();
}
$ zig run ascii_class_table.ziginput: Hello, Zig 0.15.2!
digits=4 letters=8 spaces=6 punct=4Tables like this remove repeated branching in inner loops. Keep the derivation logic easy to audit, and prefer std.ascii helpers where possible. 15
Popcount table for bytes
Rather than call a bit-twiddling routine per byte, we bake a 256-entry popcount table and reduce across inputs. This scales from toy examples to “count set bits in a buffer” primitives.
const std = @import("std");
/// Returns a reference to a buffered stdout writer.
/// The buffer and writer are stored in a private struct to persist across calls.
fn stdout() *std.Io.Writer {
const g = struct {
// Static buffer for stdout writes—survives function returns
var buf: [4096]u8 = undefined;
// Writer wraps stdout with the buffer; created once
var w = std.fs.File.stdout().writer(&buf);
};
// Return pointer to the writer's generic interface
return &g.w.interface;
}
/// Counts the number of set bits (1s) in a single byte using bit manipulation.
/// Uses a well-known parallel popcount algorithm that avoids branches.
fn popcountByte(x: u8) u8 {
var v = x;
// Step 1: Count bits in pairs (2-bit groups)
// Subtracts neighbor bit from each 2-bit group to get counts 0-2
v = v - ((v >> 1) & 0x55);
// Step 2: Count bits in nibbles (4-bit groups)
// Adds adjacent 2-bit counts to get nibble counts 0-4
v = (v & 0x33) + ((v >> 2) & 0x33);
// Step 3: Combine nibbles and mask low 4 bits (result 0-8)
// Adding the two nibbles gives total count, truncate to u8
return @truncate(((v + (v >> 4)) & 0x0F));
}
/// Builds a 256-entry lookup table at compile time.
/// Each entry [i] holds the number of set bits in byte value i.
fn buildPopcountTable() [256]u8 {
// Initialize table with zeros (all 256 entries)
var t: [256]u8 = .{0} ** 256;
// Compile-time loop index (required for inline while)
comptime var i: usize = 0;
// Unrolled loop: compute popcount for each possible byte value
inline while (i < 256) : (i += 1) {
// Store the bit count for byte value i
t[i] = popcountByte(@intCast(i));
}
// Return the fully populated table as a compile-time constant
return t;
}
pub fn main() !void {
// Acquire the buffered stdout writer
const out = stdout();
// Generate the popcount lookup table at compile time
const T = buildPopcountTable();
// Test data: array of bytes to analyze
const bytes = [_]u8{ 0x00, 0x0F, 0xF0, 0xAA, 0xFF };
// Accumulator for total set bits across all test bytes
var sum: usize = 0;
// Sum up set bits by indexing into the precomputed table
for (bytes) |b| sum += T[b];
// Print label for the output
try out.print("bytes: ", .{});
// Print each byte in hex format with spacing
for (bytes, 0..) |b, idx| {
// Add space separator between bytes (not before first)
if (idx != 0) try out.print(" ", .{});
// Format as 0x-prefixed 2-digit hex (e.g., 0x0F)
try out.print("0x{X:0>2}", .{b});
}
// Print the final sum of all set bits
try out.print(" -> total set bits = {}\n", .{sum});
// Flush the buffered writer to ensure all output appears
try out.flush();
}
$ zig run popcount_table.zigbytes: 0x00 0x0F 0xF0 0xAA 0xFF -> total set bits = 20In many workloads, the CPU’s POPCNT instruction (or std.math.popCount) is already fast. Prefer a table only when your profile shows that it helps for your data access pattern and platform. 52
Parameterized multiplication table (N×N)
Here the table dimension is a comptime parameter, so the compiler unrolls generation and stores a compact [N][N]u16. We format a 12×12 “times table” and only print a subset to keep output readable.
const std = @import("std");
/// Returns a reference to a buffered stdout writer.
/// The buffer and writer are stored in a private struct to persist across calls.
fn stdout() *std.Io.Writer {
const g = struct {
// Static buffer for stdout writes—survives function returns
var buf: [8192]u8 = undefined;
// Writer wraps stdout with the buffer; created once
var w = std.fs.File.stdout().writer(&buf);
};
// Return pointer to the writer's generic interface
return &g.w.interface;
}
/// Builds an N×N multiplication table at compile time.
/// Each cell [i][j] holds (i+1) * (j+1) (1-indexed).
fn buildMulTable(comptime N: usize) [N][N]u16 {
// Declare the result table; will be computed entirely at compile time
var t: [N][N]u16 = undefined;
// Outer loop: row index (compile-time variable required for inline while)
comptime var i: usize = 0;
inline while (i < N) : (i += 1) {
// Inner loop: column index
comptime var j: usize = 0;
inline while (j < N) : (j += 1) {
// Store (row+1) * (col+1) in the table
t[i][j] = @intCast((i + 1) * (j + 1));
}
}
// Return the fully populated table as a compile-time constant
return t;
}
pub fn main() !void {
// Acquire the buffered stdout writer
const out = stdout();
// Table dimension (classic 12×12 times table)
const N = 12;
// Generate the multiplication table at compile time
const T = buildMulTable(N);
// Print header line
try out.print("{s}x{s} multiplication table (partial):\n", .{ "12", "12" });
// Print only first 6 rows to keep output concise (runtime loop)
var i: usize = 0;
while (i < 6) : (i += 1) {
// Print all 12 columns for this row
var j: usize = 0;
while (j < N) : (j += 1) {
// Format each cell right-aligned in a 4-character field
try out.print("{d: >4}", .{T[i][j]});
}
// End the row with a newline
try out.print("\n", .{});
}
// Flush the buffered writer to ensure all output appears
try out.flush();
}
$ zig run mult_table.zig12x12 multiplication table (partial):
1 2 3 4 5 6 7 8 9 10 11 12
2 4 6 8 10 12 14 16 18 20 22 24
3 6 9 12 15 18 21 24 27 30 33 36
4 8 12 16 20 24 28 32 36 40 44 48
5 10 15 20 25 30 35 40 45 50 55 60
6 12 18 24 30 36 42 48 54 60 66 72inline while/for constructs need compile-time-known bounds; pairing them with comptime var indices makes intent explicit. Choose ordinary loops unless you have a reason to unroll. 15
Notes & Caveats
- Binary size vs speed: tables cost memory. Use them when they remove meaningful work from a hot path and your binary budget allows it. 41
- Portability: ASCII classification is straightforward; Unicode requires a different strategy (tables of ranges/pages or a library). 47
- I/O: The examples use the Zig 0.15.2
std.Io.Writerinterface with a buffer in the interface—don’t forget to callflush().
Exercises
- Extend the ASCII table with additional classes (hex digits, control) and print a histogram for arbitrary input files.
- Generate a
crc32orcrc16table at compile time and validate against a known test vector at runtime (as a small end-to-end demo). 15 - Parameterize the multiplication table’s cell formatter to align at different widths; measure the impact on readability and code size. 47
Alternatives & Edge Cases
- Table invalidation: if inputs change shape (e.g., switching from ASCII to UTF-8 code points), document assumptions prominently and introduce compile-time assertions to catch misuse early. 37
- Micro-architectural effects: depending on cache behavior, a branchy routine can outperform a table walk; profile with realistic data. 42
- For tables much larger than CPU caches, consider on-demand generation, chunking, or precomputed assets loaded from disk rather than embedding in the binary. 28