Chapter 50Random And Math

Random and Math

Overview

With compression pipelines in place from the previous chapter, we now zoom in on the numeric engines that feed those workflows: deterministic pseudo-random number generators, well-behaved math helpers, and hashing primitives that balance speed and security. Zig 0.15.2 keeps these components modular—std.Random builds reproducible sequences, std.math provides careful tolerances and constants, and the stdlib splits hashing into non-crypto and crypto families so you can choose the right tool per workload. math.zigwyhash.zigsha2.zig

Learning Goals

  • Seed, advance, and reproduce std.Random generators while sampling common distributions. Xoshiro256.zig
  • Apply std.math utilities—constants, clamps, tolerances, and geometry helpers—to keep numeric code stable. hypot.zig
  • Distinguish fast hashers like Wyhash from cryptographic digests such as SHA-256, and wire both into file-processing jobs responsibly.

Random number foundations

Zig exposes pseudo-random generators as first-class values: you seed an engine, ask it for integers, floats, or indices, and your code owns the state transitions. That transparency gives you control over fuzzers, simulations, and deterministic tests. Random.zig

Deterministic generators with reproducible sequences

std.Random.DefaultPrng wraps Xoshiro256++, seeding itself via SplitMix64 when you call init(seed). From there you obtain a Random facade that exposes high-level helpers—ranges, shuffles, floats—while keeping the underlying state private.

Zig
const std = @import("std");

pub fn main() !void {
    var stdout_buffer: [4096]u8 = undefined;
    var stdout_writer = std.fs.File.stdout().writer(&stdout_buffer);
    const stdout = &stdout_writer.interface;

    const seed: u64 = 0x0006_7B20; // 424,224 in decimal
    var prng = std.Random.DefaultPrng.init(seed);
    var rand = prng.random();

    const dice_roll = rand.intRangeAtMost(u8, 1, 6);
    const coin = if (rand.boolean()) "heads" else "tails";
    var ladder = [_]u8{ 0, 1, 2, 3, 4, 5 };
    rand.shuffle(u8, ladder[0..]);

    const unit_float = rand.float(f64);

    var reproducible = [_]u32{ undefined, undefined, undefined };
    var check_prng = std.Random.DefaultPrng.init(seed);
    var check_rand = check_prng.random();
    for (&reproducible) |*slot| {
        slot.* = check_rand.int(u32);
    }

    try stdout.print("seed=0x{X:0>8}\n", .{seed});
    try stdout.print("d6 roll -> {d}\n", .{dice_roll});
    try stdout.print("coin flip -> {s}\n", .{coin});
    try stdout.print("shuffled ladder -> {any}\n", .{ladder});
    try stdout.print("unit float -> {d:.6}\n", .{unit_float});
    try stdout.print("first three u32 -> {any}\n", .{reproducible});

    try stdout.flush();
}
Run
Shell
$ zig run prng_sequences.zig
Output
Shell
seed=0x00067B20
d6 roll -> 5
coin flip -> tails
shuffled ladder -> { 0, 4, 3, 2, 5, 1 }
unit float -> 0.742435
first three u32 -> { 2135551917, 3874178402, 2563214192 }

The fairness guarantees of uintLessThan hinge on the generator’s uniform output; fall back to uintLessThanBiased when constant-time behavior matters more than perfect distribution.

Working with distributions and sampling heuristics

Beyond uniform draws, Random.floatNorm and Random.floatExp expose Ziggurat-backed normal and exponential samples—ideal for synthetic workloads or noise injection. ziggurat.zig Weighted choices come from weightedIndex, while .jump() on Xoshiro engines deterministically leaps ahead by 2^128 steps to partition streams across threads without overlap. 29 For cryptographic uses, swap to std.crypto.random or std.Random.DefaultCsprng to inherit ChaCha-based entropy rather than a fast-but-predictable PRNG. tlcsprng.zig

Practical math utilities

The std.math namespace combines fundamental constants with measured utilities: clamps, approximate equality, and geometry helpers all share consistent semantics across CPU targets.

Numeric hygiene toolkit

Combining a handful of helpers—sqrt, clamp, approximate equality, and the golden ratio constant—keeps reporting code readable and portable. sqrt.zig

Zig
const std = @import("std");

pub fn main() !void {
    var stdout_buffer: [4096]u8 = undefined;
    var stdout_writer = std.fs.File.stdout().writer(&stdout_buffer);
    const stdout = &stdout_writer.interface;

    const m = std.math;
    const latencies = [_]f64{ 0.94, 1.02, 0.87, 1.11, 0.99, 1.05 };

    var sum: f64 = 0;
    var sum_sq: f64 = 0;
    var minimum = latencies[0];
    var maximum = latencies[0];
    for (latencies) |value| {
        sum += value;
        sum_sq += value * value;
        minimum = @min(minimum, value);
        maximum = @max(maximum, value);
    }

    const mean = sum / @as(f64, @floatFromInt(latencies.len));
    const rms = m.sqrt(sum_sq / @as(f64, @floatFromInt(latencies.len)));
    const normalized = m.clamp((mean - 0.8) / 0.6, 0.0, 1.0);

    const turn_degrees: f64 = 72.0;
    const turn_radians = turn_degrees * m.rad_per_deg;
    const right_angle = m.pi / 2.0;
    const approx_right = m.approxEqRel(f64, turn_radians, right_angle, 1e-12);

    const hyp = m.hypot(3.0, 4.0);

    try stdout.print("sample count -> {d}\n", .{latencies.len});
    try stdout.print("min/max -> {d:.2} / {d:.2}\n", .{ minimum, maximum });
    try stdout.print("mean -> {d:.3}\n", .{mean});
    try stdout.print("rms -> {d:.3}\n", .{rms});
    try stdout.print("normalized mean -> {d:.3}\n", .{normalized});
    try stdout.print("72deg in rad -> {d:.6}\n", .{turn_radians});
    try stdout.print("close to right angle? -> {s}\n", .{if (approx_right) "yes" else "no"});
    try stdout.print("hypot(3,4) -> {d:.1}\n", .{hyp});
    try stdout.print("phi constant -> {d:.9}\n", .{m.phi});

    try stdout.flush();
}
Run
Shell
$ zig run math_inspector.zig
Output
Shell
sample count -> 6
min/max -> 0.87 / 1.11
mean -> 0.997
rms -> 1.000
normalized mean -> 0.328
72deg in rad -> 1.256637
close to right angle? -> no
hypot(3,4) -> 5.0
phi constant -> 1.618033989

Prefer approxEqRel for large-magnitude comparisons and approxEqAbs near zero; both honor IEEE-754 edge cases without tripping NaNs.

Tolerances, scaling, and derived quantities

Angular conversions use rad_per_deg/deg_per_rad, while hypot preserves precision in Pythagorean calculations by avoiding catastrophic cancellation. When chaining transforms, keep intermediate results in f64 even if your public API uses narrower floats—the mixed-type overloads in std.math do the right thing and avoid compiler warnings. 39

Hashing: reproducibility versus integrity

Zig splits hashing strategies sharply: std.hash families target speed and low collision rates for in-memory buckets, whereas std.crypto.hash.sha2 delivers standardized digests for integrity checks or signature pipelines.

Non-cryptographic hashing for buckets

std.hash.Wyhash.hash produces a 64-bit value seeded however you like, ideal for hash maps or bloom filters where avalanche properties matter more than resistance to adversaries. If you need structured hashing with compile-time type awareness, std.hash.autoHash walks your fields recursively and feeds them into a configurable backend. 44auto_hash.zig

SHA-256 digest pipeline with pragmatic guardrails

Even when your CLI only needs a checksum, treat SHA-256 as an integrity primitive—not an authenticity guarantee—and document that difference for users.

Zig
const std = @import("std");

pub fn main() !void {
    var stdout_buffer: [4096]u8 = undefined;
    var stdout_writer = std.fs.File.stdout().writer(&stdout_buffer);
    const stdout = &stdout_writer.interface;

    var gpa = std.heap.GeneralPurposeAllocator(.{}){};
    defer std.debug.assert(gpa.deinit() == .ok);
    const allocator = gpa.allocator();

    const args = try std.process.argsAlloc(allocator);
    defer std.process.argsFree(allocator, args);

    const input_path = if (args.len > 1) args[1] else "payload.txt";

    var file = try std.fs.cwd().openFile(input_path, .{ .mode = .read_only });
    defer file.close();

    var sha256 = std.crypto.hash.sha2.Sha256.init(.{});
    var buffer: [4096]u8 = undefined;
    while (true) {
        const read = try file.read(&buffer);
        if (read == 0) break;
        sha256.update(buffer[0..read]);
    }

    var digest: [std.crypto.hash.sha2.Sha256.digest_length]u8 = undefined;
    sha256.final(&digest);

    const sample = "payload preview";
    const wyhash = std.hash.Wyhash.hash(0, sample);

    try stdout.print("wyhash(seed=0) {s} -> 0x{x:0>16}\n", .{ sample, wyhash });
    const hex_digest = std.fmt.bytesToHex(digest, .lower);
    try stdout.print("sha256({s}) ->\n  {s}\n", .{ input_path, hex_digest });
    try stdout.print("(remember: sha256 certifies integrity, not authenticity.)\n", .{});

    try stdout.flush();
}
Run
Shell
$ zig run hash_digest_tool.zig -- chapters-data/code/50__random-and-math/payload.txt
Output
Shell
wyhash(seed=0) payload preview -> 0x30297ecbb2bd0c02
sha256(chapters-data/code/50__random-and-math/payload.txt) ->
  0498ca2116fb55b7a502d0bf3ad5d0e0b3f4e23ad919bdc0f9f151ca3637a6fa
(remember: sha256 certifies integrity, not authenticity.)

When hashing large files, stream through a reusable buffer and reuse a single arena allocator for argument parsing to avoid churning the general-purpose allocator. 10fmt.zig

Notes & Caveats

  • Random structs are not thread-safe; split distinct generators per worker or guard access with atomics to avoid shared-state races. 29
  • std.math functions honor IEEE-754 NaN propagation—never rely on comparisons after invalid operations without explicit checks.
  • Cryptographic digests should be paired with signature checks, HMACs, or trusted distribution; SHA-256 alone detects corruption, not tampering. hash_composition.zig

Exercises

  • Replace DefaultPrng with std.Random.DefaultCsprng in the first example and measure the performance delta across build modes. 39ChaCha.zig
  • Extend math_inspector.zig to compute confidence intervals using approxEqRel to flag outliers in a latency report. 47
  • Modify hash_digest_tool.zig to compute and store SHA-256 digests for every file inside a TAR archive from Chapter 49, emitting a manifest alongside the archive. tar.zig

Caveats, alternatives, edge cases

  • Jump functions on Xoshiro mutate state irreversibly; snapshot your generator before calling jump() if you need to rewind later.
  • Avoid bytesToHex for streaming output on gigantic files—prefer incremental encoders to sidestep large stack allocations.
  • SHA-256 digests of enormous files (>4 GiB) must account for platform-specific path encodings; normalize UTF-8/UTF-16 earlier in your pipeline to avoid hashing different byte streams. 45

Help make this chapter better.

Found a typo, rough edge, or missing explanation? Open an issue or propose a small improvement on GitHub.