Overview
With compression pipelines in place from the previous chapter, we now zoom in on the numeric engines that feed those workflows: deterministic pseudo-random number generators, well-behaved math helpers, and hashing primitives that balance speed and security. Zig 0.15.2 keeps these components modular—std.Random builds reproducible sequences, std.math provides careful tolerances and constants, and the stdlib splits hashing into non-crypto and crypto families so you can choose the right tool per workload. math.zigwyhash.zigsha2.zig
Learning Goals
- Seed, advance, and reproduce
std.Randomgenerators while sampling common distributions. Xoshiro256.zig - Apply
std.mathutilities—constants, clamps, tolerances, and geometry helpers—to keep numeric code stable. hypot.zig - Distinguish fast hashers like Wyhash from cryptographic digests such as SHA-256, and wire both into file-processing jobs responsibly.
Random number foundations
Zig exposes pseudo-random generators as first-class values: you seed an engine, ask it for integers, floats, or indices, and your code owns the state transitions. That transparency gives you control over fuzzers, simulations, and deterministic tests. Random.zig
Deterministic generators with reproducible sequences
std.Random.DefaultPrng wraps Xoshiro256++, seeding itself via SplitMix64 when you call init(seed). From there you obtain a Random facade that exposes high-level helpers—ranges, shuffles, floats—while keeping the underlying state private.
const std = @import("std");
pub fn main() !void {
var stdout_buffer: [4096]u8 = undefined;
var stdout_writer = std.fs.File.stdout().writer(&stdout_buffer);
const stdout = &stdout_writer.interface;
const seed: u64 = 0x0006_7B20; // 424,224 in decimal
var prng = std.Random.DefaultPrng.init(seed);
var rand = prng.random();
const dice_roll = rand.intRangeAtMost(u8, 1, 6);
const coin = if (rand.boolean()) "heads" else "tails";
var ladder = [_]u8{ 0, 1, 2, 3, 4, 5 };
rand.shuffle(u8, ladder[0..]);
const unit_float = rand.float(f64);
var reproducible = [_]u32{ undefined, undefined, undefined };
var check_prng = std.Random.DefaultPrng.init(seed);
var check_rand = check_prng.random();
for (&reproducible) |*slot| {
slot.* = check_rand.int(u32);
}
try stdout.print("seed=0x{X:0>8}\n", .{seed});
try stdout.print("d6 roll -> {d}\n", .{dice_roll});
try stdout.print("coin flip -> {s}\n", .{coin});
try stdout.print("shuffled ladder -> {any}\n", .{ladder});
try stdout.print("unit float -> {d:.6}\n", .{unit_float});
try stdout.print("first three u32 -> {any}\n", .{reproducible});
try stdout.flush();
}
$ zig run prng_sequences.zigseed=0x00067B20
d6 roll -> 5
coin flip -> tails
shuffled ladder -> { 0, 4, 3, 2, 5, 1 }
unit float -> 0.742435
first three u32 -> { 2135551917, 3874178402, 2563214192 }The fairness guarantees of uintLessThan hinge on the generator’s uniform output; fall back to uintLessThanBiased when constant-time behavior matters more than perfect distribution.
Working with distributions and sampling heuristics
Beyond uniform draws, Random.floatNorm and Random.floatExp expose Ziggurat-backed normal and exponential samples—ideal for synthetic workloads or noise injection. ziggurat.zig Weighted choices come from weightedIndex, while .jump() on Xoshiro engines deterministically leaps ahead by 2^128 steps to partition streams across threads without overlap. 29 For cryptographic uses, swap to std.crypto.random or std.Random.DefaultCsprng to inherit ChaCha-based entropy rather than a fast-but-predictable PRNG. tlcsprng.zig
Practical math utilities
The std.math namespace combines fundamental constants with measured utilities: clamps, approximate equality, and geometry helpers all share consistent semantics across CPU targets.
Numeric hygiene toolkit
Combining a handful of helpers—sqrt, clamp, approximate equality, and the golden ratio constant—keeps reporting code readable and portable. sqrt.zig
const std = @import("std");
pub fn main() !void {
var stdout_buffer: [4096]u8 = undefined;
var stdout_writer = std.fs.File.stdout().writer(&stdout_buffer);
const stdout = &stdout_writer.interface;
const m = std.math;
const latencies = [_]f64{ 0.94, 1.02, 0.87, 1.11, 0.99, 1.05 };
var sum: f64 = 0;
var sum_sq: f64 = 0;
var minimum = latencies[0];
var maximum = latencies[0];
for (latencies) |value| {
sum += value;
sum_sq += value * value;
minimum = @min(minimum, value);
maximum = @max(maximum, value);
}
const mean = sum / @as(f64, @floatFromInt(latencies.len));
const rms = m.sqrt(sum_sq / @as(f64, @floatFromInt(latencies.len)));
const normalized = m.clamp((mean - 0.8) / 0.6, 0.0, 1.0);
const turn_degrees: f64 = 72.0;
const turn_radians = turn_degrees * m.rad_per_deg;
const right_angle = m.pi / 2.0;
const approx_right = m.approxEqRel(f64, turn_radians, right_angle, 1e-12);
const hyp = m.hypot(3.0, 4.0);
try stdout.print("sample count -> {d}\n", .{latencies.len});
try stdout.print("min/max -> {d:.2} / {d:.2}\n", .{ minimum, maximum });
try stdout.print("mean -> {d:.3}\n", .{mean});
try stdout.print("rms -> {d:.3}\n", .{rms});
try stdout.print("normalized mean -> {d:.3}\n", .{normalized});
try stdout.print("72deg in rad -> {d:.6}\n", .{turn_radians});
try stdout.print("close to right angle? -> {s}\n", .{if (approx_right) "yes" else "no"});
try stdout.print("hypot(3,4) -> {d:.1}\n", .{hyp});
try stdout.print("phi constant -> {d:.9}\n", .{m.phi});
try stdout.flush();
}
$ zig run math_inspector.zigsample count -> 6
min/max -> 0.87 / 1.11
mean -> 0.997
rms -> 1.000
normalized mean -> 0.328
72deg in rad -> 1.256637
close to right angle? -> no
hypot(3,4) -> 5.0
phi constant -> 1.618033989Prefer approxEqRel for large-magnitude comparisons and approxEqAbs near zero; both honor IEEE-754 edge cases without tripping NaNs.
Tolerances, scaling, and derived quantities
Angular conversions use rad_per_deg/deg_per_rad, while hypot preserves precision in Pythagorean calculations by avoiding catastrophic cancellation. When chaining transforms, keep intermediate results in f64 even if your public API uses narrower floats—the mixed-type overloads in std.math do the right thing and avoid compiler warnings. 39
Hashing: reproducibility versus integrity
Zig splits hashing strategies sharply: std.hash families target speed and low collision rates for in-memory buckets, whereas std.crypto.hash.sha2 delivers standardized digests for integrity checks or signature pipelines.
Non-cryptographic hashing for buckets
std.hash.Wyhash.hash produces a 64-bit value seeded however you like, ideal for hash maps or bloom filters where avalanche properties matter more than resistance to adversaries. If you need structured hashing with compile-time type awareness, std.hash.autoHash walks your fields recursively and feeds them into a configurable backend. 44auto_hash.zig
SHA-256 digest pipeline with pragmatic guardrails
Even when your CLI only needs a checksum, treat SHA-256 as an integrity primitive—not an authenticity guarantee—and document that difference for users.
const std = @import("std");
pub fn main() !void {
var stdout_buffer: [4096]u8 = undefined;
var stdout_writer = std.fs.File.stdout().writer(&stdout_buffer);
const stdout = &stdout_writer.interface;
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
defer std.debug.assert(gpa.deinit() == .ok);
const allocator = gpa.allocator();
const args = try std.process.argsAlloc(allocator);
defer std.process.argsFree(allocator, args);
const input_path = if (args.len > 1) args[1] else "payload.txt";
var file = try std.fs.cwd().openFile(input_path, .{ .mode = .read_only });
defer file.close();
var sha256 = std.crypto.hash.sha2.Sha256.init(.{});
var buffer: [4096]u8 = undefined;
while (true) {
const read = try file.read(&buffer);
if (read == 0) break;
sha256.update(buffer[0..read]);
}
var digest: [std.crypto.hash.sha2.Sha256.digest_length]u8 = undefined;
sha256.final(&digest);
const sample = "payload preview";
const wyhash = std.hash.Wyhash.hash(0, sample);
try stdout.print("wyhash(seed=0) {s} -> 0x{x:0>16}\n", .{ sample, wyhash });
const hex_digest = std.fmt.bytesToHex(digest, .lower);
try stdout.print("sha256({s}) ->\n {s}\n", .{ input_path, hex_digest });
try stdout.print("(remember: sha256 certifies integrity, not authenticity.)\n", .{});
try stdout.flush();
}
$ zig run hash_digest_tool.zig -- chapters-data/code/50__random-and-math/payload.txtwyhash(seed=0) payload preview -> 0x30297ecbb2bd0c02
sha256(chapters-data/code/50__random-and-math/payload.txt) ->
0498ca2116fb55b7a502d0bf3ad5d0e0b3f4e23ad919bdc0f9f151ca3637a6fa
(remember: sha256 certifies integrity, not authenticity.)Notes & Caveats
Randomstructs are not thread-safe; split distinct generators per worker or guard access with atomics to avoid shared-state races. 29std.mathfunctions honor IEEE-754 NaN propagation—never rely on comparisons after invalid operations without explicit checks.- Cryptographic digests should be paired with signature checks, HMACs, or trusted distribution; SHA-256 alone detects corruption, not tampering. hash_composition.zig
Exercises
- Replace
DefaultPrngwithstd.Random.DefaultCsprngin the first example and measure the performance delta across build modes. 39ChaCha.zig - Extend
math_inspector.zigto compute confidence intervals usingapproxEqRelto flag outliers in a latency report. 47 - Modify
hash_digest_tool.zigto compute and store SHA-256 digests for every file inside a TAR archive from Chapter 49, emitting a manifest alongside the archive. tar.zig
Caveats, alternatives, edge cases
- Jump functions on Xoshiro mutate state irreversibly; snapshot your generator before calling
jump()if you need to rewind later. - Avoid
bytesToHexfor streaming output on gigantic files—prefer incremental encoders to sidestep large stack allocations. - SHA-256 digests of enormous files (>4 GiB) must account for platform-specific path encodings; normalize UTF-8/UTF-16 earlier in your pipeline to avoid hashing different byte streams. 45