Chapter 03Data Fundamentals

Data Fundamentals

Overview

Control flow is only as useful as the data it pilots, so this chapter grounds Zig’s core collection types—arrays, slices, and sentinel-terminated strings—in practical usage while keeping value semantics explicit. See #Arrays and #Slices for reference.

We also make pointers, optionals, and alignment-friendly casts feel routine, showing how to safely reinterpret memory while retaining bounds checks and clarity about mutability. See #Pointers and #alignCast for details.

Zig’s Type System Categories

Before diving into specific collection types, it’s helpful to understand where arrays, slices, and pointers fit within Zig’s type system. Every type in Zig belongs to a category, and each category provides specific operations:

graph TB subgraph "Type Categories" PRIMITIVE["Primitive Types<br/>bool, u8, i32, f64, void, ..."] POINTER["Pointer Types<br/>*T, [*]T, []T, [:0]T"] AGGREGATE["Aggregate Types<br/>struct, array, tuple"] FUNCTION["Function Types<br/>fn(...) ReturnType"] SPECIAL["Special Types<br/>anytype, type, comptime_int"] end subgraph "Common Type Operations" ABISIZE["abiSize()<br/>Byte size in memory"] ABIALIGN["abiAlignment()<br/>Required alignment"] HASRUNTIME["hasRuntimeBits()<br/>Has runtime storage?"] ELEMTYPE["elemType()<br/>Element type (arrays/slices)"] end PRIMITIVE --> ABISIZE POINTER --> ABISIZE AGGREGATE --> ABISIZE PRIMITIVE --> ABIALIGN POINTER --> ABIALIGN AGGREGATE --> ABIALIGN POINTER --> ELEMTYPE AGGREGATE --> ELEMTYPE

Key insights for this chapter:

  • Arrays are aggregate types with compile-time-known length—their size is element_size * length
  • Slices are pointer types that store both a pointer and runtime length—always 2 × pointer size
  • Pointers come in multiple shapes (single-item *T, many-item [*]T, slice []T) with different safety guarantees
  • All types expose their size and alignment, which affect struct layout and memory allocation

This type-aware design lets the compiler enforce bounds checking on slices while allowing pointer arithmetic on many-item pointers when you explicitly opt out of safety.

Learning Goals

  • Distinguish array value semantics from slice views, including zero-length idioms for safe fallbacks.
  • Navigate pointer shapes (*T, [*]T, ?*T) and unwrap optionals without sacrificing safety instrumentation (see #Optionals).
  • Apply sentinel-terminated strings and alignment-aware casts (@alignCast, @bitCast, @intCast) when interoperating with other APIs (see #Sentinel-Terminated-Pointers and #Explicit-Casts).

Structuring Collections in Memory

Arrays own storage while slices borrow it, so the compiler enforces different guarantees around length, mutability, and lifetimes; mastering their interplay keeps iteration predictable and moves most bounds checks into debug builds.

Arrays as Owned Storage

Arrays carry length in their type, copy by value, and give you a mutable baseline from which to carve read-only and read-write slices.

Zig
const std = @import("std");

/// Prints information about a slice including its label, length, and first element.
/// If the slice is empty, displays -1 as the head value.
fn describe(label: []const u8, data: []const i32) void {
    // Get first element or -1 if slice is empty
    const head = if (data.len > 0) data[0] else -1;
    std.debug.print("{s}: len={} head={d}\n", .{ label, data.len, head });
}

/// Demonstrates array and slice fundamentals in Zig, including:
/// - Array declaration and initialization
/// - Creating slices from arrays with different mutability
/// - Modifying arrays through direct indexing and slices
/// - Array copying behavior (value semantics)
/// - Creating empty and zero-length slices
pub fn main() !void {
    // Declare mutable array with inferred size
    var values = [_]i32{ 3, 5, 8, 13 };
    // Declare const array with explicit size using anonymous struct syntax
    const owned: [4]i32 = .{ 1, 2, 3, 4 };

    // Create a mutable slice covering the entire array
    var mutable_slice: []i32 = values[0..];
    // Create an immutable slice of the first two elements
    const prefix: []const i32 = values[0..2];
    // Create a zero-length slice (empty but valid)
    const empty = values[0..0];

    // Modify array directly by index
    values[1] = 99;
    // Modify array through mutable slice
    mutable_slice[0] = -3;

    std.debug.print("array len={} allows mutation\n", .{values.len});
    describe("mutable_slice", mutable_slice);
    describe("prefix", prefix);
    // Demonstrate that slice modification affects the underlying array
    std.debug.print("values[0] after slice write = {d}\n", .{values[0]});
    std.debug.print("empty slice len={} is zero-length\n", .{empty.len});

    // Arrays are copied by value in Zig
    var copy = owned;
    copy[0] = -1;
    // Show that modifying the copy doesn't affect the original
    std.debug.print("copy[0]={d} owned[0]={d}\n", .{ copy[0], owned[0] });

    // Create a slice from an empty array literal using address-of operator
    const zero: []const i32 = &[_]i32{};
    std.debug.print("zero slice len={} from literal\n", .{zero.len});
}
Run
Shell
$ zig run arrays_and_slices.zig
Output
Shell
array len=4 allows mutation
mutable_slice: len=4 head=-3
prefix: len=2 head=-3
values[0] after slice write = -3
empty slice len=0 is zero-length
copy[0]=-1 owned[0]=1
zero slice len=0 from literal

The mutable slice and the original array share storage, while the []const prefix resists writes—an intentional boundary that forces read-only consumers to stay honest.

Memory Layout: Arrays vs Slices

Understanding how arrays and slices are laid out in memory clarifies why "arrays own storage while slices borrow it" and why array-to-slice coercion is a cheap operation:

graph TB subgraph "Array in Memory" ARRAY_DECL["const values: [4]i32 = .{1, 2, 3, 4}"] ARRAY_MEM["Memory Layout (16 bytes)\n\nstack frame\n| 1 | 2 | 3 | 4 |"] ARRAY_DECL --> ARRAY_MEM end subgraph "Slice in Memory" SLICE_DECL["const slice: []const i32 = &values"] SLICE_MEM["Memory Layout (16 bytes on 64-bit)\n\nstack frame\n| ptr | len=4 |"] POINTS["ptr points to array data"] SLICE_DECL --> SLICE_MEM SLICE_MEM --> POINTS end POINTS -.->|"references"| ARRAY_MEM subgraph "Key Differences" DIFF1["Array: Stores data inline<br/>Size = elem_size × length"] DIFF2["Slice: Stores pointer + length<br/>Size = 2 × pointer_size (16 bytes on 64-bit)"] DIFF3["Coercion: &array → slice<br/>Just creates {ptr, len} pair"] end

Why this matters:

  • Arrays have value semantics: assigning an array copies all elements
  • Slices have reference semantics: assigning a slice copies just the pointer and length
  • Array-to-slice coercion (&array) is cheap—it doesn’t copy data, just creates a descriptor
  • Slices are "fat pointers": they carry runtime length information, enabling bounds checking

This is why functions typically accept slices as parameters—they can work with arrays, slices, and portions of either without copying the underlying data.

Strings and Sentinels in Practice

Sentinel-terminated arrays bridge to C APIs without forfeiting the safety of slices; you can reinterpret the byte stream with std.mem.span and still mutate the underlying buffer when the sentinel convention is preserved.

Zig
const std = @import("std");

/// Demonstrates sentinel-terminated strings and arrays in Zig, including:
/// - Zero-terminated string literals ([:0]const u8)
/// - Many-item sentinel pointers ([*:0]const u8)
/// - Sentinel-terminated arrays ([N:0]T)
/// - Converting between sentinel slices and regular slices
/// - Mutation through sentinel pointers
pub fn main() !void {
    // String literals in Zig are sentinel-terminated by default with a zero byte
    // [:0]const u8 denotes a slice with a sentinel value of 0 at the end
    const literal: [:0]const u8 = "data fundamentals";
    
    // Convert the sentinel slice to a many-item sentinel pointer
    // [*:0]const u8 is compatible with C-style null-terminated strings
    const c_ptr: [*:0]const u8 = literal;
    
    // std.mem.span converts a sentinel-terminated pointer back to a slice
    // It scans until it finds the sentinel value (0) to determine the length
    const bytes = std.mem.span(c_ptr);
    std.debug.print("literal len={} contents=\"{s}\"\n", .{ bytes.len, bytes });

    // Declare a sentinel-terminated array with explicit size and sentinel value
    // [6:0]u8 means an array of 6 elements plus a sentinel 0 byte at position 6
    var label: [6:0]u8 = .{ 'l', 'a', 'b', 'e', 'l', 0 };
    
    // Create a mutable sentinel slice from the array
    // The [0.. :0] syntax creates a slice from index 0 to the end, with sentinel 0
    var sentinel_view: [:0]u8 = label[0.. :0];
    
    // Modify the first element through the sentinel slice
    sentinel_view[0] = 'L';

    // Create a regular (non-sentinel) slice from the first 4 elements
    // This drops the sentinel guarantees but provides a bounded slice
    const trimmed: []const u8 = sentinel_view[0..4];
    std.debug.print("trimmed slice len={} -> {s}\n", .{ trimmed.len, trimmed });

    // Convert the sentinel slice to a many-item sentinel pointer
    // This allows unchecked indexing while preserving sentinel information
    const tail: [*:0]u8 = sentinel_view;
    
    // Modify element at index 4 through the many-item sentinel pointer
    // No bounds checking occurs, but the sentinel guarantees remain valid
    tail[4] = 'X';

    // Demonstrate that mutations through the pointer affected the original array
    // std.mem.span uses the sentinel to reconstruct the full slice
    std.debug.print("full label after mutation: {s}\n", .{std.mem.span(tail)});
}
Run
Shell
$ zig run sentinel_strings.zig
Output
Shell
literal len=17 contents="data fundamentals"
trimmed slice len=4 -> Labe
full label after mutation: LabeX

The sentinel slice keeps the trailing zero intact, so taking a [*:0]u8 for FFI remains sound even after local mutations, while the plain slice gives ergonomic iteration within Zig (see #Type-Coercion).

std.mem.span converts sentinel pointers into ordinary slices without cloning data, making it ideal when you temporarily need bounds checks or slice helpers before returning to pointer APIs.

Immutable and Mutable Views

Prefer []const T when callers only inspect data—Zig will gladly coerce a mutable slice to a const view, giving you API clarity and keeping accidental writes from compiling in the first place.

Pointer Patterns and Cast Workflows

Pointers surface when you share storage, interoperate with foreign layouts, or step outside slice bounds; by leaning on optional wrappers and explicit casts, you keep intent clear and allow safety checks to fire whenever assumptions break.

Pointer Shape Reference

Zig offers multiple pointer types, each with different safety guarantees and use cases. Understanding when to use each shape is essential for writing safe, efficient code:

graph TB subgraph "Pointer Shapes" SINGLE["*T<br/>Single-Item Pointer"] MANY["[*]T<br/>Many-Item Pointer"] SLICE["[]T<br/>Slice"] OPTIONAL["?*T<br/>Optional Pointer"] SENTINEL_PTR["[*:0]T<br/>Sentinel Many-Item"] SENTINEL_SLICE["[:0]T<br/>Sentinel Slice"] end subgraph "Characteristics" SINGLE --> S_BOUNDS["✓ Bounds: Single element<br/>✓ Safety: Dereference checked<br/>📍 Use: Function parameters, references"] MANY --> M_BOUNDS["⚠ Bounds: Unknown length<br/>✗ Safety: No bounds checking<br/>📍 Use: C interop, tight loops"] SLICE --> SL_BOUNDS["✓ Bounds: Runtime length<br/>✓ Safety: Bounds checked<br/>📍 Use: Most Zig code, iteration"] OPTIONAL --> O_BOUNDS["✓ Bounds: May be null<br/>✓ Safety: Must unwrap first<br/>📍 Use: Optional references"] SENTINEL_PTR --> SP_BOUNDS["✓ Bounds: Until sentinel<br/>~ Safety: Sentinel must exist<br/>📍 Use: C strings, null-terminated"] SENTINEL_SLICE --> SS_BOUNDS["✓ Bounds: Length + sentinel<br/>✓ Safety: Both length and sentinel<br/>📍 Use: Zig ↔ C string bridge"] end

Comparison Table:

ShapeExampleLength Known?Bounds Checked?Common Use
*T*i32Single elementYes (implicit)Reference to one item
[*]T[*]i32UnknownNoC arrays, pointer arithmetic
[]T[]i32Runtime (in slice)YesPrimary Zig collection type
?*T?*i32Single (if non-null)Yes + null checkOptional references
[*:0]T[*:0]u8Until sentinelSentinel must existC strings (char*)
[:0]T[:0]u8Runtime + sentinelYes + sentinel guaranteeZig strings for C APIs

Guidelines:

  • Default to slices ([]T) for all Zig code—they provide safety and convenience
  • Use single-item pointers (*T) when you need to mutate a single value or pass by reference
  • Avoid many-item pointers ([*]T) unless interfacing with C or in performance-critical inner loops
  • Use optional pointers (?*T) when null is a meaningful state, not for error handling
  • Use sentinel types ([*:0]T, [:0]T) at the C boundary, convert to slices internally

Optional Pointers for Shared Mutability

Optional single-item pointers expose mutability without guessing at lifetimes—capture them only when present, mutate through the dereference, and fall back gracefully when the pointer is absent.

Zig
const std = @import("std");

/// A simple structure representing a sensor device with a numeric reading.
const Sensor = struct {
    reading: i32,
};

/// Prints a sensor's reading value to debug output.
/// Takes a single pointer to a Sensor and displays its current reading.
fn report(label: []const u8, ptr: *Sensor) void {
    std.debug.print("{s} -> reading {d}\n", .{ label, ptr.reading });
}

/// Demonstrates pointer fundamentals, optional pointers, and many-item pointers in Zig.
/// This example covers:
/// - Single-item pointers (*T) and pointer dereferencing
/// - Pointer aliasing and mutation through aliases
/// - Optional pointers (?*T) for representing nullable references
/// - Unwrapping optional pointers with if statements
/// - Many-item pointers ([*]T) for unchecked multi-element access
/// - Converting slices to many-item pointers via .ptr property
pub fn main() !void {
    // Create a sensor instance on the stack
    var sensor = Sensor{ .reading = 41 };
    
    // Create a single-item pointer alias to the sensor
    // The & operator takes the address of sensor
    var alias: *Sensor = &sensor;
    
    // Modify the sensor through the pointer alias
    // Zig automatically dereferences pointer fields
    alias.reading += 1;

    report("alias", alias);

    // Declare an optional pointer initialized to null
    // ?*T represents a pointer that may or may not hold a valid address
    var maybe_alias: ?*Sensor = null;
    
    // Attempt to unwrap the optional pointer
    // This branch will not execute because maybe_alias is null
    if (maybe_alias) |pointer| {
        std.debug.print("unexpected pointer: {d}\n", .{pointer.reading});
    } else {
        std.debug.print("optional pointer empty\n", .{});
    }

    // Assign a valid address to the optional pointer
    maybe_alias = &sensor;
    
    // Unwrap and use the optional pointer
    // The |pointer| capture syntax extracts the non-null value
    if (maybe_alias) |pointer| {
        pointer.reading += 10;
        std.debug.print("optional pointer mutated to {d}\n", .{sensor.reading});
    }

    // Create an array and a slice view of it
    var samples = [_]i32{ 5, 7, 9, 11 };
    const view: []i32 = samples[0..];
    
    // Extract a many-item pointer from the slice
    // Many-item pointers ([*]T) allow unchecked indexing without length tracking
    const many: [*]i32 = view.ptr;
    
    // Modify the underlying array through the many-item pointer
    // No bounds checking is performed at this point
    many[2] = 42;

    std.debug.print("slice view len={}\n", .{view.len});
    // Verify that the modification through many-item pointer affected the original array
    std.debug.print("samples[2] via many pointer = {d}\n", .{samples[2]});
}
Run
Shell
$ zig run pointers_and_optionals.zig
Output
Shell
alias -> reading 42
optional pointer empty
optional pointer mutated to 52
slice view len=4
samples[2] via many pointer = 42

The ?*Sensor gate keeps mutation behind pattern matching, while the many-item pointer ([*]i32) documents aliasing risk by dropping bounds checks—a deliberate trade-off reserved for tight loops and FFI.

Aligning and Reinterpreting Data

When you must reinterpret raw bytes, use the casting builtins to promote alignment, change pointer element types, and keep integer/float conversions explicit so debug builds can catch undefined assumptions (see #bitCast).

Zig
const std = @import("std");

/// Demonstrates memory alignment concepts and various type casting operations in Zig.
/// This example covers:
/// - Memory alignment guarantees with align() attribute
/// - Pointer casting with alignment adjustments using @alignCast
/// - Type punning with @ptrCast for reinterpreting memory
/// - Bitwise reinterpretation with @bitCast
/// - Truncating integers with @truncate
/// - Widening integers with @intCast
/// - Floating-point precision conversion with @floatCast
pub fn main() !void {
    // Create a byte array aligned to u64 boundary, initialized with little-endian bytes
    // representing 0x11223344 in the first 4 bytes
    var raw align(@alignOf(u64)) = [_]u8{ 0x44, 0x33, 0x22, 0x11, 0, 0, 0, 0 };

    // Get a pointer to the first byte with explicit u64 alignment
    const base: *align(@alignOf(u64)) u8 = &raw[0];
    
    // Adjust alignment constraint from u64 to u32 using @alignCast
    // This is safe because u64 alignment (8 bytes) satisfies u32 alignment (4 bytes)
    const aligned_bytes = @as(*align(@alignOf(u32)) const u8, @alignCast(base));
    
    // Reinterpret the byte pointer as a u32 pointer to read 4 bytes as a single integer
    const word_ptr = @as(*const u32, @ptrCast(aligned_bytes));
    
    // Dereference to get the 32-bit value (little-endian: 0x11223344)
    const number = word_ptr.*;
    std.debug.print("32-bit value = 0x{X:0>8}\n", .{number});

    // Alternative approach: directly reinterpret the first 4 bytes using @bitCast
    // This creates a copy and doesn't require pointer manipulation
    const from_bytes = @as(u32, @bitCast(raw[0..4].*));
    std.debug.print("bitcast copy = 0x{X:0>8}\n", .{from_bytes});

    // Demonstrate @truncate: extract the least significant 8 bits (0x44)
    const small: u8 = @as(u8, @truncate(number));
    
    // Demonstrate @intCast: widen unsigned u32 to signed i64 without data loss
    const widened: i64 = @as(i64, @intCast(number));
    std.debug.print("truncate -> 0x{X:0>2}, widen -> {d}\n", .{ small, widened });

    // Demonstrate @floatCast: reduce f64 precision to f32
    // May result in precision loss for values that cannot be exactly represented in f32
    const ratio64: f64 = 1.875;
    const ratio32: f32 = @as(f32, @floatCast(ratio64));
    std.debug.print("floatCast ratio -> {}\n", .{ratio32});
}
Run
Shell
$ zig run alignment_and_casts.zig
Output
Shell
32-bit value = 0x11223344
bitcast copy = 0x11223344
truncate -> 0x44, widen -> 287454020
floatCast ratio -> 1.875

By chaining @alignCast, @ptrCast, and @bitCast you assert layout relationships explicitly, and the subsequent @truncate/@intCast conversions keep integer widths honest when narrowing or widening across APIs.

Notes & Caveats

  • Sentinel-terminated pointers are great for C bridges, but within Zig prefer slices so bounds checks stay available and APIs expose lengths.
  • Upgrading pointer alignment with @alignCast still traps in Debug mode if the address is misaligned—prove the precondition before promoting.
  • Many-item pointers ([*]T) drop bounds checks; reach for them sparingly and document invariants that a safe slice would have enforced.

Exercises

  • Extend arrays_and_slices.zig to create a zero-length mutable slice from a runtime array, then append via std.ArrayList to observe how slice views remain valid.
  • Modify sentinel_strings.zig to accept a user-supplied [:0]u8 and guard against inputs missing the sentinel by returning an error union.
  • Enhance alignment_and_casts.zig by adding a branch that rejects values whose low byte is zero before truncation, surfacing how @intCast depends on caller-supplied range guarantees.

Help make this chapter better.

Found a typo, rough edge, or missing explanation? Open an issue or propose a small improvement on GitHub.