Overview
Our second project graduates from arithmetic to text processing: a tiny grep clone that accepts a search pattern and a file path, then prints only the matching lines. The exercise reinforces argument handling from the previous chapter while introducing file I/O and slice utilities from the standard library. #Command-line-flags, File.zig
Instead of streaming byte-by-byte, we lean on Zig’s memory-safe helpers to load the file, split it into lines, and surface hits with straightforward substring checks. Every failure path produces a user-friendly message before exiting, so the tool behaves predictably inside shell scripts—a theme we will carry into the next project. See #Command-line-flags and File.zig for related APIs, and #Error-Handling for error-handling patterns.
Learning Goals
- Implement a command-line parsing routine that supports
--help, enforces arity, and terminates gracefully on misuse. - Use
std.fs.File.readToEndAllocandstd.mem.splitScalarto load and iterate over file contents (see mem.zig). - Filter lines with
std.mem.indexOfand report results via stdout while directing diagnostics to stderr (see debug.zig).
Building the Search Harness
We start by wiring the CLI front end: allocate arguments, honor --help, and confirm that exactly two positional parameters — pattern and path — are present. Any deviation prints a usage banner and exits with code 1, avoiding stack traces while still signaling failure to the caller.
Validating Arguments and Usage Paths
The skeleton mirrors Chapter 5’s TempConv CLI, but now we emit diagnostics to stderr and exit explicitly whenever input is wrong or a file cannot be opened. printUsage keeps the banner in one place, and std.process.exit guarantees we stop immediately after the message is written.
Loading and Splitting the File
Rather than juggling partial reads, we load the file into memory with File.readToEndAlloc, capping the size to eight megabytes to guard against unexpected giants. A single call to std.mem.splitScalar then produces an iterator over newline-delimited segments, which we trim for Windows-style carriage returns.
Understanding std.fs Structure
Before diving into file operations, it’s helpful to understand how Zig’s filesystem API is organized. The std.fs module provides a layered hierarchy that makes file access portable and composable:
Key concepts:
- Entry Point:
std.fs.cwd()returns aDirhandle representing the current working directory - Dir Type: Provides directory-level operations like opening files, creating subdirectories, and iterating contents
- File Type: Represents an open file with read/write operations
- Chained Calls: You call
cwd().openFile()becauseopenFile()is a method on theDirtype
Why this structure matters for Grep-Lite:
// This is why we write:
const file = try std.fs.cwd().openFile(path, .{});
// ^ ^
// | +-- Method on Dir
// +----------- Returns Dir handleThe two-step process (cwd() → openFile()) gives you control over which directory to open files in. While this example uses the current directory, you could equally use:
std.fs.openDirAbsolute()for absolute pathsdir.openFile()for files relative to any directory handlestd.fs.openFileAbsolute()to skip theDirentirely
This composable design makes filesystem code testable (use a temporary directory) and portable (the same API works across platforms).
Scanning for Matches
Once we own a slice for each line, matching is a one-liner with std.mem.indexOf. We reuse the TempConv pattern of reserving stdout for successful output and stderr for diagnostics, making the tool piping-friendly.
Complete Grep-Lite Listing
The full listing below highlights how the helper functions slot together. Pay attention to the comments that tie each block back to the sections above.
const std = @import("std");
// Chapter 6 – Grep-Lite: stream a file line by line and echo only the matches
// to stdout while errors become clear diagnostics on stderr.
const CliError = error{MissingArgs};
fn printUsage() void {
std.debug.print("usage: grep-lite <pattern> <path>\n", .{});
}
fn trimNewline(line: []const u8) []const u8 {
if (line.len > 0 and line[line.len - 1] == '\r') {
return line[0 .. line.len - 1];
}
return line;
}
pub fn main() !void {
const allocator = std.heap.page_allocator;
const args = try std.process.argsAlloc(allocator);
defer std.process.argsFree(allocator, args);
if (args.len == 1 or (args.len == 2 and std.mem.eql(u8, args[1], "--help"))) {
printUsage();
return;
}
if (args.len != 3) {
std.debug.print("error: expected a pattern and a path\n", .{});
printUsage();
std.process.exit(1);
}
const pattern = args[1];
const path = args[2];
var file = std.fs.cwd().openFile(path, .{ .mode = .read_only }) catch {
std.debug.print("error: unable to open '{s}'\n", .{path});
std.process.exit(1);
};
defer file.close();
// Buffered stdout using modern Writer API
var out_buf: [8 * 1024]u8 = undefined;
var file_writer = std.fs.File.writer(std.fs.File.stdout(), &out_buf);
const stdout = &file_writer.interface;
// Section 1.2: load the complete file eagerly while enforcing a guard so
// unexpected multi-megabyte inputs do not exhaust memory.
const max_bytes = 8 * 1024 * 1024;
const contents = file.readToEndAlloc(allocator, max_bytes) catch |err| switch (err) {
error.FileTooBig => {
std.debug.print("error: file exceeds {} bytes limit\n", .{max_bytes});
std.process.exit(1);
},
else => return err,
};
defer allocator.free(contents);
// Section 2.1: split the buffer on newlines; each slice references the
// original allocation so we incur zero extra copies.
var lines = std.mem.splitScalar(u8, contents, '\n');
var matches: usize = 0;
while (lines.next()) |raw_line| {
const line = trimNewline(raw_line);
// Section 2: reuse `std.mem.indexOf` so we highlight exact matches
// without building temporary slices.
if (std.mem.indexOf(u8, line, pattern) != null) {
matches += 1;
try stdout.print("{s}\n", .{line});
}
}
if (matches == 0) {
std.debug.print("no matches for '{s}' in {s}\n", .{ pattern, path });
}
// Flush buffered stdout and finalize file position
try file_writer.end();
}
$ zig run grep_lite.zig -- pattern grep_lite.zig std.debug.print("usage: grep-lite <pattern> <path>\n", .{});
std.debug.print("error: expected a pattern and a path\n", .{});
const pattern = args[1];
if (std.mem.indexOf(u8, line, pattern) != null) {
std.debug.print("no matches for '{s}' in {s}\n", .{ pattern, path });The output shows every source line containing the literal word pattern. Your match list will differ when run against other files.
Detecting Missing Files Gracefully
To keep shell scripts predictable, the tool emits a single-line diagnostic and exits with a non-zero status when a file path cannot be opened.
$ zig run grep_lite.zig -- foo missing.txterror: unable to open 'missing.txt'Notes & Caveats
readToEndAllocis simple but loads the entire file; add a streaming reader later if you need to handle very large inputs.- The size cap prevents runaway allocations. Raise or make it configurable once you trust your deployment environment.
- This example uses a buffered stdout writer for matches and
std.debug.printfor diagnostics to stderr; we flush via the writer’send()at exit (see Io.zig).
Exercises
- Accept multiple files on the command line and print a
path:lineprefix for each match (see #for). - Add a
--ignore-caseflag by normalizing both the pattern and each line withstd.ascii.toLower(see ascii.zig). - Support regular expressions by integrating a third-party matcher after loading the entire buffer.
Alternatives & Edge Cases
- Windows files often end lines with
\r\n; trimming the carriage return keeps substring checks clean. - Empty patterns currently match every line. Introduce an explicit guard if you prefer to treat an empty string as misuse.
- To integrate with larger builds, replace
zig runwith azig build-exestep and package the binary on yourPATH.