patterns
practical patterns for using std.Io in real applications.
backend selection
const std = @import("std");
const Io = std.Io;
const Backend = if (Io.Evented != void) Io.Evented else Io.Threaded;
var backend: Backend = undefined;
pub fn main() !void {
const allocator = std.heap.smp_allocator;
if (Backend == Io.Threaded) {
backend = Io.Threaded.init(allocator, .{});
} else {
try Backend.init(&backend, allocator, .{});
}
const io = backend.io();
// pass io to your app
try app(io, allocator);
}
production: Io.Evented has known bugs as of 0.16.0-dev.3059 — see below. the code is identical between backends — just swap the init.
Threaded InitOptions
Io.Threaded.init(allocator, opts) accepts:
Io.Threaded.init(allocator, .{
.stack_size = 8 * 1024 * 1024, // default: 16MB (std.Thread.SpawnConfig.default_stack_size)
.async_limit = .{ .value = 8 }, // default: CPU count - 1
.concurrent_limit = .{ .value = 64 }, // default: .unlimited
});
| option | default | what it does |
|---|---|---|
stack_size |
16MB | per-thread stack. affects all spawned threads. |
async_limit |
CPU - 1 | bounded pool for io.async(). overflow runs task inline. |
concurrent_limit |
.unlimited |
pool for io.concurrent(). overflow returns error.ConcurrencyUnavailable. |
the thread explosion lesson
with default concurrent_limit = .unlimited, every io.concurrent() call that outlives its parent creates a permanent OS thread. a relay connecting to 2,000+ hosts with 2 concurrent tasks each (read loop + ping loop) creates ~4,000 threads at 16MB stack = 64GB virtual memory.
mitigations:
- set
concurrent_limitto a bounded value - set
stack_sizeto what you actually need (8MB is plenty for I/O tasks) - reduce concurrent tasks per unit of work — merge ping into read loop (1 task per host, not 2)
- use
Io.Groupfor lifecycle management — cancel all subscribers on shutdown
under Evented, io.concurrent() creates fibers (cheap userspace stacks). the 2-tasks-per-host architecture is fine there. Threaded InitOptions let you bound the damage until Evented is production-ready.
debug_io override
std.Options.debug_io is backed by a single-threaded instance. using it for application I/O silently serializes everything.
symptom: coral went from ~60 events/s to ~4/s after migrating to Io.Mutex with std.Options.debug_io.
override in your root source file:
var app_threaded_io: Io.Threaded = undefined;
pub const std_options_debug_threaded_io: ?*Io.Threaded = &app_threaded_io;
pub fn main() !void {
app_threaded_io = Io.Threaded.init(allocator, .{});
// now all std.Options.debug_io usage gets the real threaded instance
}
this works because the Io struct holds a pointer to Threaded — the pointer is stable even though the data is undefined at comptime.
or just pass io explicitly — create Io.Threaded in main, call .io(), thread it through functions. avoids globals entirely but more invasive.
long-lived task lifecycle
replacing std.Thread.spawn with io.concurrent for I/O-bound loops:
// old pattern
self.thread = try std.Thread.spawn(.{ .stack_size = 8 * 1024 * 1024 }, runLoop, .{self});
// ... later:
if (self.thread) |t| t.join();
// new pattern
self.future = try io.concurrent(runLoop, .{self});
// ... later:
_ = self.future.cancel(io);
the task function should exit cleanly on cancellation:
fn runLoop(self: *Self) void {
while (!self.shouldStop()) {
self.io.sleep(Io.Duration.fromMilliseconds(100), .awake) catch break;
// ... work ...
}
}
io.sleep() is a cancellation point. when future.cancel(io) is called, sleep returns error.Canceled. the catch break exits the loop.
cancel vs await
cancel(io)— requests cancellation + blocks until done. returns the task's result.await(io)— just blocks until done. no cancellation request.- both are idempotent and consume the future.
- both are NOT threadsafe — only call from the parent task.
managing dynamic task sets with Group
for a dynamic set of long-lived tasks (e.g., subscriber connections):
var subscribers: Io.Group = .init;
// spawn subscribers as they're discovered
for (hosts) |host| {
subscribers.concurrent(io, runSubscriber, .{host, io}) catch {
log.warn("concurrent limit reached for {s}", .{host});
continue;
};
}
// on shutdown — cancel all at once
subscribers.cancel(io);
Group resources per task are freed when that task returns, not when the group is awaited. safe for long-lived groups where tasks come and go.
std.net moved to Io.net
const net = Io.net;
// connecting
const host_name = try net.HostName.init(host);
const stream = try host_name.connect(io, port, .{});
// listening
var addr = try net.IpAddress.parse("::", port);
var server = try net.IpAddress.listen(&addr, io, .{ .reuse_address = true });
defer server.deinit(io);
// accepting
const stream = try server.accept(io);
// reading/writing (need wrapper)
var reader = net.Stream.Reader.init(stream, io, &read_buf);
var writer = net.Stream.Writer.init(stream, io, &write_buf);
net.Stream no longer has direct read/writeAll. use Stream.Reader/Stream.Writer.
Evented production experience
field notes from running an AT Protocol relay (~2,800 PDS connections) on
Io.Evented with 0.16.0-dev.3059, kernel 6.8.0-101-generic.
fiber contextSwitch GPF under ReleaseSafe
Io.Evented fibers crash immediately under ReleaseSafe on x86_64. the GPF is
in std.Io.fiber.contextSwitch — the inline asm that saves/restores
rsp/rbp/rip. the optimizer under ReleaseSafe arranges the code differently than
ReleaseFast, causing the restored instruction pointer to fault.
General protection exception (no address available)
lib/std/Io/fiber.zig:30 in contextSwitch
lib/std/Io/Uring.zig:1142 in mainIdle
consequence: Evented currently requires ReleaseFast, which strips all safety checks. any bounds error, null dereference, or use-after-free becomes silent memory corruption instead of a clean panic with stack trace.
status: zig stdlib bug. no workaround other than ReleaseFast. a minimal repro (fiber that returns without yielding) triggers it on the first context switch.
cross-backend bridging (Evented fibers ↔ Threaded workers)
the Io interface is backend-agnostic, but you cannot mix execution contexts. Evented fibers cannot safely lock a Threaded mutex — the scheduler accesses thread-local state that doesn't exist in the fiber context.
pattern: bridge with a lock-free MPSC queue using atomics:
[Evented fibers] --atomics→ [ring buffer] --wake→ [Threaded worker pool]
Evented subscriber fibers enqueue work items via atomic CAS. a bounded set of Threaded workers dequeue and execute (e.g., postgres queries). no mutex crossing between backends.
this is the "DbRequestQueue" pattern — decouples the hot networking path (Evented) from blocking I/O (database) that can't run in fibers.
safety checks matter more under Evented
under Threaded/ReleaseSafe, a bounds error panics with a stack trace pointing to the exact line. under Evented/ReleaseFast (forced by the GPF bug), the same error silently corrupts memory and manifests as a SIGSEGV minutes or hours later with no useful diagnostic.
example: a websocket library assumed \r\n always arrives in a single TCP
read. when TCP splits mid-CRLF, line_start advances past pos and the
next buf[line_start..pos] slice has start > end. under ReleaseSafe this
is an immediate panic:
thread 543 panic: start index 1370 is larger than end index 1369
websocket.zig/src/client/client.zig:766
under ReleaseFast: silent corruption → SIGSEGV every 30-90 min across ~2,800 connections. took switching back to Threaded/ReleaseSafe to get the stack trace that identified the real bug.
lesson: when forced into ReleaseFast by the fiber GPF, you lose the single most valuable debugging tool zig provides. any bug that would be trivially caught by bounds checking becomes a production mystery.
thread count: Evented vs Threaded
| backend | OS threads | subscriber tasks | RSS |
|---|---|---|---|
| Threaded (ReleaseSafe) | ~2,830 | ~2,830 | ~1.9 GiB |
| Evented (ReleaseFast) | ~47 | ~2,830 | ~1.2 GiB |
Evented runs the same ~2,800 subscriber tasks on ~47 OS threads (bounded worker pool + io_uring event loop). RSS is lower partly due to fewer thread stacks and partly due to ReleaseFast stripping safety metadata.
uring networking patch
Io.Uring ships with networking functions stubbed out as *Unavailable
(return error.NetworkDown). to use Evented for real networking, you need to
patch Uring.zig to implement netListenIp, netAccept, netConnectIp,
netSend, netRead, netWrite using io_uring opcodes (ACCEPT, CONNECT,
SENDMSG, READV, etc.).
note: bind and listen use sync syscalls because IORING_OP_BIND /
IORING_OP_LISTEN require kernel 6.11+. DNS resolution (netLookup) is
also not patched — subscribers resolve hostnames through a Threaded pool_io
fallback.