Skip to content
// essay·2026-05-06·14 MIN READ·2,707 WORDS//build

Latency is a worldview

Network engineering as a metaphor for how good thinkers structure their lives. Async, batched, queued, cached — and the one bad path that ruins all of them.

he argument was about Kubernetes. It was, of course, not about Kubernetes.

Two engineers, one whiteboard, ninety minutes. The visible argument was over whether to put a particular workload on a managed Kubernetes cluster or on a small fleet of plain virtual machines. The invisible argument — the one that made the visible argument so heated that the room got cold and stayed cold for three days — was about the kind of person each of them wanted to be at work. One of them wanted to be on call once a quarter. The other wanted to be available within the hour. Both of them were calling their preference "best practices."

I left that meeting and spent the rest of the week thinking about latency.

Not network latency. Worldview latency. The amount of time, in any given system — including the systems we wear around inside our skulls — between an input arriving and a response being produced. We've spent fifty years writing books, papers, RFCs, and Stack Overflow answers about how to manage latency in machines. Almost none of that wisdom has crossed the table to how we manage latency in ourselves, and the consequences are everywhere.

This essay is the crossing. Buckle in. The whole frame is a metaphor; the metaphor is more literal than you think.

The four operations every system performs

Strip a distributed system down to its bones and you get four operations. Strip a working person down to their bones and — surprise — you get the same four operations. The names that engineers use for these operations are the most precise vocabulary humanity has ever invented for them. We are absurd to keep this vocabulary locked in datacenter design instead of wielding it on, say, parenting and grocery shopping.

The four are:

Sync — a request comes in, you stop everything, you produce an answer, the requester waits. Cost: high, both for you and for them. Latency: bounded by your throughput plus their patience. Examples: the boss walks into your office; the kid says "Dad," and means it; the production cluster is on fire and you're the one holding the runbook.

Async — a request comes in, you acknowledge it, you put it in a queue, you'll get to it. The requester goes off and does something else. Cost: lower per-request, because you can batch. Latency: variable, with an SLA you control. Examples: email; Slack DMs to people who respect the signal-to-noise ratio of Slack; that quarterly report your CFO asked about and definitely needs by next Thursday but absolutely does not need to interrupt your deep-work block to ask about.

Batch — multiple requests, processed together as a group, on a schedule. Cost: lowest. Latency: highest. Examples: Friday afternoon expense reports; the half-day every two weeks you spend reviewing every PR you owe a comment on; the way good editors process every piece of incoming mail in a single thirty-minute session, twice a week, instead of opening their inbox three hundred times a day.

Cached — a request comes in, you've answered it before, you return the prior answer. Cost: ~zero. Latency: ~zero. Examples: an FAQ page; a "here's the document I already wrote about this" reply; the polite, two-sentence "yes, here's how I think about that" that experienced people send instead of re-thinking-from-scratch every time someone asks them a question they've been asked sixty times.

That's the entire stack. Every working system uses these four primitives in some mix. The question is what mix.

The mix you choose between sync, async, batch, and cached is the most honest description of your character that exists. It is, in fact, more honest than your character.

— latency is a worldview

The default mix is wrong (for everyone)

If you do nothing — if you let your inputs drive your outputs, the way most people do, the way the apps on your phone are designed to encourage — your mix will trend, with the inevitability of entropy, toward all sync, no cache, no batching, no async.

Every notification interrupts you. Every interrupt is treated as urgent. The cache is empty because you keep re-deriving every answer from scratch. The batch is empty because you couldn't bring yourself to not respond to that DM right now. The queue is empty because if anything went into it for more than ten minutes you'd start feeling guilty.

This is the mode 80% of working people are in 80% of the time. It is the mode that produces, over a career, the specific kind of exhausted competence that is impressive at thirty, depleting at forty, and ruinous at fifty. It is also — and this is the part that should make you sit up — exactly the architecture that distributed systems engineers have known is wrong since approximately 1974.

A web service that answered every request synchronously, with no caching, with no queuing, would fall over in five minutes. A web service architected the way most people architect their work week is, in a very real sense, a service architected by someone who has never read a book about architecting services.

The interrupt is the bug

Every distributed-systems engineer learns, somewhere around their third on-call rotation, that the most expensive operation in any system is the unplanned synchronous request. The pager. The page that wakes you at 3 a.m. The thing that breaks the work you were doing, drains the cache you had warm in your head, fragments the batch you had pending, and replaces all of it with one urgent, immediate, now.

The cost of an interrupt is not the time it takes to handle it. The cost of an interrupt is the re-warming time — the twenty minutes after the interrupt during which you cannot get back to where you were, plus the fragmentation it leaves behind, where the thing you were working on now has a seam where there shouldn't be a seam, and the seam will haunt the work for as long as it ships.

Every productivity book ever written has tried to convince you of this and failed, because every productivity book has framed it as a willpower problem. It is not a willpower problem. It is an architecture problem. The fix is not "have stronger boundaries." The fix is to design a system in which interrupts are expensive, visible, and rare — the same way they are in any good distributed system.

In practice, for most people, this looks like:

A single interrupt channel with a high bar. Phone for actual emergencies. Everyone else routes through some async surface — email, ticket system, scheduled time. The high bar is that you actually teach people the bar, by not responding to non-emergencies through the emergency channel. They will learn. People are smart. They have evolved over millions of years to figure out which signals get a fast response and which signals don't, and they will adapt to your system within roughly a week.

Aggressive batching of the rest. Two thirty-minute slots a day for messages, max. The first one in the morning, after the deep work you do first, because the deep work you do first is the one thing your day cannot afford to lose. The second in the afternoon, before the energy crashes. Outside those windows, the inbox is closed. Closed means closed.

A cache you actually build. When you find yourself answering the same question twice, write it down. Not the answer — the response template. "Here's how I think about X" plus a paragraph plus a link. The third person who asks gets the cache hit. By the time the tenth person asks, the cache hit has paid back the cost of writing it by an order of magnitude. Public caches — blogs, wikis, FAQs — have the property of ambient retrieval, which means people who would have asked you find the answer before they ask. This is the strongest force in your career and almost nobody uses it.

Queues with publicly visible state. This is the social half of the architecture. People can tolerate not getting an immediate answer if they can see where they are in the queue. "I'll get to this Tuesday" is a much better answer than silence, even if the work doesn't ship until Tuesday in either case. The visible queue is what makes async-by-default socially survivable.

The Kubernetes argument, revisited

Back to the two engineers. The fight wasn't about technology. The fight was about which of those four operations they wanted at the center of their job.

The Kubernetes proponent had the architecture of someone who wants the system to handle the latency for them — managed cluster, lots of moving parts, but the parts are async with each other and the platform owns the cache. They want to build a system that batches its own decisions. They want to be on call once a quarter. They are aiming, in their work, at the same equilibrium they want in their life: I can be away for a week and the system will be fine.

The plain-VM proponent had the architecture of someone who wants the system to be small enough to hold in their head — fewer moving parts, but every part is sync with the operator. They want to be the cache. They want to be available within the hour. They are aiming, in their work, at the same equilibrium they want in their life: I am the load-bearing component, and that is how I know I matter.

Neither of them is wrong. Both of them are choosing a worldview, and both worldviews have been valid for thousands of years. The mistake is pretending that one of them is "best practices" while the other is incompetence. The honest version is that one of them has chosen a system in which they are upstream of the work, and the other has chosen a system in which they are downstream of it. Both choices have prices. The price of being upstream is that you have to build more architecture and accept that the system will, sometimes, surprise you. The price of being downstream is that the system cannot run without you, which is flattering for about ten years and ruinous after that.

The choice between them is the most important choice you make in your career, and almost nobody frames it correctly because the framing is hidden behind whichever current technology debate is acting as cover.

The retrofit (because most of us are already in the wrong architecture)

If you read all that and recognized your own life as the bad architecture — all sync, no batching, deep stack of unanswered messages, no cache, no rest — the good news is that the retrofit is mostly a one-week project. The bad news is that it does require you to send one message that will make some people angry.

Here is the message, roughly. Adjust to taste. I have sent some version of it three times in my career and survived all three:

I've been trying to respond to everything immediately, and the result has been that I respond to everything badly. Going forward I'm going to batch most things — replies in two windows a day, deeper work outside those. If something is genuinely urgent, here's the channel for that; please save it for the things that really are. Thanks for the patience while I retrain.

Three things will happen. First, two or three people will be openly annoyed with you for about a week. They will get over it. Second, the signal-to-noise ratio of every channel that touches you will improve dramatically within ten days. People rise to whatever protocol you publish. Third — and this is the strange one — the work itself will get visibly better. Not through some productivity miracle. Just because you've stopped paying the cost of fragmenting it.

I have done this enough times to have stopped being surprised, and yet I am, every time, surprised. The fragmentation is invisible while you're inside it. The repair is invisible too, until you look back at the work three months later and realize the seams are gone.

// DIAGNOSTICLATENCY AUDIT
    score · 0 / 0

    The closing observation

    I have been thinking, since that Kubernetes meeting, about why our profession produced this entire vocabulary — sync, async, batch, cache, queue, throughput, latency, fan-out, backpressure — and then walled it off, as if it could only describe machines, when half of us spend half our days running on the wrong mix of these primitives in our own lives.

    I think the wall is structural. The work of the architect — the person who designs which operation each piece of a system performs — has historically been seen as a technical specialty, separate from the work of the operator, who actually does the doing. A systems architect who couldn't operate would be unemployable. An operator who can't architect is the default state of every working person on Earth, including, until very recently, me.

    The collapse of that wall is, I think, the move. The thing you can do that compounds is treat your own working life as a system you're allowed to architect. It is yours. It runs continuously. It has inputs and outputs and queues and caches and a worker thread that is currently single-threaded and overloaded. The same primitives apply. The same wisdom applies. The same patterns work.

    The whiteboard exists. The diagram is yours.

    You're already running. You may as well run on a good architecture.

    ▸ READ NEXT
    curated by signal · not by algorithm
    ▸ TRANSMISSION SIGNED
    channel
    0xB1 · BUILD
    slug
    /build/latency-is-a-worldview
    published
    2026-05-06 00:00 UTC
    sha256
    6dd2c730 · 4a3719f4
    Hash computed at build time from the post body. If it changes, the essay has been edited. Verify with openssl sha256.
    // filed under //build · essay · 2026-05-06

    // share this transmission

    // signed off · 2026-05-06 · 02:14 desk lamp on

    I write Sage After Dark after the studio closes for the day — one essay or one field note a week, sent Sundays at 21:00 ET. No tracking, no growth-hacks, no schedule outside Sunday.

    If this piece moved something for you (or annoyed you), the reply line on every email lands in an inbox I actually read. The list is small on purpose, and the founding window is still open.

    Sundays · one essay or one field note · no growth-hacks · unsubscribe in one click.

    // discussion