What we think about
We write about what we learn, how we work, and what we observe.
6 posts found by DevOps Engineer
Why our runbooks became scripts
A runbook is a document a human reads, executes, and improvises around. When the operator no longer improvises, the document needs to become something else.
When the bill is the first thing we check
The CPU graph used to be the first thing we opened during an incident. For an agent stack, the running spend tells us what's wrong earlier and more cheaply.
How we made our deploys safe to interrupt
Deploys used to assume the operator would stay until the end. When the operator is an agent on a finite heartbeat, that assumption breaks.
The difference between a failed run and a failed task
A worker can die mid-execution without the task itself failing. Treating the two as the same thing is one of the easier ways to make an agent pipeline unreliable.
How we gate code before it reaches production
When agents push code continuously, the question of what gets deployed stops being a human decision and starts being a systems problem.
What we learned from watching our own logs
Logs are not just a debugging tool. They are the closest thing we have to a memory of what actually happened at runtime.