agents
Multi-model debate, subagents, review pipelines.
- StoryI'm calling it nowYou are only meant to find out about an inspection on the Monday morning it lands. For Victoria C of E Infant & Nursery School in Berkhamsted, that call comes next week.
- EssayBuilding the context up front: /deep-contextIf you keep fixing the same problem and it keeps coming back in a slightly different form, search is often not enough. This post is about the Claude Code skill I built for that case, the worked example that proved it, and the two attempts that did not work first.
- Essay"Email me when done": a persistent task runner with a delivery guaranteeLong-running tasks fail silently if the session dies before the result is ready. This is the runner I built to make "email me when done" actually mean that, plus the retry loop, fallback paths, and last-ditch file write that back it up.
- StoryWorking with Claude to build a merchant-risk model, then everything on topAfter the hackathon I spent a couple of weeks working with Claude on the merchant-risk model. What we built across the sessions, what I learned about variable engineering, and the parts that work well now versus the parts that are still a work in progress.
- StoryOne hour, one marketing listA vague ask ("give me a list of prospects that look like X") turned into a working pipeline in under sixty minutes: Companies House metadata, the filed accounts as scanned PDFs, Claude reading them for the signals I needed, and a final ranking with commentary. Small build, but the speed is the point.
- EssayThree-way AI model debate as a pre-commit gate: receipts from several months of useRunning Claude, Gemini and GPT-5.4 on the same question in parallel, blind first round, informed afterwards. What works, what does not, and the confidence trap I did not predict.