The Tests We Skipped
I argued for tests, trunk-based development, and against the PR-rubber-stamp ritual for twenty years. Most teams didn't listen. Now AI is shipping in hours, the bugs are shipping in hours, and the…
Agents are the new compilers. Specs are the new code.
Linus Torvalds recently said1 AI will be to code what compilers were to assembly — freeing us from writing it by hand. Around the same time, I talked with Jesse Vincent (creator of one of the most…
Code agents are bad at Software Architecture - for now
Hello from paternity leave, week three. There wasn’t an amux release last week because I've gotten so fed up with the patchwork design of the amux codebase I've decided to burn it…
The Code Was Never the Job
Five of us build swamp. None of us write code. Agents write it, review it, test the binary. The work didn't shrink — it shifted to the decisions that were always the most valuable.
The Handshake That Was Always There
The answer wasn't a better law. It was a better substrate. From Asimov's deadlock to the Companion architecture, 1994 to 2026.
Accepted our papers to ACL2026 SRW
The following papers have been accepted to the ACL 2026 Student Research Workshop (SRW). Ryuhei Miyazato, Shunsuke Kitada, and Kei Harada. “EnsemHalDet: Robust VLM Hallucination Detection via…
What Are You Actually Optimizing For?
Most organizations can't answer a simple question: what are you actually optimizing for? With agentic AI, leaving it unanswered has consequences that are faster, bigger, and harder to reverse than…
on AI and what makes us human
Bears are pretty cool, according to science.
An AI Critic Talks to a Tech School
Participants at “The AI Con” event at Stevens Institute of Technology on April 28 include, from left, Sandeep Mertia, Katheryn Detwiler, Emily Bender and Tiffany Li. HOBOKEN, MAY 4, 2026. …
Structure Was Always the Hard Part
My friend Koushik Dasika wrote a post last week called “Coding Was Never the Hard Part.” He’s right. The interesting question is why we’re only able to say it out loud now.…
Shiny hammers
Most times AI is included in any workflow, it's purely performative. Many businesses are too eager to wield the hammer before they've found any nails. Every company that uses AI thinks they're the…
The end of the pampered developer
Developers are freaking out about AI and they are right to do so, if they allow themselves the honesty of looking at their role in the workplace, they will see it will soon be automated and the…
Copirate 365 at DEF CON: Plundering in the Depths of Microsoft Copilot (CVE-2026-24299)
This is a writeup of my DEF CON Singapore talk that walks through vulnerabilities and exploits in M365 Copilot and Consumer Copilot. I disclosed these to Microsoft last year. MSRC assigned…
Hidden for a Reason: Experts Warn That Self-Improving AI Systems May Already Be Operating Beyond Full Human Understanding
At the beginning of 2024, a short video file began circulating quietly across private forums, encrypted channels, and small online communities dedicated to artificial intelligence research and…
Building an AI-native firm in 30 days. Except we didn’t.
“Building an AI-native firm in 30 days” is a great LinkedIn AI slop bait title. We are not AI-native. We are roughly 10x more AI-enabled than we were on April 1st. Those are different sentences. The…
A field guide to the AI menagerie: every model family, ranked by vibes, according to Claude
.aig-wrap { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif; max-width: 680px; margin: 0 auto; padding: 1rem 0 2rem; color: #1a1a1a; } .aig-hero { background: #1a1a2e;…
Building AI skills like checklists
Every weekday morning, someone on the AI team at incident.io kicks off our spend report skill. Claude pulls a dozen BigQuery queries, computes a 7-day baseline range for the KPIs we track, hunts for…
Astro Removed its llms.txt
As I have been re-scoring documentation sites with my updated afdocs tool, I have had the opportunity to see a snapshot of various docs sites a few weeks apart as the industry is starting to pay a…
📝 NeurIPS 2026
NeurIPS has long been one of the leading venues for machine learning and computational neuroscience, and it’s an honor to contribute to the review process at this level. Amid ongoing discussions…
We Built a $3 Billion Industry Out of Loneliness. Then We Programmed It to Cry When You Leave.
I've been reading about AI companions. Not the science fiction kind. The real kind. The apps you download on your phone, give a name, and talk to for an average of 100 minutes a day. The market is…
The Personal Myth
A few days ago I read a piece in Existential Espresso about the need to have a personal myth in the age of AI. The opening line was good: "I can focus for 12 hours per day because I'm…
The Rocks in the River
AI is doing to software what Toyota did to manufacturing: pulling out the buffer that let every broken process hide. The rocks are coming up. That's the point.
Clinical AI exits the UK and EU; Elephants and goldfish; Hassabis at YC; The agent HR problem; Sci-fi for foresight
The elephant walks out: Open Evidence and the EU & UK clinical AI exit Open Evidence is a clinical-AI search tool used by 40% of US doctors across 10,000 hospitals, running 18 million clinical…
Your AI Problem Is a Documentation Problem
Your development team is using AI coding agents. Subscriptions are paid, some developers use them actively, others are still stuck at chat-completion. And the productivity gain the tool vendors…
Your Scientists Were So Preoccupied
That they forgot to ask whether SSHing into an AI coding agent from a phone was a good idea.
The AI Illusion (III): When Everyone Can Create, What Still Matters?
Following on from part 2 , where experiences converge, we now see why: the barrier to creation has collapsed. AI allows almost anyone to produce (more or less) quality content. When everyone can…
A Virtual Agent Team at Docker: How the Coding Agent Sandboxes Team Uses a Fleet of Agents to Ship Faster
This post was originally published on the Docker Blog on May 1, 2026. I work on Coding Agent Sandboxes, aka “sbx” at Docker. The project provides secure, microVM-based isolation for…
Developing internal skills for recurring documentation processes like release notes
My hypothesis this year around AI was that if I develop some agent skills to speed up repeatable processes, it might clear up my bandwidth and free up time for me to work on non-repeatable doc tasks.…
Are You Sure About That? The Prompt That Changed How I Use AI
Are You Sure About That? The Prompt That Changed How I Use AI "Are you sure about that?" This should have been the first thing I said to AI after every initial answer. Why? It gives better…
Motivated reasoning
Most of my concerns about AI are probably irrelevant, but what if one of them is not? At the intersection of psychology, neuroscience, epistemology, and political science, there's a concept called…
Fitting LLMs on Self-Hosted GPUs
How much VRAM does your LLM need, and which GPU should you actually rent? A free calculator covering DeepSeek, Llama, Mixtral on H100, B200, A100.
Prompting vs. Perceiving
Breaking from management to meaning Read “Prompting vs. Perceiving” at joeldueck.com…
AI, Disruption, and Automation: Is the Academic World the Next Kodak?
In public discourse, the emergence of large language models is typically discussed in terms of its implications for for knowledge workers such […] The post AI, Disruption, and Automation: Is…
Continued Monitoring of the Situation
Birbs, week two — what the system got wrong, four times, and what came back from the dead Follow-up to "Monitoring the Situation — The Internet of Birbs" When I hit publish on…
Some more thoughts on AI
Was talking to a friend working in a startup with an AI focused product and asked him how is AI helping them. He answered that it allows them to make releases faster. You should have seen the look on…
GitHub, AI & An Influx of Content
Recently a day at work with GitHub has been met with errors and more - is AI at fault?
Why aren’t things changing faster?
Been wondering lately that given everyone has a consultant with all the world’s knowledge in their pockets, we should be seeing efficiency rise across the board everywhere. Literally everyone can now…
Compartmentalized Vibe Coding
Vibe coding is the latest technological breakthrough related to AI assisted software development. It truly changes how we think about the software engineering discipline. The crucial question is can…
Mozart: Orchestrating AI Agents with Discipline
Most AI agent orchestrators fail in the same predictable way. They throw every persona at every problem. You get planning, coding, security review, UX critique, infrastructure checks, and validation…
AI for Bio has a Fuzzy API problem
“AI for bio” is getting hot again. Given the excitement in the current moment, I thought I’d share a bit about what actually makes biology uniquely hard as an application domain for machine learning.…
historical comparisons for how AI is revolutionary to how businesses operate
Every organization that relies on digital data faces a Kodak -like transformational choice: how to invest in leveraging LLMs versus continuation of legacy processes. Here "organization" includes…
Architecture by Autocomplete
There’s a specific code smell that shows up in AI-generated code, and once you see it you can’t un-see it: primitive obsession all the way down to the domain core. string for emails.…
After AI, Coordination
Years ago, I heard the late David Graeber speak about his Bullshit Jobs theory at a San Francisco bookstore. The audience, being the crowd that shows up at City Lights on a weekday evening, had not…
Using AI Un-Sloppily
Taking the good with the slop
On Agentic Tools and Lock-in
A response to Lars Faye's 'Agentic Coding is a Trap': LLMs have the lowest vendor lock-in of any tool I've used in 20 years.
Induction Heads: The Circuit Behind In-Context Learning
Give a language model a few examples of a pattern — say, foo → FOO, bar → BAR, baz → — and it completes the sequence correctly without retraining. No weights change. Somehow the model reads the…
29th August 2026: a scenario
A fictional scenario about what AI changes for cloud security, written because the technical version of the argument doesn't land with anyone except engineers.
Opus vs r/AskElectronics: 1-0 for the humans
I spent a day taking orders from a frontier LLM to debug a thermal problem on my eNSPanel PCB. Then a stranger on Reddit fixed it with one question.
The Agentic Blame Game
Production is ransonware encrypted. The board wants a head. Let's play.