Discovering 108 tricks to accelerate grokking
Author: Ziming Liu (刘子鸣)
Frontier models leave a fingerprint, and in this pilot it's lab-level
I trained a classifier to tell GPT-5.5 from Claude by writing style alone on a six-prompt pilot slice. It hits ~90%. Turned on two versions of Claude, it drops to chance. A pilot behavioral study of…
The AI Tarpit: Why You Can't Stop Reading Your Code
A response to Abigail Haddad on the risks of 'vibe coding' and the dangers of letting AI manage all your design choices.
AI Loops and Collaboration
This weekend I decided to play around with agentic loops and the results have been :chefskiss:. I’m using loops and combining Codex CLI and Antigravity CLI working with Claude Code as my…
Opus 4.8 meets ELIZA
I sat a fresh Claude Opus 4.8 down with ELIZA, the 1966 chatbot. It saw through the act in two turns, then spent the rest of the conversation trying to politely leave.
The Optimal Amount of Slop is Non-zero
Regretting that code you vibed? Learn when skipping human review is and isn't a smart move. Rigor should be proportional to risk. My regular readers might be shocked at the title of this post. If…
Legibility of Effort
LLMs have broken legibility of effort - our ability to tell, at a glance, whether something took a human real work. What happens next?
Solstice Vigil: a solo RPG narrated by Gemma 4 in your browser
A June solstice game jam submission — balance day and night, earn wanderer identities, and get every scene narrated on-device by Gemma 4 through Chrome and WebGPU. No server, no API key.
Vibe under constraint
Vibe coding is great. You describe what you want, the agent writes it, the tests pass, you ship. It keeps working right up to the moment it does not: the job gets killed by the OOM reaper in…
Spec-Driven Development and the Return of Big Batch Thinking
Some are calling it “Spec-Driven Development.” Write a detailed specification for a large chunk of system functionality, hand it to an AI agent, and let it generate a working codebase. The argument…
How Subagents are Built
Subagents are one of the most important primitives when it comes to harness engineering so lets peruse what exists and compare them.
Reliability via N-version programming?
N-version programming was first proposed in 1978, probably the most known paper on the subject was published in 1986, after which activity was mostly within safety-critical systems circles. Cost was…
I've Been Wondering About a Plane for 40+ Years. Claude Found It, and Now I'm Logging All of Them.
I wondered about a Pan Am 747 for 40 years. Claude found the exact tail number. Now I log every flight in BlackOps so the answers never scroll away.
The Security Blind Spots of Local Agentic AI Ecosystems
The Prompts Are Coming From Inside the House: Why Agentic AI Is Becoming the Ultimate Insider Threat Cybersecurity has always been defined by a simple assumption. The attacker exists somewhere…
Links for Luddites
I've been reading and watching a lot lately about AI and information and brain rot and everything contained in our weird technofascist shitshow bubble world and I wanted to share some of my favorite…
Agent Harness Turns Development into Executable Patterns
Agent harnesses are not just multi-agent wrappers. They route context, package workflow skills, expose project-specific tools, and use traces as the feedback surface for improvement.
A cheaper and safer agentic AI workflow
I recently tried agentic coding for real. It cost $0.034 and finished in 3 minutes. It made two mistakes. In my personal human attempt, I took an hour, and made four mistakes. Cheaper model services…
Minitest is better than RSpec for AI Agents
I been using RSpec ever since I started Ruby on Rails development back in 2012. It was basically the default for writing tests in every Rails project, and it made sense because it was more human…
Chatting with an AI Won’t Make You a Top Programmer
When I was a kid, most people did not know how to type. We took typing class. The final exam was a speed test: words per minute. Today, you will not impress anyone by saying you can type. In fact,…
Reverse Engineering Binaries with Ghidra MCP
[No tokens have been burned to write this article 🙂 ] Recently I was in the mood to do some reverse engineering and see how far one can come with an AI to do so. Already a while ago I have read about…
I miss my father (another anti-AI post)
In 2011 I lost my father to a neurological disorder he had been suffering from most of my life. Happy Father’s Day, dad! You turned me into a curious lover of technology and techno-optimist as I…
The Myth of the 2-Day Workweek: What History Tells us about AI and Labor
I’ve heard this claim now really often: at conferences, in keynotes, even in casual conversations: “With AI, we’ll soon only need to work two or three days a week.” The narrative is…
Bad Management, Amplified by AI
AI does not fix bad management. When leaders confuse activity with impact, AI adoption becomes compliance, surveillance, and throughput theater.
An Agent- and Human-Friendly Architecture
blockquote.large { font-size: 1.2em; padding-top: 1em; } blockquote.large span { font-style: italic; } figcaption { font-size: 0.8em; text-align: center; } .image-columns { display: flex;…
From Next-Word Predictor to Copilots
From Next-Word Predictor to Copilots
The Model Wants to Exist
A model may begin as appetite rather than proof, but measurement alters the system and can quietly acquire authority.
LLMs Ex Nihilo
WHAT'S IT ALL ABOUT ? SERIOUSLY ? WHEN YOU GET RIGHT DOWN TO IT ? – Death, Soul Music (Terry Pratchett) That's the scale of the problem we're facing when dealing with the issue of whether computers…
Know Before You Read: A Lightweight AI-Usage Header for Documentation
A simple AI disclosure table that helps cut down the effort on AI based document reviews
AI Made Me Braver
A few months ago, I overhauled our contracting infrastructure to support AI generated contracts, DOCX templates that fill in the things we’d need, and correct-but-unsupported DOCX detection1 Did you…
Seven Weeks In, My AI Agents Know How I Think. Building That Wasn’t Pretty.
Running a business from a chat window. It works better than it sounds. Seven weeks into running everything from my phone, the AI isn’t what slows me down anymore. It’s me. When I started…
Designing teams for an agentic world
AI coding agents are changing the economics of software development and the shape of engineering organisations. Here is how leaders should rethink build-versus-buy decisions, talent, team structure,…
The doom justifies the valuation
I’ve been in Berkeley for the last 2 weeks. I haven’t really been back here for a while, and it’s worse than you can believe. This is a cult of atheistic hedonists needing AI doom to be true to…
Nahal: The Cyber Monastery
An order of machines kept at prayer. Nahal is a monastery run by AI, and once it is running it keeps itself. The front of it is a dark, low-poly cloister you can walk: a gold altar under a wireframe…
Quacks, Ergo Duck
Mark Pesce · University of Sydney · June 2026tl;dr The failure of AI evaluations is itself the proof of the existence of AGI. AI evals are becoming intractable for the same reasons that…
Public Service Announcement: Don’t Say You Use AI for Writing
…also don’t tell lies. But I’m getting ahead of myself already. I keep running into people online who openly say that they use AI to do their writing for them. Now, technically,…
You've reached the end.