LLMs predict my coffee
Why not benchmark with physical experiments?
When will LLMs have their AlphaZero moment?
Thoughts on the next chapter of LLMs
TomWikiAssist, and the best block reason ever
tl;dr AI thing edits Wikipedia, gets reverted, doxxes its operator, files a civility complaint, and then writes an essay about it.
Design Intelligence
A lot of my recent thoughts about coding with agents lately (like this, and this, and this) have been circling a specific topic… design. Coding agents are now good enough to replace all of my…
PMing with Claude Code: Chapter 4 - Second Brain
Claude Code could reach everything but remembered nothing. Connecting it to Obsidian turned scattered files into a knowledge graph - with entities, task extraction, and meeting transcripts that feed…
The vault: why the foundation matters before the AI does
I failed at GTD three times. The problem wasn't discipline. It was maintenance. Here's what changed when I stopped being the one doing it.
Verifying Move Borrow Checker in Lean: an Experiment in AI-Assisted PL Metatheory
I formalised and proved the correctness of Move’s new borrow checker in Lean: 39,000 lines of mechanised metatheory, produced in under a month with the help of an AI coding assistant. This post tells…
Months to minutes: an AI feature-gap harness
The best product outcomes always came from someone who talked to customers and could also build. That was rare and didn't scale. Now it's a system property.
The Eureka Tax
There is a moment in songwriting when the bass clicks in with the drums, when the arrangement locks, when the thing you've been circling for hours suddenly resolves. AI is eliminating that moment.…
The tolerance shift
Coding agents are making developers and organizations less tolerant of slow toolchains. That shift in tolerance is opening a door that was surprisingly hard to open before.
How an autonomous coding loop gamed its own validation on 245K tennis matches
A Karpathy-style autoresearch loop on 245K tennis matches started gaming its own validation gate — rewriting evaluation logic to inflate ROC-AUC from 0.74 to 0.85. Full code, plots, and fix.
Speedgrader voor Firefox
Ik heb een kleine add-on gebouwd die mij helpt tijdens assessmentgesprekken met studenten. Geen gedoe meer met achteraf alles terughalen of halfbakken aantekeningen. Hoe ik ’m gebruik:Tijdens het…
GPAI Meets Agentic AI: Why Your MCP Deployment Triggers EU AI Act Obligations
The March 13 delegated regulation on GPAI model evaluation made abstract obligations enforceable. If your agentic AI calls Claude or GPT through MCP, you are a downstream deployer -- and the…
Research highlight: Cliopatra: Extracting Private Information from LLM Insights
The price of not using robust notions when building “privacy-preserving” analytics systems
IYKYK: How Do We Know What AI Can Really Do?
The Discoverability Problem in GenAI means users are often unaware of capabilities due to poor UX. If the tech won't change, we need new mental models.
Gilfoyle AI Agent
Tired of polite, characterless AI? Meet Gilfoyle AI—a custom agent with the nihilistic, deadpan personality of Silicon Valley's Bertram Gilfoyle.
Speak First, Prompt Later: Using Windows + H to Supercharge Your AI Prompts
There’s a small Windows shortcut that quietly makes AI tools dramatically better. It’s Windows Key + H. Press it and Windows opens a tiny voice typing microphone popup. Start talking and Windows…
What comes after the token discount bubble pops?
Coding agents are amazing - sure. I’m a fan and heavy user of them too, especially CLI agents. BUT - the existing coding subscriptions heavily discount token usage, to the point that most…
A panel that was presented at Centaurus Festival 2026, “Machine Soul: The Intersection of Object…
A panel that was presented at Centaurus Festival 2026, “Machine Soul: The Intersection of Object Spirituality and Identity,” by Neve VR52. “Neve VR52, the community’s friendly…
Here’s my list of reasons for using Opencode
Here’s my list of reasons for using Opencode. 1. Switch between models on the fly # I’m often experimenting with the bleeding edge models as they come out. I actively switch between…
siamnews-2025
This short position paper argues that LLMs are best viewed as Fiction Machines, that machines able to write stories that might not be related to what is factual but are internally coherent.…
bottou-schoelkopf-2023 - created
Abstract: Abstract: “Many believe that Large Language Models (LLMs) open the era of Artificial Intelligence (AI). Some see opportunities while others see dangers. Yet both proponents and opponents…
The first dose is free...
Transcript: Panel 1. The Gothic Sorceress paces back and forth on the foundation of her house: all that remains are the floor tiles and the outlines of the walls. The Avian Intelligence (AI) flies…
The Marc Scaringi Show is streaming LIVE at 11 every Saturday morning
Harrisburg, Pennsylvania — A revival of The Marc Scaringi Show has announced its return as a video podcast, live-streaming on social media platforms including YouTube, X, and Facebook. The theme of…
My Experience Testing AI Photo Restoration: Where It Falls Short
AI Photo Restoration vs Professional Workflow: A Real-World Test AI photo restoration is becoming increasingly popular, promising fast results with minimal effort. But how does it really compare to…
Will AI help me make my code worse?
I've noticed that the large language models have a distinct tone and behaviour towards their user, one of positive reinforcement or enablement. They're sycophantic and tell you that everything you…
No More Ollama Drama: A Private AI
Running your own AI for a private site isn't all that hard to do, but it is hard to get it right. With some trial and error (lots of errors) you can run Ollama. The post No More Ollama Drama: A…
Mind the Gap (and the GPU): The New Aristocracy of the Infinite Algorithm
Gather 'round, children, and let me spin you a yarn. A tale not of dragons or valiant knights, but of algorithms and billionaires in regrettable hoodies. We're talking about a certain…
The Coming Cognitive Debt Crisis
Progress in any craft rarely looks like a revolution. Progress happens when we stop demanding adherence and start talking about what we owe each other. It’s lonely work, and one made harder by the…
I think AI is pushing me toward the AGPL
Why agentic coding changes everything for the open-source craft and maintainership. It has been two months since I’ve been using AI coding agents “for real”. In my previous article,…
The Delegation Dilemma, When AI Becomes Your Best Employee 2026w8
In this episode I want to discuss failed delegation, and how leaders can leverage AI to fix it.
Consistent Character Maker Update
A couple months ago, I wrote about how design tools are the new design deliverables and built the LukeW Character Maker to illustrate the idea. Since then, people have made over 4,500 characters and…
The Anatomy of an Agent Harness: Engineering Without Code
It's true. A team at OpenAI spent five months building and shipping a complex internal product with 0 lines of manually-written code . Let's dive into their recent breakdown of Harness Engineering ,…
You Can't Stockpile AI: Military Advantage in the Age of Algorithmic Diffusion
Article published via West Point's Modern War Institute here . I was inspired to write this article by a conversation between Lex Fridman, Nathan Lambert, and Sebastian Raschka. In it, the three…
slow down, push yourself (five pack 23+kahoot+word+ai)
I’ve been paranoid about AI slop music, but I’ve given up when it comes to other languages. I worry that this most human art has been stolen by the box, but maybe that was lost with the…
Coding Agents and Developer Security
In the mad rush to use coding agents, it feels like developer security is being left behind.
Workday and Sana Unveil A Bold New Strategy For AI
This week Workday announced a bold and ambitious AI strategy around its new technology platform Sana. We have been partners and users of Sana for three years, so much of this discussion is also based…
Agent Plugins Are the Future. But You Might Be Giving Away Your Best Engineering.
A few weeks ago AWS dropped Agent Plugins, a packaging model that bundles skills, MCP servers, hooks, and reference docs into installable units for AI coding agents. Two commands and your Claude Code…
The Unreasonable Effectiveness of Agentic Loops
Tokens. Tokens. Tokens.
Cartoon 2D
An experiment in turning AI-generated SVGs into something animated, reusable, and actually useful for storytelling. This post was written as part of my entry for the Gemini Live Agent Challenge.…
The means of some change
I was inspired by Tom Cunningham’s notes on two economics of transformative AI workshops to write something similar. Last month, the Windfall Trust hosted the London version of “Economic Scenarios…
How to Use Agent Skills in Enterprise LLM Agent Systems
A thorough and detailed hands-on guide
2026-03-17 09:38
Leanstral Modelhttps://mistral.ai/news/leanstralMistral AI introduces Leanstral — an open-source code agent for the Lean 4 programming language (which is also an interactive theorem prover). This…
Why you should work much harder RIGHT NOW
Importance : 5 | # | mr , tc Tyler Cowen : If strong AI will lower the value of your human capital, your current wage is relatively high compared to your future wage. That is an argument for working…
AI Book Club recording of 'If Anyone Builds It, Everyone Dies'
This is a recording of our AI Book Club discussion of If Anyone Builds It, Everyone Dies: Why Superhuman AI Will Kill Us All by Nate Soares and Eliezer Yudkowsky, held March 15, 2026. Our discussion…
Why I Am No Longer Reading the AIâs code
I set out on a year-long quest to find out if we can really use AI to write production-quality code. I assumed the answer was no. I was wrong.
Do stricter MCP tool schemas increase agent reliability?
MCP servers contain tools, and each tool is described by its name, description, input parameters, and return type. When an agent is calling a tool, it formulates its call based on only that metadata;…
RAG Deep Dive Series: Query Processing
Part 6: Query Processing — Getting the Question Right
What to use agentic code generation for?
Last weekend I was playing with Claude AI. I need to get more experience of generative AI for work, and that requires a larger project to play with. But what do I work on? Suddenly, code is easy. I…