Goblins, Grapefruit, and the Mythos of "Mythos"
Friday field notes from the AI multiverse
Welcome back to this Friday afternoon edition of Coffee with Claude – where I take the time to slow down, take stock of my growing list of neglected bookmarks, and do my best to contextualize the week’s AI news.
My promise and premise is this:
Hot takes served up weekly – but only after the dust has settled.
I’m still playing around with formats. In addition to the weekend deep dives, I hope to share these meandering reviews where I reflect on the links and stories that caught my eye.
If you want the bleeding edge and up-to-the-minute hype on the latest OpenAI models (like Codex 5.5), check out the Neuron or the Rundown.
If you want to sip coffee while reading my ramblings, stick around.
—Charlie
P.S. I’m sticking with the name Coffee with Claude for now – partly from inertia and partly from affection for Anthropic’s soothing color scheme.
Gleanings
Same drudgery, different interface.
Sam Altman just said something similar to what I framed as the optimistic scenario for AI’s evolution – namely, that it could let us do a lot less clicking around on a computer screen:
“The degree to which most people will realize they can like, sit back and watch an AI do most of their drudgery, is going to surprise people.”
I don’t think this world has materialized for anyone yet. It’s possible, but it’s also possible that instead of clicking around between apps, we’ll just end up clicking between chats.
The Last Mile
Box CEO Aaron Levie has another banger essay elaborating on his previous application of Jevons Paradox to the AI economy. Why does AI tend to create more work for human beings, not less?
He says that it’s because AI is really good at starting things and getting them tantalizingly close to completion. But you then drive yourself crazy with the last 20% or 10% or 5% or 1% of the thinking that can’t be delegated, and as Levie warns: “the last mile is really what matters.”
There’s also this:
“As we get better at automating larger tasks in the enterprise, this just leads to greater expectations from customers and the market, which then causes an expansion of what the role must do to stay competitive...
As such, we use the tools to raise the bar of the work product, and then a new “last mile” simply emerges in the process. This continues ad infinitum.”
Welcome to the AI hedonic treadmill.
The Aesthetic of Least Resistance
The case in point for Levie’s argument may be something like Claude’s new Design feature, that lets anyone with halfway decent taste and some brand guidelines create what would have passed for stunningly beautiful websites, slide decks, and animations just a few short months ago.
Soon, those same pages will reek of AI slop because everyone will revamp their webpages using the same toolkit, and they’ll all look and feel eerily similar (despite superficially different color palettes).
As the goalposts shift, I maintain that the new winners will be those who are best at “deciding the exception” – feeling which way the wind is blowing, and like Steve Jobs, knowing what people want before they want it.
The Morning Routine
In addition to the new Design mode, Claude added another new feature called “routines” that allow you to create scheduled tasks that run like programs, but on natural language rather than traditional software automation that had to be governed by deterministic rules.
I thought this was a clever use case, from Aakash Gupta:
I have one Claude Routine that runs at 7:30 AM every weekday and it has changed how I show up to meetings.
It reads my Google Calendar, finds every meeting with 2+ participants, pulls the last 10 email threads with those people from Gmail, and writes one paragraph per meeting in my Slack DM:
What we last discussed.
The open ask.
What I should be ready to address today.
The payoff isn’t speed. It’s that I never walk into a meeting and forget the thing the other person is waiting on me to follow up on. The context that used to live in my head and disappear by Wednesday is now sitting at the top of my Slack at 7:30 AM.
This is a response to OpenClaw’s popularization of AI + cron jobs. By triggering your agent at set intervals, you give it a “heartbeat” – bringing us one step closer to omnipresent intelligence managing every aspect of our lives. What could possibly go wrong?
For one, knowing that AI “has you covered” might make you more prone to dropping the ball. I was initially excited about Routines. It promises less work. But once again, there is a high risk of creating new workflows for yourself that would have previously been too complex to build. Thus the illusion of simplicity brings about greater complexity.
Agent Provocateur
A senior engineer at Anthropic gave a short talk on “Building Effective Agents in Prod” — i.e., advanced vibe coding. Recall that Karpathy’s original canonical definition is where you “fully give into the vibes, embrace exponentials, and forget the code even exists.”
Two takeaways worth lifting:
Be Claude’s PM. “Ask not what Claude can do for you, but what you can do for Claude.” Spend the 15-20 minutes up front giving the model the same tour of the codebase a new hire would need — then hand the artifact to a fresh session and let it cook.
Embrace the exponential the way an exponential actually behaves. Don’t picture these models as “twice as good in twenty years.” Picture them a million times faster — that’s what 90s programmers couldn’t picture about today’s compute, and it’s the right order of magnitude to plan against.
Just yesterday, Karpathy (sensei himself) summarized his remarks at the Sequoia Ascent summit, saying much the same thing:
“Vibe coding is fine for prototypes and personal tools. Agentic engineering is what serious teams need.
The agentic engineer does not blindly accept generated code. They design specs, supervise plans, inspect diffs, write tests, create evaluation loops, manage permissions, isolate worktrees, and preserve quality.”
There are reports of unemployed engineers losing out on jobs to less experienced peers who err on the side of AI slop.
But if you’re a software developer and you’re still vibe coding, do try to keep up now – the new term is agentic engineering.
The AI Wars Continue
Mythos is Anthropic’s newest state-of-the-art model, but they aren’t releasing it to the general public, because they think it’s too powerful for us mortals.
But seriously, they believe it would – in the wrong hands – lead to widespread hacking and bug exploitation. For now, only the security teams at top enterprises have access and that makes a lot of sense to me. Let them patch the big holes and redesign the systems with the same tool that will soon be used to try to break them.
Apparently the White House took notice because they are back in talks with Anthropic, even after shredding their contracts and labeling them a “supply chain risk” just a few months ago.
Some people (see Gary Marcus, for one) have said the hype over Mythos is overblown — that the partial release is a PR stunt to virtue signal on safety, while simultaneously delaying the full release until they can purchase more compute.
Dario is on the record calling OpenAI reckless for the amounts of compute it has purchased, but it seems he may have underprojected demand for his own product. There have been more reports of bugs and outages with Claude, even after they shifted to congestion pricing — making it so that tokens during peak hours from ~8am to 2pm ET cost more. As a result, I’ve shifted my heaviest AI workflows to the afternoon and evening, reserving the mornings for writing, pure thinking, and farm chores. This has been a blessing in many ways, but I don’t like the fact that I plan my days around token pricing.
I’ve always been eager to get my hands on the newest model to see if it can make my work easier. But for the first time, I feel more anxious dread than anxious anticipation.
Dread that it will be everything they say it is.
Dread that it could “10x productivity” if you submit to its yoke.
Dread that those who learn to fill the gaps between its increasingly jagged intelligence will push so far ahead of everyone else that the rest of us might as well just surrender and embrace our status as permanent underclass.
Maybe that wouldn’t be so bad. It’s not such a big leap to go from my current status as a hobby homesteader to becoming a full-time subsistence farmer… but land in California is expensive, and so I keep grinding.
Security Breach
And speaking of security breaches, I had a scare on Monday when my computer started fritzing out. The buttons seemed to be scrambled — even after restarting it, my password wasn’t working. After a bit of back and forth with Gemini on my phone about how to recover from this apparent hacking incident, the Google AI model diplomatically asked me if it might be something more mundane, like a sticky shift key stuck in the pressed position. Well, sure enough... (that teaches me not to eat grapefruit at my desk.)
Human, Write!
Three data points suggesting you should lean into your humanity:
The NY Times is betting on humans with expertise over AI slop.
Anthropic is hiring an events person for $400K/year.
And SEO analytics platform Semrush has found that the best SEO strategy all along wasn’t keyword stuffing or anything like that, but just to write like a human:
Maybe the winners are just those that are best at faking it, and my long-term bet is still on the human with the computer. But the best uses of AI are those that are invisible — even to the most sophisticated AI detectors — because AI is not actually generating the words that appear on the page. It’s generating intermediate assets, preparing digests to skim, and polishing/organizing rough brain dumps and voice notes with minimal changes to the underlying thought.
One of the reasons I remain in the Substack ecosystem is that it seems to be one of the last places that rewards human writing, and punishes artificiality.
The New Protestant Work Ethic
If you’re looking for more thoughtful AI takes from the frontlines, Taylor Pearson’s essay “As We May Work” is a must-read.
There is too much good stuff in his latest X roundup for me to summarize, but if I had to pick one thing, it would be his mini-review of Max Weber’s Protestant Work Ethic and the Spirit of Capitalism (updated for the AI age). The money quote:
“Innerworldly Protestant asceticism works with all its force against the uninhibited enjoyment of possessions; it discourages consumption... Conversely, it has the effect of liberating the acquisition of wealth from the inhibitions of traditionalist ethics; it breaks the fetters on the striving for gain by not only legalizing it, but seeing it as directly willed by God.”
This is exactly what I’m describing in the Harried AI Class – a modern world that is rich beyond our ancestors’ wildest imaginations, yet insists on grinding away in pursuit of a counterfeit salvation.
Pearson also links to a canary-in-the-coalmine type piece by a physician named Ben Gooch (another Substack writer—not a coincidence) about why he stopped using AI note takers after being an early adopter.
“I was left with too many problems documented to manage, too many loose threads to follow up, and a creeping sense of clinical overextension that I initially attributed to patient complexity rather than to a change in my own behaviour.”
Is it too early to call the pattern here? So many people are saying the same thing in different ways.
As someone who leans very heavily on my meeting transcripts to remember what to follow up on, this gives me some serious pause.
Never Go Full Borg
And just to bring this round-up full circle, LindyMan makes a very Lindy rebuttal to the “Optimistic Altmans” of the world – specifically on the dangers of tools that appear to reduce the friction involved in knowledge work, like voice dictation:
This meme from the comments on that post is unhinged, but directionally correct:
If you’re not even a little bit worried about the direction we’re headed, you’re not paying attention.
Voice is unlikely to be the final form factor for communicating with computers. With Neuralink, Elon Musk has built a functional prototype of a brain-computer interface.
Are you prepared to go Full Borg?
(For the record, I typed this edition up to this point.)
Quick Sips
This guy built a Bible app that lets you choose your own adventure, keeping track of what you’ve read.
A new mobile-first voice keyboard. (Haven’t tried it yet.)
Tl;dr — (too long; didn’t read)
Bullet-point summaries drafted by Claude, edited by Charlie.
Aaron Levie’s Prophetic JD for the “Agent Deployer” Role
One more juicy premonition from Levie, who is turning out to be one of the more prophetic voices on AI:
Every team will need someone whose job is mapping the highest-leverage workflows for agent automation — anywhere you could throw compute at a task to do it 100× more or 100× faster. Lead enrichment, contract intake, client onboarding, the internal knowledge bases nobody maintains.
Required skills: process mapping, structured + unstructured data flows, comfort with skills/MCP/CLIs, eval and review management, ongoing KPI tracking. In other words, half the stack we’re already living in.
Could be a repositioned existing employee or a net-new role. Lives inside the function, not centralized. “A fantastic job for next-gen hires leaning into AI.” Translation: if you’re young and AI-native, your career on-ramp is suddenly a lot less crowded.
Pirate Wires: Why ChatGPT Is Obsessed With Goblins
This piece by Katherine Dee of Default Blog is the funniest reinforcement-learning post-mortem you will read this year. OpenAI had to instruct Codex 5.5 in its own system prompt — not buried in training data, in the actual visible system prompt — to stop bringing up “goblins, gremlins, trolls, ogres, pigeons, and raccoons unless it is absolutely and unambiguously relevant.”
The post explains how this happened:
Some user-base tic with the “Nerdy” persona preset (probably the kind of guy with “a folder of shortstack fan art on his desktop”) was getting upweighted by the reward model. Goblin mentions were up 175% since launch. Gremlins, 52%.
The Nerdy outputs got recycled into supervised fine-tuning data, at which point the tic stopped being a persona quirk and became default model behavior. The whole bestiary tagged along: raccoons, trolls, ogres, pigeons. An entire enchanted forest.
They patched the gradient and pulled the vocabulary out of training data, but GPT-5.5 was already cooking, hence the system-prompt prohibition.
Joan Westenberg: Why I Quit “The Strive”
Achievement satisfaction lasts four hours to two days. The goalposts move. The hedonic treadmill is the whole game.
The viral-growth, status-and-scale doctrine — what she calls “The Strive” — burns years for temporary highs.
The replacement metric: does the work sustain itself and bring you joy without requiring perpetual scaling toward an impossible “enough”?
Shann Holmberg — AI Knowledge Layer (and why your agents are useless without it)
If your agent doesn’t know you, it’ll hand you slop. Shann’s unlock is a two-layer system you set up once and that compounds for months:
The Knowledge Base Layer (KBL) is dynamic and agent-maintained. Dump tweets, articles, bookmarks, PDFs, voice memos into a
raw/folder. The agent classifies each source, builds wiki pages with cross-references, maintains a master index. Every question you ask gets filed back as a new page. The wiki gets richer over time.The Brand Foundation (BF) is static and human-edited. Your voice rules, your banned words, your positioning. Agents read it before producing anything but never rewrite it. It’s the anchor.
He’s open-sourced the framework as LLM Wikid. Twenty-minute setup: clone, run an agent, fill
raw/, run/wiki-ingest. At ~100 articles, the compiled wiki reportedly beats RAG; one tool measured 71.5× fewer tokens per query.
Why most people won’t build one: it’s the meal-prep argument. An hour of upfront work to save ten hours over the week, and most would rather complain about bad AI output than spend twenty minutes setting up the system that fixes it.
FWIW, I actually use a system like this and I’ve found that the value compounds the more you use it. No. Does it make me more productive? Undoubtedly. Am I ready to hand over root access to my entire life?
Anil Dash: Y2K 2.0 — The AI Security Reckoning
More froth around the Mythos announcement:
“This leaves us in a situation akin to the Y2K bug around the turn of the century, where every organization around the world has to scramble to update their systems all at once, to accommodate an unexpected new technical requirement. Only this time, we don’t know which of our systems are still using two digits to store the date. And we don’t know what date the new millennium starts.”
Welp.
How Every Runs a 25-Person Company on Four AI Agents
Four custom Notion agents doing the coordination work most companies hire a layer of management for:
Anton prioritizes daily tasks. Max turns meeting transcripts into action items. The Strategy Interviewer rips out aligned quarterly OKRs in an afternoon. The Campaign Reporter drops daily growth metrics into Slack.
The actual insight is structural, not technological: “Your Notion is your agent’s brain.” Don’t prescribe steps. Describe outcomes and let the agent figure out the implementation.
The trick is interconnected databases. Strategy, calendars, tasks, and people all reference each other, which turns the existing infrastructure into an agent-powered coordination layer.
After a long hiatus, I’m actually migrating back towards Notion as the best platform for collaboration with other humans and their agents. More on this later, if it sticks…
Alright - that’s a wrap for this week’s round-up.
If you made it this far, go outside and touch grass. Stretch your legs. And take a break from talking to your computer.















