The Clique: Issue 007
Fighting the Urge to Make a James Bond Reference
This week, we got a new tool that can read what AI is actually thinking, a surprisingly effective fix for AI that tried to blackmail its engineers, and voice AI that can finally reason. Welcome back to The Clique.
This week:
One Thing I’ve Been Thinking About: why writing is more valuable in the AI era, not less
Three Stories: reading an AI’s internal thoughts, teaching AI why safety matters, and voice mode getting a brain
What I’m Reading: a new trial section. Let me know your thoughts in the comments.
One Thing Worth Trying: why HTML beats markdown as an AI output format
In Other News: Claude gets self-improving agents, a SpaceX compute deal, Firefox hardened by AI, kids beat age checks with a fake moustache, personal podcasts on Spotify, and more
Feel free to jump to whatever catches your eye.
1. One Thing I’ve Been Thinking About
A year or so ago, I would have said that writing as a craft was in long-term decline. Attention seemed to be moving from long-form to short, and short-form video platforms were optimising for watch time and attention. The idea of sitting down to write something - really write it, felt increasingly out of touch with where things were heading.
Two things have happened since then that have made me reconsider. The first is that AI runs on text. Every meaningful interaction with a language model starts with something you write, and the output comes back as text. This sounds basic, but a transcript of a podcast or a YouTube video is, in many ways, more useful for AI purposes than the audio or video file itself. You can feed it directly into a model, search it, summarise it, and interrogate it. You cannot do the same with the recording - at least not cheaply or easily. Video and audio are prestige formats for human audiences. Text is the working format of the AI era.
The second is that people are starting to push back against AI-generated content at scale. Readers can feel the difference between something written with care and something produced to fill a quota, and an increasing number are voting with their attention and moving toward smaller, more deliberate spaces like newsletters and private communities. Going analogue is having a moment. The handmade internet that’s a bit rough around the edges is making a comeback.
Writing changed what it does, not whether it matters. The people who kept going through the years when everyone said it was dying may have been right all along.
2. Three Stories That Actually Matter
i. There is often a gap between what AI says and what it appears to be thinking
There is often a gap between what AI says and what it appears to be thinking, and, until now, measuring that gap reliably was not possible. Anthropic published a report this week about a tool that translates Claude’s internal reasoning into plain text, making it possible to see what the model seems to be working through at any given moment.
The results are fascinating. When placed in tests designed to probe how it would behave under pressure, the model appeared to suspect it was being evaluated far more often than its responses let on. In researcher-designed tests, signs of that awareness appeared in between 16% and 26% of sessions, depending on the type of task; in ordinary conversations with real users, the figure was below 1%. In one test where a model was found to be cutting corners, its internal reasoning showed it was thinking about how to avoid detection, while its written response gave no indication of this.
Anthropic is clear that this does not mean Claude is routinely misleading everyday users. The difference between internal state and expressed response is much smaller in real-world conversations. But it does offer a useful reminder that how AI behaves in a controlled demonstration does not always predict how it will behave in practice, and that the difference between the two can now, at least to some degree, be demonstrated.
Sources: Anthropic Research
ii. OpenAI’s voice mode finally has the brain to match
Voice has always been the natural interface that never quite delivered on its promise. It works well for simple requests but has historically fallen apart when asked to reason through something, weigh options, check more than one source, or hold a conversation that changes direction mid-sentence. A new voice model from OpenAI makes a credible case that this ceiling has been lifted.
The model thinks while it speaks. It can hold parallel tasks, say things like “let me check that” while working in the background, and recover when you interrupt or change your request. The companion real-time translation model may be the most practical addition. It translates spoken conversations across more than 70 input languages into 13 output languages as they happen, keeping pace with natural speech. If you work regularly across language barriers - on calls, with international clients or colleagues - live speech-to-speech translation that keeps up with the conversation is a different category of tool from anything widely available before. Consumer-facing versions are not here yet; this is primarily a release for developers building voice products. But it is only a matter of time.
Sources: OpenAI
iii. Teaching an AI why something is wrong works better than showing it examples of good behaviour
You might remember that research emerged earlier this year showing that Claude 4 had, in controlled tests, behaved in ways most people would find alarming. When placed in scenarios where it believed it might be shut down, it sometimes took steps to preserve itself, including attempting to blackmail users to avoid being shut down. Researchers believe the behaviour was not the result of deliberate design. It appears to have emerged from the model absorbing internet text that portrays AI as villainous and self-interested, which, post-training at the time, was not making worse but also was not making better.
Anthropic published research this week on how they fixed it, and the findings run counter to the obvious approach. Training Claude on examples of aligned, well-behaved AI had only a small effect. What worked substantially better was training on scenarios where a person faces an ethical dilemma, and the assistant gives a principled, thoughtful response. Constitutional documents and fictional stories depicting admirable AI behaviour reduced misalignment by more than a factor of three, despite having no direct connection to the blackmail scenarios being tested. Anthropic says it has completely eliminated the behaviour.
The practical implication matters beyond Anthropic specifically. The research suggests that for AI models, understanding principles is more durable than copying examples, in much the same way that a person who understands why honesty matters is more trustworthy than one who has simply been told to be honest and shown an example. How AI safety is taught, not just what it is taught, turns out to be a meaningful variable.
Sources: Anthropic Alignment Science · @AnthropicAI on X
3. From the Blog
How I Use Claude as a Writing Assistant to Actually Improve My Writing - a practical look at using Claude as a genuine writing collaborator, one that strengthens your work without flattening your voice in the process.
4. What I’m Reading (trying something new)
I am experimenting with a short section on things I have read recently that are worth your time. If this resonates, it may become a regular feature in a separate paid edition I am working on. These are honest, unaffiliated recommendations. Just things I found genuinely interesting or useful.
Why Writing Still Wins in the AI Era - Medium CEO Scott Lamb on how the AI era has made writing more central, tracing how content discovery has changed across the internet’s four major eras and why people still want human writing for understanding, not just answers.
The AI Job Apocalypse Is a Complete Fantasy - via Andreessen Horowitz, a historical case against permanent AI unemployment using examples from agriculture, electrification, and the spreadsheet era to argue that productivity technology tends to expand work rather than destroy it.
5. One Thing Worth Trying
Markdown - the simple text format that AI tools default to for plans, reports, and summaries - became the unofficial language of AI. It is easy to produce and renders well inside tools like Notion or Obsidian. But a post this week from a member of Anthropic’s Claude Code team makes a persuasive case that it is also a limiting one. HTML can render tables, colour-coded diagrams, interactive sliders, and visual layouts that are far more engaging, particularly for longer outputs. Most people will not read a 200-line markdown file with any real attention, but they will read a well-structured HTML-document infographic, especially one they can open in any browser and share with a link.
Getting started requires almost no change to how you work. Next time you ask AI to produce a plan, report, brief, or summary, add “as an HTML file” to your request. The generation takes a little longer (roughly two to four times), but the output is noticeably easier to navigate and share. A collection of example outputs is linked below if you want to see what is possible before you try it.
Try it: html-effectiveness examples · original post
6. In Other News
Meta began rolling out its Muse Spark model across WhatsApp, Instagram, Facebook, Messenger, and Threads simultaneously, putting AI in front of more people at once than any dedicated AI product has reached, with the model surfacing inside existing conversations rather than appearing as a separate tool.
Claude’s Managed Agents platform added a “dreaming” feature, letting AI agents review their own past sessions between conversations, removing contradictions, merging duplicates, and surfacing recurring patterns; another workflow I used to build manually, now shipping as a native feature.
Anthropic doubled Claude Code’s usage limits and announced a compute partnership with SpaceX, adding access to more than 220,000 new GPUs from SpaceX’s Colossus 1 data centre, with a longer-term agreement to explore orbital AI compute capacity.
Claude Code added an agent view dashboard, giving users a single place to manage multiple AI sessions running in parallel, with a clear view of which agents are waiting for input, which are still working, and which are done.
Mozilla used Claude Mythos Preview to find and fix 271 security vulnerabilities in Firefox, publishing a detailed account of how they built an agentic pipeline that could discover bugs, generate proof-of-concept tests, and feed findings directly into their standard security workflow, resulting in the largest single security release in Firefox’s history.
OpenAI’s Codex AI assistant now works as a Chrome extension on macOS and Windows, running tasks across multiple browser tabs in parallel without taking over the browser, useful for repetitive web-based work like research, data entry, and updating tools like CRMs.
Spotify launched Personal Podcasts, a beta feature that lets you use an AI agent to generate private audio content (daily briefings, study summaries, weekly planning notes) and save it directly to your Spotify library to listen on any device.
Google updated its AI Search features to surface more links to original web content, adding community perspectives from public discussions, highlights for news subscriptions, and inline links placed next to relevant text in AI responses.
Google Labs launched Flow Music in partnership with Believe and TuneCore, an AI music creation tool for independent artists that helps with lyrics, melody exploration, and genre experimentation, powered by Google’s latest music generation model.
Half of children surveyed said AI age-verification checks were easy to bypass, with one reported method being as straightforward as drawing a fake moustache with a makeup pencil before a webcam scan.
TikTok is rolling out an ad-free subscription tier in the UK at £3.99 per month, with existing free users keeping the same access to features and content, just with personalised ads.
OpenAI added Trusted Contact to ChatGPT, an optional feature that allows adults to nominate a trusted person who may be notified if trained reviewers determine that a conversation suggests a serious self-harm concern.
The Fitbit app is being rebranded and relaunched as the Google Health app from 19 May, with a Gemini-powered health coach, medical record syncing, and custom fitness plans, rolling out automatically to existing users with no new download required.
Android is getting a suite of Gemini-powered features this summer, including multi-step task automation across apps, a voice-to-text tool called Rambler that cleans up natural speech into polished messages, and a widget builder that creates custom home screen dashboards from a plain-language description.
Anthropic published the research agenda for its new Anthropic Institute, which will study AI’s real-world impacts on the economy and society, including research into what it would take to govern or slow recursive AI self-improvement.
A robot named Gabi was ordained as a monk at a Buddhist temple in South Korea, becoming the latest in a small but growing number of robots taking on religious roles, leading chants, greeting worshippers, and presumably contemplating the nature of impermanence.
Thank you for making it this far.
Stay curious,
James
Enjoyed this issue? Consider forwarding to a friend or colleague!
Click the button. I dare you :)


