Simon's writing is consistently either highly practical, or extremely high quality, or both. What's your reference frame to call it "bad" - your own comments?
Spending thousands of words to essentially say "ChatGPT's search feature works pretty well now" with mundane examples like finding UK cake pop availability or identifying buildings from train windows. This has been done before by less capable models - it's just a rehash. Should we expect newer models getting worse? The breathless "Research Goblin" framing and detailed play-by-play of basic web searches feels like padding to make a now routine tool use seem revolutionary.
The mundane examples were the point. I'm not picking things to show it in the best possible light, I picked a representative sample of the ways I've been using it.
I called out the terrible scatter plot of the latitude/longitude points because it helped show that this thing has its own flaws.
I know so many people who are convinced that ChatGPT's search feature is entirely useless. This post is mainly for them.
The thing about models getting incrementally better is that occasionally they cross a milestone where something that didn't work before starts being useful.
Those are the kinds of things I look out for and try to write about.
Simon says “what I used to Google I now try AI thinking models”
I didn’t feel that he was framing it as _revolutionary_ it felt more evolutionary.
Simon, for every person miffed about your writing, there is another person like me today who said “ok, I guess I should sign up for Simon’s newsletter.” Keep it up.
It’s easy to be a hater on da internet.
42lux, if you have better articles on AI progress do please link them so we can all benefit.
I wanna know when my research goblin can run on my box with 2x 3090s.
FWIW I take his writings with a hefty pinch of salt these days. It seems incredibly concentrated on OpenAI to the detriment of anything else. This was only cemented when he ended up appearing on some OpenAI marketing video.
This is fine. He is his own person and can write about whatever he wants and work with whoever he wants, but the days when I'd eagerly read his blog to get a finger of the pulse of all of the main developments in the main labs/models has passed, as he seems to only really cover OpenAI these days, and major events from non-OpenAI labs/models don't seem to even get a mention even if they're huge (e.g. nano banana).
That's fine. It's his blog. He can do what he wants. But to me personally he feels like an OpenAI mouthpiece now. But that's just my opinion.
So far in 2025: 106 posts tagged OpenAI, 78 tagged Claude, 58 tagged Gemini, 55 tagged ai-in-china (which includes DeepSeek and Qwen and suchlike.)
I think I'm balancing the vendors pretty well, personally. I'm particularly proud of my coverage of significant model releases - this tag has 140 posts now! https://simonwillison.net/tags/llm-release/
OpenAI did get a lot of attention from me over the last six weeks thanks to the combination of gpt-oss and GPT-5.
I do regret not having written about Nano Banana yet, I've been trying to find a good angle on it that hasn't already been covered to death.
> I think I'm balancing the vendors pretty well, personally.
You are. Pretty much my main source these days to get a filtered down, generalist/pragmatic view on use of LLMs in software dev. I'm stumped as to what the person above you is talking about.
OT: maybe I missed this but is the Substack new and any reason (besides visibility) you're launching newsletters there vs. on your wonderful site? :)
The Substack is literally the exact same content as my blog, just manually copied and pasted into an email once a week or so for people who prefer an email subscription.
As soon as another lab release an exciting new model (Anthropic and Gemini have both been quiet since GPT-5, with the exception of nano banana which I do intend to cover) I'll write about what they're up to.
I said the opposite of that: I haven't written about it yet because I didn't have anything interesting to say, and I try to write things that add value.
No, you've said that you regret not having written something earlier before everyone else took the easy angles. That's something completely different from your last response.
That seems a little harsh. But, I felt the same about older blogs I used to read such as CodingHorror. They just aren’t for me anymore after diverging into other topics.
I really liked this article and the coining of the term “Research Goblin”. That is how I use it too sometimes. Which is also how I used to use Google.