> Youtube charges $10 per month and doesn't produce a single video
It is different from Netflix (that pays upfront for production costs), but there's of course a revenue share + the bulk of the revenue for creators is actually from sponsorships (which YT doesn't take a share of).
LLMs generate their output one token at a time. The first thought when you learn this is that this is a huge performance bottleneck, as we are used to highly parallelized systems.
However, a large part of what makes LLMs feel so magical comes from this bottleneck.
The main thing I noticed in the video is that they have heavily sped up all the code generation sections... seems to be on 5x speed or more. (because people got used to how fast and good Sonnet, and especially Gemini 3.0 Flash, are)
I would also add that search has already moved elsewhere.
Less and less people are using search engines to shop, ex:Amazon makes >$57B a year from search ads, but also look at Temu and Shein which are mostly glorified product search platforms.
No one is searching for "funny videos" when you can just open Instagram and Tiktok.
The only real unique thing that search engines can do is queries that are not directly commercial (e.g. education, information seeking, etc.) and competition is insanely intense (w/ ChatGPT, Perplexity, etc) there.
Honestly, I haven't used either of ChatGPT or Perplexity seriously. They haven't performed particularly well when I tested them and dun-dun-dun in my uses Gemini has been growing on me. And another odd thing at the moment is that Google's search has somehow become better at giving me the results I'm looking for and DDG is giving a lot of annoying crap.
Basically when I search for API of a specific function or package docs on DDG I end up with page after page of people blogging about using them and the actual docs don't show up. So I add "!g" and the same crap is there, but the link to reference will be somewhere among the first page of results (although Goggle usually has a link to an old stale version of the docs).
Do you have specific examples of this behavior that I can look into? Also, curious if you've tried our Assist function (comes up automatically for some searches or click Assist under search box) or duck.ai for stuff like that?
I didn't realize that the context would require such so much memory. Is this KV caches? It would seem like a big advantage if this memory requirement could be reduced.
> I cannot be the first person to think about such possibilities
Differentiable Rendering [1] is the closest thing to what you are describing.
And yes, people have been working on this for the same reason you outline, it is more data/compute efficient and hence should generalize better.
But also:
> While cool, this also seems utterly wasteful. Video games offer known "analytical" solutions for the interactions that the model provides as a "statistical approximation", so to say.
A bit of the same debate as people calling LLMs a "blurry JPEG of the web" and hence useless.
Yes this is a statistical approximation to an analytical problem... but that's a very reductive framing to what is going on.
To find the symbolic/analytical solution here would require to constrain the problem greatly: not all things on the screen have a differentiable representation, for example complex simulations might involve some kind of custom internal loop/simulation.
You waste compute to get a solution that can just be trained on billions of unlabeled (synthetic) examples, and then generalize to previously unseen prompts/environments.
Training code is only useful to people in academia, and the closest thing to "code you can modify" are open weights.
People are framing this as if it was an open-source hierarchy, with "actual" open-source requiring all training code to be shared. This is not obvious to me, as I'm not asking people that share open-source libraries to also share the tools they used to develop them. I'm also not asking them to share all the design documents/architecture discussion behind this software. It's sufficient that I can take the end result and reshape it in any way I desire.
This is coming from an LLM practitioner that finetunes models for a living; and this constant debate about open-source vs open-weights seems like a huge distraction vs the impact open-sourcing something like Llama has... this is truly a Linux-like moment. (at a much smaller scale of course, for now at least)
I dunno — if an open source project required, say, a proprietary compiler, that would diminish its open source-ness. But I agree it's not totally comparable, since the weights are not particularly analogous to machine code. We probably need a new term. Open Weights.
It is different from Netflix (that pays upfront for production costs), but there's of course a revenue share + the bulk of the revenue for creators is actually from sponsorships (which YT doesn't take a share of).
reply